tf-serving

"tf"

Posted by zwt on December 19, 2020

环境准备

1
2
3
4
安装docker
docker pull tensorflow/serving
新建文件夹tmp/tfserving
git clone https://github.com/tensorflow/serving

模型训练保存

1
2
3
4
5
model_output_dir = "D:\zwt\work_test\qa_robot_test\logs\\1"
tf.saved_model.simple_save(session, model_output_dir, inputs={"char_inputs": self.model.char_inputs},
                outputs={"outputs": self.model.prediction})

在tmp文件夹下新建文件,命名为模型名称,将模型文件copy到新建文件夹下

-启动服务:

1
    - docker run -t --rm -p 8500:8500 -p 8501:8501 --mount type=bind,source=D:\\zwt\\docker\\tmp\\transformer\\,target=/models/transformer  -e MODEL_NAME=transformer  tensorflow/serving &

测试是否启动 curl http://localhost:8501/v1/models/transformer

1
2
3
4
5
6
7
    显示{"model_version_status": [
  {
   "version": "1",
   "state": "AVAILABLE",
   "status": {
    "error_code": "OK",
    "error_message": ""} }}

curl http://localhost:8501/v1/models/transformer/metadata

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
    {"model_spec":{"name": "transformer","signature_name": "","version": "1"},"metadata": {"signature_def":{"signature_def": {
  "serving_default": {
   "inputs": {
    "myInput": {
     "dtype": "DT_FLOAT",
     "tensor_shape": {
      "dim": [
       { "size": "-1",
        "name": ""
       }, {
        "size": "100",
        "name": ""
       }],
      "unknown_rank": false
     },
     "name": "myInput:0" }
   },
   "outputs": {
    "myOutput": {
     "dtype": "DT_FLOAT",
     "tensor_shape": {
      "dim": [
       {
        "size": "-1",
        "name": ""
       },
       {
        "size": "1",
        "name": ""
       }
      ],
      "unknown_rank": false
     },
     "name": "Sigmoid:0"
    }  }, "method_name": "tensorflow/serving/predict"
  }}}}}

命令框模型调用:

1
curl -d "{\"instances\": [[45, 181, 121, 22, 246, 88, 146, 25, 16, 117, 145, 231, 187, 157, 206, 191, 77, 236, 124, 68, 198, 19, 84, 193, 251, 7, 105, 229, 119, 243, 103, 90, 250, 82, 138, 9, 1, 240, 242, 133, 214, 216, 111, 189, 54, 212, 209, 169, 167, 179, 13, 249, 99, 95, 137, 55, 199, 237, 159, 183, 15, 150, 76, 104, 2, 27, 220, 149, 38, 3, 86, 8, 23, 235, 83, 12, 190, 239, 5, 140, 155, 114, 53, 135, 148, 141, 89, 232, 219, 226, 62, 33, 116, 46, 225, 49, 154, 4, 234, 222, 59, 217, 184, 163, 196, 213, 48, 42, 122, 101, 170, 233, 158, 35, 30, 52, 197, 178, 210, 56, 165, 120, 106, 129, 172, 32, 176, 18, 28, 26, 174, 221, 173, 20, 60, 40, 123, 161, 66, 72, 134, 29, 69, 100, 180, 152, 200, 11, 51, 151, 238, 252, 41, 160, 215, 218, 131, 21, 58, 203, 34, 102, 75, 207, 255, 175, 201, 164, 81, 47, 139, 162, 244, 126, 211, 192, 6, 57, 182, 195, 73, 254, 143, 70, 125, 185, 227, 130, 228, 153, 94, 248, 144, 127, 91, 202, 96, 92, 17, 98, 65, 67, 241, 87, 80, 224, 24, 147, 166, 230, 78, 112, 97, 118, 204, 186, 107, 108, 253, 43, 74, 10, 37, 245, 208, 14, 194, 110, 64, 113, 128, 79, 115, 223, 171, 247, 61, 142, 109, 132, 63, 36, 205, 31, 93, 0, 136, 177, 71, 156, 44, 188, 50, 168, 85, 39]]}" -X POST http://localhost:8501/v1/models/transformer:predict

restful接口访问

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
import requests
import json
import numpy

char_index = {' ': 0}


def load_dict():
    i = 0
    with open('D:\zwt\work_test\qa_robot_test\\resource\\vocab\dictionary.json', "r+", encoding="utf-8") as reader:
        items = json.load(reader)
        for charvalue in items:
            char_index[charvalue.strip()] = i + 1
            i += 1


def convert_vector(input_text, limit):
    char_vector = numpy.zeros((70), dtype=numpy.int32)
    count = len(input_text.strip().lower())
    if count > limit:
        count = limit
    for i in range(count):
        if input_text[i] in char_index.keys():
            char_vector[i] = char_index[input_text[i]]
    return numpy.array([char_vector])


load_dict()
t = convert_vector('哪里有汉堡', 70)
pdata = {"instances": t.tolist()}
param = json.dumps(pdata)
res = requests.post('http://localhost:8501/v1/models/transformer:predict', data=param)
print(res.text)

grpc访问:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
import numpy as np
import tensorflow as tf
from tensorflow_serving.apis import predict_pb2
from tensorflow_serving.apis import prediction_service_pb2
from grpc.beta import implementations


# 利用grpc 进行连接
channel = implementations.insecure_channel("127.0.0.1", 8500)
stub = prediction_service_pb2.beta_create_PredictionService_stub(channel)
request = predict_pb2.PredictRequest()
# 模型的名称
request.model_spec.name = "transformer"
# 签名的名称
# request.model_spec.signature_name = "predict"

# 每次只支持传入一条数据进行预测,传入数据时要注意数据格式和模型定义时的格式一致
request.inputs['char_inputs'].CopyFrom(tf.contrib.util.make_tensor_proto(t[0], dtype=tf.int32, shape=[1, 256]))

# response返回的是protobuff的格式
response = stub.Predict.future(request)
# 去除预测的数值,对于many to many 的LSTM,输出的结果是多个,读取成列表的形式
res_list = response.result().outputs["outputs"].float_val
print(res_list)

多模型配置:

1
2
在tmp文件夹下新建multiModel文件,在其中分别存放保存好的模型
在multiModel文件夹下新建model.config文件

文件中写入

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
model_config_list:{
    config:{
      name:"test1",
      base_path:"/models/models/test1",
      model_platform:"tensorflow",
      model_version_policy{
        specific {
            versions: 1
            versions: 2
        }
    }
    },
  config: {
    name: "model2",
    base_path: "/models//models/model2",
    model_platform: "tensorflow",
      model_version_policy{
        specific {
            versions: 1
            versions: 2
        }
    }
  }
}

也可以通过:
model_platform:"tensorflow",
      model_version_policy:{
        all:{}
      }
      来进行版本控制
      
      
版本切换:
model_version_policy {
  specific {
    versions: 42
    versions: 43
  }
}
version_labels {
  key: 'stable'
  value: 43
}
version_labels {
  key: 'canary'
  value: 43
}

如果需要版本控制,加入上述文件后访问的时候需要

1
2
启动命令后加上  --model_config_file=/models/models/models.config
访问时:http://localhost:8501/v1/models/model1/versions/1:predict