tfserving模型策略整理

简介

本文总结一下tensorflow serving的模型加载策略,以及模型serving策略。

模型加载策略

tf-serving支持多平台、多模型、多版本,支持制定模型维度的加载策略。
这个策略配置在serving-1.15.0/tensorflow_serving/config/model_server_config.proto中,字段如下:

1
2
3
4
5
6
// Version policy for the model indicating which version(s) of the model to
// load and make available for serving simultaneously.
// The default option is to serve only the latest version of the model.
//
// (This can be changed once a model is in serving.)
FileSystemStoragePathSourceConfig.ServableVersionPolicy model_version_policy = 7;

这个字段的定义在serving-1.15.0/tensorflow_serving/sources/storage_path/file_system_storage_path_source.proto中,内容如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
// A policy that dictates which version(s) of a servable should be served.
message ServableVersionPolicy {
// Serve the latest versions (i.e. the ones with the highest version
// numbers), among those found on disk.
//
// This is the default policy, with the default number of versions as 1.
message Latest {
// Number of latest versions to serve. (The default is 1.)
uint32 num_versions = 1;
}

// Serve all versions found on disk.
message All {
}

// Serve a specific version (or set of versions).
//
// This policy is useful for rolling back to a specific version, or for
// canarying a specific version while still serving a separate stable
// version.
message Specific {
// The version numbers to serve.
repeated int64 versions = 1;
}

oneof policy_choice {
Latest latest = 100;
All all = 101;
Specific specific = 102;
}
}

末尾policy_choice中可以判断使用的是哪种:

  • latest 表示只加载最新版本
  • all 加载所有版本
  • specific 指定版本(可以多个版本)

除此之外,AspiredVersionsManager类有两种版本管理策略:

  • AvailabilityPreservingPolicy 先加载新版本,后卸载老版本
  • ResourcePreservingPolicy 先卸载老版本,后加载新版本

模型serving策略

模型serving策略实际上是在加载策略基础上实现的。
在模型对应的配置中有字段version_labels,定义如下:

1
2
3
4
5
6
7
8
9
10
11
12
// String labels to associate with versions of the model, allowing inference
// queries to refer to versions by label instead of number. Multiple labels
// can map to the same version, but not vice-versa.
//
// An envisioned use-case for these labels is canarying tentative versions.
// For example, one can assign labels "stable" and "canary" to two specific
// versions. Perhaps initially "stable" is assigned to version 0 and "canary"
// to version 1. Once version 1 passes canary, one can shift the "stable"
// label to refer to version 1 (at that point both labels map to the same
// version -- version 1 -- which is fine). Later once version 2 is ready to
// canary one can move the "canary" label to version 2. And so on.
map<string, int64> version_labels = 8;

这个字段可以指定版本的标签。

接着在请求包有的model_spec字段中,可以指定版本或者指定标签,如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
// Metadata for an inference request such as the model name and version.
message ModelSpec {
// Required servable name.
string name = 1;

// Optional choice of which version of the model to use.
//
// Recommended to be left unset in the common case. Should be specified only
// when there is a strong version consistency requirement.
//
// When left unspecified, the system will serve the best available version.
// This is typically the latest version, though during version transitions,
// notably when serving on a fleet of instances, may be either the previous or
// new version.
oneof version_choice {
// Use this specific version number.
google.protobuf.Int64Value version = 2;

// Use the version associated with the given label.
string version_label = 4;
}

// A named signature to evaluate. If unspecified, the default signature will
// be used.
string signature_name = 3;
}

serving服务处理请求时,根据version_choice的取值来选择版本,默认是选择最新版本,代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
switch (model_spec.version_choice_case()) {
case ModelSpec::kVersion: {
*servable_request = ServableRequest::Specific(model_spec.name(), model_spec.version().value());
break;
}
case ModelSpec::kVersionLabel: {
if (!options_.allow_version_labels) {
return errors::InvalidArgument("ModelSpec has 'version_label' set, but it is not currently allowed by the server.");
}
int64 version;
TF_RETURN_IF_ERROR(GetModelVersionForLabel(model_spec.name(), model_spec.version_label(), &version));
*servable_request = ServableRequest::Specific(model_spec.name(), version);
break;
}
case ModelSpec::VERSION_CHOICE_NOT_SET: {
*servable_request = ServableRequest::Latest(model_spec.name());
break;
}
}

上面的模型选择需要与模型加载机制相配置,如果只加载了一个版本的模型,这时如果指定版本或者标签,就有可能找不到对应的模型,
这时就会返回标签不存在或者模型版本找不到的错误,然后请求失败,并没有兜底请求一说。