1. build kubeflow/serving container image which contains serving_model

[1]

# run container tensorflow/serving, the image of this container 
# is the base image.
(base) maye@maye-Inspiron-5547:~$ sudo nerdctl run -d --name serving_base tensorflow/serving
854cb6ca8bd3de4a97b4fc4aa3c29caf37525c536a2277e18f27c3f216a4461c

(base) maye@maye-Inspiron-5547:~$ nerdctl commit --help
Create a new image from a container's changes

Usage: nerdctl commit [flags] CONTAINER REPOSITORY[:TAG]

Flags:
  -a, --author string        Author (e.g., "nerdctl contributor <nerdctl-dev@example.com>")
  -c, --change stringArray   Apply Dockerfile instruction to the created image (supported directives: [CMD, ENTRYPOINT])
  -h, --help                 help for commit
  -m, --message string       Commit message
  -p, --pause                Pause container during commit (default true)


# copy model directory from host to container
$ sudo nerdctl cp /home/maye/maye_temp/wafer serving_base:/models/wafer

"""
build a container image based on the running container
A kubernetes container runs as user root, user root can only see images, containers created (pull or load) by user root, not by other users.
and crictl (container runtime inferface cli of kuberneres) can only see images in namespace k8s.io.
"""
(base) maye@maye-Inspiron-5547:~$ sudo nerdctl commit  serving_base maye/wafer_serving --namespace k8s.io
sha256:324b3a421d43bcedd16820bbbe7060691cced88a3f919a7e74aca74f6f7aed8d
(base) maye@maye-Inspiron-5547:~$ 

# check the built image
(base) maye@maye-Inspiron-5547:~$ sudo crictl images list | grep wafer_serving
WARN[0000] image connect using default endpoints: [unix:///var/run/dockershim.sock unix:///run/containerd/containerd.sock unix:///run/crio/crio.sock unix:///var/run/cri-dockerd.sock]. As the default settings are now deprecated, you should set the endpoint instead. 
ERRO[0000] validate service connection: validate CRI v1 image API for endpoint "unix:///var/run/dockershim.sock": rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial unix /var/run/dockershim.sock: connect: no such file or directory" 
docker.io/maye/wafer_serving                                                    latest                                            324b3a421d43b       147MB
(base) maye@maye-Inspiron-5547:~$ 

$ sudo nerdctl stop serving_base
$ sudo nerdctl rm serving_base

Note:

  1. environment variable can not be set in nerdctl commit, raise the following error:
(base) maye@maye-Inspiron-5547:~$ nerdctl commit --change "ENV MODEL_NAME wafer" serving_base maye/wafer_serving
FATA[0000] unknown change directive "ENV" 

environment variable can be set in container's spec in yaml file.

2. generate yaml file for model_serving service

## file wafer_k8s.yaml
# Copyright 2017 Google Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================

apiVersion: apps/v1
kind: Deployment
metadata:
  name: wafer-deployment
spec:
  selector:
    matchLabels:
      app: wafer-server
  replicas: 3
  template:
    metadata:
      labels:
        app: wafer-server
    spec:
### set nodeAffinity according to the actual need.     
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: kubernetes.io/hostname
                operator: In
                values:
                - maye-inspiron-5547    
    
      containers:
      - name: wafer-container
      
        #image: gcr.io/tensorflow-serving/resnet
### image needs to be present on all possible nodes 
### this pod may be scheduled to. Or, push the image to a
### image repository, set imagePullPolicy: IfNotPresent.
        image: maye/wafer_serving
        imagePullPolicy: Never

 ### set environment variable       
        env:
        - name: MODEL_NAME
          value: "wafer"
        
        ports:
        - containerPort: 8500
        - containerPort: 8501
---
apiVersion: v1
kind: Service
metadata:
  labels:
    run: wafer-service
  name: wafer-service
spec:
  ports:
  - port: 8500
    targetPort: 8500
    name: grpc-api-port
    
  - port: 8501
    targetPort: 8501
    name: rest-api-port
        
  selector:
    app: wafer-server
  #type: LoadBalancer
  type: NodePort

Note:

  1. NodePort类型的Service资源依然会被配置ClusterIP,事实上,它会作为节点从NodePort接入流量后转发的目标地址,目标端口则是与Service资源对应的spec.ports.port属性中定义的端口。

因此呢,对于集群外部的客户端来说,它们可经由任何一个节点的节点IP及端口访问NodePort类型的Service资源,而对于集群内的Pod客户端来说,依然可以通过ClusterIP对其进行访问。NodePort类型就是在工作节点的IP地址上选择一个端口用于将集群外部的用户请求转发至目标Service的ClusterIP和Port,因此,这种类型的Service也可如ClusterIP一样受到集群内部客户端Pod访问。
并不建议用户使用自定义使用的节点端口 nodePort,除非事先能够明确知道它不会与某个现存的Service资源产生冲突。无论如何,只要没有特别需求,留给系统自动配置总是较好的选择. [2]

3. deploy model_serving service on kubernetes cluster

$ kubectl create -f wafer_k8s.yaml

# check the service
(base) maye@maye-Inspiron-5547:~$ kubectl get service
NAME            TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                         AGE
kubernetes      ClusterIP   10.96.0.1        <none>        443/TCP                         9d
wafer-service   NodePort    10.106.157.196   <none>        8500:31491/TCP,8501:32203/TCP   56s
(base) maye@maye-Inspiron-5547:~$ 

(base) maye@maye-Inspiron-5547:~$ kubectl get pod
NAME                                READY   STATUS    RESTARTS   AGE
wafer-deployment-79d77c5479-6bg24   1/1     Running   0          63m
wafer-deployment-79d77c5479-rjn6s   1/1     Running   0          63m
wafer-deployment-79d77c5479-wbcvv   1/1     Running   0          63m
(base) maye@maye-Inspiron-5547:~$ 

4. send REST request to the wafer_serving service from outside the kubernetes cluster via any-node-external-ip:32203

(base) maye@maye-Inspiron-5547:~$ curl -d '{"instances": [{"b64": ""}]}' -X POST http://192.168.0.102:32203/v1/models/wafer:predict
{
    "predictions": [[0.434313387]
    ]
(base) maye@maye-Inspiron-5547:~$ 

References:


  1. https://tensorflow.google.cn/tfx/serving/serving_kubernetes?hl=en#part_3_deploy_in_kubernetes ↩︎

  2. https://www.cnblogs.com/zhangxin9/p/16757479.html ↩︎