谁处理了kubectl exec的长连接:something ain't right
intro
在k8s中,kubectl exec提供了登录特定容器的可观测方法。以前一直有一个印象:kubectl是通过RESTful API HTTP连接来实现,而kubectl的tty操作必然需要长连接。
这个功能是如此的重要以至于会让人忍不住思考:它是如何实现的?
k8s官方网站的Container Runtime Interface streaming explained描述了长连接的实现方式——通过SPDY或者WebSocket,这两种机制都是可以实现长连接的。
What is so special about them一节的中有一个细节:
"Connection upgrade(SPDY or WebSocket)"是通过API Server发起的。
但是,下面描述中,"The client upgrades"根据上下文指代的又是kubelet(Clients like crictl or the kubelet)。
Clients like crictl or the kubelet (via kubectl) request a new exec, attach or port forward session from the runtime using the gRPC interface. The runtime implements a streaming server that also manages the active sessions. This streaming server provides an HTTP endpoint for the client to connect to. The client upgrades the connection to use the SPDY streaming protocol or (in the future) to a WebSocket connection and starts to stream the data back and forth.
那么,到底是哪个组件/进程发起了这个长连接(streaming connection)?
streaming
streaming如何实现相对比较简单直接,使用的也是通用技术。下面文档描述长连接实现的主要方式。
proposal
这个提议design docs and proposals中的描述和前面k8s描述的流程图吻合。但是,查看k8s的代码,可以发现k8s APIServer中并没有这部分逻辑。
- APIServer ⟷ Exec server
The apiserver checks the response and follows redirects as necessary. Once the final connection is established, the HTTP response and all further data is proxied back to the client connection (same way it works today).
KEP
1558-streaming-proxy-redirects说明了最终的实现中,回退了了前面的提议,依然是通过kubelet打开streaming长连接(而不是通过API Server)。
After the transition period, the only way to configure container streaming will be using the local redirect with Kubelet proxy approach. Under this configuration, handling streaming redirects are not needed by the apiserver, so StreamingProxyRedirects will be deprecated and eventually removed. Once StreamingProxyRedirects are removed, ValidateProxyRedirects is no longer needed and can be removed. Finally, the --redirect-container-streaming flag will be deprecated and eventually removed.
Upon completion, streaming requests will be forwarded to the Kubelet. The Kubelet will ask the CRI to prepare a streaming server, and then open an upgraded connection to the CRI streaming server. The kubelet will proxy the connection back to the apiserver (upgrade the original exec request), which will in turn proxy the connection back to the client.
对应代码位于kubelet文件夹
///@file: kubernetes/pkg/kubelet/server/server.go
// getExec handles requests to run a command inside a container.
func (s *Server) getExec(request *restful.Request, response *restful.Response) {
params := getExecRequestParams(request)
streamOpts, err := remotecommandserver.NewOptions(request.Request)
if err != nil {
utilruntime.HandleError(err)
response.WriteError(http.StatusBadRequest, err)
return
}
pod, ok := s.host.GetPodByName(params.podNamespace, params.podName)
if !ok {
response.WriteError(http.StatusNotFound, fmt.Errorf("pod does not exist"))
return
}
podFullName := kubecontainer.GetPodFullName(pod)
url, err := s.host.GetExec(podFullName, params.podUID, params.containerName, params.cmd, *streamOpts)
if err != nil {
streaming.WriteError(err, response.ResponseWriter)
return
}
if s.redirectContainerStreaming {
http.Redirect(response.ResponseWriter, request.Request, url.String(), http.StatusFound)
return
}
proxyStream(response.ResponseWriter, request.Request, url)
}
verify
在执行
kubectl exec --stdin --tty nginx-deployment-647677fc66-42l6h -- /bin/bash
到pod之后,在pod上查看kubelet的TCP连接,可以看到多了两个连接:一个是和APIServer,一个是和本地的container。这说明:长连接并不是APIServer绕过kubelet直连pod,而是依然经过kubelet中转了。
tsecer@harry: sudo lsof -Pnp `pidof kubelet` | fgrep TCP > kubectl_exec_before.txt
tsecer@harry: sudo lsof -Pnp `pidof kubelet` | fgrep TCP > kubectl_exec_after.txt
tsecer@harry: diff kubectl_exec_before.txt kubectl_exec_after.txt
0a1
> kubelet 1438 root 3u IPv6 62339 0t0 TCP 172.16.0.3:10250->172.16.0.2:60214 (ESTABLISHED)
1a3
> kubelet 1438 root 17u IPv4 62940 0t0 TCP 127.0.0.1:33406->127.0.0.1:35677 (ESTABLISHED)
tsecer@harry:
outro
搜索引擎依然是解决问题的关键工具,GPT could never。
浙公网安备 33010602011771号