Kmesh-Waypoint 深度解析:Kmesh 服务网格的七层流量管理引擎

Waypoint 深度解析:Kmesh 服务网格的七层流量管理引擎

1. 使用背景

Waypoint 是 Kmesh 服务网格中负责七层(L7)流量管理的核心组件,基于 Istio 的 waypoint 架构适配 Kmesh 协议,为服务网格提供高级的流量治理能力。在现代微服务架构中,随着服务数量和复杂性的增加,传统的四层(L4)流量管理已经无法满足日益增长的业务需求,七层流量管理成为服务网格的重要功能。

1.1 与其他组件的关系

Kmesh 整体架构由三大核心组件组成:

  1. Kmesh-daemon:核心守护进程,负责 eBPF 编排生命周期管理、xDS 协议集成、可观察性等功能
  2. eBPF Orchestration:数据面流量编排,基于 eBPF 实现的高性能流量管理
  3. Waypoint:基于 Istio 的 waypoint 适配 Kmesh 协议,负责 L7 流量管理

Waypoint 与其他组件的关系如下:

  • 受控于:Kmesh-daemon,通过 kmeshctl 工具进行管理和配置
  • 协作于:eBPF Orchestration 组件,共同提供完整的 L4/L7 流量管理能力
  • 服务于:最终用户的服务通信,提供高级的流量治理功能

1.2 技术背景

传统的服务网格实现通常采用 Sidecar 代理模式来提供 L7 流量管理功能,但这种方式存在一些局限性:

  1. 资源开销:每个 Pod 都需要运行一个代理容器,消耗大量资源
  2. 部署复杂:需要在每个 Pod 中注入代理,增加了部署和运维的复杂性
  3. 性能影响:流量需要经过多次代理转发,增加了延迟
  4. 扩展性限制:在大规模部署中,代理的数量和资源消耗成为瓶颈

Kmesh 的 Waypoint 组件采用了不同的架构设计,将 L7 流量管理功能集中到专门的 waypoint 服务中,通过以下方式解决传统方案的问题:

  1. 集中管理:将 L7 流量管理功能集中到 waypoint 服务,减少资源消耗
  2. 按需部署:根据业务需求灵活部署 waypoint 服务
  3. 高效转发:结合 eBPF 技术,实现高效的流量转发
  4. 灵活配置:支持多种流量治理策略,如路由、重试、故障注入等

1.3 Istio Waypoint 架构

Waypoint 组件的设计灵感来源于 Istio 的 waypoint 架构,该架构是 Istio Ambient Mesh 模式的核心组成部分。Istio waypoint 提供了一种新的服务网格部署方式,无需在每个 Pod 中注入 Sidecar,而是通过集中的 waypoint 服务来处理 L7 流量。

Kmesh 借鉴了这一设计理念,并将其与 eBPF 技术结合,实现了高性能的 L7 流量管理。Waypoint 组件作为 Kmesh 架构中的重要补充,为用户提供了更灵活、更高效的服务网格解决方案。

2. 架构设计

2.1 核心架构层次

┌─────────────────────────────────────────────────────────┐
│                    管理接口层                          │
│         (kmeshctl waypoint 命令)                    │
├─────────────────────────────────────────────────────────┤
│                    Kubernetes API                       │
│      (Gateway API, Service, Namespace)                │
├─────────────────────────────────────────────────────────┤
│                    Waypoint 服务                        │
│    (Gateway, Service, Deployment, ConfigMap)          │
├─────────────────────────────────────────────────────────┤
│                    缓存管理层                          │
│      (waypoint_cache, service_cache, workload_cache)    │
├─────────────────────────────────────────────────────────┤
│                    eBPF 集成层                         │
│  (sockops, sendmsg, cgroup_skb 等 eBPF 程序)       │
├─────────────────────────────────────────────────────────┤
│                    数据转发层                           │
│       (TCP/IP 协议栈,网络设备)                      │
└─────────────────────────────────────────────────────────┘

2.2 核心组件设计

2.2.1 管理接口层

管理接口层提供了与 Waypoint 交互的命令行工具,主要功能包括:

  • waypoint apply:应用 waypoint 配置到集群
  • waypoint delete:删除 waypoint 配置
  • waypoint list:列出集群中的 waypoint 配置
  • waypoint status:查看 waypoint 的状态
  • waypoint generate:生成 waypoint 的 YAML 配置

2.2.2 Kubernetes API 层

Waypoint 组件充分利用 Kubernetes 的 Gateway API 来实现其功能:

  • Gateway 资源:定义 waypoint 的网络入口点
  • Service 资源:暴露 waypoint 服务
  • Namespace 标签:标记使用 waypoint 的命名空间
  • Pod 标签:标记使用 waypoint 的 Pod

2.2.3 Waypoint 服务层

Waypoint 服务层是实际处理 L7 流量的组件,包括:

  • Gateway 配置:定义监听端口、协议类型等
  • Service 配置:提供稳定的访问入口
  • Deployment 配置:部署 waypoint 代理实例
  • ConfigMap 配置:存储 waypoint 的配置信息

2.2.4 缓存管理层

缓存管理层负责维护 waypoint 与服务、工作负载之间的关联关系:

  • waypoint_cache:管理 waypoint 的缓存和关联关系
  • service_cache:管理服务的缓存信息
  • workload_cache:管理工作负载的缓存信息

2.2.5 eBPF 集成层

eBPF 集成层负责将 waypoint 的配置应用到 eBPF 程序中:

  • sockops 程序:处理套接字操作,识别需要通过 waypoint 的流量
  • sendmsg 程序:在发送消息时添加 waypoint 元数据
  • cgroup_skb 程序:在套接字缓冲区级别处理流量

2.3 工作流程设计

2.3.1 Waypoint 部署流程

  1. 配置生成:使用 kmeshctl 生成 waypoint 配置
  2. 资源创建:创建 Gateway、Service 等 Kubernetes 资源
  3. 标签应用:为命名空间或 Pod 添加 waypoint 标签
  4. 缓存更新:更新 waypoint 缓存,建立关联关系
  5. eBPF 更新:更新 eBPF 程序,启用 waypoint 转发

2.3.2 流量处理流程

  1. 流量识别:eBPF 程序识别需要通过 waypoint 的流量
  2. 元数据添加:在发送消息时添加 waypoint 元数据
  3. Waypoint 转发:流量被转发到 waypoint 服务
  4. L7 处理:waypoint 执行 L7 流量管理策略
  5. 后端选择:根据策略选择后端服务
  6. 流量转发:将流量转发到目标服务

2.4 数据结构设计

2.4.1 Waypoint 关联对象

type waypointAssociatedObjects struct {
    mutex sync.RWMutex
    // IP address of waypoint.
    // If it is nil, it means that the waypoint service has not been processed yet.
    address *workloadapi.NetworkAddress

    // Associated services of this waypoint.
    // The key of this map is service resource name and value is corresponding service structure.
    services map[string]*workloadapi.Service

    // Associated workloads of this waypoint.
    // The key of this map is workload uid and value is corresponding workload structure.
    workloads map[string]*workloadapi.Workload
}

2.4.2 Service 和 Backend 结构

在 eBPF 程序中,service 和 backend 结构都包含 waypoint 相关的字段:

// service map
typedef struct {
    __u32 prio_endpoint_count[PRIO_COUNT]; // endpoint count of current service with prio
    __u32 lb_policy; // load balancing algorithm, currently supports random algorithm, locality loadbalance
                     // Failover/strict mode
    __u32 service_port[MAX_PORT_COUNT]; // service_port[i] and target_port[i] are a pair, i starts from 0 and max value
                                        // is MAX_PORT_COUNT-1
    __u32 target_port[MAX_PORT_COUNT];
    struct ip_addr wp_addr;
    __u32 waypoint_port;
} service_value;

// backend map
typedef struct {
    struct ip_addr addr;
    __u32 service_count;
    __u32 service[MAX_SERVICE_COUNT];
    struct ip_addr wp_addr;
    __u32 waypoint_port;
} backend_value;

3. 使用方式

3.1 基本使用

3.1.1 应用 Waypoint

为当前命名空间应用 waypoint:

# 为当前命名空间应用 waypoint
kmeshctl waypoint apply

# 为指定命名空间应用 waypoint
kmeshctl waypoint apply --namespace default

# 等待 waypoint 准备就绪
kmeshctl waypoint apply --namespace default --wait

# 为特定 Pod 应用 waypoint
kmeshctl waypoint apply -n default --name reviews-v2-pod-waypoint --for workload

3.1.2 生成 Waypoint 配置

生成 waypoint 的 YAML 配置:

# 生成 waypoint 配置
kmeshctl waypoint generate --namespace default

# 生成处理服务流量的 waypoint
kmeshctl waypoint generate --for service -n default

3.1.3 列出 Waypoint

列出集群中的 waypoint 配置:

# 列出当前命名空间的 waypoint
kmeshctl waypoint list

# 列出指定命名空间的 waypoint
kmeshctl waypoint list --namespace default

# 列出所有命名空间的 waypoint
kmeshctl waypoint list -A

3.1.4 查看 Waypoint 状态

查看 waypoint 的运行状态:

# 查看当前命名空间的 waypoint 状态
kmeshctl waypoint status

# 查看指定命名空间的 waypoint 状态
kmeshctl waypoint status --namespace default

3.1.5 删除 Waypoint

删除 waypoint 配置:

# 删除当前命名空间的 waypoint
kmeshctl waypoint delete

# 删除指定名称的 waypoint
kmeshctl waypoint delete waypoint-name --namespace default

# 删除多个 waypoint
kmeshctl waypoint delete waypoint-name1 waypoint-name2 --namespace default

# 删除指定命名空间的所有 waypoint
kmeshctl waypoint delete --all --namespace default

3.2 高级配置

3.2.1 指定流量类型

为 waypoint 指定处理的流量类型:

# 处理服务流量
kmeshctl waypoint apply --for service

# 处理工作负载流量
kmeshctl waypoint apply --for workload

# 处理所有流量
kmeshctl waypoint apply --for all

# 不处理任何流量
kmeshctl waypoint apply --for none

3.2.2 命名空间注册

将命名空间注册到 waypoint:

# 注册命名空间到 waypoint
kmeshctl waypoint apply --enroll-namespace

# 覆盖现有的 waypoint
kmeshctl waypoint apply --enroll-namespace --overwrite

3.2.3 指定 Waypoint 名称

为 waypoint 指定自定义名称:

# 使用自定义名称
kmeshctl waypoint apply --name my-custom-waypoint

3.2.4 指定 Waypoint 镜像

为 waypoint 指定自定义镜像:

# 使用自定义镜像
kmeshctl waypoint apply --image my-custom-waypoint-image:latest

3.3 Kubernetes 部署

3.3.1 手动部署 Waypoint

如果需要手动部署 waypoint,可以使用生成的 YAML 配置:

# 生成配置
kmeshctl waypoint generate --namespace default > waypoint.yaml

# 应用配置
kubectl apply -f waypoint.yaml

3.3.2 验证 Waypoint 部署

验证 waypoint 是否正确部署:

# 查看 Gateway 资源
kubectl get gateway -n default

# 查看 Service 资源
kubectl get svc -n default

# 查看 Pod 状态
kubectl get pod -n default -l gateway.networking.k8s.io/waypoint-for=default-waypoint

3.4 与 Istio 集成

3.4.1 使用 Istio 控制面

Waypoint 可以与 Istio 控制面配合使用:

# 部署 Istio 控制面
istioctl install --set profile=ambient

# 部署 Kmesh 作为数据面
kubectl apply -f kmesh.yaml

# 应用 waypoint
kmeshctl waypoint apply --enroll-namespace

3.4.2 配置流量策略

使用 Istio 的流量策略配置 waypoint:

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: reviews
spec:
  hosts:
  - reviews
  http:
  - match:
    - headers:
        end-user:
          exact: jason
    route:
    - destination:
        host: reviews
        subset: v2

4. 源码原理

Waypoint 组件的源码实现涉及多个层次和模块,下面深入分析其核心源码原理:

4.1 命令行工具实现

4.1.1 命令结构定义

Waypoint 命令行工具基于 Cobra 框架实现,提供了完整的命令结构:

func NewCmd() *cobra.Command {
    waypointCmd := &cobra.Command{
        Use:   "waypoint",
        Short: "Manage waypoint configuration",
        Long:  "A group of commands used to manage waypoint configuration",
        Example: `  # Apply a waypoint to the current namespace
  kmeshctl waypoint apply

  # Generate a waypoint as yaml
  kmeshctl waypoint generate --namespace default

  # List all waypoints in a specific namespace
  kmeshctl waypoint list --namespace default`,
        Args: func(cmd *cobra.Command, args []string) error {
            if len(args) != 0 {
                return fmt.Errorf("unknown subcommand %q", args[0])
            }
            return nil
        },
        RunE: func(cmd *cobra.Command, args []string) error {
            cmd.HelpFunc()(cmd, args)
            return nil
        },
    }

    // 添加子命令
    waypointCmd.AddCommand(waypointListCmd)
    waypointCmd.AddCommand(waypointDeleteCmd)
    waypointCmd.AddCommand(waypointStatusCmd)
    waypointCmd.AddCommand(waypointGenerateCmd)
    waypointCmd.AddCommand(waypointApplyCmd)

    return waypointCmd
}

4.1.2 Gateway 创建逻辑

Gateway 资源的创建是 waypoint 的核心功能:

makeGateway := func(forApply bool) (*gateway.Gateway, error) {
    ns := namespaceOrDefault(namespace)
    if namespace == "" && !forApply {
        ns = ""
    }

    // If a user sets waypoint name to an empty string, set it to the default namespace waypoint name.
    if waypointName == "" {
        waypointName = constants.DefaultNamespaceWaypoint
    } else if waypointName == "none" {
        return nil, fmt.Errorf("invalid name provided for waypoint, 'none' is a reserved value")
    }
    gw := gateway.Gateway{
        TypeMeta: metav1.TypeMeta{
            Kind:       gvk.KubernetesGateway_v1.Kind,
            APIVersion: gvk.KubernetesGateway_v1.GroupVersion(),
        },
        ObjectMeta: metav1.ObjectMeta{
            Name:        waypointName,
            Namespace:   ns,
            Annotations: make(map[string]string, 0),
        },
        Spec: gateway.GatewaySpec{
            GatewayClassName: constants.WaypointGatewayClassName,
            Listeners: []gateway.Listener{{
                Name:     "mesh",
                Port:     15008,
                Protocol: gateway.ProtocolType(protocol.HBONE),
            }},
        },
    }

    gw.Annotations[WaypointImageAnnotation] = getKmeshWaypointImage()

    // only label if user has provided their own value, otherwise we let istiod choose a default at runtime (service)
    // this will allow for gateway class to provide a default for that class rather than always forcing service or requiring users to configure correctly
    if trafficType != "" {
        if !validTrafficTypes.Contains(trafficType) {
            return nil, fmt.Errorf("invalid traffic type: %s. Valid options are: %v", trafficType, sets.SortedList(validTrafficTypes))
        }

        if gw.Labels == nil {
            gw.Labels = map[string]string{}
        }

        gw.Labels[KmeshWaypointForTrafficTypeLabel] = trafficType
    }

    if revision != "" {
        gw.Labels = map[string]string{label.IoIstioRev.Name: revision}
    }
    return &gw, nil
}

关键功能分析

  1. 命名空间处理:根据参数确定目标命名空间
  2. 名称验证:验证 waypoint 名称的有效性
  3. Gateway 配置:创建标准的 Gateway 资源配置
  4. 标签管理:添加必要的标签和注解
  5. 流量类型:支持不同的流量类型配置

4.1.3 Apply 命令实现

Apply 命令是应用 waypoint 配置的核心功能:

waypointApplyCmd := &cobra.Command{
    Use:   "apply",
    Short: "Apply a waypoint configuration",
    Long:  "Apply a waypoint configuration to the cluster",
    Example: `  # Apply a waypoint to the current namespace
  kmeshctl waypoint apply

  # Apply a waypoint to a specific namespace and wait for it to be ready
  kmeshctl waypoint apply --namespace default --wait
 
  # Apply a waypoint to a specific pod
  kmeshctl waypoint apply -n default --name reviews-v2-pod-waypoint --for workload`,
    RunE: func(cmd *cobra.Command, args []string) error {
        kubeClient, err := utils.CreateKubeClient()
        if err != nil {
            return fmt.Errorf("failed to create Kubernetes client: %v", err)
        }
        ns := namespaceOrDefault(namespace)
        // If a user decides to enroll their namespace with a waypoint, verify that they have labeled their namespace as Kmesh.
        // If they don't, the user will be warned and be presented with a command to label their namespace as Kmesh if they
        // choose to do so.
        //
        // NOTE: This is a warning and not an error because the user may not intend to label their namespace as Kmesh.
        //
        // e.g. Users are handling Kmesh redirection per workload rather than at the namespace level.
        hasWaypoint, err := namespaceHasLabel(kubeClient, ns, KmeshUseWaypointLabel)
        if err != nil {
            return err
        }
        if enrollNamespace {
            if !overwrite && hasWaypoint {
                // we don't want to error on the user when they don't explicitly overwrite namespaced Waypoints,
                // we just warn them and provide a suggestion
                fmt.Fprintf(cmd.OutOrStdout(), "Warning: namespace (%s) already has an enrolled Waypoint. Consider "+
                    "adding `"+"--overwrite"+"` flag to your apply command.\n", ns)
                return nil
            }
            namespaceIsLabeledKmesh, err := namespaceHasLabelWithValue(kubeClient, ns, label.IoIstioDataplaneMode.Name, DataplaneModeKmesh)
            if err != nil {
                return fmt.Errorf("failed to check if namespace is labeled Kmesh: %v", err)
            }
            if !namespaceIsLabeledKmesh {
                fmt.Fprintf(cmd.OutOrStdout(), "Warning: namespace is not enrolled in Kmesh. Consider running\t"+
                    "`"+"kubectl label namespace %s istio.io/dataplane-mode=Kmesh"+"`\n", ns)
            }
        }
        gw, err := makeGateway(true)
        if err != nil {
            return fmt.Errorf("failed to create gateway: %v", err)
        }

        _, err = kubeClient.GatewayAPI().GatewayV1().Gateways(ns).Create(context.Background(), gw, metav1.CreateOptions{
            FieldManager: "kmeshctl",
        })
        if err != nil {
            if errors.IsNotFound(err) {
                return fmt.Errorf("missing Kubernetes Gateway CRDs need to be installed before applying a waypoint: %s", err)
            }
            return err
        }

        if waitReady {
            startTime := time.Now()
            ticker := time.NewTicker(1 * time.Second)
            defer ticker.Stop()
            for range ticker.C {
                programmed := false
                gwc, err := kubeClient.GatewayAPI().GatewayV1().Gateways(ns).Get(context.TODO(), gw.Name, metav1.GetOptions{})
                if err == nil {
                    // Check if gateway has Programmed condition set to true
                    for _, cond := range gwc.Status.Conditions {
                        if cond.Type == string(gateway.GatewayConditionProgrammed) && string(cond.Status) == "True" {
                            programmed = true
                            break
                        }
                    }
                }
                if programmed {
                    break
                }
                if time.Since(startTime) > waitTimeout {
                    return errorWithMessage("timed out while waiting for waypoint", gwc, err)
                }
            }
        }
        fmt.Fprintf(cmd.OutOrStdout(), "waypoint %v/%v applied\n", gw.Namespace, gw.Name)

        // If a user decides to enroll their namespace with a waypoint, label the namespace with the waypoint name
        // after the waypoint has been applied.
        if enrollNamespace {
            err = labelNamespaceWithWaypoint(kubeClient, ns)
            if err != nil {
                return fmt.Errorf("failed to label namespace with waypoint: %v", err)
            }
            fmt.Fprintf(cmd.OutOrStdout(), "namespace %v labeled with \"%v: %v\"\n", ns,
                KmeshUseWaypointLabel, gw.Name)
        }
        return nil
    },
}

关键功能分析

  1. Kubernetes 客户端创建:创建用于与 Kubernetes API 交互的客户端
  2. 命名空间检查:检查命名空间是否已有 waypoint 配置
  3. Kmesh 标签验证:验证命名空间是否标记为 Kmesh
  4. Gateway 创建:创建 Gateway 资源到 Kubernetes
  5. 等待就绪:等待 waypoint 准备就绪
  6. 命名空间标记:为命名空间添加 waypoint 标签

4.2 缓存管理实现

4.2.1 Waypoint 缓存结构

Waypoint 缓存是管理 waypoint 与服务、工作负载关联关系的核心组件:

type waypointCache struct {
    mutex sync.RWMutex

    serviceCache ServiceCache

    // NOTE: The following data structure is used to change the waypoint
    // address of type hostname in the service or workload to type ip address. Because of
    // order in which services are processed, it may be possible that corresponding
    // waypoint service can't be found when processing the service or workload. The waypoint associated
    // with a service or a workload can also changed at any time, so we need to following maps to track
    // relationship between service & workload and its waypoint.

    // Used to track a waypoint and all services and workloads associated with it.
    // Keyed by waypoint service resource name, valued by its associated services and workloads.
    //
    // ***
    // When a service's or workload's waypoint needs to be converted, first check whether the waypoint can be found in this map.
    // If it can be found, convert it directly. Otherwise, add it to waypointAssociatedServices and wait.
    // When the corresponding waypoint service is added to the cache, it will be processed and returned uniformly.
    // ***
    waypointAssociatedObjects map[string]*waypointAssociatedObjects

    // Used to locate relevant waypoint when deleting or updating service.
    // Keyed by service resource name, valued by associated waypoint's resource name.
    serviceToWaypoint map[string]string

    // Used to locate relevant waypoint when deleting or updating workload.
    // Keyed by workload uid, valued by associated waypoint's resource name.
    workloadToWaypoint map[string]string
}

关键设计特点

  1. 双向映射:维护服务/工作负载到 waypoint 的双向映射
  2. 延迟解析:支持 hostname 类型 waypoint 的延迟解析
  3. 关联跟踪:跟踪 waypoint 与所有关联的服务和工作负载
  4. 并发安全:使用读写锁保证并发安全

4.2.2 服务添加或更新

服务添加或更新是 waypoint 缓存的核心功能之一:

func (w *waypointCache) AddOrUpdateService(svc *workloadapi.Service) bool {
    w.mutex.Lock()
    defer w.mutex.Unlock()

    resourceName := svc.ResourceName()
    // If this is a service without waypoint or with an IP address type waypoint, no processing is required and
    // return directly.
    if svc.GetWaypoint() == nil || svc.GetWaypoint().GetAddress() != nil {
        // Service may become unassociated with waypoint.
        if waypoint, ok := w.serviceToWaypoint[resourceName]; ok {
            delete(w.serviceToWaypoint, resourceName)
            w.waypointAssociatedObjects[waypoint].deleteService(resourceName)
        }
        return true
    }

    var ret bool

    // If this is a svc with hostname waypoint.
    hostname := svc.GetWaypoint().GetHostname()
    waypointResourceName := hostname.GetNamespace() + "/" + hostname.GetHostname()

    if waypoint, ok := w.serviceToWaypoint[resourceName]; ok && waypoint != waypointResourceName {
        // Service updated associated waypoint, delete previous association first.
        delete(w.serviceToWaypoint, resourceName)
        w.waypointAssociatedObjects[waypoint].deleteService(resourceName)
    }

    log.Debugf("Update svc %s with waypoint %s", svc.ResourceName(), waypointResourceName)
    if associated, ok := w.waypointAssociatedObjects[waypointResourceName]; ok {
        if associated.isResolved() {
            // The waypoint corresponding to this service has been resolved.
            updateServiceWaypoint(svc, associated.waypointAddress())
            ret = true
        }
    } else {
        // Try to find waypoint service from the cache.
        waypointService := w.serviceCache.GetService(waypointResourceName)
        var addr *workloadapi.NetworkAddress
        if waypointService != nil && len(waypointService.GetAddresses()) != 0 {
            addr = waypointService.GetAddresses()[0]
            updateServiceWaypoint(svc, waypointService.GetAddresses()[0])
            ret = true
        }
        w.waypointAssociatedObjects[waypointResourceName] = newAssociatedObjects(addr)
    }
    w.serviceToWaypoint[resourceName] = waypointResourceName
    // Anyway, add svc to the association list.
    w.waypointAssociatedObjects[waypointResourceName].addService(resourceName, svc)

    return ret
}

关键功能分析

  1. 无 waypoint 处理:处理没有 waypoint 或使用 IP 地址 waypoint 的服务
  2. waypoint 变更:处理服务关联 waypoint 的变更
  3. 延迟解析:处理 hostname 类型 waypoint 的延迟解析
  4. 缓存查找:从服务缓存中查找 waypoint 服务
  5. 关联更新:更新服务与 waypoint 的关联关系

4.2.3 工作负载添加或更新

工作负载添加或更新的逻辑与服务类似:

func (w *waypointCache) AddOrUpdateWorkload(workload *workloadapi.Workload) bool {
    w.mutex.Lock()
    defer w.mutex.Unlock()

    uid := workload.GetUid()
    // If this is a workload with waypoint or with an IP address type waypoint, no processing is required and
    // return directly.
    if workload.GetWaypoint() == nil || workload.GetWaypoint().GetAddress() != nil {
        // Workload may become unassociated with waypoint.
        if waypoint, ok := w.workloadToWaypoint[uid]; ok {
            delete(w.workloadToWaypoint, uid)
            w.waypointAssociatedObjects[waypoint].deleteWorkload(uid)
        }
        return true
    }

    var ret bool

    // If this is a svc with hostname waypoint.
    hostname := workload.GetWaypoint().GetHostname()
    waypointResourceName := hostname.GetNamespace() + "/" + hostname.GetHostname()

    if waypoint, ok := w.workloadToWaypoint[uid]; ok && waypoint != waypointResourceName {
        // Workload updated associated waypoint, delete previous association first.
        delete(w.workloadToWaypoint, uid)
        w.waypointAssociatedObjects[waypoint].deleteWorkload(uid)
    }

    log.Debugf("Update workload %s with waypoint %s", uid, waypointResourceName)
    if associated, ok := w.waypointAssociatedObjects[waypointResourceName]; ok {
        if associated.isResolved() {
            // The waypoint corresponding to this service has been resolved.
            updateWorkloadWaypoint(workload, associated.waypointAddress())
            ret = true
        }
    } else {
        // Try to find waypoint service from the cache.
        waypointService := w.serviceCache.GetService(waypointResourceName)
        var addr *workloadapi.NetworkAddress
        if waypointService != nil && len(waypointService.GetAddresses()) != 0 {
            addr = waypointService.GetAddresses()[0]
            updateWorkloadWaypoint(workload, waypointService.GetAddresses()[0])
            ret = true
        }
        w.waypointAssociatedObjects[waypointResourceName] = newAssociatedObjects(addr)
    }
    w.workloadToWaypoint[uid] = waypointResourceName
    // Anyway, add svc to the association list.
    w.waypointAssociatedObjects[waypointResourceName].addWorkload(uid, workload)

    return ret
}

4.2.4 Waypoint 刷新机制

Waypoint 刷新机制用于处理新添加的 waypoint 服务:

func (w *waypointCache) Refresh(svc *workloadapi.Service) ([]*workloadapi.Service, []*workloadapi.Workload) {
    if len(svc.GetAddresses()) == 0 {
        return nil, nil
    }

    address := svc.GetAddresses()[0]
    resourceName := svc.ResourceName()

    w.mutex.Lock()
    defer w.mutex.Unlock()

    // If this svc is a waypoint service, may need refreshing.
    if associated, ok := w.waypointAssociatedObjects[resourceName]; ok {
        waypointAddr := associated.waypointAddress()
        if waypointAddr != nil && waypointAddr.String() == address.String() {
            return nil, nil
        }

        log.Debugf("Refreshing services associated with waypoint %s", resourceName)
        return associated.update(address)
    }

    return nil, nil
}

关键功能分析

  1. 地址验证:验证 waypoint 地址是否已解析
  2. 关联更新:更新所有关联的服务和工作负载
  3. 延迟处理:返回需要更新的服务和工作负载列表

4.3 eBPF 集成实现

4.3.1 Waypoint 管理函数

在 eBPF 程序中,waypoint 管理函数负责处理需要通过 waypoint 的流量:

static inline int waypoint_manager(struct kmesh_context *kmesh_ctx, struct ip_addr *wp_addr, __u32 port)
{
    ctx_buff_t *ctx = (ctx_buff_t *)kmesh_ctx->ctx;

    if (ctx->user_family == AF_INET)
        kmesh_ctx->dnat_ip.ip4 = wp_addr->ip4;
    else
        bpf_memcpy(kmesh_ctx->dnat_ip.ip6, wp_addr->ip6, IPV6_ADDR_LEN);
    kmesh_ctx->dnat_port = port;
    kmesh_ctx->via_waypoint = true;
    return 0;
}

关键功能分析

  1. 地址转换:将目标地址转换为 waypoint 地址
  2. 端口转换:将目标端口转换为 waypoint 端口
  3. 标记设置:设置 via_waypoint 标记,标识流量需要通过 waypoint

4.3.2 后端管理函数

后端管理函数决定流量是否需要通过 waypoint:

static inline int
backend_manager(struct kmesh_context *kmesh_ctx, backend_value *backend_v, __u32 service_id, service_value *service_v)
{
    int ret = -ENOENT;
    ctx_buff_t *ctx = (ctx_buff_t *)kmesh_ctx->ctx;
    __u32 i, user_port = ctx->user_port;

    if (backend_v->waypoint_port != 0) {
        BPF_LOG(
            DEBUG,
            BACKEND,
            "route to waypoint[%s:%u]\n",
            ip2str((__u32 *)&backend_v->wp_addr, ctx->family == AF_INET),
            bpf_ntohs(backend_v->waypoint_port));
        ret = waypoint_manager(kmesh_ctx, &backend_v->wp_addr, backend_v->waypoint_port);
        return ret;
    }

    ret = svc_dnat(kmesh_ctx, backend_v, service_v);
    if (ret == 0) {
        BPF_LOG(
            DEBUG,
            BACKEND,
            "svc %u dnat to [%s:%u]\n",
            service_id,
            ip2str((__u32 *)&kmesh_ctx->dnat_ip, ctx->family == AF_INET),
            bpf_ntohs(kmesh_ctx->dnat_port));
    }

    return ret;
}

关键功能分析

  1. Waypoint 检查:检查后端是否配置了 waypoint
  2. 路由决策:根据配置决定是否通过 waypoint
  3. DNAT 处理:执行目标地址转换
  4. 日志记录:记录路由决策和转换结果

4.3.3 Sockops 程序集成

在 sockops 程序中,处理需要通过 waypoint 的连接:

case BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB:
    if (!is_managed_by_kmesh(skops))
        break;
    observe_on_connect_established(skops->sk, OUTBOUND);
    if (bpf_sock_ops_cb_flags_set(skops, BPF_SOCK_OPS_STATE_CB_FLAG) != 0)
        BPF_LOG(ERR, SOCKOPS, "set sockops cb failed!\n");
    struct bpf_sock *sk = (struct bpf_sock *)skops->sk;
    if (!sk) {
        break;
    }
    struct sock_storage_data *storage = NULL;
    storage = bpf_sk_storage_get(&map_of_sock_storage, sk, 0, 0);
    if (!storage) {
        break;
    }

    if (storage->via_waypoint) {
        enable_encoding_metadata(skops);
    }
    break;

关键功能分析

  1. 连接管理:处理主动建立的连接
  2. 存储检查:检查套接字存储中的 waypoint 配置
  3. 元数据编码:为通过 waypoint 的连接启用元数据编码

4.4 元数据编码实现

4.4.1 元数据编码启用

元数据编码用于在发送消息时添加 waypoint 相关信息:

// update sockmap to trigger sk_msg prog to encode metadata before sending to waypoint
static inline void enable_encoding_metadata(struct bpf_sock_ops *skops)
{
    int err;
    struct bpf_sock_tuple tuple_info = {0};
    extract_skops_to_tuple(skops, &tuple_info);
    err = bpf_sock_hash_update(skops, &map_of_kmesh_socket, &tuple_info, BPF_ANY);
    if (err)
        BPF_LOG(ERR, SOCKOPS, "enable encoding metadata failed!, err is %d", err);
}

关键功能分析

  1. 套接字元组提取:提取套接字的五元组信息
  2. sockmap 更新:将套接字添加到 sockmap
  3. 触发编码:触发 sendmsg 程序进行元数据编码

4.5 状态管理实现

4.5.1 Waypoint 状态检查

Waypoint 状态检查用于验证 waypoint 是否准备就绪:

func printWaypointStatus(w *tabwriter.Writer, kubeClient kube.CLIClient, gw []gateway.Gateway) error {
    var cond metav1.Condition
    startTime := time.Now()
    ticker := time.NewTicker(1 * time.Second)
    defer ticker.Stop()
    if namespace == "" {
        fmt.Fprintln(w, "NAMESPACE\tNAME\tSTATUS\tTYPE\tREASON\tMESSAGE")
    } else {
        fmt.Fprintln(w, "NAME\tSTATUS\tTYPE\tREASON\tMESSAGE")
    }
    for _, gw := range gw {
        for range ticker.C {
            programmed := false
            gwc, err := kubeClient.GatewayAPI().GatewayV1().Gateways(namespaceOrDefault(namespace)).Get(context.TODO(), gw.Name, metav1.GetOptions{})
            if err == nil {
                // Check if gateway has Programmed condition set to true
                for _, cond = range gwc.Status.Conditions {
                    if cond.Type == string(gateway.GatewayConditionProgrammed) && string(cond.Status) == "True" {
                        programmed = true
                        break
                    }
                }
            }
            if namespace == "" {
                fmt.Fprintf(w, "%s\t%s\t%s\t%s\t%s\t%s\n", gwc.Namespace, gwc.Name, cond.Status, cond.Type, cond.Reason, cond.Message)
            } else {
                fmt.Fprintf(w, "%s\t%s\t%s\t%s\t%s\n", gwc.Name, cond.Status, cond.Type, cond.Reason, cond.Message)
            }

            if programmed {
                break
            }
            if time.Since(startTime) > waitTimeout {
                return errorWithMessage("timed out while retrieving status for waypoint", gwc, err)
            }
        }
    }
    return w.Flush()
}

关键功能分析

  1. 状态轮询:定期检查 waypoint 的状态
  2. 条件检查:检查 Gateway 的 Programmed 条件
  3. 超时处理:处理状态检查超时的情况
  4. 状态输出:格式化输出 waypoint 的状态信息
posted @ 2026-02-04 20:11  Mephostopheles  阅读(0)  评论(0)    收藏  举报