Protobuf

1.什么是Protocol Buffers？--一种序列化框架

Protocol Buffers是Google开发的一种语言中立、平台中立、可扩展的数据序列化机制。你可以把它想象成XML或JSON，但它更小、更快、更简单。

1.2 主要特点

高效性: Protobuf将数据序列化为紧凑的二进制格式，比文本格式（如XML、JSON）更小，因此传输和存储效率更高。
速度快: 序列化和反序列化速度非常快，这对于高性能系统（如HDFS中的大量数据传输）至关重要。
强类型: 你需要定义数据的结构（Schema），这使得数据具有明确的类型，有助于避免数据解析错误。
语言中立: 支持多种编程语言（C++, Java, Python, Go, C#等），这意味着你可以用不同的语言编写客户端和服务器端，并且它们可以无缝地交换数据。
向前兼容和向后兼容: 良好的Schema演进能力，即使数据结构发生变化（例如添加新字段），旧的代码仍然可以读取新格式的数据，新的代码也可以读取旧格式的数据。
代码生成: 通过.proto文件定义数据结构后，可以使用Protobuf编译器（protoc）自动生成各种语言的源代码，这些生成的代码提供了便捷的方法来读写结构化数据。

1.3 工作原理

1.3.1 定义`.proto`文件

首先，你需要在一个.proto文件中定义你的数据结构，这被称为“消息（Message）”。例如：

syntax = "proto3"; // 或者 proto2

message Person {
  string name = 1;
  int32 id = 2;
  string email = 3;
}

这里 name、id、email 是字段，后面的数字是字段的唯一标识符。

1.3.2 编译`.proto`文件

使用Protobuf编译器protoc来编译.proto文件，为特定的编程语言生成代码。例如，对于Java：

protoc --java_out=./ Person.proto

这会生成一个Java类，其中包含了Person消息的Java表示以及序列化和反序列化的方法。

1.3.3 使用生成的代码

在你的应用程序中，你可以使用这些生成的类来构建、序列化和反序列化数据。

2. 在HDFS中的应用

HDFS作为一个分布式文件系统，需要在各个组件（如NameNode、DataNode、Client）之间进行大量的通信和数据交换。为了实现高效、可靠的通信，HDFS从0.23版本开始广泛使用Protocol Buffers作为其RPC协议的底层数据序列化机制。

2.1 具体应用场景

RPC通信: HDFS的客户端与NameNode、DataNode之间的所有RPC调用，以及NameNode与DataNode之间的内部RPC调用，都使用Protobuf来序列化请求和响应数据。例如，当客户端请求读取文件时，会发送一个Protobuf序列化的请求消息给NameNode，NameNode处理后返回一个Protobuf序列化的响应消息。

数据结构定义: HDFS中的许多核心数据结构，例如文件块（ExtendedBlockProto）、数据节点信息（DatanodeIDProto）、文件位置信息（LocatedBlockProto）等，都是通过Protobuf的.proto文件来定义的。你可以在HDFS源码中找到类似hdfs.proto这样的文件，它们定义了HDFS内部使用的各种消息格式。

例如，在hadoop-hdfs-project/hadoop-hdfs-client/src/main/proto/hdfs.proto文件中，你可以看到HDFS客户端、服务器和数据传输协议中使用的Protobuf定义。

NamenodeProtocol.proto

/**
 * Licensed to the Apache Software Foundation (ASF) under one
 * or more contributor license agreements.  See the NOTICE file
 * distributed with this work for additional information
 * regarding copyright ownership.  The ASF licenses this file
 * to you under the Apache License, Version 2.0 (the
 * "License"); you may not use this file except in compliance
 * with the License.  You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */

/**
 * These .proto interfaces are private and stable.
 * Please see https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Compatibility.html
 * for what changes are allowed for a *stable* .proto interface.
 */

// This file contains protocol buffers that are used throughout HDFS -- i.e.
// by the client, server, and data transfer protocols.
syntax = "proto2";
option java_package = "org.apache.hadoop.hdfs.protocol.proto";
option java_outer_classname = "NamenodeProtocolProtos";
option java_generic_services = true;
option java_generate_equals_and_hash = true;
package hadoop.hdfs.namenode;

import "hdfs.proto";
import "HdfsServer.proto";

/**
 * Get list of blocks for a given datanode with the total length 
 * of adding up to given size
 * datanode - Datanode ID to get list of block from
 * size - size to which the block lengths must add up to
 */
message GetBlocksRequestProto {
  required DatanodeIDProto datanode = 1; // Datanode ID
  required uint64 size = 2;              // Size in bytes
  // Minimum Block Size in bytes, adding default value to 10MB, as this might
  // cause problem during rolling upgrade, when balancers are upgraded later.
  // For more info refer HDFS-13356
  optional uint64 minBlockSize = 3 [default = 10485760];
  optional uint64 timeInterval = 4 [default = 0];
  optional StorageTypeProto storageType = 5;
}

 
/**
 * blocks - List of returned blocks
 */
message GetBlocksResponseProto {
  required BlocksWithLocationsProto blocks = 1; // List of blocks
}

/**
 * void request
 */
message GetBlockKeysRequestProto {
}

/**
 * keys - Information about block keys at the active namenode
 */
message GetBlockKeysResponseProto {
  optional ExportedBlockKeysProto keys = 1;
}

/**
 * void request
 */
message GetTransactionIdRequestProto {
}

/**
 * txId - Transaction ID of the most recently persisted edit log record
 */
message GetTransactionIdResponseProto {
  required uint64 txId = 1;   // Transaction ID
}

/**
 * void request
 */
message RollEditLogRequestProto {
}

/**
 * signature - A unique token to identify checkpoint transaction
 */
message RollEditLogResponseProto {
  required CheckpointSignatureProto signature = 1;
}

/**
 * void request
 */
message GetMostRecentCheckpointTxIdRequestProto {
}

message GetMostRecentCheckpointTxIdResponseProto{
  required uint64 txId = 1;
}

message GetMostRecentNameNodeFileTxIdRequestProto {
  required string nameNodeFile = 1;
}

message GetMostRecentNameNodeFileTxIdResponseProto{
  required uint64 txId = 1;
}

/**
 * registration - Namenode reporting the error
 * errorCode - error code indicating the error
 * msg - Free text description of the error
 */
message ErrorReportRequestProto {
  required NamenodeRegistrationProto registration = 1; // Registration info
  required uint32 errorCode = 2;  // Error code
  required string msg = 3;        // Error message
}

/**
 * void response
 */
message ErrorReportResponseProto {
}

/**
 * registration - Information of the namenode registering with primary namenode
 */
message RegisterRequestProto {
  required NamenodeRegistrationProto registration = 1; // Registration info
}

/**
 * registration - Updated registration information of the newly registered
 *                datanode.
 */
message RegisterResponseProto {
  required NamenodeRegistrationProto registration = 1; // Registration info
}

/**
 * Start checkpoint request
 * registration - Namenode that is starting the checkpoint
 */
message StartCheckpointRequestProto {
  required NamenodeRegistrationProto registration = 1; // Registration info
}

/**
 * command - Command returned by the active namenode to be
 *           be handled by the caller.
 */
message StartCheckpointResponseProto {
  required NamenodeCommandProto command = 1;
}

/**
 * End or finalize the previously started checkpoint
 * registration - Namenode that is ending the checkpoint
 * signature - unique token to identify checkpoint transaction,
 *             that was received when checkpoint was started.
 */
message EndCheckpointRequestProto {
  required NamenodeRegistrationProto registration = 1; // Registration info
  required CheckpointSignatureProto signature = 2;
}

/**
 * void response
 */
message EndCheckpointResponseProto {
}

/**
 * sinceTxId - return the editlog information for transactions >= sinceTxId
 */
message GetEditLogManifestRequestProto {
  required uint64 sinceTxId = 1;  // Transaction ID
}

/**
 * manifest - Enumeration of editlogs from namenode for 
 *            logs >= sinceTxId in the request
 */
message GetEditLogManifestResponseProto {
  required RemoteEditLogManifestProto manifest = 1; 
}

/**
 * void request
 */
message IsUpgradeFinalizedRequestProto {
}

message IsUpgradeFinalizedResponseProto {
  required bool isUpgradeFinalized = 1;
}

/**
 * void request
 */
message IsRollingUpgradeRequestProto {
}

message IsRollingUpgradeResponseProto {
  required bool isRollingUpgrade = 1;
}

message GetFilePathRequestProto {
  required uint64 fileId = 1;
}

message GetFilePathResponseProto {
  required string srcPath = 1;
}

message GetNextSPSPathRequestProto {
}

message GetNextSPSPathResponseProto {
  optional uint64 spsPath = 1;
}

/**
 * Protocol used by the sub-ordinate namenode to send requests
 * the active/primary namenode.
 *
 * See the request and response for details of rpc call.
 */
service NamenodeProtocolService {
  /**
   * Get list of blocks for a given datanode with length
   * of blocks adding up to given size.
   */
  rpc getBlocks(GetBlocksRequestProto) returns(GetBlocksResponseProto);

  /**
   * Get the current block keys
   */
  rpc getBlockKeys(GetBlockKeysRequestProto) returns(GetBlockKeysResponseProto);

  /**
   * Get the transaction ID of the most recently persisted editlog record
   */
  rpc getTransactionId(GetTransactionIdRequestProto) 
      returns(GetTransactionIdResponseProto);

  /**
   * Get the transaction ID of the most recently persisted editlog record
   */
  rpc getMostRecentCheckpointTxId(GetMostRecentCheckpointTxIdRequestProto) 
      returns(GetMostRecentCheckpointTxIdResponseProto);

  /**
   * Get the transaction ID of the NameNodeFile
   */
  rpc getMostRecentNameNodeFileTxId(GetMostRecentNameNodeFileTxIdRequestProto)
      returns(GetMostRecentNameNodeFileTxIdResponseProto);

  /**
   * Close the current editlog and open a new one for checkpointing purposes
   */
  rpc rollEditLog(RollEditLogRequestProto) returns(RollEditLogResponseProto);

  /**
   * Request info about the version running on this NameNode
   */
  rpc versionRequest(VersionRequestProto) returns(VersionResponseProto);

  /**
   * Report from a sub-ordinate namenode of an error to the active namenode.
   * Active namenode may decide to unregister the reporting namenode 
   * depending on the error.
   */
  rpc errorReport(ErrorReportRequestProto) returns(ErrorReportResponseProto);

  /**
   * Request to register a sub-ordinate namenode
   */
  rpc registerSubordinateNamenode(RegisterRequestProto) returns(RegisterResponseProto);

  /**
   * Request to start a checkpoint. 
   */
  rpc startCheckpoint(StartCheckpointRequestProto) 
      returns(StartCheckpointResponseProto);

  /**
   * End of finalize the previously started checkpoint
   */
  rpc endCheckpoint(EndCheckpointRequestProto) 
      returns(EndCheckpointResponseProto);

  /**
   * Get editlog manifests from the active namenode for all the editlogs
   */
  rpc getEditLogManifest(GetEditLogManifestRequestProto) 
      returns(GetEditLogManifestResponseProto);

  /**
   * Return whether the NameNode is in upgrade state (false) or not (true)
   */
  rpc isUpgradeFinalized(IsUpgradeFinalizedRequestProto)
      returns (IsUpgradeFinalizedResponseProto);

  /**
   * Return whether the NameNode is in rolling upgrade (true) or not (false).
   */
  rpc isRollingUpgrade(IsRollingUpgradeRequestProto)
      returns (IsRollingUpgradeResponseProto);

  /**
   * Return the sps path from namenode
   */
  rpc getNextSPSPath(GetNextSPSPathRequestProto)
      returns (GetNextSPSPathResponseProto);
}

2.2 为什么要使用Protobuf？

性能优化: HDFS处理的数据量巨大，RPC调用频繁。Protobuf的紧凑二进制格式和快速序列化/反序列化能力，显著提升了HDFS的通信效率和整体性能。
跨语言支持: 虽然HDFS主要用Java编写，但Protobuf的跨语言特性使得未来与其他语言（如C++、Python）的组件集成变得更加容易。
Schema演进: HDFS是一个长期演进的项目，数据结构可能会不断变化。Protobuf的向后兼容性保证了在HDFS版本升级时，旧客户端或旧DataNode仍然可以与新NameNode通信，反之亦然，降低了升级的复杂性。
代码维护: 通过.proto文件统一管理数据结构，并通过工具自动生成代码，减少了手动编写序列化/反序列化代码的工作量，降低了出错的可能性，也提高了代码的可读性和维护性。

HDFS选择Protobuf，主要是看重其在RPC性能和数据紧凑性方面的优势，这对于一个核心的分布式文件系统来说至关重要。

posted @ 2025-08-16 12:15 PRdE 阅读(35) 评论(0) 收藏举报

刷新页面返回顶部

PHub

随手记

Protobuf

1.什么是Protocol Buffers？--一种序列化框架

1.2 主要特点

1.3 工作原理

1.3.1 定义`.proto`文件

1.3.2 编译`.proto`文件

1.3.3 使用生成的代码

2. 在HDFS中的应用

2.1 具体应用场景

2.2 为什么要使用Protobuf？

公告

PHub

随手记

Protobuf

1.什么是Protocol Buffers？--一种序列化框架

1.2 主要特点

1.3 工作原理

1.3.1 定义.proto文件

1.3.2 编译.proto文件

1.3.3 使用生成的代码

2. 在HDFS中的应用

2.1 具体应用场景

2.2 为什么要使用Protobuf？

公告

1.3.1 定义`.proto`文件

1.3.2 编译`.proto`文件