Protobuf

1.什么是Protocol Buffers?--一种序列化框架

Protocol Buffers是Google开发的一种语言中立、平台中立、可扩展的数据序列化机制。你可以把它想象成XML或JSON,但它更小、更快、更简单。

1.2 主要特点

  • 高效性: Protobuf将数据序列化为紧凑的二进制格式,比文本格式(如XML、JSON)更小,因此传输和存储效率更高。
  • 速度快: 序列化和反序列化速度非常快,这对于高性能系统(如HDFS中的大量数据传输)至关重要。
  • 强类型: 你需要定义数据的结构(Schema),这使得数据具有明确的类型,有助于避免数据解析错误。
  • 语言中立: 支持多种编程语言(C++, Java, Python, Go, C#等),这意味着你可以用不同的语言编写客户端和服务器端,并且它们可以无缝地交换数据。
  • 向前兼容和向后兼容: 良好的Schema演进能力,即使数据结构发生变化(例如添加新字段),旧的代码仍然可以读取新格式的数据,新的代码也可以读取旧格式的数据。
  • 代码生成: 通过.proto文件定义数据结构后,可以使用Protobuf编译器(protoc)自动生成各种语言的源代码,这些生成的代码提供了便捷的方法来读写结构化数据。

1.3 工作原理

1.3.1 定义.proto文件

首先,你需要在一个.proto文件中定义你的数据结构,这被称为“消息(Message)”。例如:

syntax = "proto3"; // 或者 proto2

message Person {
  string name = 1;
  int32 id = 2;
  string email = 3;
}

这里 nameidemail 是字段,后面的数字是字段的唯一标识符。

1.3.2 编译.proto文件

使用Protobuf编译器protoc来编译.proto文件,为特定的编程语言生成代码。例如,对于Java:

protoc --java_out=./ Person.proto

这会生成一个Java类,其中包含了Person消息的Java表示以及序列化和反序列化的方法。

1.3.3 使用生成的代码

在你的应用程序中,你可以使用这些生成的类来构建、序列化和反序列化数据。

2. 在HDFS中的应用

HDFS作为一个分布式文件系统,需要在各个组件(如NameNode、DataNode、Client)之间进行大量的通信和数据交换。为了实现高效、可靠的通信,HDFS从0.23版本开始广泛使用Protocol Buffers作为其RPC协议的底层数据序列化机制

2.1 具体应用场景

  • RPC通信: HDFS的客户端与NameNode、DataNode之间的所有RPC调用,以及NameNode与DataNode之间的内部RPC调用,都使用Protobuf来序列化请求和响应数据。例如,当客户端请求读取文件时,会发送一个Protobuf序列化的请求消息给NameNode,NameNode处理后返回一个Protobuf序列化的响应消息。

  • 数据结构定义: HDFS中的许多核心数据结构,例如文件块(ExtendedBlockProto)、数据节点信息(DatanodeIDProto)、文件位置信息(LocatedBlockProto)等,都是通过Protobuf的.proto文件来定义的。你可以在HDFS源码中找到类似hdfs.proto这样的文件,它们定义了HDFS内部使用的各种消息格式。

    • 例如,在hadoop-hdfs-project/hadoop-hdfs-client/src/main/proto/hdfs.proto文件中,你可以看到HDFS客户端、服务器和数据传输协议中使用的Protobuf定义。

      NamenodeProtocol.proto

      /**
       * Licensed to the Apache Software Foundation (ASF) under one
       * or more contributor license agreements.  See the NOTICE file
       * distributed with this work for additional information
       * regarding copyright ownership.  The ASF licenses this file
       * to you under the Apache License, Version 2.0 (the
       * "License"); you may not use this file except in compliance
       * with the License.  You may obtain a copy of the License at
       *
       *     http://www.apache.org/licenses/LICENSE-2.0
       *
       * Unless required by applicable law or agreed to in writing, software
       * distributed under the License is distributed on an "AS IS" BASIS,
       * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
       * See the License for the specific language governing permissions and
       * limitations under the License.
       */
      
      /**
       * These .proto interfaces are private and stable.
       * Please see https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Compatibility.html
       * for what changes are allowed for a *stable* .proto interface.
       */
      
      // This file contains protocol buffers that are used throughout HDFS -- i.e.
      // by the client, server, and data transfer protocols.
      syntax = "proto2";
      option java_package = "org.apache.hadoop.hdfs.protocol.proto";
      option java_outer_classname = "NamenodeProtocolProtos";
      option java_generic_services = true;
      option java_generate_equals_and_hash = true;
      package hadoop.hdfs.namenode;
      
      import "hdfs.proto";
      import "HdfsServer.proto";
      
      /**
       * Get list of blocks for a given datanode with the total length 
       * of adding up to given size
       * datanode - Datanode ID to get list of block from
       * size - size to which the block lengths must add up to
       */
      message GetBlocksRequestProto {
        required DatanodeIDProto datanode = 1; // Datanode ID
        required uint64 size = 2;              // Size in bytes
        // Minimum Block Size in bytes, adding default value to 10MB, as this might
        // cause problem during rolling upgrade, when balancers are upgraded later.
        // For more info refer HDFS-13356
        optional uint64 minBlockSize = 3 [default = 10485760];
        optional uint64 timeInterval = 4 [default = 0];
        optional StorageTypeProto storageType = 5;
      }
      
       
      /**
       * blocks - List of returned blocks
       */
      message GetBlocksResponseProto {
        required BlocksWithLocationsProto blocks = 1; // List of blocks
      }
      
      /**
       * void request
       */
      message GetBlockKeysRequestProto {
      }
      
      /**
       * keys - Information about block keys at the active namenode
       */
      message GetBlockKeysResponseProto {
        optional ExportedBlockKeysProto keys = 1;
      }
      
      /**
       * void request
       */
      message GetTransactionIdRequestProto {
      }
      
      /**
       * txId - Transaction ID of the most recently persisted edit log record
       */
      message GetTransactionIdResponseProto {
        required uint64 txId = 1;   // Transaction ID
      }
      
      /**
       * void request
       */
      message RollEditLogRequestProto {
      }
      
      /**
       * signature - A unique token to identify checkpoint transaction
       */
      message RollEditLogResponseProto {
        required CheckpointSignatureProto signature = 1;
      }
      
      /**
       * void request
       */
      message GetMostRecentCheckpointTxIdRequestProto {
      }
      
      message GetMostRecentCheckpointTxIdResponseProto{
        required uint64 txId = 1;
      }
      
      message GetMostRecentNameNodeFileTxIdRequestProto {
        required string nameNodeFile = 1;
      }
      
      message GetMostRecentNameNodeFileTxIdResponseProto{
        required uint64 txId = 1;
      }
      
      /**
       * registration - Namenode reporting the error
       * errorCode - error code indicating the error
       * msg - Free text description of the error
       */
      message ErrorReportRequestProto {
        required NamenodeRegistrationProto registration = 1; // Registration info
        required uint32 errorCode = 2;  // Error code
        required string msg = 3;        // Error message
      }
      
      /**
       * void response
       */
      message ErrorReportResponseProto {
      }
      
      /**
       * registration - Information of the namenode registering with primary namenode
       */
      message RegisterRequestProto {
        required NamenodeRegistrationProto registration = 1; // Registration info
      }
      
      /**
       * registration - Updated registration information of the newly registered
       *                datanode.
       */
      message RegisterResponseProto {
        required NamenodeRegistrationProto registration = 1; // Registration info
      }
      
      /**
       * Start checkpoint request
       * registration - Namenode that is starting the checkpoint
       */
      message StartCheckpointRequestProto {
        required NamenodeRegistrationProto registration = 1; // Registration info
      }
      
      /**
       * command - Command returned by the active namenode to be
       *           be handled by the caller.
       */
      message StartCheckpointResponseProto {
        required NamenodeCommandProto command = 1;
      }
      
      /**
       * End or finalize the previously started checkpoint
       * registration - Namenode that is ending the checkpoint
       * signature - unique token to identify checkpoint transaction,
       *             that was received when checkpoint was started.
       */
      message EndCheckpointRequestProto {
        required NamenodeRegistrationProto registration = 1; // Registration info
        required CheckpointSignatureProto signature = 2;
      }
      
      /**
       * void response
       */
      message EndCheckpointResponseProto {
      }
      
      /**
       * sinceTxId - return the editlog information for transactions >= sinceTxId
       */
      message GetEditLogManifestRequestProto {
        required uint64 sinceTxId = 1;  // Transaction ID
      }
      
      /**
       * manifest - Enumeration of editlogs from namenode for 
       *            logs >= sinceTxId in the request
       */
      message GetEditLogManifestResponseProto {
        required RemoteEditLogManifestProto manifest = 1; 
      }
      
      /**
       * void request
       */
      message IsUpgradeFinalizedRequestProto {
      }
      
      message IsUpgradeFinalizedResponseProto {
        required bool isUpgradeFinalized = 1;
      }
      
      /**
       * void request
       */
      message IsRollingUpgradeRequestProto {
      }
      
      message IsRollingUpgradeResponseProto {
        required bool isRollingUpgrade = 1;
      }
      
      message GetFilePathRequestProto {
        required uint64 fileId = 1;
      }
      
      message GetFilePathResponseProto {
        required string srcPath = 1;
      }
      
      message GetNextSPSPathRequestProto {
      }
      
      message GetNextSPSPathResponseProto {
        optional uint64 spsPath = 1;
      }
      
      /**
       * Protocol used by the sub-ordinate namenode to send requests
       * the active/primary namenode.
       *
       * See the request and response for details of rpc call.
       */
      service NamenodeProtocolService {
        /**
         * Get list of blocks for a given datanode with length
         * of blocks adding up to given size.
         */
        rpc getBlocks(GetBlocksRequestProto) returns(GetBlocksResponseProto);
      
        /**
         * Get the current block keys
         */
        rpc getBlockKeys(GetBlockKeysRequestProto) returns(GetBlockKeysResponseProto);
      
        /**
         * Get the transaction ID of the most recently persisted editlog record
         */
        rpc getTransactionId(GetTransactionIdRequestProto) 
            returns(GetTransactionIdResponseProto);
      
        /**
         * Get the transaction ID of the most recently persisted editlog record
         */
        rpc getMostRecentCheckpointTxId(GetMostRecentCheckpointTxIdRequestProto) 
            returns(GetMostRecentCheckpointTxIdResponseProto);
      
        /**
         * Get the transaction ID of the NameNodeFile
         */
        rpc getMostRecentNameNodeFileTxId(GetMostRecentNameNodeFileTxIdRequestProto)
            returns(GetMostRecentNameNodeFileTxIdResponseProto);
      
        /**
         * Close the current editlog and open a new one for checkpointing purposes
         */
        rpc rollEditLog(RollEditLogRequestProto) returns(RollEditLogResponseProto);
      
        /**
         * Request info about the version running on this NameNode
         */
        rpc versionRequest(VersionRequestProto) returns(VersionResponseProto);
      
        /**
         * Report from a sub-ordinate namenode of an error to the active namenode.
         * Active namenode may decide to unregister the reporting namenode 
         * depending on the error.
         */
        rpc errorReport(ErrorReportRequestProto) returns(ErrorReportResponseProto);
      
        /**
         * Request to register a sub-ordinate namenode
         */
        rpc registerSubordinateNamenode(RegisterRequestProto) returns(RegisterResponseProto);
      
        /**
         * Request to start a checkpoint. 
         */
        rpc startCheckpoint(StartCheckpointRequestProto) 
            returns(StartCheckpointResponseProto);
      
        /**
         * End of finalize the previously started checkpoint
         */
        rpc endCheckpoint(EndCheckpointRequestProto) 
            returns(EndCheckpointResponseProto);
      
        /**
         * Get editlog manifests from the active namenode for all the editlogs
         */
        rpc getEditLogManifest(GetEditLogManifestRequestProto) 
            returns(GetEditLogManifestResponseProto);
      
        /**
         * Return whether the NameNode is in upgrade state (false) or not (true)
         */
        rpc isUpgradeFinalized(IsUpgradeFinalizedRequestProto)
            returns (IsUpgradeFinalizedResponseProto);
      
        /**
         * Return whether the NameNode is in rolling upgrade (true) or not (false).
         */
        rpc isRollingUpgrade(IsRollingUpgradeRequestProto)
            returns (IsRollingUpgradeResponseProto);
      
        /**
         * Return the sps path from namenode
         */
        rpc getNextSPSPath(GetNextSPSPathRequestProto)
            returns (GetNextSPSPathResponseProto);
      }
      
      

2.2 为什么要使用Protobuf?

  • 性能优化: HDFS处理的数据量巨大,RPC调用频繁。Protobuf的紧凑二进制格式和快速序列化/反序列化能力,显著提升了HDFS的通信效率和整体性能。
  • 跨语言支持: 虽然HDFS主要用Java编写,但Protobuf的跨语言特性使得未来与其他语言(如C++、Python)的组件集成变得更加容易。
  • Schema演进: HDFS是一个长期演进的项目,数据结构可能会不断变化。Protobuf的向后兼容性保证了在HDFS版本升级时,旧客户端或旧DataNode仍然可以与新NameNode通信,反之亦然,降低了升级的复杂性。
  • 代码维护: 通过.proto文件统一管理数据结构,并通过工具自动生成代码,减少了手动编写序列化/反序列化代码的工作量,降低了出错的可能性,也提高了代码的可读性和维护性。

HDFS选择Protobuf,主要是看重其在RPC性能和数据紧凑性方面的优势,这对于一个核心的分布式文件系统来说至关重要。

posted @ 2025-08-16 12:15  PRdE  阅读(35)  评论(0)    收藏  举报