多媒体文件格式之RMVB

[时间:2016-07] [状态:Open]

RM/RMVB是Real公司私有的封装格式,常见的后缀形式是rm、ra、rmvb。
通常封装的都是real转悠的编码格式,比如音频中的sipro、cook、atrc、ralf、raac,视频的RV10、RV20、RV30、RV40。

0. 学习多媒体容器格式的目的

主要是为了回答以下问题:

  1. 该容器中数据是如何组织的?
  2. 该容器包含哪些编码格式的数据?这些数据是如何存储的?
  3. 该容器包含哪些元数据信息?包含哪些节目信息?
  4. 对于支持多节目的容器格式,如何找到对应的音频流、视频流、字幕流?
  5. 如何确定该容器的节目播放时长?
  6. 如何从该容器中提取音频、视频、字幕数据,并交给解码器解码,有时间戳否?
  7. 该容器是否支持seek?有哪些辅助信息?
  8. 是否支持直接流化?
  9. 哪里可以找到该容器格式最标准的文档资料?
  10. 有哪些可用的工具,方便分析容器格式异常或者错误?

1. RM文件格式概述

RealMedia File Format(RMFF)是一种基于TAG的文件格式,每个TAG有四个字节(FOURCC)用于标识元素类型。
RM文件的基本构成块是chunk。每个chunk构成如下:

==============
 ID(FOURCC)
--------------
 size(4 byte)
--------------
 data([size])
==============

每个chunk的ID决定了data域如何解析。顶层的chunk可以包含sub-chunk。
一个常见的chunk构成如下图:
rmff chunk diagram

通常RM文件有三部分构成:header section、data section、index section。每个section都是多个chunk构成,具体可以参考下图:
RMFF structure diagram

后续部分详细介绍各个section。

2. RM文件头(header section)

RMFF是基于TAG的格式,在header section中各个chunk出现的顺序并不是固定的,但RealMedia File Header(文件头)必须是第一个chunk。其他后续chunk包括:Properties Header(属性头)、Media Properties Header(媒体属性头)、Content Description Header(内容描述头)。

RealMedia文件头

RealMedia文件头通常用于识别文件格式,并且每个RM文件只有一个文件头。其中包含的字段如下:

RealMedia_File_Header
{
  UINT32    object_id;
  UINT32    size;
  UINT16    object_version;

  if ((object_version == 0) || (object_version == 1))
  {
    UINT32   file_version;
    UINT32   num_headers;
  }
}

各字段具体含义见下表:

field type description
object_id UINT32 The unique object ID for a RealMedia File (.RMF ). All RealMedia files begin with this identifier.
size UINT32 The size of the RealMedia header section in bytes.
object_version UINT16 The version of the RealMedia File Header object. All files created according to this specification have an object_version number of 0 (zero) or 1.
file_version UINT32 The version of the RealMedia file. This member is present on all RealMedia_File_Header objects with an object_version of 0 (zero) or 1.
num_headers UINT32 The number of headers in the header section that follow the RealMedia File Header. This member is present on all RealMedia_File_Header objects with an object_version of 0 (zero) or 1.

注:后续表格中将不会出现关于object_version的限制,具体建议参考标准文档。

RM属性头

Properties Header描述RMF的一般媒体属性。
RM系统会参考这个对象中的数据处理RM文件或流中的数据。在RMF中只有一个属性头。其中包含的字段如下:

Properties_Header
{
  UINT32    object_id;
  UINT32    size;
  UINT16    object_version;

  if (object_version == 0)
  {
    UINT32   max_bit_rate;
    UINT32   avg_bit_rate;
    UINT32   max_packet_size;
    UINT32   avg_packet_size;
    UINT32   num_packets;
    UINT32   duration;
    UINT32   preroll;
    UINT32   index_offset;
    UINT32   data_offset;
    UINT16   num_streams;
    UINT16   flags;
  }
}

各字段具体含义见下表:

field type description
object_id UINT32 The unique object ID for a Properties Header ('PROP').
size UINT32 The 32-bit size of the Properties Header in bytes.
object_version UINT16 The version of the RealMedia File Header object. All files created according to this specification have an object_version number of 0 (zero).
max_bit_rate UINT32 The maximum bit rate required to deliver this file over a network.
avg_bit_rate UINT32 The average bit rate required to deliver this file over a network.
max_packet_size UINT32 The largest packet size (in bytes) in the media data.
avg_packet_size UINT32 The average packet size (in bytes) in the media data.
num_packets UINT32 The number of packets in the media data.
duration UINT32 The duration of the file in milliseconds.
preroll UINT32 The number of milliseconds to prebuffer before starting playback.
index_offset UINT32 The offset in bytes from the start of the file to the start of the index header object. This value can be 0 (zero), which indicates that no index chunks are present in this file.
data_offset UINT32 The offset in bytes from the start of the file to the start of the Data Section.
Note: There can be a number of Data_Chunk_Headers in a RealMedia file. The data_offset value specifies the offset in bytes to the first Data_Chunk_Header. The offsets to the other Data_Chunk_Headers can be derived from the next_data_header field in a Data_Chunk_Header.
num_streams UINT16 The total number of media properties headers in the main headers section.
flags UINT16

RM媒体属性头(Media Properties Header)

Media Properties Header描述了RM文件中每个流的特定媒体属性。每一个流都有一个媒体属性头。其中包含的字段如下:

Media_Properties_Header
{
  UINT32     object_id;
  UINT32     size;
  UINT16     object_version;

  if (object_version == 0)
  {
    UINT16                      stream_number;
    UINT32                      max_bit_rate;
    UINT32                      avg_bit_rate;
    UINT32                      max_packet_size;
    UINT32                      avg_packet_size;
    UINT32                      start_time;
    UINT32                      preroll;
    UINT32                      duration;
    UINT8                       stream_name_size;
    UINT8[stream_name_size]     stream_name;
    UINT8                       mime_type_size;
    UINT8[mime_type_size]       mime_type;
    UINT32                      type_specific_len;
    UINT8[type_specific_len]    type_specific_data;
  }
}

各字段含义如下表:

field type description
object_id UINT32 The unique object ID for a Media Properties Header ("MDPR").
size UINT32 The size of the Media Properties Header in bytes.
object_version UINT16 The version of the Media Properties Header object.
stream_number UINT16 The stream_number (synchronization source identifier) is a unique value that identifies a physical stream. Every data packet that belongs to a physical stream contains the same STREAM_NUMBER. The STREAM_NUMBER enables a receiver of multiple physical streams to distinguish which packets belong to each physical stream.
max_bit_rate UINT32 The maximum bit rate required to deliver this stream over a network.
avg_bit_rate UINT32 The average bit rate required to deliver this stream over a network.
max_packet_size UINT32 The largest packet size (in bytes) in the stream of media data.
avg_packet_size UINT32 The average packet size (in bytes) in the stream of media data.
start_time UINT32 The time offset in milliseconds to add to the time stamp of each packet in a physical stream.
preroll UINT32 The time offset in milliseconds to subtract from the time stamp of each packet in a physical stream.
duration UINT32 The duration of the stream in milliseconds.
stream_name_size UINT8 The length of the following stream_name member in bytes.
stream_name UINT8[] A nonunique alias or name for the stream. This size of this member is variable.
mime_type_size UINT8 The length of the following mime_type field in bytes.
mime_type UINT8[] A nonunique MIME style type/subtype string for data associated with the stream.This size of this member is variable.
type_specific_len UINT32 The length of the following type_specific_data in bytes. The type_specific_data is typically used by the data type renderer to initialize itself in order to process the physical stream.
type_specific_data UINT8[] The type_specific_data is typically used by the data type renderer to initialize itself in order to process the physical stream.The size of this member is variable.

RM逻辑流属性头

RM中可以包含多节目流,一般RM文件中通过RM逻辑流(logical stream)将多个物理流构成。逻辑流包含以下信息:有哪些物理流构成的,以及一些用于识别逻辑流的属性(比如语言、包组等)。
逻辑流也是保存在Media Properties Header中,其mime type的前缀是"logical-"。举个例子,一个RealAudio流(physical stream)的mime type是audio/x-pn-multirate-realaudio,那么对应的逻辑流(logical stream)的mime type是logical-audio/x-pn-multirate-realaudio。下图是一个逻辑流的构成示例:
RMFF logical stream

对于逻辑流对应的属性头,其type_specific_data字段包含LogicalStream结构。
文件中也有一个特殊的逻辑流,其MIME type是logical-fileinfo,包含整个文件的信息,而且只能有一个类似的文件。

LogicalStream Structure

其中包含的字段如下:

LogicalStream
{
  ULONG32 size;
  UINT16  object_version;

  if (object_version == 0)
  {
    UINT16     num_physical_streams;
    UINT16     physical_stream_numbers[num_physical_streams];
    ULONG32    data_offsets[num_physical_streams];
    UINT16     num_rules;
    UINT16     rule_to_physical_stream_number_map[num_rules];
    UINT16     num_properties;
    NameValueProperty        properties[num_properties];
  }
};

各字段含义如下表:

field type description
size UINT32 The size of the LogicalStream structure in bytes.
object_version UINT16 The version of the LogicalStream structure.
num_physical_streams UINT16 The number of physical streams that make up this logical stream. The physical stream numbers are stored in a list immediately following this field. These physical stream numbers refer to the stream_number field found in the Media Properties Object for each physical stream belonging to this logical stream.
physical_stream_numbers UINT16[] The list of physical stream numbers that comprise this logical stream. The size of this structure member is variable.
data_offsets UINT32[] The list of data offsets indicating the start of the data section for each physical stream. The size of this structure member is variable.
num_rules UINT16 The number of ASM rules for the logical stream. Each physical stream in the logical stream has at least one ASM rule associated with it or it will never get played. The mapping of ASM rule numbers to physical stream numbers is stored in a list immediately following this member. These physical stream numbers refer to the stream_number field found in the Media Properties Object for each physical stream belonging to this logical stream.
rule_to_physical_stream_map UINT16[] The list of physical stream numbers that map to each rule. Each entry in the map corresponds to a 0-based rule number. The value in each entry is set to the physical stream number for the rule. For example:
rule_to_physical_stream_map[0] = 5
This example means physical stream 5 corresponds to rule 0. All of the ASM rules referenced by this array are stored in the first name-value pair of this logical stream which must be called "ASMRuleBook" and be of type "string". Each rule is separated by a semicolon.
The size of this structure member is variable.
num_properties UINT16 The number of NameValueProperty structures contained in this structure. These name/value structures can be used to identify properties of this logical stream (for example, language).
properties NameValueProperty[] The list of NameValueProperty structures (see NameValueProperty Structure below for more details). As mentionied above, it is required that the first name-value pair be a string named "ASMRuleBook" and contain the ASM rules for this logical stream. The size of this structure member is variable.

NameValueProperty Structure

其中包含的字段如下:

NameValueProperty
{
  ULONG32             size;
  UINT16              object_version;

  if (object_version == 0)
  {
    UINT8      name_length;
    UINT8      name[namd_length];
    INT32      type;
    UINT16     value_length;
    UINT8      value_data[value_length];
  }
}

各字段含义如下表:

field type description
size UINT32 The size of the NameValueProperty structure in bytes.
object_version UINT16 The version of the NameValueProperty structure.
name_length UINT8 The length of the name data.
name UINT8[] The name string data.
type UINT32 The type of the value data. This member can take on one of three values (any other value is undefined), as shown in the following table:
=0 32-bit unsigned integer property
=1 buffer
=2 string
value_length UINT16 The length of the value data.
value_data UINT8[] The value data.

RM内容描述头(Content Description Header)

Content Description Header包含了RM文件的title、author、copyright、comments information等信息。其中包含的字段如下:

Content_Description
{
  UINT32     object_id;
  UINT32     size;
  UINT16     object_version;

  if (object_version == 0)
  {
    UINT16    title_len;
    UINT8[title_len]  title;
    UINT16    author_len;
    UINT8[author_len]  author;
    UINT16    copyright_len;
    UINT8[copyright_len]  copyright;
    UINT16    comment_len;
    UINT8[comment_len]  comment;
  }
}

各字段含义如下表:

field type description
object_id UINT32 The unique object ID for the Content Description Header ("CONT").
size UINT32 The size of the Content Description Header in bytes.
object_version UINT16 the version of the Content Description Header object.
title_len UINT16 The length of the title data in bytes. Note that the title data is not null-terminated.
title UINT8[title_len] An array of ASCII characters that represents the title information for the RealMedia file. The size of this member is variable.
author_len UINT16 The length of the author data in bytes. Note that the author data is not null-terminated.
author UINT8[author_len] An array of ASCII characters that represents the author information for the RealMedia file. The size of this member is variable.
copyright_len UINT16 The length of the copyright data in bytes. Note that the copyright data is not null-terminated.
copyright UINT8[] An array of ASCII characters that represents the copyright information for the RealMedia file. The size of this member is variable.
comment_len UINT16 The length of the comment data in bytes. Note that the comment data is not null-terminated.
comment UINT8[] An array of ASCII characters that represents the comment information for the RealMedia file.The size of this member is variable.

3. RM数据段(Data Section)

Data Section的起始位置可以通过Properties Header的data_offset字段获取。通常RM数据段包括一个Data Chunk Header和多个交织的媒体数据包(data packet)构成。

Data Chunk Header

标记数据块的开始位置。一般RM文件只有一个数据块。特别大的文件,可能有多个数据块。其中包含的字段如下:

Data_Chunk_Header
{
  UINT32     object_id;
  UINT32     size;
  UINT16      object_version;

  if (object_version == 0)
  {
    UINT32    num_packets; 
    UINT32    next_data_header;
  }
}

各字段含义如下表:

field type description
object_id UINT32 The unique object ID for the Data Chunk Header ('DATA').
size UINT32 The size of the Data Chunk in bytes. The size includes the size of the header plus the size of all the packets in the data chunk.
object_version UINT16 The version of the Data Chunk Header object.
num_packets UINT32 Number of packets in the data chunk.
next_data_header UINT32 Offset from start of file to the next data chunk. A non-zero value refers to the file offset of the next data chunk. A value of zero means there are no more data chunks in this file. This field is not typically used.

Data Packet

data chunk header之后紧跟着是num_packets个数据包。这些packet可能来自多个流,但是其时间戳是按照升序顺序存储的。每一个数据包的构成如下:

Media_Packet_Header
{
  UINT16                object_version;

  if ((object_version == 0) || (object_version == 1))
  {
    UINT16        length;
    UINT16        stream_number;
    UINT32        timestamp;
    if (object_version == 0)
    {
      UINT8        packet_group;
      UINT8        flags;
    }
    else if (object_version == 1)
    {
      UINT16        asm_rule;
      UINT8         asm_flags;
    }

    UINT8[length]        data;
  }
  else
  {
    StreamDone();
  }
}

各字段含义如下表:

field type description
object_version UINT16 The version of the Media Packet Header object.
length UINT16 The length of the packet in bytes.
stream_number UINT16 The 16-bit alias used to associate data packets with their associated Media Properties Header.
timeStamp UINT32 The time stamp of the packet in milliseconds.
packet_group UINT8 The packet group to which the packet belongs. If packet grouping is not used, set this field to 0 (zero).
flags UINT8 Flags describing the properties of the packet. The following flags are defined:
HX_RELIABLE_FLAG=1
If this flag is set, the packet is delivered reliably.
HX_KEYFRAME_FLAG=2
If this flag is set, the packet is part of a key frame or in some way marks a boundary in your data stream.
asm_rule UINT16 The ASM rule assigned to this packet.
asm_flags UINT8 Contains HX_ flags that dictate stream switching points.
data UINT8[length] The application-specific media data. The size of this member is variable.

4. RM索引段(Index Section)

Index Section存储了音视频关键帧相关的时间到偏移量的映射。
通常索引块包含一个Index Chunk Header和一系列的index records。

Index Chunk Header

Index Chunk Header标识索引块的开始位置,其偏移量可以通过Properties Header的index_offset字段获取。其中保存了索引段的属性信息。其中包含的字段如下:

Index_Chunk_Header
{
  u_int32     object_id;
  u_int32     size;
  u_int16     object_version;

  if (object_version == 0)
  {
    u_int32     num_indices;
    u_int16     stream_number;
    u_int32     next_index_header;
  }
}

各字段含义如下表:

field type description
object_id UINT32 The unique object ID for the Index Chunk Header ("INDX").
size UINT32 The size of the Index Chunk in bytes.
object_version UINT16 The version of the Index Chunk Header object.
num_indices UINT32 Number of index records in the index chunk.
stream_number UINT16 The stream number for which the index records in this index chunk are associated.
next_index_header UINT32 Offset from start of file to the next index chunk. This member enables RealMedia file format readers to find all the index chunks quickly. A value of zero for this member indicates there are no more index headers in this file.

index records

index record中记录时间戳到数据包偏移量的映射。其中包含的字段如下:

IndexRecord
{
  UINT16   object_version;

  if (object_version == 0)
  {
    u_int32  timestamp;
    u_int32  offset;
    u_int32  packet_count_for_this_packet;
  }
}

各字段含义如下表:

field type description
object_version UINT16 The version of the Index Record object.
timestamp UINT32 The time stamp (in milliseconds) associated with this record.
offset UINT32 The offset from the start of the file at which this packet can be found.
packet_count_for_this_packet UINT32 The packet number of the packet for this record. This is the same number of packets that would have been seen had the file been played from the beginning to this point.

注意,通常情况下每个stream对应一个索引段,也就是说index section可能会出现多次。

6. RM元数据段(Metadata Section)

RealMedia元数据中只有一个tag,这个tag里面包含一系列的命名metadata,这些metadata描述了媒体文件的属性。这些metadata可以是文本、整型或二进制数据。Metadata Section包含一个Header和Tag Body。

Metadata Section Header

定义如下:

MetadataSectionHeader
{
  u_int32        object_id; // The unique object ID for the Metadata Section Header ("RMMD")
  u_int32        size;
}

Metadata Tag

metadata tag有多个properties构成。这些properties通过树形结构组织,每一个property包含一个类型和值,也可能包括多个sub-properties。其中包含如下字段:

MetadataTag
{
  u_int32        object_id;
  u_int32        object_version;
  u_int8[]       properties;
}

各字段含义如下表:

field type description
object_id UINT32 The unique object ID for the Metadata Tag ("RJMD").
object_version UINT32 The version of the Metadata Tag.
properties UINT8[] The MetadataProperty structure that makes up the metadata tag (see "Metadata Property Structure" for more details). As mentioned above, the properties will be represented as one unnamed root metadata property with multiple sub-properties, each with their own optional sub-properties. These will be nested, as in a tree.

Metadata Property Structure

该部分包含如下字段:

MetadataProperty
{
  u_int32        size;
  u_int32        type;
  u_int32        flags;
  u_int32        value_offset;
  u_int32        subproperties_offset;
  u_int32        num_subproperties;
  u_int32        name_length;
  u_int8[name_length]    name;
  u_int32        value_length;
  u_int8[value_length]    value;
  PropListEntry[num_subproperties]    subproperties_list;
  MetadataProperty[num_subproperties]    subproperties;
}

各字段含义如下表:

field type description
size UINT32 The size of the MetadataProperty structure in bytes.
type UINT32 The type of the value data. The data in the value array can be one of the following types:
MPT_TEXT
The value is string data.
MPT_TEXTLIST
The value is a separated list of strings, separator specified as sub-property/type descriptor.
MPT_FLAG
The value is a boolean flag either 1 byte or 4 bytes, check size value.
MPT_ULONG
The value is a four-byte integer.
MPT_BINARY
The value is a byte stream.
MPT_URL
The value is string data.
MPT_DATE
The value is a string representation of the date in the form: YYYYmmDDHHMMSS (m = month, M = minutes).
MPT_FILENAME
The value is string data.
MPT_GROUPING
This property has subproperties, but its own value is empty.
MPT_REFERENCE
The value is a large buffer of data, use sub-properties/type descriptors to identify mime-type.
flags UINT32 Flags describing the property. The following flags are defined these can be used in combination:
MPT_READONLY
Read only, cannot be modified.
MPT_PRIVATE
Private, do not expose to users.
MPT_TYPE_DESCRIPTOR
Type descriptor used to further define type of value.
value_offset UINT32 The offset to the value_length , relative to the beginning of the MetadataProperty structure.
subproperties_offset UINT32 The offset to the subproperties_list , relative to the beginning of the MetadataProperty structure.
num_subproperties UINT32 The number of subproperties for this MetadataProperty structure.
name_length UINT32 The length of the name data, including the null-terminator.
name UINT8[] The name of the property (string data). The size of this member is designated by name_length.
value_length UINT32 The length of the value data.
value UINT8[] The value of the property (data depends on the type specified for the property). The size of this member is designated by value_length.
subproperties_list PropListEntry[] The list of PropListEntry structures. The PropListEntry structure identifies the offset for each property (see "PropListEntry Structure" for more details. The size of this member is num_subproperties * sizeof(PropListEntry).
subproperties MetadataProperty[] The sub-properties. Each sub-property is a MetadataProperty structure with its own size, name, value, sub-properties, and so on. The size of this member is variable.

PropListEntry Structure

该部分包含如下字段:

PropListEntry
{
  u_int32        offset;
  u_int32        num_props_for_name;
}

各字段含义如下表:

field type description
offset UINT32 The offset for this indexed sub-property, relative to the beginning of the containing MetadataProperty.
num_props_for_name UINT32 The number of sub-properties that share the same name. For example, a lyrics property could have multiple versions as differentiated by the language sub-property type descriptor.

metadata section footer标志着RealMedia文件的metasection的结束。由于位于RM文件末尾,section footer的位置是固定的,即相对文件结尾偏移量-140字节。section footer中的size字段表示metadata tag的长度,这个字段可以用于快速定位metadata数据。其中包含的字段如下:

MetadataSectionFooter
{
  u_int32        object_id; // The unique object ID for the Metadata Section Footer ("RMJE").
  u_int32        object_version; // The version of the metadata tag.
  u_int32        size; // The size of the preceding metadata tag
}

ID3v1 Tag

ID3v1 Tag位于metadata section最后一部分,其长度固定为128字节。格式可参考ID3v1 standard

7. 其他问题

整个rm/rmvb文件结构相对比较简单。
比较官方的文档可以从Helix DNA Common Components找到。
目前来说只有一个工具可以分析RM/RMVB文件,名字是RM文件分析器。

参考文献

  1. multimedia-wiki-RealMedia
  2. RealMedia File Format-helix
  3. RM文件解析
posted @ 2016-07-23 14:41  Tocy  阅读(1892)  评论(0编辑  收藏  举报