[原创]桓泽学音频编解码（8）：关于MP3和AAC量化器设计的研究

一:量化的基本知识

量化是数据压缩重要的一个组成部分。

量化分为矢量量化器，标量量化器。也可以分为均匀量化器和非均匀量化器，按照记忆性可以分为无记忆量化器和有记忆量化器。有记忆量化器就是自适应量化器。

对于信源采样而言，ADC/DAC都是一种量化和反量化器。比如对PCM信号的量化分析步骤

一段正弦波，与3bit量化和量化误差。

蓝线是正弦波

红线是量化值

绿线是量化误差

4bit 量化图

5bit量化图

16bit量化图

可以看到量化误差在慢慢变小。这是典型的均匀量化。

至于基本非均匀量化可以查相关的书籍，很好找到，因为这里为了讲MP3和AAC的量化，暂时这里暂时不赘述。

对于标量量化和矢量量化很多书和论文讲的比较详细，我暂时也不说了。我就给一个图，简单说明下区别

同样是2bit分配的标量量化是矢量量化，可以看到矢量量化的效率要高，RMS Error值要小。

二：MP3和AAC的量化

MP3 的量化公式

AAC的量化公式

MAGIC_NUMBER 是 0.4054

MP3和AAC是典型的指数量化器，AAC是从MP3继承过来做了一些改动的。

可以分析得到，

1 . 在没有0.75指数的计算的时候，如果scalefactor上升1，量化系数的影响将是

1.5db 约= 20log10(2^(1/4)) ：公式是-1/4的原因是scf小于common_scf

使用了3/4的非线性处理，scalefactor上升1，量化系数的影响是

1.1db 约= 20 log10(2^((1/4)*(3/4)))

2 . 全局Gain量化的提取是分析率失真模型在全体频带或系数上取得的量化步长因子。而scf量化因子是局部子带或是scale factor band内微调的量化因子，最后进行整体量化。

那么对以上两个公式，我提出以下问题：

1. 为什么要设计成指数量化器。

2. 设计成指数量化器为什么是2的-0.25和整体的0.75次幂，这些系数的设计是怎么来的

3. MAGIC_NUMBER和0.0946的数字怎么来的

1，2，3实际是一个问题。

TO 1: 采样数据经过MDCT后，具有明显的非线性特性，并且人耳是具有非线性特性的（尤其是log特性）。故使用非线性量化器。另外使用非线性特性是可以降低量化步长。

TO 2：1/4，3/4，0.0946是怎么来的。

经过查找相关论文【1】得到，MP3和AAC的量化技术是一种power-law quantizer。power-law quantizer是幂次法则的一种变种应用，最早出现在【2】中

1. Power Law（幂次定律，幂次法则）的详细解释请参考wiki

http://en.wikipedia.org/wiki/Power_law#Power-law_functions

简而言之，在实际应用中（比如，地震信号等）他具有很明显的幂特性，那为了更好的描述这些信号，科学家设计了一种函数

就是幂次法则函数，

$y = ax^k + \varepsilon.\!$

其实power law也有各种各样的具体应用函数，比如【3】

Power law的分布也有多种，Power-law probability distributions
power low系数的估计方法也有多种，比如wiki里介绍的几种办法。

2. 那么这些a，k，e的系数如何得到呢，这里就要用到下一个内容，Power Law of Practice。

wiki链接：http://en.wikipedia.org/wiki/Power_Law_of_Practice

简而言之就是法则训练。通过训练得到相关参数。

4. AAC和MP3的公式为什么会有区别。

那么AAC和MP3的量化方案建模问题解决了，MP3也AAC为什么不同呢，这就很好解答了，MP3是PQF+MDCT，AAC是MDCT

前端时频转换算法不同后端的量化参数计算也不同了。

三. 对于量化，除了指数量化器，还有反正切量化器等超越函数量化器。

四. 那么是所有的标准的时频量化模型都是这样设计的吗。

可以说，大部分音频编码标准都采用了scalefactor或exponent或相似的方式（比如COOK/g.722.1的RMS）。来做量化的第一级。而对剩下部分的量化方法就各有不同了。但总体说，mp3和aac的量化还是然人不易理解的。而其他的非均匀标量量化器，相对比较好理解。

【1】LOW BITRATE AUDIO CODING STATE OF THE ART，by K Brandenburg

【2】Mechanisms of skill acquisition and the law of practice by Allen Newell and Paul S.cn Rosenbloom。 1982年

【3】The power law of practice in adaptive training applications

The Stevens' power law of psychophysics
The Stefan–Boltzmann law
The Ramberg–Osgood stress–strain relationship
The input-voltage–output-current curves of field-effect transistors and vacuum tubes approximate a square-law relationship, a factor in "tube sound".
A 3/2-power law can be found in the plate characteristic curves of triodes.
The inverse-square laws of Newtonian gravity and electrostatics
Electrostatic potential and gravitational potential
Model of van der Waals force
Force and potential in simple harmonic motion
Kepler's third law
The initial mass function of stars
The M-sigma relation
Gamma correction relating light intensity with voltage
Kleiber's law relating animal metabolism to size, and allometric laws in general
Behaviour near second-order phase transitions involving critical exponents
Proposed form of experience curve effects
The differential energy spectrum of cosmic-ray nuclei
Square-cube law (ratio of surface area to volume)
Constructal law
Fractals
The Pareto principle also called the "80–20 rule"
Zipf's law in corpus analysis and population distributions amongst others, where frequency of an item or event is inversely proportional to its frequency rank (i.e. the second most frequent item/event occurring half as often the most frequent item and so on).
The safe operating area relating to maximum simultaneous current and voltage in power semiconductors.
The unequal participation and traffic in relation to the use of communications tools as noted by Clay Shirky in Here Comes Everybody.

^ Newell, A., & Rosenbloom, P. S. (1981). Mechanisms of skill acquisition and the law of practice. In J. R. Anderson (Ed.), Cognitive skills and their acquisition (pp. 1-55). Hillsdale, NJ: Erlbaum. ISBN 0898590930
^ Delaney, P. F., Reder, L. M., Staszewski, J. J., & Ritter, F. E. (1998). The strategy specific nature of improvement: The power law applies by strategy within task. Psychological Science, 9(1), 1-8.
^ Heathcote, A., Brown, S., & Mewhort, D. J. K. (2000). The power law repealed: The case for an exponential law of practice. Psychonomic Bulletin & Review, 7(2), 185-207.
^ Logan, G. (1992). Shapes of reaction-time distributions and shapes of learning curves: A test of the instance theory of automaticity. Journal of Experimental Psychology: Learning, Memory, and Cognition, 18(5), 883-914.
^ Anderson, J., Fincham, J., & Douglass, S. (1999). Practice and retention: A unifying analysis. Journal of Experimental Psychology: Learning, Memory, and Cognition, 25(5), 1120-1136

posted @ 2012-05-07 07:58 杭州桓泽阅读(1991) 评论(2) 收藏举报

刷新页面返回顶部

[原创]桓泽学音频编解码（8）：关于MP3和AAC量化器设计的研究

公告