codepage

代码页 code page 是IBM 的传统术语,就是“一张字符编码表”,当然这个“张”可以很大也 可以很小。例如 IBM PC (OEM) code page,中文GBK code page 。

Code page is the traditional IBM term used for a specific character encoding table: a mapping in which a sequence of bits, usually a single octet representing integer values 0 through 255, is associated with a specific character. IBM and Microsoft often allocate a code page number to a character set even if that charset is better known by another name.

GB2312 code page 是双bytes 码,两字节大于 0xA0A0 的表. 也就是说code page 里可能含有部分空白(少数码,没有字符)。

UTF 是unicode的传送码,即unicode编码后的编码。UTF的编码方法很简单,用算术表达式计算就可以了,看3字节的Utf-8数据没意思。unicode 与字符集对应。Utf-8与unicode值对应。
posted @ 2008-02-22 16:35  911  阅读(354)  评论(0)    收藏  举报