表示一个数字如100.2，可以是Fixed point也就是100.2，也可以是Floating point（也就是科学计数法scientific notation）1.002 × 10².
通常是以第一个非零数字作为小数点前的数，也被称为normalized form，也就是说上面的数字不会表示成100.2 × 10⁰或0.1002 × 10^{3

浮点数的优点是能表示Fixed point无法表示的超大和超小数值}

2. IEEE Standard 754 关于floating number在计算机中表示的定义。
Java中float（单精度浮点），double（双精度浮点）也遵照次标准设计。

The sign bit is 0 for positive, 1 for negative. 符号位0是正，1是负
The exponent(幂值)'s base is two. 幂值是2
The exponent field contains 127 plus the true exponent for single-precision(单精度),
or 1023 plus the true exponent for double precision(双精度).
The first bit of the mantissa is typically assumed to be 1.f, where f is the field of fraction bits.
尾数中第一位任何情况下都是1（因为binary中只有0,1），所以不用占空间，所以fraction bits都用来存储.f

图示如下：

float(32位)：

double(64位):

3.为什么会有精度丢失？

拿单精度浮点float为例，正常的整数integer，可以用全部32位表示数字，而Single Precision只能有24位存储数值，这24位是没办法

match 32位的精度的，所以就会丢失。例如：

      11110000 11001100 10101010 00001111  // 32-bit integer

  = +1.1110000 11001100 10101010 x 2³¹     // Single-Precision Float

  =   11110000 11001100 10101010 00000000  // Corresponding Value

4.如何解决？

通常会用long和BigDecimal来替代float，double。比如eBay内部的Money类，
就是用long来做internal value存储amount的数值的。

5.Java考题常出现

当你不声明的时候，默认小数都用double来表示，所以如果要用float的话，则应该在其后加上f

例如：float a=1.3;
则会提示不能将double转化成float 这是窄型转化。

如果要用float来修饰的话，则应该使用float a=1.3f

6.Java变量取值范围

byte的取值范围为-128~127，占用1个字节（-2的7次方到2的7次方-1）
short的取值范围为-32768~32767，占用2个字节（-2的15次方到2的15次方-1）
int的取值范围为（-2147483648~2147483647），占用4个字节（-2的31次方到2的31次方-1）
long的取值范围为（-9223372036854774808~9223372036854774807），占用8个字节（-2的63次方到2的63次方-1）
float (单精度浮点）约等于(-1 x 2^-127 ~ +1 x 2¹²⁷），占4个字节（指数段8bits）
double（双精度浮点）约等于(-1 x 2^-1023 ~ +1 x 2¹⁰²³），占8个字节（指数段11bits）

posted on 2011-11-19 10:19 significantfrank 阅读(1357) 评论(0) 收藏举报

刷新页面返回顶部

导航