JVM 深入研究 -- 详解class 材料

抛砖引玉

class 文件结构比较复杂难懂,先来看一个比较轻松易懂的例子,从中可以看得出class的一些设计思想,有一段话:

鱼我所欲也,熊掌亦我所欲也,二者不可得兼,舍鱼而取熊掌者也。
生亦我所欲也,义亦我所欲也,二者不可得兼,舍生而取义者也。

那么,我可以这样表示你信吗?而且是等效的哦:

01,231,4,50627。
831,931,4,58697。

看不看的懂是一回事,就问你是不是很简洁?其实主要的功臣是下面的字典,对应的就是 class 中的常量池

0:鱼
1:我所欲也
2:熊掌
3:亦
4:二者不可得兼
5:舍
6:而取
7:者也
8:生
9:义

先把 字面量 全存到一个地方,表示的时候再用其索引拼接,例如:0:鱼 1:我所欲也,那么01就是两者拼起来了,就表示 鱼我所欲也

012314
我所欲也熊掌我所欲也二者不可得兼

处理断句

上面其实还不完美,因为有逗号句号遗留,但字典上没有逗号句号,那还原断句怎么办呢?那再加一个长度来指导断句吧。例如:01,总长是2个数字,那前面加一个2,就变成201,那么变成如下:

密文长度位数据位解密
201201鱼 我所欲也
32313231熊掌 亦 我所欲也
1414二者不可得兼
550627550627舍 鱼 而取 熊掌 者也

class 文件结构组织核心也就是三点:

  1. 常量池索引拼接(或索引引用)
  2. 字节长度引领下文
  3. 信息结构体定义

好了,如果能理解到上面的点,那就继续往下看吧。

A.java 与 A.class

下面就进入到正文,先来一个 A.java 原代码:

package basic.object;
public class A {
// 一段rap
private String song = "你看这个面它又长又宽,还有这个碗它又大又圆,两者之间并没有关系,但我要用rap把它们缝在一起,耶!";
public void rap() {
System.out.println(this.song);
}
public static void main(String[] args) throws Exception {
A obj = new A();
obj.rap();
obj.getClass();
while (true){
}
}
}

执行 javac A.java 原码编译后形成 A.class 文件,长这个样子,一堆十六进制的天书:

接下来我们就一起阅读一下这个天书吧。

反编译

执行命令 javap -v A.class 查看字节码,原来上面那么少的十六进制的内容可以表示这么多信息啊,是不是有点震惊到?

Classfile /path/to/java-oops/target/classes/basic/object/A.class
Last modified 2025-9-26; size 969 bytes
MD5 checksum 7b641ea8aee513d0afbf8492ea0fd19e
Compiled from "A.java"
public class basic.object.A
minor version: 0
major version: 52
flags: ACC_PUBLIC, ACC_SUPER
Constant pool:
#1 = Methodref          #10.#32        // java/lang/Object."<init>":()V
  #2 = String             #33            // 你看这个面它又长又宽,还有这个碗它又大又圆,两者之间并没有关系,但我要用rap把它们缝在一起,耶!
  #3 = Fieldref           #6.#34         // basic/object/A.song:Ljava/lang/String;
  #4 = Fieldref           #35.#36        // java/lang/System.out:Ljava/io/PrintStream;
  #5 = Methodref          #37.#38        // java/io/PrintStream.println:(Ljava/lang/String;)V
  #6 = Class              #39            // basic/object/A
  #7 = Methodref          #6.#32         // basic/object/A."<init>":()V
    #8 = Methodref          #6.#40         // basic/object/A.rap:()V
    #9 = Methodref          #10.#41        // java/lang/Object.getClass:()Ljava/lang/Class;
    #10 = Class              #42            // java/lang/Object
    #11 = Utf8               song
    #12 = Utf8               Ljava/lang/String;
    #13 = Utf8               <init>
      #14 = Utf8               ()V
      #15 = Utf8               Code
      #16 = Utf8               LineNumberTable
      #17 = Utf8               LocalVariableTable
      #18 = Utf8               this
      #19 = Utf8               Lbasic/object/A;
      #20 = Utf8               rap
      #21 = Utf8               main
      #22 = Utf8               ([Ljava/lang/String;)V
      #23 = Utf8               args
      #24 = Utf8               [Ljava/lang/String;
      #25 = Utf8               obj
      #26 = Utf8               StackMapTable
      #27 = Class              #39            // basic/object/A
      #28 = Utf8               Exceptions
      #29 = Class              #43            // java/lang/Exception
      #30 = Utf8               SourceFile
      #31 = Utf8               A.java
      #32 = NameAndType        #13:#14        // "<init>":()V
        #33 = Utf8               你看这个面它又长又宽,还有这个碗它又大又圆,两者之间并没有关系,但我要用rap把它们缝在一起,耶!
        #34 = NameAndType        #11:#12        // song:Ljava/lang/String;
        #35 = Class              #44            // java/lang/System
        #36 = NameAndType        #45:#46        // out:Ljava/io/PrintStream;
        #37 = Class              #47            // java/io/PrintStream
        #38 = NameAndType        #48:#49        // println:(Ljava/lang/String;)V
        #39 = Utf8               basic/object/A
        #40 = NameAndType        #20:#14        // rap:()V
        #41 = NameAndType        #50:#51        // getClass:()Ljava/lang/Class;
        #42 = Utf8               java/lang/Object
        #43 = Utf8               java/lang/Exception
        #44 = Utf8               java/lang/System
        #45 = Utf8               out
        #46 = Utf8               Ljava/io/PrintStream;
        #47 = Utf8               java/io/PrintStream
        #48 = Utf8               println
        #49 = Utf8               (Ljava/lang/String;)V
        #50 = Utf8               getClass
        #51 = Utf8               ()Ljava/lang/Class;
        {
        private java.lang.String song;
        descriptor: Ljava/lang/String;
        flags: ACC_PRIVATE
        public basic.object.A();
        descriptor: ()V
        flags: ACC_PUBLIC
        Code:
        stack=2, locals=1, args_size=1
        0: aload_0
        1: invokespecial #1                  // Method java/lang/Object."<init>":()V
          4: aload_0
          5: ldc           #2                  // String 你看这个面它又长又宽,还有这个碗它又大又圆,两者之间并没有关系,但我要用rap把它们缝在一起,耶!
          7: putfield      #3                  // Field song:Ljava/lang/String;
          10: return
          LineNumberTable:
          line 3: 0
          line 6: 4
          LocalVariableTable:
          Start  Length  Slot  Name   Signature
          0      11     0  this   Lbasic/object/A;
          public void rap();
          descriptor: ()V
          flags: ACC_PUBLIC
          Code:
          stack=2, locals=1, args_size=1
          0: getstatic     #4                  // Field java/lang/System.out:Ljava/io/PrintStream;
          3: aload_0
          4: getfield      #3                  // Field song:Ljava/lang/String;
          7: invokevirtual #5                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
          10: return
          LineNumberTable:
          line 9: 0
          line 10: 10
          LocalVariableTable:
          Start  Length  Slot  Name   Signature
          0      11     0  this   Lbasic/object/A;
          public static void main(java.lang.String[]) throws java.lang.Exception;
          descriptor: ([Ljava/lang/String;)V
          flags: ACC_PUBLIC, ACC_STATIC
          Code:
          stack=2, locals=2, args_size=1
          0: new           #6                  // class basic/object/A
          3: dup
          4: invokespecial #7                  // Method "<init>":()V
            7: astore_1
            8: aload_1
            9: invokevirtual #8                  // Method rap:()V
            12: aload_1
            13: invokevirtual #9                  // Method java/lang/Object.getClass:()Ljava/lang/Class;
            16: pop
            17: goto          17
            LineNumberTable:
            line 13: 0
            line 14: 8
            line 15: 12
            line 16: 17
            LocalVariableTable:
            Start  Length  Slot  Name   Signature
            0      20     0  args   [Ljava/lang/String;
            8      12     1   obj   Lbasic/object/A;
            StackMapTable: number_of_entries = 1
            frame_type = 252 /* append */
            offset_delta = 17
            locals = [ class basic/object/A ]
            Exceptions:
            throws java.lang.Exception
            }
            SourceFile: "A.java"

class 结构

class 文件格式组织参考官网文档 https://docs.oracle.com/javase/specs/jvms/se8/html/jvms-4.html

ClassFile {
u4             magic;
u2             minor_version;
u2             major_version;
u2             constant_pool_count;
cp_info        constant_pool[constant_pool_count-1];
u2             access_flags;
u2             this_class;
u2             super_class;
u2             interfaces_count;
u2             interfaces[interfaces_count];
u2             fields_count;
field_info     fields[fields_count];
u2             methods_count;
method_info    methods[methods_count];
u2             attributes_count;
attribute_info attributes[attributes_count];
}

其中 u4 表示4个字节,u2 表示2个字节,像 xx_info 表示的又是一种新的结构了,例如 cp_info 表示常量池结构 https://docs.oracle.com/javase/specs/jvms/se8/html/jvms-4.html#jvms-4.4

class 字节码详解

再来看一眼 A.class,下面我们将根据官网规定的结构来阅读这份天书

一般信息

来看看开头的10个字节 CAFEBABE000000340034,它包括 魔数主次版本常量池数量信息

# u4 magic;         
CAFEBABE            // 魔数
# u2 minor_version; 
0000                //次版本号:0
# u2 major_version;
0034                // 主版本号:52[1.8]
# u2 constant_pool_count;
0034                // 常量池数量 52

常量池

接下来就是重头戏了,常量池,在 class 文件中是用 cp_info 列表来组织的,字节码为

                    0A000A002008
0021090006002209002300240A002500
260700270A000600200A000600280A00
...(中间省去很多行)
6176612F6C616E672F537472696E673B
2956010008676574436C617373010013
28294C6A6176612F6C616E672F436C61
73733B
cp_info constant_pool[constant_pool_count-1];

cp_info 结构

cp_info {
u1 tag;
u1 info[];
}

常量池 tag 列表如下

Constant TypeValue
CONSTANT_Class7
CONSTANT_Fieldref9
CONSTANT_Methodref10
CONSTANT_InterfaceMethodref11
CONSTANT_String8
CONSTANT_Integer3
CONSTANT_Float4
CONSTANT_Long5
CONSTANT_Double6
CONSTANT_NameAndType12
CONSTANT_Utf81
CONSTANT_MethodHandle15
CONSTANT_MethodType16
CONSTANT_InvokeDynamic18

constant_pool_count=52,但定义常量池长度却少1([constant_pool_count-1]),这是因为常量池索引从 1 开始,因此实际有效索引范围是 1~51,共 51 项,但特殊的 序号0 占一个用来表示“无”或“不引用”,如果加上序号0就总共 52 项。

下面举个几个常用例子 :

  1. CONSTANT_Methodref 引用型,例如 0A000A0020
0A000A0020含义
CONSTANT_Methodref#10#32<java/lang/Object> <<init> : ()V>
  1. CONSTANT_Class 引用型,例如 070027
070027含义
CONSTANT_Class0x27=39,引用#39常量basic/object/A
  1. CONSTANT_Utf8 字符串型,例如 010004736F6E67 ,ascii 相关参考 ascii 对照表
010004736F6E67
CONSTANT_Utf8长度4个字节song

下面是全部常量池的字节码, // 是我注释上的:

0A 000A 0020       // 第1个常量#1,0A:CONSTANT_Methodref,#10#32拼接: java/lang/Object."":()V
08 0021            // 第2个常量#2,08:CONSTANT_String,引用#33 字面量,"你看这个..."
09 00060022        // 常量#3,09:CONSTANT_Fieldref,含义:basic/object/A.song:Ljava/lang/String;
09 00230024
0A 00250026
07 0027                 // 07:CONSTANT_Class ,引用#39常量,basic/object/A
0A 00060020
0A 00060028
0A 000A0029
07 002A
01 0004 736F6E67   // 01: CONSTANT_Utf8, 0004:内容长度4个字节,736F6E67:ascii 码对应 song
01 0012 4C6A6176612F6C616E672F537472696E
673B
01 0006 3C696E69743E
01 0003 282956
01 0004 436F6465
01 000F 4C696E654E
756D6265725461626C65
01 0012 4C6F63
616C5661726961626C655461626C65
01 0004 74686973
01 0010 4C62617369632F6F626A6563742F413B
01 0003 726170
01 0004 6D61696E
01 0016 285B4C6A617661
2F6C616E672F537472696E673B2956
01 0004 61726773
01 0013 5B4C6A6176612F
6C616E672F537472696E673B
01 0003 6F626A
01 000D 537461636B4D6170546162
6C65
07 0027
01 000A 457863657074696F
6E73
07 002B
01 000A 536F757263654669
6C65
01 0006 412E6A617661
0C 000D 000E                       //#13:#14 // "":()V
01 008D E4BDA0E79C8BE8BF99E4B8AAE9
9DA2E5AE83E58F88E995BFE58F88E5AE
BDEFBC8CE8BF98E69C89E8BF99E4B8AA
E7A297E5AE83E58F88E5A4A7E58F88E5
9C86EFBC8CE4B8A4E88085E4B98BE997
B4E5B9B6E6B2A1E69C89E585B3E7B3BB
EFBC8CE4BD86E68891E8A681E794A872
6170E68A8AE5AE83E4BBACE7BC9DE59C
A8E4B880E8B5B7EFBC8CE880B6EFBC81
0C 000B 000C
07 002C
0C 002D 002E
07 002F
0C 0030 0031
01 000E 62617369632F6F62
6A6563742F41
0C 0014 000E
0C 0032 0033
01 0010 6A6176612F6C616E672F4F626A
656374
01 0013 6A6176612F6C616E672F
457863657074696F6E
01 0010 6A6176612F6C616E672F53797374656D
01 0003 6F7574
01 0015 4C6A6176612F696F2F5072
696E7453747265616D3B
01 0013 6A6176
612F696F2F5072696E7453747265616D
01 0007 7072696E746C6E
01 0015 284C6A
6176612F6C616E672F537472696E673B
2956
01 0008 676574436C617373
01 0013
28294C6A6176612F6C616E672F436C61
73733B

对应的就是前面反编译后的结果,对照着来看会理清楚:

Constant pool:
#1 = Methodref          #10.#32        // java/lang/Object."<init>":()V
  #2 = String             #33            // 你看这个面它又长又宽,还有这个碗它又大又圆,两者之间并没有关系,但我要用rap把它们缝在一起,耶!
  #3 = Fieldref           #6.#34         // basic/object/A.song:Ljava/lang/String;
  #4 = Fieldref           #35.#36        // java/lang/System.out:Ljava/io/PrintStream;
  #5 = Methodref          #37.#38        // java/io/PrintStream.println:(Ljava/lang/String;)V
  #6 = Class              #39            // basic/object/A
  #7 = Methodref          #6.#32         // basic/object/A."<init>":()V
    #8 = Methodref          #6.#40         // basic/object/A.rap:()V
    #9 = Methodref          #10.#41        // java/lang/Object.getClass:()Ljava/lang/Class;
    #10 = Class              #42            // java/lang/Object
    #11 = Utf8               song
    #12 = Utf8               Ljava/lang/String;
    #13 = Utf8               <init>
      #14 = Utf8               ()V
      #15 = Utf8               Code
      #16 = Utf8               LineNumberTable
      #17 = Utf8               LocalVariableTable
      #18 = Utf8               this
      #19 = Utf8               Lbasic/object/A;
      #20 = Utf8               rap
      #21 = Utf8               main
      #22 = Utf8               ([Ljava/lang/String;)V
      #23 = Utf8               args
      #24 = Utf8               [Ljava/lang/String;
      #25 = Utf8               obj
      #26 = Utf8               StackMapTable
      #27 = Class              #39            // basic/object/A
      #28 = Utf8               Exceptions
      #29 = Class              #43            // java/lang/Exception
      #30 = Utf8               SourceFile
      #31 = Utf8               A.java
      #32 = NameAndType        #13:#14        // "<init>":()V
        #33 = Utf8               你看这个面它又长又宽,还有这个碗它又大又圆,两者之间并没有关系,但我要用rap把它们缝在一起,耶!
        #34 = NameAndType        #11:#12        // song:Ljava/lang/String;
        #35 = Class              #44            // java/lang/System
        #36 = NameAndType        #45:#46        // out:Ljava/io/PrintStream;
        #37 = Class              #47            // java/io/PrintStream
        #38 = NameAndType        #48:#49        // println:(Ljava/lang/String;)V
        #39 = Utf8               basic/object/A
        #40 = NameAndType        #20:#14        // rap:()V
        #41 = NameAndType        #50:#51        // getClass:()Ljava/lang/Class;
        #42 = Utf8               java/lang/Object
        #43 = Utf8               java/lang/Exception
        #44 = Utf8               java/lang/System
        #45 = Utf8               out
        #46 = Utf8               Ljava/io/PrintStream;
        #47 = Utf8               java/io/PrintStream
        #48 = Utf8               println
        #49 = Utf8               (Ljava/lang/String;)V
        #50 = Utf8               getClass
        #51 = Utf8               ()Ljava/lang/Class;

类的一些信息

字节码:

      00210006000A00000001000200
0B000C00000003

解释:

### u2 access_flags;
0021                          // 00100001  有一个1是表示public
### u2 this_class;
0006                        //常量池中#6常量,表示basic/object/A
### u2 super_class;
000A                        //常量池#10常量:java/lang/Object
### u2 interfaces_count;
0000                        //接口数量为0
### u2 interfaces[interfaces_count]
(因为接口数量为0,所以此处无字节码)
### u2 fields_count;
0001                         //字段数量是1,因为有一个字段 song
### field_info     fields[fields_count];
0002 000B 000C 0000  // public song Ljava/lang/String;
### u2 methods_count;
0003                        // 3个方法,分别是构造方法,rap(),main()

方法信息

接下来就是方法信息,这也是重点关注的,能够运行的核心代码逻辑就是在这里面

 method_info methods[methods_count];

来看看它们是怎么组织起来的吧,官网文档结构参考:

https://docs.oracle.com/javase/specs/jvms/se8/html/jvms-4.html#jvms-4.6

https://docs.oracle.com/javase/specs/jvms/se8/html/jvms-4.html#jvms-4.7

method_info {
u2             access_flags;
u2             name_index;
u2             descriptor_index;
u2             attributes_count;
attribute_info attributes[attributes_count];
}
attribute_info {
u2 attribute_name_index;
u4 attribute_length;
u1 info[attribute_length];
}
Code_attribute {
u2 attribute_name_index;
u4 attribute_length;
u2 max_stack;
u2 max_locals;
u4 code_length;
u1 code[code_length];
u2 exception_table_length;
{   u2 start_pc;
u2 end_pc;
u2 handler_pc;
u2 catch_type;
} exception_table[exception_table_length];
u2 attributes_count;
attribute_info attributes[attributes_count];
}
LineNumberTable_attribute {
u2 attribute_name_index;
u4 attribute_length;
u2 line_number_table_length;
{   u2 start_pc;
u2 line_number;
} line_number_table[line_number_table_length];
}
LocalVariableTable_attribute {
u2 attribute_name_index;
u4 attribute_length;
u2 local_variable_table_length;
{   u2 start_pc;
u2 length;
u2 name_index;
u2 descriptor_index;
u2 index;
} local_variable_table[local_variable_table_length];
}
Exceptions_attribute {
u2 attribute_name_index;
u4 attribute_length;
u2 number_of_exceptions;
u2 exception_index_table[number_of_exceptions];
}
SourceFile_attribute {
u2 attribute_name_index;
u4 attribute_length;
u2 sourcefile_index;
}

字节码解释,#,// 都是我注释符号,# 表示官方定义的结构

# u2 access_flags;      // public
0001
# u2 name_index;        // <init>
  000D
  # u2 descriptor_index;
  000E                    // ()V
  # u2 attributes_count;
  0001                    // 后面有1个属性信息
  # Code_attribute
  # attribute_info attributes[attributes_count];
  # u2 attribute_name_index;
  000F                   // #15常量 Code,表示接下来是 Code_attribute 结构
  # u4 attribute_length;
  0000 0039              // 字节长度 57字节
  # u2 max_stack;         // 操作数最大深度:2
  0002
  # u2 max_locals;        // 局部变量最大槽数:1
  0001
  # u4 code_length;       // 字节码长度:11
  0000 000B
  # u1 code[code_length]; // 11个字节码:2AB700012A1202B50003B1
  2A         // aload_0
  B7 0001    // invokespecial #1 <java/lang/Object.<init> : ()V>
    2A         // aload_0
    12 02      // ldc #2
    B5 0003    // putfield #3 
    B1         // return
    # u2 exception_table_length;
    0000       //方法有0个异常处理
    # exception_table[exception_table_length];
    ()
    # u2 attributes_count;
    0002
    # attribute_info attributes[attributes_count];
    # u2 attribute_name_index;
    0010                              // LineNumberTable
    # u4 attribute_length;
    0000 000A                         // 长度:10
    # u2 line_number_table_length;
    0002                             // 接下来有2个行号属性
    # line_number_table[line_number_table_length];
    # u2 start_pc;
    0000
    # u2 line_number;	
    0003                     // line 3: 0
    # u2 start_pc;
    0004
    # u2 line_number;
    0006                     // line 6: 4
    # u2 attribute_name_index;
    0011                     // LocalVariableTable
    # u4 attribute_length;
    0000 000C                // 字节长度:12
    0001 0000 000B 0012 0013 0000
    # rap() 方法
    0001                    // public
    0014                    // rap
    000E                    // ()V
    0001                    // attributes_count: 1
    000F                    // Code
    0000 0039               // code_length: 3*16+9=57 字节
    00020001
    0000000BB200042AB40003B60005B100
    00000200100000000A00020000000900
    0A000A00110000000C00010000000B00
    1200130000
    # main()
    0009                  // 00001001: public static
    0015                  // main
    0016                  // 参数([Ljava/lang/String;)V
    0002                  // attributes_count: 2
    000F                  // Code
    0000 0062             // 长度 16*6+2=98 个字节
    0002000200000014BB000659B7
    00074C2BB600082BB6000957A7000000
    00000300100000001200040000000D00
    08000E000C000F001100100011000000
    16000200000014001700180000000800
    0C001900130001001A000000080001FC
    001107001B
    001C                 // 常量池#28常量, Exceptions
    0000 0004            // u4 attribute_length;
    # u2 number_of_exceptions;
    0001
    # u2 exception_index_table[number_of_exceptions];
    001D                // #29: <java/lang/Exception>
      # u2 attributes_count;
      0001
      # u2 attribute_name_index;
      001E                 // SourceFile
      # u4 attribute_length;
      0000 0002            // 属性长度:2个字节
      # u2 sourcefile_index; 
      001F                 //  A.java

posted on 2025-10-04 21:47  slgkaifa  阅读(3)  评论(0)    收藏  举报

导航