jvm源码解析java对象头

  认真学习过java的同学应该都知道,java对象由三个部分组成:对象头,实例数据,对齐填充,这三大部分扛起了java的大旗对象,实例数据其实就是我们对象中的数据,对齐填充是由于为了规则分配内存空间,java对象大小一定是8字节的整数倍,但是我们也不能让程序员来控制吧,所以当不够8位时,会自动填充至8的整数倍,对象头记录了hash值,gc年龄,锁状态(偏向锁还会记录线程id),gc状态等等,它还保存了对象的class指针,可谓是核心中的核心,有兴趣的同学可以去看一下关于我写的对象的一些介绍:https://www.cnblogs.com/gmt-hao/p/13817564.html。那么接下来我们就从jvm层面来剖析对象头的实现,还是老规矩,先撸代码。

  java作为面向对象的语言,作为代表的对象原始类名称也很有代表性:oop,我们进oop.hpp中看一下:

// oopDesc is the top baseclass for objects classes.  The {name}Desc classes describe
// the format of Java objects so the fields can be accessed from C++.
// oopDesc is abstract.
// (see oopHierarchy for complete oop class hierarchy)
//
// no virtual functions allowed
...省略
class oopDesc {
  friend class VMStructs;
 private:
  volatile markOop  _mark;
  union _metadata {
    Klass*      _klass;
    narrowKlass _compressed_klass;
  } _metadata;

先看一下注释,oopDesc代表所有object对象的最上层基类,至于后面一句我理解的话其实这一块的意思就是说用c++中的字段定义java对象的格式,,再看下面定义的几个字段,_mark 就是mark world,而_metadata里面有俩属性, _klass和_compressed_klass,前者就是正常的指针,而后者是压缩指针,压缩指针在1.8默认开启,可以通过-XX:-UseCompressedOops关闭,这里就不做详细赘述,反正记住都是class指针,指向具体的klass就行了,先看Klass的注释

// A Klass provides:                                         
// 1: language level class object (method dictionary etc.)
// 2: provide vm dispatch behavior for the object
// Both functions are combined into one C++ class.
这段话的意思是Klass提供了语言级别的类对象(如方法,字典表等),vm调度行为再一个c++ 类里面

// One reason for the oop/klass dichotomy in the implementation is
// that we don't want a C++ vtbl pointer in every object. Thus,
// normal oops don't have any virtual functions. Instead, they
// forward all "virtual" functions to their klass, which does have
// a vtbl and does the C++ dispatch depending on the object's
// actual type. (See oop.inline.hpp for some of the forwarding code.)
// ALL FUNCTIONS IMPLEMENTING THIS DISPATCH ARE PREFIXED WITH "oop_"!
这段话的意思大致是解释为什么要把klass 和 对象实体分成两部分来实现,他说不希望一个c++的虚方法指针存放在每个对象中,从而普通的对象不存放任何虚方法,有着虚方法的klass可以根据对象的实际类型进行c++的调度。
现在我大概是明白了,这不就是多态吗,原来多态的实现是这么玩的,在编译时期,对象是不知道自己具体调用的方法的,而在实际运行时去klass中去找实际类型调用对应方法。
我们再看一下实际类加载的klass子类InstanceKlass:
class InstanceKlass: public Klass {
  friend class VMStructs;
  friend class ClassFileParser;
  friend class CompileReplay;

 protected:
  // Constructor  构造函数
InstanceKlass(int vtable_len,             //虚方法表大小
                int itable_len,             //接口函数表大小
                int static_field_size,      //静态变量个数
                int nonstatic_oop_map_size, //非静态变量个数
                ReferenceType rt,           //引用类型
                AccessFlags access_flags,   //当前类的访问修饰符(public private)
                bool is_anonymous);         //是否匿名
。。。。。。。

 // See "The Java Virtual Machine Specification" section 2.16.2-5 for a detailed description
  // of the class loading & initialization procedure, and the use of the states.
  enum ClassState {
    allocated,                          // allocated (but not yet linked)
    loaded,                             // loaded and inserted in class hierarchy (but not linked yet)
    linked,                             // successfully linked/verified (but not initialized yet)
    being_initialized,                  // currently running class initializer
    fully_initialized,                  // initialized (successfull final state)
    initialization_error                // error happened during initialization
  };

protected: // Annotations for this class 类注解信息 Annotations* _annotations; // Array classes holding elements of this class. Klass* _array_klasses; // Constant pool for this class. ConstantPool* _constants; // The InnerClasses attribute and EnclosingMethod attribute. The // _inner_classes is an array of shorts. If the class has InnerClasses // attribute, then the _inner_classes array begins with 4-tuples of shorts // [inner_class_info_index, outer_class_info_index, // inner_name_index, inner_class_access_flags] for the InnerClasses // attribute. If the EnclosingMethod attribute exists, it occupies the // last two shorts [class_index, method_index] of the array. If only // the InnerClasses attribute exists, the _inner_classes array length is // number_of_inner_classes * 4. If the class has both InnerClasses // and EnclosingMethod attributes the _inner_classes array length is // number_of_inner_classes * 4 + enclosing_method_attribute_size. Array<jushort>* _inner_classes; // the source debug extension for this klass, NULL if not specified. // Specified as UTF-8 string without terminating zero byte in the classfile, // it is stored in the instanceklass as a NULL-terminated UTF-8 string char* _source_debug_extension; // Array name derived from this class which needs unreferencing // if this class is unloaded. Symbol* _array_name; // Number of heapOopSize words used by non-static fields in this klass // (including inherited fields but after header_size()). int _nonstatic_field_size; int _static_field_size; // number words used by static fields (oop and non-oop) in this klass // Constant pool index to the utf8 entry of the Generic signature, // or 0 if none. u2 _generic_signature_index; // Constant pool index to the utf8 entry for the name of source file // containing this klass, 0 if not specified. u2 _source_file_name_index; u2 _static_oop_field_count;// number of static oop fields in this klass u2 _java_fields_count; // The number of declared Java fields int _nonstatic_oop_map_size;// size in words of nonstatic oop map blocks // _is_marked_dependent can be set concurrently, thus cannot be part of the // _misc_flags. bool _is_marked_dependent; // used for marking during flushing and deoptimization

可以看到初始化的Klass的构造方法包含了像虚函数表大小,引用类型等等基本信息,再往下可以看到这里面字段增加了注解属性,当前常量池中保存的当前类引用,内部类等等。

  说完klass,我们在聊一聊今天的重头戏mark word,我们首先还是先看一下作者的注释:

The markOop describes the header of an object.
markOop描述了一个对象头

//
// Note that the mark is not a real oop but just a word.
// It is placed in the oop hierarchy for historical reasons.
请注意mark只是一个word(32位机器上就是32个字节,64位就是64个字节)而不是一个真实对象,由于一些历史原因他被留在了oop结构中

//
// Bit-format of an object header (most significant first, big endian layout below):
//对象的字节格式采用大端模式(高位字节放低位地址)
// 32 bits:
// --------
// hash:25 ------------>| age:4 biased_lock:1 lock:2 (normal object)
// JavaThread*:23 epoch:2 age:4 biased_lock:1 lock:2 (biased object)
// size:32 ------------------------------------------>| (CMS free block)
// PromotedObject*:29 ---------->| promo_bits:3 ----->| (CMS promoted object)
//
// 64 bits:
// --------
// unused:25 hash:31 -->| unused:1 age:4 biased_lock:1 lock:2 (normal object)
// JavaThread*:54 epoch:2 unused:1 age:4 biased_lock:1 lock:2 (biased object)
// PromotedObject*:61 --------------------->| promo_bits:3 ----->| (CMS promoted object)
// size:64 ----------------------------------------------------->| (CMS free block)

  第一句就点明了它作为我们这一章的主角地位,markOop描述了一个对象头,好家伙,这个才是真正的对象头,看了一圈网上的文章,基本都是在描述mark word和klass指针之类的,但是没关系,只是定义不同。

  再看下面的字节格式,我们主要看一下64位系统,根据上述提供的我们看一下这4种情况:

  1.未加锁但调用了hash是这样的:

  

   2.加了偏向锁,并偏向指定线程:

       

   3.CMS标记:

  

  4.回收就不谈了,肯定是空的。

  这里其实存在一个问题,可以看到第二种偏向锁的场景是没办法再存hash值的,那难道我加了偏向锁就不能在获取hash值了吗,答案当然是否定的,要分析这个我们先来看一段代码:

 

public class Response {
}
@Slf4j
public class TestHeader {

    static Response response = new Response();
    public static void aaa(Response response) throws InterruptedException {
        log.info(Thread.currentThread().getName() + "out" +ClassLayout.parseInstance(response).toPrintable());

        synchronized (response){
            log.info(Thread.currentThread().getName() + ClassLayout.parseInstance(response).toPrintable());
            sleep(5000);
            log.info(Thread.currentThread().getName());
        }
    }

    public static void main(String[] args) throws InterruptedException {
        Thread t1 = new Thread("t1"){
            @SneakyThrows
            @Override
            public void run(){
                sleep(2000);
                aaa(response);
            }
        };
        Thread t2 = new Thread("t2"){
            @SneakyThrows
            @Override
            public void run(){
                aaa(response);
            }
        };
        t1.start();
        t2.start();
        t1.join();
        t2.join();
    }
}

这里Response是一个空对象,没有计算hash,我们看打印结果:

16:03:40.326 [t2] INFO com.example.demo.TestHeader - t2outcom.example.demo.Response object internals:
 OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
      0     4        (object header)                           05 00 00 00 (00000101 00000000 00000000 00000000) (5)
      4     4        (object header)                           00 00 00 00 (00000000 00000000 00000000 00000000) (0)
      8     4        (object header)                           05 c2 00 f8 (00000101 11000010 00000000 11111000) (-134168059)
     12     4        (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

16:03:40.330 [t2] INFO com.example.demo.TestHeader - t2com.example.demo.Response object internals:
 OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
      0     4        (object header)                           05 b0 59 1f (00000101 10110000 01011001 00011111) (525971461)
      4     4        (object header)                           00 00 00 00 (00000000 00000000 00000000 00000000) (0)
      8     4        (object header)                           05 c2 00 f8 (00000101 11000010 00000000 11111000) (-134168059)
     12     4        (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

16:03:42.368 [t1] INFO com.example.demo.TestHeader - t1outcom.example.demo.Response object internals:
 OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
      0     4        (object header)                           05 b0 59 1f (00000101 10110000 01011001 00011111) (525971461)
      4     4        (object header)                           00 00 00 00 (00000000 00000000 00000000 00000000) (0)
      8     4        (object header)                           05 c2 00 f8 (00000101 11000010 00000000 11111000) (-134168059)
     12     4        (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

16:03:45.331 [t2] INFO com.example.demo.TestHeader - t2
16:03:45.331 [t1] INFO com.example.demo.TestHeader - t1com.example.demo.Response object internals:
 OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
      0     4        (object header)                           ba 16 ee 1c (10111010 00010110 11101110 00011100) (485365434)
      4     4        (object header)                           00 00 00 00 (00000000 00000000 00000000 00000000) (0)
      8     4        (object header)                           05 c2 00 f8 (00000101 11000010 00000000 11111000) (-134168059)  //klass引用
     12     4        (loss due to the next object alignment)    //对齐填充

上面的对象头介绍我们可以知道,锁的标识是最后两位,而倒数第三位

  我们在来介绍一下其他几个的含义:age用来记录gc年龄(由于只有4位,最多只能记录到15,因此gc年龄最大也就是15),biased_lock表示偏向锁标识,0关闭,1开启,lock标识锁状态,01偏向锁,00轻量锁,10重量锁,而当被gc标记时,后三位用来表示标记符。

 然后大端模式导致我们显示出来的和想象的不一样,可以看到除了对齐填充和klass就是mark word 一共64个01,8个字节,而这8个字节按倒序排序(前8位所占的字节其实是最后一个字节),所以我们看锁标记直接看标红地方的后三位就可以了。

  我们在来具体分析一下这个代码,两个线程t1和t2,t1启动后等待2秒,t2先跑,拿到锁之后歇5秒,而t1在2秒之后到达,则会进行锁竞争,我们可以看到在t2在第一次拿到锁之后,将线程id记录了下来,而t1过来抢锁之后,则由偏向锁直接升级为重量锁。

  我们再试一下将休眠5s给去掉,看下执行结果:

16:45:17.873 [t2] INFO com.example.demo.TestHeader - t2outcom.example.demo.Response object internals:
 OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
      0     4        (object header)                           05 00 00 00 (00000101 00000000 00000000 00000000) (5)
      4     4        (object header)                           00 00 00 00 (00000000 00000000 00000000 00000000) (0)
      8     4        (object header)                           05 c2 00 f8 (00000101 11000010 00000000 11111000) (-134168059)
     12     4        (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

16:45:17.876 [t2] INFO com.example.demo.TestHeader - t2com.example.demo.Response object internals:
 OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
      0     4        (object header)                           05 48 27 1f (00000101 01001000 00100111 00011111) (522668037)
      4     4        (object header)                           00 00 00 00 (00000000 00000000 00000000 00000000) (0)
      8     4        (object header)                           05 c2 00 f8 (00000101 11000010 00000000 11111000) (-134168059)
     12     4        (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

16:45:17.876 [t2] INFO com.example.demo.TestHeader - t2
16:45:19.843 [t1] INFO com.example.demo.TestHeader - t1outcom.example.demo.Response object internals:
 OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
      0     4        (object header)                           05 48 27 1f (00000101 01001000 00100111 00011111) (522668037)
      4     4        (object header)                           00 00 00 00 (00000000 00000000 00000000 00000000) (0)
      8     4        (object header)                           05 c2 00 f8 (00000101 11000010 00000000 11111000) (-134168059)
     12     4        (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

16:45:19.844 [t1] INFO com.example.demo.TestHeader - t1com.example.demo.Response object internals:
 OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
      0     4        (object header)                           f0 f3 ac 1f (11110000 11110011 10101100 00011111) (531428336)
      4     4        (object header)                           00 00 00 00 (00000000 00000000 00000000 00000000) (0)
      8     4        (object header)                           05 c2 00 f8 (00000101 11000010 00000000 11111000) (-134168059)
     12     4        (loss due to the next object alignment)

前面三次还是一样的,由于t2没有休眠,所以拿完锁直接释放了,而t1休眠2秒过来抢锁,偏向已经撤销,转为轻量锁00了。

我们再看一下刚才说的hashCode的情况:

public class TestHeader {

    static Response response = new Response();
    public static void aaa(Response response) throws InterruptedException {
        log.info(Thread.currentThread().getName() + "out" +ClassLayout.parseInstance(response).toPrintable());
        response.hashCode();
        log.info(Thread.currentThread().getName() + "hash" +ClassLayout.parseInstance(response).toPrintable());
        synchronized (response){
            log.info(Thread.currentThread().getName() + ClassLayout.parseInstance(response).toPrintable());
//            sleep(5000);
        }
    }

    public static void main(String[] args) throws InterruptedException {
        Thread t2 = new Thread("t2"){
            @SneakyThrows
            @Override
            public void run(){
                aaa(response);
            }
        };
        t2.start();
        t2.join();
    }

这里只启动了一个线程,分别在hash计算前,计算后和加锁后打印:

16:50:19.440 [t2] INFO com.example.demo.TestHeader - t2outcom.example.demo.Response object internals:
 OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
      0     4        (object header)                           05 00 00 00 (00000101 00000000 00000000 00000000) (5)
      4     4        (object header)                           00 00 00 00 (00000000 00000000 00000000 00000000) (0)
      8     4        (object header)                           05 c2 00 f8 (00000101 11000010 00000000 11111000) (-134168059)
     12     4        (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

16:50:19.443 [t2] INFO com.example.demo.TestHeader - t2hashcom.example.demo.Response object internals:
 OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
      0     4        (object header)                           01 63 bb 3f (00000001 01100011 10111011 00111111) (1069245185)
      4     4        (object header)                           50 00 00 00 (01010000 00000000 00000000 00000000) (80)
      8     4        (object header)                           05 c2 00 f8 (00000101 11000010 00000000 11111000) (-134168059)
     12     4        (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

16:50:19.444 [t2] INFO com.example.demo.TestHeader - t2com.example.demo.Response object internals:
 OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
      0     4        (object header)                           10 ee 1e 1f (00010000 11101110 00011110 00011111) (522120720)
      4     4        (object header)                           00 00 00 00 (00000000 00000000 00000000 00000000) (0)
      8     4        (object header)                           05 c2 00 f8 (00000101 11000010 00000000 11111000) (-134168059)
     12     4        (loss due to the next object alignment)

可以看到第一次就是常规的匿名可偏向,而计算完hash之后,变为不可偏向,并计算了hash值,加锁之后也不再是偏向锁,而是直接变为了轻量锁并保存线程id,再看一下,如果已经偏向某个线程后在调用hashCode的结果:

public class TestHeader {

    static Response response = new Response();
    public static void aaa(Response response) throws InterruptedException {
        log.info(Thread.currentThread().getName() + "out" +ClassLayout.parseInstance(response).toPrintable());

        synchronized (response){
            log.info(Thread.currentThread().getName() + ClassLayout.parseInstance(response).toPrintable());
            response.hashCode();
            log.info(Thread.currentThread().getName() + "hash" +ClassLayout.parseInstance(response).toPrintable());
//            sleep(5000);
        }
    }

    public static void main(String[] args) throws InterruptedException {
        Thread t2 = new Thread("t2"){
            @SneakyThrows
            @Override
            public void run(){
                aaa(response);
            }
        };
        t2.start();
        t2.join();
    }
}

执行结果:

16:59:12.601 [t2] INFO com.example.demo.TestHeader - t2outcom.example.demo.Response object internals:
 OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
      0     4        (object header)                           05 00 00 00 (00000101 00000000 00000000 00000000) (5)
      4     4        (object header)                           00 00 00 00 (00000000 00000000 00000000 00000000) (0)
      8     4        (object header)                           9f c1 00 f8 (10011111 11000001 00000000 11111000) (-134168161)
     12     4        (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

16:59:12.604 [t2] INFO com.example.demo.TestHeader - t2com.example.demo.Response object internals:
 OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
      0     4        (object header)                           05 68 40 1f (00000101 01101000 01000000 00011111) (524314629)
      4     4        (object header)                           00 00 00 00 (00000000 00000000 00000000 00000000) (0)
      8     4        (object header)                           9f c1 00 f8 (10011111 11000001 00000000 11111000) (-134168161)
     12     4        (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

16:59:12.604 [t2] INFO com.example.demo.TestHeader - t2hashcom.example.demo.Response object internals:
 OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
      0     4        (object header)                           2a 12 d4 1c (00101010 00010010 11010100 00011100) (483660330)
      4     4        (object header)                           00 00 00 00 (00000000 00000000 00000000 00000000) (0)
      8     4        (object header)                           9f c1 00 f8 (10011111 11000001 00000000 11111000) (-134168161)
     12     4        (loss due to the next object alignment)

可以看到由偏向锁直接升级为重量锁(10)。

 

总结:

  对象头其实在我看来就是一个死的概念,更多的时在gc或者是锁甚至是以后其他的操作,在jdk源码和jvm中看到了很多对于一个int值或者其他多字节的字段进行拆解操作,比如像jdk中的读写锁,便是用高低位分别表示,,而像这里也是用了一个word表示出那么多的花样,这一篇本来是不打算写的,但是当我要写synchronized的源码分析时,写了一小段突然发现卡壳了,完全没有办法绕开它,不过这也说明了对象头的重要性吧。

  对于锁的升级,从上面的例子也可以看出默认情况下为匿名可偏向(这里是默认去除偏向延迟的,可以加上-XX:BiasedLockingStartupDelay=0),当有一个线程过来时,会偏向当前线程,而多个线程交替执行(即一个线程执行完再执行下一个,永远不会出现两个线程同时在锁临界区内),则会升级为轻量锁,而多个线程竞争(两个或以上线程同时在临界区中),而在计算hash值之后,匿名偏向计算hash后加锁则升级为轻量锁,加锁后计算hash则直接升级为重量锁。

 

 

 

 

posted @ 2020-12-17 21:06  吃肉不长肉的小灏哥  阅读(325)  评论(0编辑  收藏  举报