第32篇-解析interfacevirtual字节码指令

在前面介绍invokevirtual指令时，如果判断出ConstantPoolCacheEntry中的_indices字段的_f2属性的值为空，则认为调用的目标方法没有连接，也就是没有向ConstantPoolCacheEntry中保存调用方法的相关信息，需要调用InterpreterRuntime::resolve_invoke()函数进行方法连接，这个函数的实现比较多，我们分几部分查看：

InterpreterRuntime::resolve_invoke()函数第1部分：

Handle receiver(thread, NULL);
if (bytecode == Bytecodes::_invokevirtual || bytecode == Bytecodes::_invokeinterface) {
    ResourceMark rm(thread);
    // 调用method()函数从当前的栈帧中获取到需要执行的方法
    Method* m1 = method(thread);
    methodHandle m (thread, m1);

    // 调用bci()函数从当前的栈帧中获取需要执行的方法的字节码索引
    int i1 = bci(thread);
    Bytecode_invoke call(m, i1);

    // 当前需要执行的方法的签名
    Symbol* signature = call.signature();

    frame fm = thread->last_frame();
    oop x = fm.interpreter_callee_receiver(signature);
    receiver = Handle(thread,x);
}

当字节码为invokevirtual或invokeinterface这样的动态分派字节码时，执行如上的逻辑。获取到了receiver变量的值。接着看实现，如下：

InterpreterRuntime::resolve_invoke()函数第2部分：

CallInfo info;
constantPoolHandle pool(thread, method(thread)->constants());

{
    JvmtiHideSingleStepping jhss(thread);
    int cpcacheindex = get_index_u2_cpcache(thread, bytecode);
    LinkResolver::resolve_invoke(info, receiver, pool,cpcacheindex, bytecode, CHECK);
    ...
} 

// 如果已经向ConstantPoolCacheEntry中更新了调用的相关信息则直接返回
if (already_resolved(thread))
  return;

根据存储在当前栈中的bcp来获取字节码指令的操作数，这个操作数通常就是常量池缓存项索引。然后调用LinkResolver::resolve_invoke()函数进行方法连接。这个函数会间接调用LinkResolver::resolve_invokevirtual()函数，实现如下：

void LinkResolver::resolve_invokevirtual(
 CallInfo&           result,
 Handle              recv,
 constantPoolHandle  pool,
 int                 index,
 TRAPS
){

  KlassHandle  resolved_klass;
  Symbol*      method_name = NULL;
  Symbol*      method_signature = NULL;
  KlassHandle  current_klass;

  // 解析常量池时，传入的参数pool（根据当前栈中要执行的方法找到对应的常量池）和
  // index（常量池缓存项的缓存，还需要映射为原常量池索引）是有值的，根据这两个值能够
  // 解析出resolved_klass和要查找的方法名称method_name和方法签名method_signature
  resolve_pool(resolved_klass, method_name,  method_signature, current_klass, pool, index, CHECK);

  KlassHandle  recvrKlass(THREAD, recv.is_null() ? (Klass*)NULL : recv->klass());

  resolve_virtual_call(result, recv, recvrKlass, resolved_klass, method_name, method_signature, current_klass, true, true, CHECK);
}

其中会调用resolve_pool()和resolve_vritual_call()函数分别连接常量池和方法调用指令。调用会涉及到的相关函数如下图所示。

下面介绍resolve_pool()和resolve_virtual_call()函数及其调用的相关函数的实现。

1、LinkResolver::resolve_pool()函数

调用的resolve_pool()函数会调用一些函数，如下图所示。

每次调用LinkResolver::resolve_pool()函数时不一定会按如上的函数调用链执行，但是当类还没有解析时，通常会调用SystemDictionary::resolve_or_fail()函数进行解析，最终会获取到指向Klass实例的指针，最终将这个类更新到常量池中。

resolve_pool()函数的实现如下：

void LinkResolver::resolve_pool(
 KlassHandle& resolved_klass,
 Symbol*&     method_name,
 Symbol*&     method_signature,
 KlassHandle& current_klass,
 constantPoolHandle pool,
 int          index,
 TRAPS
) {
  resolve_klass(resolved_klass, pool, index, CHECK);

  method_name      = pool->name_ref_at(index);
  method_signature = pool->signature_ref_at(index);
  current_klass    = KlassHandle(THREAD, pool->pool_holder());
}

其中的index为常量池缓存项的索引。resolved_klass参数表示需要进行解析的类（解析是在类生成周期中连接相关的部分，所以我们之前有时候会称为连接，其实具体来说是解析的意思），而current_klass为当前拥有常量池的类，由于传递参数时是C++的引用传递，所以同值会直接改变变量的值，调用者中的值也会随着改变。

调用resolve_klass()函数进行类解析，一般来说，类解析会在解释常量池项时就会进行，这在《深入剖析Java虚拟机：源码剖析与实例详解（基础卷）》一书中介绍过，这里需要再说一下。

调用的resolve_klass()函数及相关函数的实现如下：

void LinkResolver::resolve_klass(
 KlassHandle&         result,
 constantPoolHandle   pool,
 int                  index,
 TRAPS
) {
  Klass* result_oop = pool->klass_ref_at(index, CHECK);
  // 通过引用进行传递
  result = KlassHandle(THREAD, result_oop);
}

Klass* ConstantPool::klass_ref_at(int which, TRAPS) {
  int x = klass_ref_index_at(which);
  return klass_at(x, CHECK_NULL);
}

int klass_ref_index_at(int which) {
  return impl_klass_ref_index_at(which, false);
}

调用的impl_klass_ref_index_at()函数的实现如下：　　

int ConstantPool::impl_klass_ref_index_at(int which, bool uncached) {
  int i = which;
  if (!uncached && cache() != NULL) {
	// 从which对应的ConstantPoolCacheEntry项中获取ConstantPoolIndex
    i = remap_instruction_operand_from_cache(which);
  }

  assert(tag_at(i).is_field_or_method(), "Corrupted constant pool");
  // 获取
  jint ref_index = *int_at_addr(i);
  // 获取低16位，那就是class_index
  return extract_low_short_from_int(ref_index);
}

根据断言可知，在原常量池索引的i处的项肯定为JVM_CONSTANT_Fieldref、JVM_CONSTANT_Methodref或JVM_CONSTANT_InterfaceMethodref，这几项的格式如下：

CONSTANT_Fieldref_info{
  u1 tag;
  u2 class_index; 
  u2 name_and_type_index; // 必须是字段描述符
}

CONSTANT_InterfaceMethodref_info{
  u1 tag;
  u2 class_index; // 必须是接口
  u2 name_and_type_index; // 必须是方法描述符
}

CONSTANT_Methodref_info{
  u1 tag;
  u2 class_index; // 必须是类
  u2 name_and_type_index; // 必须是方法描述符
}

3项的格式都一样，其中的class_index索引处的项必须为CONSTANT_Class_info结构，表示一个类或接口，当前类字段或方法是这个类或接口的成员。name_and_type_index索引处必须为CONSTANT_NameAndType_info项。　　

通过调用int_at_addr()函数和extract_low_short_from_int()函数获取class_index的索引值，如果了解了常量池内存布局，这里函数的实现理解起来会很简单，这里不再介绍。

在klass_ref_at()函数中调用klass_at()函数，此函数的实现如下：

Klass* klass_at(int which, TRAPS) {
    constantPoolHandle h_this(THREAD, this);
    return klass_at_impl(h_this, which, CHECK_NULL);
}

调用的klass_at_impl()函数的实现如下：

Klass* ConstantPool::klass_at_impl(
 constantPoolHandle this_oop,
 int                which,
 TRAPS
) {
  
  CPSlot entry = this_oop->slot_at(which);
  if (entry.is_resolved()) { // 已经进行了连接
    return entry.get_klass();
  }

  bool do_resolve = false;
  bool in_error = false;

  Handle  mirror_handle;
  Symbol* name = NULL;
  Handle  loader;
  {
     MonitorLockerEx ml(this_oop->lock());

    if (this_oop->tag_at(which).is_unresolved_klass()) {
      if (this_oop->tag_at(which).is_unresolved_klass_in_error()) {
        in_error = true;
      } else {
        do_resolve = true;
        name   = this_oop->unresolved_klass_at(which);
        loader = Handle(THREAD, this_oop->pool_holder()->class_loader());
      }
    }
  } // unlocking constantPool

  // 省略当in_error变量的值为true时的处理逻辑
 
  if (do_resolve) {
    oop protection_domain = this_oop->pool_holder()->protection_domain();
    Handle h_prot (THREAD, protection_domain);
    Klass* k_oop = SystemDictionary::resolve_or_fail(name, loader, h_prot, true, THREAD);
    KlassHandle k;
    if (!HAS_PENDING_EXCEPTION) {
      k = KlassHandle(THREAD, k_oop);
      mirror_handle = Handle(THREAD, k_oop->java_mirror());
    }

    if (HAS_PENDING_EXCEPTION) {
      ...
      return 0;
    }

    if (TraceClassResolution && !k()->oop_is_array()) {
      ...      
    } else {
      MonitorLockerEx ml(this_oop->lock());
      do_resolve = this_oop->tag_at(which).is_unresolved_klass();
      if (do_resolve) {
        ClassLoaderData* this_key = this_oop->pool_holder()->class_loader_data();
        this_key->record_dependency(k(), CHECK_NULL); // Can throw OOM
        this_oop->klass_at_put(which, k()); // 注意这里会更新常量池中存储的内容，这样就表示类已经解析完成，下次就不需要重复解析了
      }
    }
  }

  entry = this_oop->resolved_klass_at(which);
  assert(entry.is_resolved() && entry.get_klass()->is_klass(), "must be resolved at this point");
  return entry.get_klass();
}

函数首先调用slot_at()函数获取常量池中一个slot中存储的值，然后通过CPSlot来表示这个slot，这个slot中可能存储的值有2个，分别为指向Symbol实例（因为类名用CONSTANT_Utf8_info项表示，在虚拟机内部统一使用Symbol对象表示字符串）的指针和指向Klass实例的指针，如果类已经解释，那么指针表示的地址的最后一位为0，如果还没有被解析，那么地址的最后一位为1。

当没有解析时，需要调用SystemDictionary::resolve_or_fail()函数获取类Klass的实例，然后更新常量池中的信息，这样下次就不用重复解析类了。最后返回指向Klass实例的指针即可。

继续回到LinkResolver::resolve_pool()函数看接下来的执行逻辑，也就是会获取JVM_CONSTANT_Fieldref、JVM_CONSTANT_Methodref或JVM_CONSTANT_InterfaceMethodref项中的name_and_type_index，其指向的是CONSTANT_NameAndType_info项，格式如下：

CONSTANT_NameAndType_info{
   u1 tag;
  u2 name_index;
  u2 descriptor index;
}

获取逻辑就是先根据常量池缓存项的索引找到原常量池项的索引，然后查找到CONSTANT_NameAndType_info后，获取到方法名称和签名的索引，进而获取到被调用的目标方法的名称和签名。这些信息将在接下来调用的resolve_virtual_call()函数中使用。　

2、LinkResolver::resolve_virtual_call()函数

resolve_virtual_call()函数会调用的相关函数如下图所示。

LinkResolver::resolve_virtual_call()的实现如下：

void LinkResolver::resolve_virtual_call(
 CallInfo&     result,
 Handle        recv,
 KlassHandle   receiver_klass,
 KlassHandle   resolved_klass,
 Symbol*       method_name,
 Symbol*       method_signature,
 KlassHandle   current_klass,
 bool         check_access,
 bool         check_null_and_abstract,
 TRAPS
) {
  methodHandle resolved_method;

  linktime_resolve_virtual_method(resolved_method, resolved_klass, method_name, method_signature, current_klass, check_access, CHECK);

  runtime_resolve_virtual_method(result, resolved_method, resolved_klass, recv, receiver_klass, check_null_and_abstract, CHECK);
}

首先调用LinkResolver::linktime_resolve_virtual_method()函数，这个函数会调用如下函数：

void LinkResolver::resolve_method(
 methodHandle&  resolved_method,
 KlassHandle    resolved_klass,
 Symbol*        method_name,
 Symbol*        method_signature,
 KlassHandle    current_klass,
 bool          check_access,
 bool          require_methodref,
 TRAPS
) {

  // 从解析的类和其父类中查找方法
  lookup_method_in_klasses(resolved_method, resolved_klass, method_name, method_signature, true, false, CHECK);

  // 没有在解析类的继承体系中查找到方法
  if (resolved_method.is_null()) { 
    // 从解析类实现的所有接口（包括间接实现的接口）中查找方法
    lookup_method_in_interfaces(resolved_method, resolved_klass, method_name, method_signature, CHECK);
    // ...

    if (resolved_method.is_null()) {
      // 没有找到对应的方法
      ...
    }
  }

  // ...
}

如上函数中最主要的就是根据method_name和method_signature从resolved_klass类中找到合适的方法，如果找到就赋值给resolved_method变量。

调用lookup_method_in_klasses()、lookup_method_in_interfaces()等函数进行方法的查找，这里暂时不介绍。

下面接着看runtime_resolve_virtual_method()函数，这个函数的实现如下：

void LinkResolver::runtime_resolve_virtual_method(
 CallInfo&      result,
 methodHandle   resolved_method,
 KlassHandle    resolved_klass,
 Handle         recv,
 KlassHandle    recv_klass,
 bool          check_null_and_abstract,
 TRAPS
) {

  int vtable_index = Method::invalid_vtable_index;
  methodHandle selected_method;

  // 当方法定义在接口中时，表示是miranda方法
  if (resolved_method->method_holder()->is_interface()) { 
    vtable_index = vtable_index_of_interface_method(resolved_klass,resolved_method);

    InstanceKlass* inst = InstanceKlass::cast(recv_klass());
    selected_method = methodHandle(THREAD, inst->method_at_vtable(vtable_index));
  } else {
    // 如果走如下的代码逻辑，则表示resolved_method不是miranda方法，需要动态分派且肯定有正确的vtable索引
    vtable_index = resolved_method->vtable_index();

    // 有些方法虽然看起来需要动态分派，但是如果这个方法有final关键字时，可进行静态绑定，所以直接调用即可
    // final方法其实不会放到vtable中，除非final方法覆写了父类中的方法
    if (vtable_index == Method::nonvirtual_vtable_index) {
      selected_method = resolved_method;
    } else {
      // 根据vtable和vtable_index以及inst进行方法的动态分派
      InstanceKlass* inst = (InstanceKlass*)recv_klass();
      selected_method = methodHandle(THREAD, inst->method_at_vtable(vtable_index));
    }
  }  
 
  // setup result  resolve的类型为CallInfo，为CallInfo设置了连接后的相关信息
  result.set_virtual(resolved_klass, recv_klass, resolved_method, selected_method, vtable_index, CHECK);
}

当为miranda方法时，调用 LinkResolver::vtable_index_of_interface_method()函数查找；当为final方法时，因为final方法不可能被子类覆写，所以resolved_method就是目标调用方法；除去前面的2种情况后，剩下的方法就需要结合vtable和vtable_index进行动态分派了。

如上函数将查找到调用时需要的所有信息并存储到CallInfo类型的result变量中。　

在获取到调用时的所有信息并存储到CallInfo中后，就可以根据info中相关信息填充ConstantPoolCacheEntry。我们回看InterpreterRuntime::resolve_invoke()函数的执行逻辑。

InterpreterRuntime::resolve_invoke()函数第2部分：

switch (info.call_kind()) {
  case CallInfo::direct_call: // 直接调用
    cache_entry(thread)->set_direct_call(
		  bytecode,
		  info.resolved_method());
    break;
  case CallInfo::vtable_call: // vtable分派
    cache_entry(thread)->set_vtable_call(
		  bytecode,
		  info.resolved_method(),
		  info.vtable_index());
    break;
  case CallInfo::itable_call: // itable分派
    cache_entry(thread)->set_itable_call(
		  bytecode,
		  info.resolved_method(),
		  info.itable_index());
    break;
  default:  ShouldNotReachHere();
}

无论直接调用，还是vtable和itable动态分派，都会在方法解析完成后将相关的信息存储到常量池缓存项中。调用cache_entry()函数获取对应的ConstantPoolCacheEntry项，然后调用set_vtable_call()函数，此函数会调用如下函数更新ConstantPoolCacheEntry项中的信息，如下：

void ConstantPoolCacheEntry::set_direct_or_vtable_call(
 Bytecodes::Code  invoke_code,
 methodHandle     method,
 int              vtable_index
) {
  bool is_vtable_call = (vtable_index >= 0);  // FIXME: split this method on this boolean
 
  int byte_no = -1;
  bool change_to_virtual = false;

  switch (invoke_code) {
    case Bytecodes::_invokeinterface:
       change_to_virtual = true;

    // ...
    // 可以看到，通过_invokevirtual指令时，并不一定都是动态分发，也有可能是静态绑定
    case Bytecodes::_invokevirtual: // 当前已经在ConstantPoolCacheEntry类中了
      {
        if (!is_vtable_call) {
          assert(method->can_be_statically_bound(), "");
          // set_f2_as_vfinal_method checks if is_vfinal flag is true.
          set_method_flags(as_TosState(method->result_type()),
                           (                             1      << is_vfinal_shift) |
                           ((method->is_final_method() ? 1 : 0) << is_final_shift)  |
                           ((change_to_virtual         ? 1 : 0) << is_forced_virtual_shift), // 在接口中调用Object中定义的方法
                           method()->size_of_parameters());
          set_f2_as_vfinal_method(method());
        } else {
          // 执行这里的逻辑时，表示方法是非静态绑定的非final方法，需要动态分派，则vtable_index的值肯定大于等于0
          set_method_flags(as_TosState(method->result_type()),
                           ((change_to_virtual ? 1 : 0) << is_forced_virtual_shift),
                           method()->size_of_parameters());
          // 对于动态分发来说，ConstantPoolCacheEntry::_f2中保存的是vtable_index
          set_f2(vtable_index);
        }
        byte_no = 2;
        break;
      }
      // ...
  }

  if (byte_no == 1) {
    // invoke_code为非invokevirtual和非invokeinterface字节码指令
    set_bytecode_1(invoke_code);
  } else if (byte_no == 2)  {
    if (change_to_virtual) {
      if (method->is_public()) 
         set_bytecode_1(invoke_code);
    } else {
      assert(invoke_code == Bytecodes::_invokevirtual, "");
    }
    // set up for invokevirtual, even if linking for invokeinterface also:
    set_bytecode_2(Bytecodes::_invokevirtual);
  } 
}

连接完成后ConstantPoolCacheEntry中的各个项如下图所示。

所以对于invokevirtual来说，通过vtable进行方法的分发，在ConstantPoolCacheEntry中，_f1字段没有使用，而对_f2字段来说，如果调用的是非final的virtual方法，则保存的是目标方法在vtable中的索引编号，如果是virtual final方法，则_f2字段直接指向目标方法的Method实例。

专注虚拟机与编译器研究