阅读android源码了解 android 加载so的流程

参考原文:http://bbs.pediy.com/thread-217656.htm

Android安全–linker加载so流程,在.init下断点: http://www.blogfshare.com/linker-load-so.html

我的源码版本:android-4.4.4_r1

以 [java.lang.Runtime -> load()] 为例来说明(loadLiabrary() 最后和 load() 殊途同归,有兴趣的可以自行分析),对应的 Android 源码在 [srcAndroid/libcore/luni/src/main/java/java/lang/Runtime.java],

从 320 行开始。

    /**
     * Loads and links the dynamic library that is identified through the
     * specified path. This method is similar to {@link #loadLibrary(String)},
     * but it accepts a full path specification whereas {@code loadLibrary} just
     * accepts the name of the library to load.
     *
     * @param pathName
     *            the absolute (platform dependent) path to the library to load.
     * @throws UnsatisfiedLinkError
     *             if the library can not be loaded.
     */
    public void load(String pathName) {
        load(pathName, VMStack.getCallingClassLoader());
    }

    /*
     * Loads and links the given library without security checks.
     */
    void load(String pathName, ClassLoader loader) {
        if (pathName == null) {
            throw new NullPointerException("pathName == null");
        }
        String error = doLoad(pathName, loader);
        if (error != null) {
            throw new UnsatisfiedLinkError(error);
        }
    }

最终调用了doLoad(String name, ClassLoader loader)函数,这个函数仍然在Runtime.java文件中:

    private String doLoad(String name, ClassLoader loader) {
        // Android apps are forked from the zygote, so they can't have a custom LD_LIBRARY_PATH,
        // which means that by default an app's shared library directory isn't on LD_LIBRARY_PATH.

        // The PathClassLoader set up by frameworks/base knows the appropriate path, so we can load
        // libraries with no dependencies just fine, but an app that has multiple libraries that
        // depend on each other needed to load them in most-dependent-first order.

        // We added API to Android's dynamic linker so we can update the library path used for
        // the currently-running process. We pull the desired path out of the ClassLoader here
        // and pass it to nativeLoad so that it can call the private dynamic linker API.

        // We didn't just change frameworks/base to update the LD_LIBRARY_PATH once at the
        // beginning because multiple apks can run in the same process and third party code can
        // use its own BaseDexClassLoader.

        // We didn't just add a dlopen_with_custom_LD_LIBRARY_PATH call because we wanted any
        // dlopen(3) calls made from a .so's JNI_OnLoad to work too.

        // So, find out what the native library search path is for the ClassLoader in question...
        String ldLibraryPath = null;
        if (loader != null && loader instanceof BaseDexClassLoader) {
            ldLibraryPath = ((BaseDexClassLoader) loader).getLdLibraryPath();
        }
        // nativeLoad should be synchronized so there's only one LD_LIBRARY_PATH in use regardless
        // of how many ClassLoaders are in the system, but dalvik doesn't support synchronized
        // internal natives.
        synchronized (this) {
            return nativeLoad(name, loader, ldLibraryPath);
        }
    }

    // TODO: should be synchronized, but dalvik doesn't support synchronized internal natives.
    private static native String nativeLoad(String filename, ClassLoader loader, String ldLibraryPath);

最终调用到了"String nativeLoad(String filename, ClassLoader loader, String ldLibraryPath)"函数,这个一个native函数,定义位于[srcAndroid/dalvik/vm/native/java_lang_Runtime.cpp]文件中。从64行开始:

/*
 * static String nativeLoad(String filename, ClassLoader loader, String ldLibraryPath)
 *
 * Load the specified full path as a dynamic library filled with
 * JNI-compatible methods. Returns null on success, or a failure
 * message on failure.
 */
static void Dalvik_java_lang_Runtime_nativeLoad(const u4* args,
    JValue* pResult)
{
    StringObject* fileNameObj = (StringObject*) args[0];
    Object* classLoader = (Object*) args[1];
    StringObject* ldLibraryPathObj = (StringObject*) args[2];

    assert(fileNameObj != NULL);
    char* fileName = dvmCreateCstrFromString(fileNameObj);

    if (ldLibraryPathObj != NULL) {
        char* ldLibraryPath = dvmCreateCstrFromString(ldLibraryPathObj);
        void* sym = dlsym(RTLD_DEFAULT, "android_update_LD_LIBRARY_PATH");
        if (sym != NULL) {
            typedef void (*Fn)(const char*);
            Fn android_update_LD_LIBRARY_PATH = reinterpret_cast<Fn>(sym);
            (*android_update_LD_LIBRARY_PATH)(ldLibraryPath);
        } else {
            ALOGE("android_update_LD_LIBRARY_PATH not found; .so dependencies will not work!");
        }
        free(ldLibraryPath);
    }

    StringObject* result = NULL;
    char* reason = NULL;
    bool success = dvmLoadNativeCode(fileName, classLoader, &reason);
    if (!success) {
        const char* msg = (reason != NULL) ? reason : "unknown failure";
        result = dvmCreateStringFromCstr(msg);
        dvmReleaseTrackedAlloc((Object*) result, NULL);
    }

    free(reason);
    free(fileName);
    RETURN_PTR(result);
}

还是传值 + 检查,然后执行 [bool success = dvmLoadNativeCode(fileName, classLoader, &reason);] ,看下 dvmLoadNativeCode(...) 的代码,位于[srcAndroid/dalvik/vm/Native.cpp] 301 行。

  1 /*
  2  * Load native code from the specified absolute pathname.  Per the spec,
  3  * if we've already loaded a library with the specified pathname, we
  4  * return without doing anything.
  5  *
  6  * TODO? for better results we should absolutify the pathname.  For fully
  7  * correct results we should stat to get the inode and compare that.  The
  8  * existing implementation is fine so long as everybody is using
  9  * System.loadLibrary.
 10  *
 11  * The library will be associated with the specified class loader.  The JNI
 12  * spec says we can't load the same library into more than one class loader.
 13  *
 14  * Returns "true" on success. On failure, sets *detail to a
 15  * human-readable description of the error or NULL if no detail is
 16  * available; ownership of the string is transferred to the caller.
 17  */
 18 bool dvmLoadNativeCode(const char* pathName, Object* classLoader,
 19         char** detail)
 20 {
 21     SharedLib* pEntry;
 22     void* handle;
 23     bool verbose;
 24 
 25     /* reduce noise by not chattering about system libraries */
 26     verbose = !!strncmp(pathName, "/system", sizeof("/system")-1);
 27     verbose = verbose && !!strncmp(pathName, "/vendor", sizeof("/vendor")-1);
 28 
 29     if (verbose)
 30         ALOGD("Trying to load lib %s %p", pathName, classLoader);
 31 
 32     *detail = NULL;
 33 
 34     /*
 35      * See if we've already loaded it.  If we have, and the class loader
 36      * matches, return successfully without doing anything.
 37      */
 38     pEntry = findSharedLibEntry(pathName);
 39     if (pEntry != NULL) {
 40         if (pEntry->classLoader != classLoader) {
 41             ALOGW("Shared lib '%s' already opened by CL %p; can't open in %p",
 42                 pathName, pEntry->classLoader, classLoader);
 43             return false;
 44         }
 45         if (verbose) {
 46             ALOGD("Shared lib '%s' already loaded in same CL %p",
 47                 pathName, classLoader);
 48         }
 49         if (!checkOnLoadResult(pEntry))
 50             return false;
 51         return true;
 52     }
 53 
 54     /*
 55      * Open the shared library.  Because we're using a full path, the system
 56      * doesn't have to search through LD_LIBRARY_PATH.  (It may do so to
 57      * resolve this library's dependencies though.)
 58      *
 59      * Failures here are expected when java.library.path has several entries
 60      * and we have to hunt for the lib.
 61      *
 62      * The current version of the dynamic linker prints detailed information
 63      * about dlopen() failures.  Some things to check if the message is
 64      * cryptic:
 65      *   - make sure the library exists on the device
 66      *   - verify that the right path is being opened (the debug log message
 67      *     above can help with that)
 68      *   - check to see if the library is valid (e.g. not zero bytes long)
 69      *   - check config/prelink-linux-arm.map to ensure that the library
 70      *     is listed and is not being overrun by the previous entry (if
 71      *     loading suddenly stops working on a prelinked library, this is
 72      *     a good one to check)
 73      *   - write a trivial app that calls sleep() then dlopen(), attach
 74      *     to it with "strace -p <pid>" while it sleeps, and watch for
 75      *     attempts to open nonexistent dependent shared libs
 76      *
 77      * This can execute slowly for a large library on a busy system, so we
 78      * want to switch from RUNNING to VMWAIT while it executes.  This allows
 79      * the GC to ignore us.
 80      */
 81     Thread* self = dvmThreadSelf();
 82     ThreadStatus oldStatus = dvmChangeStatus(self, THREAD_VMWAIT);
 83     handle = dlopen(pathName, RTLD_LAZY);
 84     dvmChangeStatus(self, oldStatus);
 85 
 86     if (handle == NULL) {
 87         *detail = strdup(dlerror());
 88         ALOGE("dlopen(\"%s\") failed: %s", pathName, *detail);
 89         return false;
 90     }
 91 
 92     /* create a new entry */
 93     SharedLib* pNewEntry;
 94     pNewEntry = (SharedLib*) calloc(1, sizeof(SharedLib));
 95     pNewEntry->pathName = strdup(pathName);
 96     pNewEntry->handle = handle;
 97     pNewEntry->classLoader = classLoader;
 98     dvmInitMutex(&pNewEntry->onLoadLock);
 99     pthread_cond_init(&pNewEntry->onLoadCond, NULL);
100     pNewEntry->onLoadThreadId = self->threadId;
101 
102     /* try to add it to the list */
103     SharedLib* pActualEntry = addSharedLibEntry(pNewEntry);
104 
105     if (pNewEntry != pActualEntry) {
106         ALOGI("WOW: we lost a race to add a shared lib (%s CL=%p)",
107             pathName, classLoader);
108         freeSharedLibEntry(pNewEntry);
109         return checkOnLoadResult(pActualEntry);
110     } else {
111         if (verbose)
112             ALOGD("Added shared lib %s %p", pathName, classLoader);
113 
114         bool result = false;
115         void* vonLoad;
116         int version;
117 
118         vonLoad = dlsym(handle, "JNI_OnLoad");
119         if (vonLoad == NULL) {
120             ALOGD("No JNI_OnLoad found in %s %p, skipping init", pathName, classLoader);
121             result = true;
122         } else {
123             /*
124              * Call JNI_OnLoad.  We have to override the current class
125              * loader, which will always be "null" since the stuff at the
126              * top of the stack is around Runtime.loadLibrary().  (See
127              * the comments in the JNI FindClass function.)
128              */
129             OnLoadFunc func = (OnLoadFunc)vonLoad;
130             Object* prevOverride = self->classLoaderOverride;
131 
132             self->classLoaderOverride = classLoader;
133             oldStatus = dvmChangeStatus(self, THREAD_NATIVE);
134             if (gDvm.verboseJni) {
135                 ALOGI("[Calling JNI_OnLoad for \"%s\"]", pathName);
136             }
137             version = (*func)(gDvmJni.jniVm, NULL);
138             dvmChangeStatus(self, oldStatus);
139             self->classLoaderOverride = prevOverride;
140 
141             if (version == JNI_ERR) {
142                 *detail = strdup(StringPrintf("JNI_ERR returned from JNI_OnLoad in \"%s\"",
143                                               pathName).c_str());
144             } else if (dvmIsBadJniVersion(version)) {
145                 *detail = strdup(StringPrintf("Bad JNI version returned from JNI_OnLoad in \"%s\": %d",
146                                               pathName, version).c_str());
147                 /*
148                  * It's unwise to call dlclose() here, but we can mark it
149                  * as bad and ensure that future load attempts will fail.
150                  *
151                  * We don't know how far JNI_OnLoad got, so there could
152                  * be some partially-initialized stuff accessible through
153                  * newly-registered native method calls.  We could try to
154                  * unregister them, but that doesn't seem worthwhile.
155                  */
156             } else {
157                 result = true;
158             }
159             if (gDvm.verboseJni) {
160                 ALOGI("[Returned %s from JNI_OnLoad for \"%s\"]",
161                       (result ? "successfully" : "failure"), pathName);
162             }
163         }
164 
165         if (result)
166             pNewEntry->onLoadResult = kOnLoadOkay;
167         else
168             pNewEntry->onLoadResult = kOnLoadFailed;
169 
170         pNewEntry->onLoadThreadId = 0;
171 
172         /*
173          * Broadcast a wakeup to anybody sleeping on the condition variable.
174          */
175         dvmLockMutex(&pNewEntry->onLoadLock);
176         pthread_cond_broadcast(&pNewEntry->onLoadCond);
177         dvmUnlockMutex(&pNewEntry->onLoadLock);
178         return result;
179     }
180 }

做了一些常规的检查,不赘述了,可以看到 [version = (*func)(gDvmJni.jniVm, NULL);] 这里调用了 JNI_OnLoad,上一行是 [ALOGI("[Calling JNI_OnLoad for \"%s\"]", pathName);],记录一下方便逆向时确定位置。

根据逆向经验 .init(_array) 段定义的内容是在 JNI_OnLoad 之前执行的,而 dlopen 是加载 SO 的函数可能会在这里执行 .init,看一下 dlopen 函数,它的定义在 [srcAndroid/bionic/linker/dlfcn.cpp] # 63 行。

1 void* dlopen(const char* filename, int flags) {
2   ScopedPthreadMutexLocker locker(&gDlMutex);
3   soinfo* result = do_dlopen(filename, flags);
4   if (result == NULL) {
5     __bionic_format_dlerror("dlopen failed", linker_get_error_buffer());
6     return NULL;
7   }
8   return result;
9 }

其实还是调用了 do_dlopen,do_dlopen 的定义在[srcAndroid/bionic/linker/linker.cpp] # 823 行,代码如下。

 1 soinfo* do_dlopen(const char* name, int flags) {
 2   if ((flags & ~(RTLD_NOW|RTLD_LAZY|RTLD_LOCAL|RTLD_GLOBAL)) != 0) {
 3     DL_ERR("invalid flags to dlopen: %x", flags);
 4     return NULL;
 5   }
 6   set_soinfo_pool_protection(PROT_READ | PROT_WRITE);
 7   soinfo* si = find_library(name);// 查找 SO,判断 SO 是否已经加载,若没有,则加载
 8   if (si != NULL) {
 9     si->CallConstructors();//调用so的init函数
10   }
11   set_soinfo_pool_protection(PROT_READ);
12   return si;
13 }

做了一些检查,*是否符合调用 dlopen 的格式、*是否属于已经加在过的 SO,如果属于之前没有加在过的 SO 就执行 [si->CallConstructors();],看一下 CallConstructors() 的定义。仍然在linker.cpp中,1192行:

 1 void soinfo::CallConstructors() {
 2   if (constructors_called) {
 3     return;
 4   }
 5 
 6   // We set constructors_called before actually calling the constructors, otherwise it doesn't
 7   // protect against recursive constructor calls. One simple example of constructor recursion
 8   // is the libc debug malloc, which is implemented in libc_malloc_debug_leak.so:
 9   // 1. The program depends on libc, so libc's constructor is called here.
10   // 2. The libc constructor calls dlopen() to load libc_malloc_debug_leak.so.
11   // 3. dlopen() calls the constructors on the newly created
12   //    soinfo for libc_malloc_debug_leak.so.
13   // 4. The debug .so depends on libc, so CallConstructors is
14   //    called again with the libc soinfo. If it doesn't trigger the early-
15   //    out above, the libc constructor will be called again (recursively!).
16   constructors_called = true;
17 
18   if ((flags & FLAG_EXE) == 0 && preinit_array != NULL) {
19     // The GNU dynamic linker silently ignores these, but we warn the developer.
20     PRINT("\"%s\": ignoring %d-entry DT_PREINIT_ARRAY in shared library!",
21           name, preinit_array_count);
22   }
23 
24   if (dynamic != NULL) {
25     for (Elf32_Dyn* d = dynamic; d->d_tag != DT_NULL; ++d) {
26       if (d->d_tag == DT_NEEDED) {
27         const char* library_name = strtab + d->d_un.d_val;
28         TRACE("\"%s\": calling constructors in DT_NEEDED \"%s\"", name, library_name);
29         find_loaded_library(library_name)->CallConstructors();
30       }
31     }
32   }
33 
34   TRACE("\"%s\": calling constructors", name);
35 
36   // DT_INIT should be called before DT_INIT_ARRAY if both are present.
37   CallFunction("DT_INIT", init_func);
38   CallArray("DT_INIT_ARRAY", init_array, init_array_count, false);
39 }

重点是最后这的 [CallFunction("DT_INIT", init_func);] 和 [CallArray("DT_INIT_ARRAY", init_array, init_array_count, false);],很明显是执行 .init(_array) 定义的内容,这里不贴 CallArray 的代码了,其实还是循环调用了 CallFunction,下面看看 CallFunction 的代码,linker.cpp # 1172 行。

 1 void soinfo::CallFunction(const char* function_name UNUSED, linker_function_t function) {
 2   if (function == NULL || reinterpret_cast<uintptr_t>(function) == static_cast<uintptr_t>(-1)) {
 3     return;
 4   }
 5 
 6   TRACE("[ Calling %s @ %p for '%s' ]", function_name, function, name);
 7   function();
 8   TRACE("[ Done calling %s @ %p for '%s' ]", function_name, function, name);
 9 
10   // The function may have called dlopen(3) or dlclose(3), so we need to ensure our data structures
11   // are still writable. This happens with our debug malloc (see http://b/7941716).
12   set_soinfo_pool_protection(PROT_READ | PROT_WRITE);
13 }

看到这行代码 [function();],所以可以确定 .init(_array) 定义的内容最终在这执行。同样记录一下 [TRACE("[ Calling %s @ %p for '%s' ]", function_name, function, name);] 方便逆向时确定位置。

结论:系统加载so,在完成装载、映射和重定向以后,就首先执行.init和.init_array段的代码,之后如果存在JNI_OnLoad就调用该函数.我们要对一个so进行分析,需要先看看有没有.init_array section和.init section,so加壳一般会在初始化函数进行脱壳操作。

 

综合以上,

so中定位到入口init中的函数过程:
1.查看dvm.so模块,搜索dvmLoadNativeCode函数
2.分析DvmLoadNativeCode函数,找到对dlopen函数的调用,并跟进
3.由于rom包版本的不同这一点可能不一样,在4.4.2rom包上,dlopen函数是对do_dlopen函数的封装,对do_dlopen(soName, falgs)跟中;
4.do_dlopen函数中调用find_library(soName).完成对so的加载,并会返回一个soinfo对象
5.调用soinfo对象的成员函数constructors();完成调用动态链接库初始化代码
6.在constructors()函数中,调用CallFunction(“DT_INIT”, init_func); 回调函数init_func就是init段中的函数
7.进入CallFunction 找到BLX R4 既是对init_func函数的调用
当然上面这种方法过于复杂,其实可以刷debug版的rom包,通过定位do_dlopen,dlopen,或CallFunction函数中的特征字符串快速定位到关键点(BLX R4)。但有些手机并不好刷原生的debug版rom包,比如我的测试机大华为,我只有对他呵呵了,所以掌握以上的方法很有必要的.

 

posted on 2017-05-24 14:50  寻步  阅读(1720)  评论(0编辑  收藏  举报