ImportError: dlopen: cannot load any more object with static TLS 错误原因定位及解决办法

https://github.com/pytorch/pytorch/issues/2575  帖子对于此问题的讨论非常深刻,除了更换import顺序的通用做法,也指出了问题的根本原因:

 

By suo

Summarizing what I've learned so far. I may be wrong about any of these things, so please correct/take with a grain of salt.

glibc has a table called the DTV. There is a slot for every dlopen'd library. Its use is not important for this discussion.

The DTV is resizable. However, in older versions of glibc, adding a library with static TLS will not resize the DTV, but do a conservative check that amounts to "have a I loaded more than 14 libraries of any kind".

That's why #24911 failed to fix anything, because reducing the amount of thread-local storage is irrelevant.

That's also why changing import order can fix things, because if you change it in a way that loads all your "static TLS" libraries first, then future "dynamic TLS" libraries will resize the DTV like normal

It seems this issue was fixed by a glibc patch in 2014, which eliminates this check and lazily updates the DTV.  (解决方案一:升级python依赖的glibc库)

 

 

One additional takeaway: this issue was fixed by a 2014 glibc patch. This patch is in the glibc distributed with Xenial.

Trusty is considered unsupported by Ubuntu (the LTS commitment expired in April of this year). We are also removing trusty from our CI (cc @jamesr66a). So if you want this problem to really really (tm) go away, upgrading your linux is another path. (解决方案二:升级操作系统版本,同时升级或者重编译python开发环境)

posted on 2020-03-15 22:15  兵者  阅读(5082)  评论(0编辑  收藏  举报

导航