【转载】 一次生产环境的NOHTTPRESPONSEEXCEPTION异常的排查记录

https://www.freesion.com/article/41531004212/

环境:

jdk1.8+tomcat8+httpclient4.5.2

主要现象:

项目偶发出现org.apache.http.NoHttpResponseException: The target server failed to respond异常

定位原因:

查阅资料,此异常属于长连接keep-Alive的一种异常现象。当服务端某连接闲置超过keep-Alive超时时间后,服务端会关闭连接,进行四次挥手。如果此时,客户端再次拿此连接来访问服务端就会报NoHttpResponseException错误。

解决过程:

既然已经知道错误导致的原因,就可对症下药。主要解决思路有两种:

方案一:延长务端keep-Alive超时时间,拿tomcat举例,可以配置Connector 元素中的keepAliveTimeout参数;

方案二:降低客户端的keep-Alive时间,在服务端关闭闲置连接前关闭客户端连接。

方案一只能优化问题,但是并不能解决问题。因为keep-Alive超时时间不能设置为-1(永久),如果设置一直保持连接会极大的影响到服务端性能。

下面主要说一下方案二的解决方案,以httpClient4.5.2版本为例:

先贴最终的代码:

  1.  
    SSLContext sslcontext = SslUtils.createIgnoreVerifySSL();
  2.  
    //设置协议http和https对应的处理socket链接工厂的对象
  3.  
    Registry<ConnectionSocketFactory> socketFactoryRegistry = RegistryBuilder.<ConnectionSocketFactory>create()
  4.  
    .register("http", PlainConnectionSocketFactory.INSTANCE)
  5.  
    .register("https", new SSLConnectionSocketFactory(sslcontext))
  6.  
    .build();
  7.  
    ConnectionKeepAliveStrategy connectionKeepAliveStrategy = (final HttpResponse response, final HttpContext context) -> {
  8.  
    Args.notNull(response, "HTTP response");
  9.  
    final HeaderElementIterator it = new BasicHeaderElementIterator(
  10.  
    response.headerIterator(HTTP.CONN_KEEP_ALIVE));
  11.  
    while (it.hasNext()) {
  12.  
    final HeaderElement he = it.nextElement();
  13.  
    final String param = he.getName();
  14.  
    final String value = he.getValue();
  15.  
    if (value != null && param.equalsIgnoreCase("timeout")) {
  16.  
    try {
  17.  
    return Long.parseLong(value) * 1000;
  18.  
    } catch (final NumberFormatException ignore) {
  19.  
    }
  20.  
    }
  21.  
    }
  22.  
    // keep alive 3秒 客户端维护这个连接最多3秒的有效期 在获取环节超过3秒就会关闭此连接org.apache.http.pool.AbstractConnPool.getPoolEntryBlocking() entry.isExpired(System.currentTimeMillis())
  23.  
    return 3 * 1000;
  24.  
    };
  25.  
    PoolingHttpClientConnectionManager connManager = new PoolingHttpClientConnectionManager(socketFactoryRegistry);
  26.  
    connManager.setDefaultMaxPerRoute(10);
  27.  
    connManager.setMaxTotal(100);
  28.  
    //获取连接后 再次校验是否空闲超时org.apache.http.pool.AbstractConnPool.getPoolEntryBlocking() entry.getUpdated() + this.validateAfterInactivity <= System.currentTimeMillis()
  29.  
    connManager.setValidateAfterInactivity(3000);
  30.  
    //evictIdleConnections 超时之前 定期回收空闲连接 并发setMaxConnPerRoute=10 最多setMaxConnTotal=100个;注意,evictIdleConnections会在启动时线程sleep一个maxIdle时间
  31.  
    //创建自定义的httpclient对象
  32.  
    CloseableHttpClient client = HttpClients.custom()
  33.  
    // 注意:HttpClients的setDefaultMaxPerRoute和setMaxTotal不会覆盖connManager的值
  34.  
    .setConnectionManager(connManager)
  35.  
    .setConnectionManagerShared(false).evictIdleConnections(3000, TimeUnit.MILLISECONDS)
  36.  
    .setKeepAliveStrategy(connectionKeepAliveStrategy)
  37.  
    // 接口幂等 允许重试 注释掉disableAutomaticRetries 默认重试3次 会从连接池中获取 不会直接创建新的连接
  38.  
    .disableAutomaticRetries()
  39.  
    .build();

主要配置参数说明:

  1. org.apache.http.impl.conn.PoolingHttpClientConnectionManager#setValidateAfterInactivity
  2. org.apache.http.impl.client.HttpClientBuilder#setConnectionManagerShared
  3. org.apache.http.impl.client.HttpClientBuilder#evictIdleConnections(long, java.util.concurrent.TimeUnit)
  4. org.apache.http.impl.client.HttpClientBuilder#setKeepAliveStrategy
  5. org.apache.http.impl.client.HttpClientBuilder#disableAutomaticRetries

ORG.APACHE.HTTP.IMPL.CONN.POOLINGHTTPCLIENTCONNECTIONMANAGER#SETVALIDATEAFTERINACTIVITY

  从连接池中获取到空闲连接后,在使用之前校验空闲时间是否超过指定的时间,单位毫秒;注意,如果你像楼主一样,使用了

PoolingHttpClientConnectionManager connManager = new PoolingHttpClientConnectionManager(socketFactoryRegistry);

这块代码,那么请注意此处会把时间设置为2000ms.(ps:楼主在本地环境一直复现不了NoHttpResponseException的罪魁祸首)

方法路径:

org.apache.http.impl.conn.PoolingHttpClientConnectionManager#PoolingHttpClientConnectionManager(org.apache.http.conn.HttpClientConnectionOperator, org.apache.http.conn.HttpConnectionFactory<org.apache.http.conn.routing.HttpRoute,org.apache.http.conn.ManagedHttpClientConnection>, long, java.util.concurrent.TimeUnit)

逻辑上只要配置此处,即可保证连接在超时后关闭并重新从池子中获取(如果还超时,继续关闭连接并重新拿),无论哪一种配置,一定要配置此处,否则都会安装默认2秒过期时间来回收连接。感兴趣的可以看下源码:

  1.  
    private E getPoolEntryBlocking(
  2.  
    final T route, final Object state,
  3.  
    final long timeout, final TimeUnit tunit,
  4.  
    final PoolEntryFuture<E> future)
  5.  
    throws IOException, InterruptedException, TimeoutException {
  6.  
     
  7.  
    Date deadline = null;
  8.  
    if (timeout > 0) {
  9.  
    deadline = new Date
  10.  
    (System.currentTimeMillis() + tunit.toMillis(timeout));
  11.  
    }
  12.  
     
  13.  
    this.lock.lock();
  14.  
    try {
  15.  
    final RouteSpecificPool<T, C, E> pool = getPool(route);
  16.  
    E entry = null;
  17.  
    while (entry == null) {
  18.  
    Asserts.check(!this.isShutDown, "Connection pool shut down");
  19.  
    for (;;) {
  20.  
    entry = pool.getFree(state);
  21.  
    if (entry == null) {
  22.  
    break;
  23.  
    }
  24.  
    if (entry.isExpired(System.currentTimeMillis())) {
  25.  
    entry.close();
  26.  
    } else if (this.validateAfterInactivity > 0) {
  27.  
    //看这里 连接最后修改时间+超时时间是否小于now
  28.  
    if (entry.getUpdated() + this.validateAfterInactivity <= System.currentTimeMillis()) {
  29.  
    if (!validate(entry)) {
  30.  
    entry.close();
  31.  
    }
  32.  
    }
  33.  
    }
  34.  
    if (entry.isClosed()) {
  35.  
    this.available.remove(entry);
  36.  
    pool.free(entry, false);
  37.  
    } else {
  38.  
    break;
  39.  
    }
  40.  
    }
  41.  
    if (entry != null) {
  42.  
    this.available.remove(entry);
  43.  
    this.leased.add(entry);
  44.  
    onReuse(entry);
  45.  
    return entry;
  46.  
    }
  47.  
     
  48.  
    // New connection is needed
  49.  
    final int maxPerRoute = getMax(route);
  50.  
    // Shrink the pool prior to allocating a new connection
  51.  
    final int excess = Math.max(0, pool.getAllocatedCount() + 1 - maxPerRoute);
  52.  
    if (excess > 0) {
  53.  
    for (int i = 0; i < excess; i++) {
  54.  
    final E lastUsed = pool.getLastUsed();
  55.  
    if (lastUsed == null) {
  56.  
    break;
  57.  
    }
  58.  
    lastUsed.close();
  59.  
    this.available.remove(lastUsed);
  60.  
    pool.remove(lastUsed);
  61.  
    }
  62.  
    }
  63.  
     
  64.  
    if (pool.getAllocatedCount() < maxPerRoute) {
  65.  
    final int totalUsed = this.leased.size();
  66.  
    final int freeCapacity = Math.max(this.maxTotal - totalUsed, 0);
  67.  
    if (freeCapacity > 0) {
  68.  
    final int totalAvailable = this.available.size();
  69.  
    if (totalAvailable > freeCapacity - 1) {
  70.  
    if (!this.available.isEmpty()) {
  71.  
    final E lastUsed = this.available.removeLast();
  72.  
    lastUsed.close();
  73.  
    final RouteSpecificPool<T, C, E> otherpool = getPool(lastUsed.getRoute());
  74.  
    otherpool.remove(lastUsed);
  75.  
    }
  76.  
    }
  77.  
    final C conn = this.connFactory.create(route);
  78.  
    entry = pool.add(conn);
  79.  
    this.leased.add(entry);
  80.  
    return entry;
  81.  
    }
  82.  
    }
  83.  
     
  84.  
    boolean success = false;
  85.  
    try {
  86.  
    pool.queue(future);
  87.  
    this.pending.add(future);
  88.  
    success = future.await(deadline);
  89.  
    } finally {
  90.  
    // In case of 'success', we were woken up by the
  91.  
    // connection pool and should now have a connection
  92.  
    // waiting for us, or else we're shutting down.
  93.  
    // Just continue in the loop, both cases are checked.
  94.  
    pool.unqueue(future);
  95.  
    this.pending.remove(future);
  96.  
    }
  97.  
    // check for spurious wakeup vs. timeout
  98.  
    if (!success && (deadline != null) &&
  99.  
    (deadline.getTime() <= System.currentTimeMillis())) {
  100.  
    break;
  101.  
    }
  102.  
    }
  103.  
    throw new TimeoutException("Timeout waiting for connection");
  104.  
    } finally {
  105.  
    this.lock.unlock();
  106.  
    }
  107.  
    }

方法路径:

org.apache.http.pool.AbstractConnPool#getPoolEntryBlocking

ORG.APACHE.HTTP.IMPL.CLIENT.HTTPCLIENTBUILDER#SETCONNECTIONMANAGERSHARED和ORG.APACHE.HTTP.IMPL.CLIENT.HTTPCLIENTBUILDER#EVICTIDLECONNECTIONS(LONG, JAVA.UTIL.CONCURRENT.TIMEUNIT)

启动异步定时线程,关闭回收指定超时时间的空闲连接。如果在获取空闲连接前已经回收就没问题了,但是极端情况下也会出现NoHttpResponseException问题。比如:keep-Alive超时时间是20秒,然后定时配置15秒,假设第一次使用连接并释放时间为x,定时上次结束时间为y,y+15<x+20,也就是定时下次处理时,连接空闲时间还没有超过20秒,那么此处定时不会回收此连接,但是如果5秒后获取这个连接使用,肯定会报NoHttpResponseException异常。

evictIdleConnections需要配合setConnectionManagerShared=false使用,ConnectionManagerShared默认false。关键代码如下:

  1.  
    if (!this.connManagerShared) {
  2.  
    if (closeablesCopy == null) {
  3.  
    closeablesCopy = new ArrayList<Closeable>(1);
  4.  
    }
  5.  
    final HttpClientConnectionManager cm = connManagerCopy;
  6.  
     
  7.  
    if (evictExpiredConnections || evictIdleConnections) {
  8.  
    final IdleConnectionEvictor connectionEvictor = new IdleConnectionEvictor(cm,
  9.  
    maxIdleTime > 0 ? maxIdleTime : 10, maxIdleTimeUnit != null ? maxIdleTimeUnit : TimeUnit.SECONDS);
  10.  
    closeablesCopy.add(new Closeable() {
  11.  
     
  12.  
    @Override
  13.  
    public void close() throws IOException {
  14.  
    connectionEvictor.shutdown();
  15.  
    }
  16.  
     
  17.  
    });
  18.  
    connectionEvictor.start();
  19.  
    }
  20.  
    closeablesCopy.add(new Closeable() {
  21.  
     
  22.  
    @Override
  23.  
    public void close() throws IOException {
  24.  
    cm.shutdown();
  25.  
    }
  26.  
     
  27.  
    });
  28.  
    }

注意:evictIdleConnections会在启动时,线程sleep一个maxIdle时间。源码如下:

  1.  
    public IdleConnectionEvictor(
  2.  
    final HttpClientConnectionManager connectionManager,
  3.  
    final ThreadFactory threadFactory,
  4.  
    final long sleepTime, final TimeUnit sleepTimeUnit,
  5.  
    final long maxIdleTime, final TimeUnit maxIdleTimeUnit) {
  6.  
    this.connectionManager = Args.notNull(connectionManager, "Connection manager");
  7.  
    this.threadFactory = threadFactory != null ? threadFactory : new DefaultThreadFactory();
  8.  
    this.sleepTimeMs = sleepTimeUnit != null ? sleepTimeUnit.toMillis(sleepTime) : sleepTime;
  9.  
    this.maxIdleTimeMs = maxIdleTimeUnit != null ? maxIdleTimeUnit.toMillis(maxIdleTime) : maxIdleTime;
  10.  
    this.thread = this.threadFactory.newThread(new Runnable() {
  11.  
    @Override
  12.  
    public void run() {
  13.  
    try {
  14.  
    while (!Thread.currentThread().isInterrupted()) {
  15.  
    //此处休眠一个sleepTimeMs时间 可追溯代码发现sleepTimeMs来源于maxIdleTime
  16.  
    Thread.sleep(sleepTimeMs);
  17.  
    connectionManager.closeExpiredConnections();
  18.  
    if (maxIdleTimeMs > 0) {
  19.  
    connectionManager.closeIdleConnections(maxIdleTimeMs, TimeUnit.MILLISECONDS);
  20.  
    }
  21.  
    }
  22.  
    } catch (final Exception ex) {
  23.  
    exception = ex;
  24.  
    }
  25.  
     
  26.  
    }
  27.  
    });
  28.  
    }

方法路径:

org.apache.http.impl.client.IdleConnectionEvictor#IdleConnectionEvictor(org.apache.http.conn.HttpClientConnectionManager, java.util.concurrent.ThreadFactory, long, java.util.concurrent.TimeUnit, long, java.util.concurrent.TimeUnit)

ORG.APACHE.HTTP.IMPL.CLIENT.HTTPCLIENTBUILDER#SETKEEPALIVESTRATEGY

此方法是设置客户端连接池维护的连接的keep-Alive时间。如果连接空闲时间超过设置的时间,则会关闭此连接并重新获取。主要相关源码如下:在初始化线程池时设置了过期时间expiry是创建时间+keep-Alive时间,已经过期时间在updateExpiry(连接池回收会调接方法)中被修改成最新时间。

  1.  
    //方法路径:org.apache.http.pool.PoolEntry#PoolEntry(java.lang.String, T, C, long, java.util.concurrent.TimeUnit) 此处的timeToLive 就是设置的keep-Alive时间
  2.  
    public PoolEntry(final String id, final T route, final C conn,
  3.  
    final long timeToLive, final TimeUnit tunit) {
  4.  
    super();
  5.  
    Args.notNull(route, "Route");
  6.  
    Args.notNull(conn, "Connection");
  7.  
    Args.notNull(tunit, "Time unit");
  8.  
    this.id = id;
  9.  
    this.route = route;
  10.  
    this.conn = conn;
  11.  
    this.created = System.currentTimeMillis();
  12.  
    if (timeToLive > 0) {
  13.  
    this.validityDeadline = this.created + tunit.toMillis(timeToLive);
  14.  
    } else {
  15.  
    this.validityDeadline = Long.MAX_VALUE;
  16.  
    }
  17.  
    this.expiry = this.validityDeadline;
  18.  
    }
  19.  
     
  20.  
     
  21.  
    //方法路径:org.apache.http.pool.PoolEntry#updateExpiry
  22.  
    public synchronized void updateExpiry(final long time, final TimeUnit tunit) {
  23.  
    Args.notNull(tunit, "Time unit");
  24.  
    this.updated = System.currentTimeMillis();
  25.  
    final long newExpiry;
  26.  
    if (time > 0) {
  27.  
    newExpiry = this.updated + tunit.toMillis(time);
  28.  
    } else {
  29.  
    newExpiry = Long.MAX_VALUE;
  30.  
    }
  31.  
    this.expiry = Math.min(newExpiry, this.validityDeadline);
  32.  
    }

以上配置不是互斥也不少都需要配置,楼主亲自验证发现,只配置setValidateAfterInactivity或只配置setKeepAliveStrategy都可以。evictIdleConnections极端情况会有问题。

ORG.APACHE.HTTP.IMPL.CLIENT.HTTPCLIENTBUILDER#DISABLEAUTOMATICRETRIES

随便一提,httpclient默认会重试3次。如果接口不支持幂等,请注意不要使用重试。

OK,为了解决个问题,把源码看了一遍,特写博客以备以后注意使用。

posted @ 2020-10-10 23:38  _再见理想  阅读(35)  评论(0编辑  收藏