如何停止CLOSE_WAIT端口:How do I remove a CLOSE_WAIT socket connection

104

I have written a small program that interacts with a server on a specific port. The program works fine, but:

Once the program terminated unexpectedly, and ever since that socket connection is shown in CLOSE_WAIT state. If I try to run a program it hangs and I have to force it close, which accumulates even more CLOSE_WAIT socket connections.

Is there a way to flush these connections?

  • 5
    You can't (and shouldn't). CLOSE_WAIT is a state defined by TCP for connections being closed waiting for the counterpart to acknowledge this.   Apr 9, 2013 at 14:54
  • 1
    See also unix.stackexchange.com/questions/10106/… ... which I won't vote as a duplicate, because it'd wind up closing the question as off-topic.   Apr 9, 2013 at 17:32
  • 5
    @vonbrand No it isn't, it is exactly the opposite. It is the state for a connection which has already been closed by the peer and is waiting for the local application to close its end.   Apr 9, 2013 at 22:01 
  •  
    If you are using Commons HttpClient then nuxeo.com/blog/… has a lot of relevant information. From RFC 2616, Section 14: HTTP/1.1 applications that do not support persistent connections MUST include the "close" connection option in every message.   Jun 23, 2015 at 0:53 

7 Answers

91
 

CLOSE_WAIT means your program is still running, and hasn't closed the socket (and the kernel is waiting for it to do so). Add -p to netstat to get the pid, and then kill it more forcefully (with SIGKILL if needed). That should get rid of your CLOSE_WAIT sockets. You can also use ps to find the pid.

SO_REUSEADDR is for servers and TIME_WAIT sockets, so doesn't apply here.

  • 7
    well... kiling the process may not be the best if that program open a lot of connection, only a few of those staying in "CLOSE_WAIT" : in that case killing the process may be completely impossible or unappropriate (the program still works and provides services, with those other connections). Just closing the pending connection would be much more appropriate. but indeed it is usually the program itself which is not closing locally the connectino (CLOSE_WAIT means it received 'FIN' from the other end and the program just have to close the connection locally). A bug report may be appropriate   Jul 5, 2017 at 14:09 
 
43

As described by Crist Clark.

CLOSE_WAIT means that the local end of the connection has received a FIN from the other end, but the OS is waiting for the program at the local end to actually close its connection.

The problem is your program running on the local machine is not closing the socket. It is not a TCP tuning issue. A connection can (and quite correctly) stay in CLOSE_WAIT forever while the program holds the connection open.

Once the local program closes the socket, the OS can send the FIN to the remote end which transitions you to LAST_ACK while you wait for the ACK of the FIN. Once that is received, the connection is finished and drops from the connection table (if your end is in CLOSE_WAIT you do not end up in the TIME_WAIT state).

  • 6
    how to close the socket??   May 1, 2015 at 10:51
  • 1
    You close the handle you have to the socket you opened. Use close() or closesocket(), depending on which platform you are using.   Aug 22, 2015 at 23:21
  •  
    @RemyLebeau I suppose the real question is how to wire-it-up in situations where it doesn't automatically happen. Why would a socket not be closed? It couldn't be waiting for incoming data (because of the FIN it would be canceled). Is it a failure to respond to the error condition of such a read operation? 
    – sehe
     Jan 23, 2021 at 16:37
25

You can forcibly close sockets with ss command; the ss command is a tool used to dump socket statistics and displays information in similar fashion (although simpler and faster) to netstat.

To kill any socket in CLOSE_WAIT state, run this (as root)

$ ss --tcp state CLOSE-WAIT --kill

You may also filter your action

$ ss --tcp state CLOSE-WAIT '( dport = 22 or dst 1.1.1.1 )' --kill
  • 5
    That should be the top answer. 
    – Tom
     Jan 3, 2021 at 9:26
  •  
    can we apply a filter to kill only a specific port?   Feb 24, 2021 at 11:15
  •  
    @MohammadFaisal Yeah, of course; you may kill (or list if you want) all sockets by their source or destination port like this: sudo ss --tcp --kill sport = 54576 or dport = :ssh   Feb 24, 2021 at 15:12
  •  
    @MustaphaHadid: Thanks. This is a helpful answer but in my case the result of ss is tcp 0 0 1.1.16.1:57212 1.1.16.28:8081 CLOSE_WAIT - with no process id and therefore the --kill was unsuccessful. And therefore I am unable to free up this port 57212  Feb 25, 2021 at 7:19
 
9

Even though too much of CLOSE_WAIT connections means there is something wrong with your code in the first and this is accepted not good practice.

You might want to check out: https://github.com/rghose/kill-close-wait-connections

What this script does is send out the ACK which the connection was waiting for.

This is what worked for me.

  •  
    you send act to close-wait socket. with not works .. if works ,why ?   Dec 18, 2015 at 6:07
  •  
    I am guessing, the OS has already sent the FIN to the remote host. The remote host probably cannot reply with the ACK that the socket is expecting. 
    – mirage
     Dec 18, 2015 at 10:06 
  •  
    yes, that's right ( from kernel code). but I also doubt about the SEQ of the packet you send, which is "10", does kernel not check it ?   Dec 21, 2015 at 8:45
  •  
    Probably not. I think I tried with many random numbers, and they seemed to work. 
    – mirage
     Dec 24, 2015 at 7:41
8

I'm also having the same issue with a very latest Tomcat server (7.0.40). It goes non-responsive once for a couple of days.

To see open connections, you may use:

sudo netstat -tonp | grep jsvc | grep --regexp="127.0.0.1:443" --regexp="127.0.0.1:80" | grep CLOSE_WAIT

As mentioned in this post, you may use /proc/sys/net/ipv4/tcp_keepalive_time to view the values. The value seems to be in seconds and defaults to 7200 (i.e. 2 hours).

To change them, you need to edit /etc/sysctl.conf.

Open/create `/etc/sysctl.conf`
Add `net.ipv4.tcp_keepalive_time = 120` and save the file
Invoke `sysctl -p /etc/sysctl.conf`
Verify using `cat /proc/sys/net/ipv4/tcp_keepalive_time`
  • 6
    the answer is confusing. you said the non-responsive states has gone for several days.. but then you also try to set the keep alive time to only 120 seconds. even with the default value (7200 sec), it shouldn't last for several days, right?   Sep 11, 2015 at 16:52
3

It should be mentioned that the Socket instance in both client and the server end needs to explicitly invoke close(). If only one of the ends invokes close() then too, the socket will remain in CLOSE_WAIT state.

1

It is also worth noting that if your program spawns a new process, that process may inherit all your opened handles. Even after your own program closs, those inherited handles can still be alive via the orphaned child process. And they don't necessarily show up quite the same in netstat. But all the same, the socket will hang around in CLOSE_WAIT while this child process is alive.

I had a case where I was running ADB. ADB itself spawns a server process if its not already running. This inherited all my handles initially, but did not show up as owning any of them when I was investigating (the same was true for both macOS and Windows - not sure about Linux).

posted @ 2022-07-04 19:47  大数据从业者FelixZh  阅读(407)  评论(0编辑  收藏  举报