Download | Wiki

Dotnet piles close_wait sockets


#1

From github:

As title says above, sockets in close_wait opened by dotnet have not been closed and are piling up.
This is probably caused by a filedescriptor leak and leads to memory consumption, because memory for buffers of those affected sockets are not released.

PTF_Version : v.1.5.1.685
OS : Linux CENT OS 7.4 3.10.0-693.2.2.el7.x86_64
DOTNET : 2.1.4

Short output :

netstat -neopa

32 0 150.95.191.124:60739 54.65.136.43:9443 CLOSE_WAIT 1001 2082272 9337/dotnet off (0.00/0/0)
tcp 32 0 150.95.191.124:53126 13.112.130.71:9443 CLOSE_WAIT 1001 2167051 9337/dotnet off (0.00/0/0)
tcp 32 0 150.95.191.124:46921 13.112.130.71:9443 CLOSE_WAIT 1001 2164912 9337/dotnet off (0.00/0/0)
tcp 32 0 150.95.191.124:58805 54.92.86.171:9443 CLOSE_WAIT 1001 2078195 9337/dotnet off (0.00/0/0)
tcp 32 0 150.95.191.124:58320 13.112.252.243:9443 CLOSE_WAIT 1001 2171998 9337/dotnet off (0.00/0/0)
tcp 32 0 150.95.191.124:59739 54.92.75.43:9443 CLOSE_WAIT 1001 2172200 9337/dotnet off (0.00/0/0)
tcp 32 0 150.95.191.124:36463 54.92.0.105:9443 CLOSE_WAIT 1001 2119499 9337/dotnet off (0.00/0/0)
tcp 32 0 150.95.191.124:43091 54.92.26.88:9443 CLOSE_WAIT 1001 2079040 9337/dotnet off (0.00/0/0)
tcp 32 0 150.95.191.124:36297 13.112.107.79:9443 CLOSE_WAIT 1001 2162867 9337/dotnet off (0.00/0/0)
really maaaaany

netstat -an | awk ‘/tcp/ {print $6}’ | sort | uniq -c
846 CLOSE_WAIT
22 ESTABLISHED
11 LISTEN
1 TIME_WAIT


#2

adding to that after we discussed it privately just to keep it for maybe future use. I experienced the same issue on Ubuntu 16.04. but didn’t first think about that this was causing it.

The issues I saw were utterly bad network connection, PT Monitor not beeing reliably reachable and all that uglyness. After getting a hint to that issues I saw the same thing with a lot higher numbers (>10k CLOSE_WAIT after 2 days of uptime).

After more than 24 hours of testing now it seems that a dotnet SDK update fixed it. Currently installed 2.1.300 on all affected machines and don’t see the same issue again until now. CLOSE_WAIT stay at around 3 to 4 and network is highly reliable again.

Will keep it posted here in case of changes.


#3

just FYI,
never had that kind of problems on debian9…


#4

@JoeChip in reply to your message here its also worth trying the upgrade Helmi mentions here. The CLOSE_WAIT seems to cause some wierd issue


#5

can’t verify now… I always use newest stable version.
But also can’t remember that kind of problems on ptf 1.5.x maybe old dotnet worked better on debian