Thursday, August 30, 2018

Fixing kernel: "unregister_netdevice: waiting for to become free. Usage count = "

This's a kernel bug which'll cause docker to hang and is triggered by you stopping a container (which possibly does not stop gracefully, i.e. does not respond to SIGTERM). The only solution is a reboot. It's speculated that this's a network namespace related problem and reproducible on all lxc/docker/rkt etc....

The thing that worked for me to reduce the probability of this bug is removing limits from the docker systemd service. Newer systemd has a default limit even if you didnt set it. Set LimitNOFILE=1048576, LimitNPROC=infinity, LimitCORE=infinity, TasksMax=infinity in docker systemd unit and this may just fix the issue; this also reduced the load average (CPU based).