Finding and fixing a tricky memory leak in .NET code

I spent a decent chunk of my Sunday hunting an elusive memory leak in a tool that I wrote years ago and still maintain. The memory leak has always been slow enough that I hadn’t cared to fix it, but after running out of RAM and crashing my Raspberry Pi (again!), I’d had enough.

This tool performs HTTP and ICMP requests on a regular schedule. Given that I’d only see a memory-related issue after weeks of operation, and that most of the time nobody actually logs in to it to perform any maintenance tasks, it was pretty clear that the problem had to lie somewhere in the HTTP or ICMP request code.

I was pretty confident that it wasn’t the ICMP code, because it’s very small and straightforward. The HTTP code, however, is, shall we say, less concise. Not only does it handle a lot of options, but also its implementation is made complicated by the fact that I wanted that code to be testable, and HttpClient and many of its related classes are notoriously unmockable.

My testable solution from some time ago has been to add HttpClientHandler into the ASP.NET DI container and request a new (transient) one for each request. This way, my unit tests could use a fake version of this class and take over the actual network calls. This made for a somewhat awkward, but very nicely testable, class.

While investigating the memory leak (by calling the HTTP checking function 300 times between taking memory snapshots), I saw a lot of leftover delegates that were handling SSL cert validation, which were set up because the tool has an option for custom TLS validation. I wasted a lot of time trying to deal with validation event subscriptions and such. Those were not the problem, as it turned out.

The problem was my incorrect use of the DI container. Turns out, the documentation on its limitations is pretty clear:

* Don’t register IDisposable instances with a transient lifetime. Use the factory pattern instead.
* Don’t resolve IDisposable instances with a transient or scoped lifetime in the root scope. The only exception to this is if the app creates/recreates and disposes IServiceProvider, but this isn’t an ideal pattern.

Yep, that HttpClientHandler that I shoved into the DI container was absolutely a transient IDisposable. Regardless of whether I was disposing of it myself, the container still kept a reference to it. That was the memory leak. As the above quote shows, the documentation even suggests a workaround: using the factory pattern.

Changing the code to use the factory pattern to create an HttpMessageHandler instance, which I switched to the more modern SocketsHttpHandler for production code, while keeping the fake HttpClientHandler for test code, was not a difficult task. And the memory leak is gone!