I’ve spent the better part of the week diagnosing a memory leak in a Windows Azure App Service. You may find this post helpful if you find yourself in the same situation I was in. I’ll walk through what a memory leak looks like and the steps you can take to diagnose and eventually solve the issue. Every application is different, so an understanding of your codebase will lead you to make the best decision to resolve your issues.

The Leak

Hopefully, you catch the memory leak before your users start screaming bloody murder. To give yourself the best chance of that, I recommend installing Azure Application Insights. Once your application starts behaving badly you have two exploratory options:

  1. Live Metrics: Seeing how the application is behaving right now.
  2. Metrics: Seeing how the application has behaved.

Both are seen here in the Application Insights blade.

App Insights Blade

Using Live Metrics

Live metrics is my first stop of choice. I want to see how much memory my application is currently using.

live metrics view

The current memory footprint will be on the bottom of this page and will look similar to the screenshot above. If it is high, the next step is to look at the Metrics tab.

Metrics Tab In Azure

When on the metrics tab, you will be presented with several drop-down options. Choose process private bytes.

live metrics view

If you see a sawtooth pattern, then congratulations, you have a memory leak.

memory leak azure

Taking a Memory Dump In Azure

Once you’ve determined you have a memory leak, its time to get a memory dump. Head back to your app service blade. Here you will click the Diagnose and solve problems menu item followed by clicking the memory dump button under Diagnostic tools.

solved

Once there, open a Collect Memory Dump tab and click the Collect Memory Dump button. Don’t worry about analyzing the data, as we don’t really need it but it doesn’t hurt if you choose that option either. Two dump files will be produced, and you should be able to download a .dmp file of the w3wp process which hosts your web application.

Note: you may have to download the mscordacwks.dll from your app service. This can be found under C:\Windows\Microsoft.NET\Framework\v4.0.30319 directory.

Perfview

Note: The following is not an actual memory leak, just an example of using PerfView.

You’ll need to download PerfView from the official GitHub page. Once downloaded, run the application.

perfview

Process your .dmp file, which should produce a .gcdump file and open up the results.

perfview

When looking at this view, you’ll notice a few noteworthy elements:

  1. The first line is always the object(s) taking the most memory.
  2. The second column, labeled Exc % notes how much of your memory is of that type of object.
  3. The Exc column is how many bytes are being used.
  4. Exc Ct is how many instances exist in the memory dump.

In the screenshot above I know that SqlCommand is using 20% of my memory with 18 megabytes of memory with 801 instances. The Exc stands for exclusive, so it does not account for children objects.

Flame Graph

The next thing I like to look at is the flame graph. The documentation in PerfView suggests this reading:

The graph starts at the bottom. Each box represents a method in the stack. Every parent is the caller, children are the callees. The wider the box, the more time it was on-CPU.

flame graph

In this example, I can see that most of my CPU time is spent on static variables. Additionally, if you look at the towers from left to right, you’ll note that Entity Framework is using the most CPU with its LazyInternalContext. This aligns with our first view, which stated that SqlCommand is our most memory utilizing object.

If we head back to the By Name tab, located at the top, we can double click the SqlCommand line. This allows us to see where most of the memory is being allocated in reference to this class. By continuing to double click we expand the references to the point we see .NET Roots. This is where it all ends.

perfview

What can we tell about SqlCommand?

  1. SqlCommand is referred to by something called a QueryCacheEntry.
  2. a QueryCacheEntry is referred to by something called a QueryCacheManager
  3. Finally, down near the bottom, all these objects are part of an internal static variable called _cachedModels inside of LazyInternalContext.

This is expected behavior for Entity Framework, but it is nice to know what the library is doing under the covers and whether its worth the memory footprint.

Conclusion

I’ve shown you how to spot a memory leak in Azure using Live Metrics and the Metrics tabs. I’ve shown you how to download a current memory dump from your application. Finally, utilizing PerfView, you can determine which objects in your application may be misbehaving. With this knowledge, you should be able to track down any memory leak and potentially free up resources and make a stable user experience.

I do want to note that I ran into strange behaviors in Azure. I would deploy my application with dependencies removed, but those objects were still in memory. If you find you are running into this issue, restart your application then take a memory dump afterward.

I also want to thank the Twitter .NET Community that gave me a lot of good direction and help when solving my own memory leak. Thank you!