Leaking Memory in Single Page Javascript Applications

Back in the bad old days, not so very long ago, lots of development teams had to support IE6. One of the major problems of writing large javascript applications in antediluvian versions of Internet Explorer was that memory would leak very easily, and you’d suddenly find that your initially snappy application was slowing your whole machine to a crawl if left running for any significant period of time. What made it worse is that IE6 wouldn’t even reclaim some leaked memory when you reloaded the page.

That particular leak gained a great deal of infamy and you can find a lot of information online about the DOM/JS circular reference memory leak (e.g. here or here). This leak was so serious because most of the natural patterns for event handlers would end up creating circular references between javascript and dom elements that IE didn’t know how to garbage collect. This was finally fixed in IE8.

Fortunately, we (at Caplin) don’t have to support versions of IE lower than 8 any more, so does that mean we can stop worrying about memory leaks? Sadly not. The applications we work on are single page applications with a requirement that they can run for days without needing to be reloaded. In that time the user might make multiple trades, hunderds of thousands of updates might come in over the network and need to be rendered to screen, elements might be removed and added from the screen, and grids with many thousands of live updating rows might be scrolled, sorted, reordered or filtered.

In situations like that, there are plenty of opportunities to leak memory in your own code, let alone (for the moment) from browser bugs.

Memory Leaks That Are Your Fault

By far the most likely way you are going to leak memory is by adding listeners and failing to remove them. You need to develop discipline to remember that every time you add a listener you should be writing the mirror image code to remove the listener.

Hierarchy of Possibly Listening Objects

Something that can make it more difficult to remember to remove listeners is when you have a tree of objects, and you need to remove an object and its children from the tree. You might remember to clear all the listeners from the object you’re removing, but did you remember to clear the listeners from the children?

This is the kind of situation where it’s useful to have a dispose method that calls the dispose methods of its children so that each child can worry about cleaning its own references up and then instructing its children to do the same.

Running Initialisation Code Twice

Something else that can catch you out is if you have a bug that causes your initialisation code to run twice. It’s very common that initialisation code registers listeners and then keeps hold of a reference that is later used to deregister the listener. If you run a naïve version of this code twice, it will register the listener twice, then throw away one of the references meaning that when you do eventually clean up your object, you’ll only remove the most recently added listeners.

For this reason, if it’s possible for initialisation code to run twice, I like it to check to see if it has already been initialised and throw an exception if so. That means that your code fails fast and you can’t just have a double-initialisation bug lurking there without you realising. If your code needs to be able to initialise twice (e.g. perhaps a setModel method) then it should check to see if it needs to deregister a listener before it starts adding any listeners.

Third Party Libraries

There are sometimes leaks in third party libraries, particularly since many thirdparty libraries haven’t been written with week long run times in mind, but more often, those libraries provide a mechanism for you to clean up that people forget or don’t know to use. If you’re using jquery controls, you should be calling destroy. If you’re using knockout you should remember to call ko.cleanNode (despite the fact that there doesn’t seem to be any documentation for it).

Obvious Stuff

The simplest kind of memory leak is adding things to a data structure and never removing them. One easy way to do this is when logging. If you’re adding a log line to a DOM element or an array and never removing it, that’s a memory leak. One solution is to ensure your logs never grow beyond a particular maximum size by removing the oldest line every time you add a new one once you get over a particular size.

Memory Leaks That Are Not Your Fault (But you need to work around anyway)

Despite the fact that things have improved a great deal, there are still browser bugs that are likely to bite you when it comes to memory management.

XHR

It’s common when making a request to a server to create a local XMLHttpRequest object and add a listener to it, and then allow that XMLHttpRequest to go out of scope. Now there is no code reachable from your own code that has a reference to that XHR, but of course it would be nonsense for the javascript engine to clean it up as if it were garbage, you’re actually expecting the request to happen and ultimately for the callbacks to get called.

Browsers like Chrome keep the xhr from being garbage collected only while the request is outstanding, although even that is longer than you might expect. Once the request is completed, if there are no further references to the xhr, then it will be collected. In IE8, any reference to the xhr from within the onreadystatechange callback function would stop the whole xhr (including any data it might have loaded) from being collected. There is more information here. My testing seems to suggest that it’s been fixed in IE9.

IE Map

Even basic object access can cause a leak in Internet Explorer. In most browsers, if you add a key/value pair to an object, then delete that mapping, the memory of that mapping is reclaimed. In IE, even IE9, while the mapping itself is reclaimed, the key is not. You can see this by writing code that stores something under an enormous key and then deletes that key.

var globalObject = {};
function start() {
     var randomKey = new Array(5000000).join(String(Math.random()));          
     globalObject[randomKey] = true;     
     delete globalObject[randomKey]; 
}

If you run start a few times in Chrome, it will use a lot of memory and then reclaim it all, while if you run it in IE9, it will grab the memory and never let go. You can easily push it up to its maximum at which point it will start throwing Out Of Memory errors.

This is usually not a serious problem, as in most maps you use a small number of short keys, however if you are creating something like a cache or anything that will have a large number of unique keys added and removed over a long period of time it can become significant.

Interestingly, this leak manifests itself in another way. If you iterate over the keys of an object, they are returned in the order that they are added. In most browsers, if you add a key, then remove it, then add it later, iteration over the keys will show that key in the second place it was added. In IE, the key is returned in the order of when it was first added, indicating that IE has ‘remembered’ that it had been previously added. Here’s some example code:

 var x = {};
 x.first = 1;
 x.second = 2;
 x.third = 3;
 // Logs "first", "second", "third" in all browsers.
 for(var i in x){console.log(i)};
 delete x.second;
 // Logs "first", "third" in all browsers.
 for(var i in x){console.log(i)};
 x.second = 2;
 // Logs "first", "third", "second" in Firefox and Chrome
 // but "first", "second", "third" in IE.
 for(var i in x){console.log(i)};

Things that Make Leaks Worse

Functions have access to variables from the scope in which they are defined, a feature called ‘lexical scoping’. When a function accesses a variable from its defining scope, it is said to ‘close over’ the variable, and such functions are often called closures. This means that if you have a reference to a closure, then it may hold references to variables from where it was defined that cannot be garbage collected.

IE will keep all the elements within the lexical environment alive while a closure is referenced, but many browsers will work out exactly which ones are accessed and only keep those from being garbage collected.

You’ll sometimes find lots of nulling of references in clean up code. That usually indicates that the developer knows that this object is leaking, but isn’t sure why. Nulling references in clean up code doesn’t stop the leak, but reduces how much is leaked. Far better to find out why and fix that.

Finding Leaks

Finding leaks can be pretty dispiriting since if the same objects are leaked in multiple places, you can fix a leak and see no improvement in memory usage. It’s only when you remove the final leak pointing to that object that you start to see an improvement.

Years ago, tracking down leaks for us would be a tedious process of observing a leak, commenting out swathes of functionality and then trying the code again to see how much it leaks, rinsing and repeating. We used to also insert special code around large datastructures to make it easy to find out if they were growing more than expected.

There are tools now that allow you to inspect heaps and compare them with each other. In IE, we often use dynatrace, although these days we tend to start by debugging with Chrome heap snapshots. A typical session might involve taking a snapshot, doing something that might leak 7 times, then taking another snapshot and examining the difference, starting by looking for objects with counts of multiples of 7. There’s a good article talking about how to go about finding leaks here.

While fixing things in Chrome first doesn’t mean you’ve definitely fixed everything in IE, it’s an easy way to get started, as things that leak in Chrome almost certainly leak everywhere, and often leak worse in IE.