JavaScript is Hard Part 1: You Can’t Trust Arrays


We also know there are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns – there are things we do not know we don’t know. – Donald Rumsfeld, February 2002

I recently had a chance to attend a workshop by Damjan Vujnovic on advanced JavaScript, where most of the attendees (including myself) were Java developers looking to dip their toes into the world of JavaScript.

Coming from a company like Caplin I had something of a headstart, because we have been JavaScript advocates for a very long time and I have in the past worked a little bit on that side of the client/server fence. So I knew in advance about some of the quirks and idiosyncracies that I’d never understood, such as the value of this, and browser differences when interacting with the DOM, and in particular the double-equals operator which appears to have been designed by a committee of psychopaths.

NaN !== NaN

NaN !== NaN

But what surprised me about the workshop was that there were a number of aspects of JavaScript which not only did I not know, but I also didn’t know that I didn’t know them. In some cases these were familiar language features, such as arrays, which turn out to have surprising behaviours. In other cases they were features I hadn’t heard of in the first place.

This is the first in a short series of blog posts about things I didn’t know that I didn’t know about JavaScript. Hopefully by the end you will find a thing or two that you didn’t know either. We’ll start with a few quirks about arrays.

You can’t trust the length property

JavaScript arrays don’t really have a concept of size. You can try to retrieve an element from any position in the array, and if no element exists it will return undefined. There is no such thing as an out of bounds error. So what would be the length of this array?

var myArray = [];
myArray[3] = "element";
myArray[6] = undefined;
console.log(myArray.length);

There are two elements in the array, although one of them is undefined. So is the length 2? Or could it be 1 if the undefined element doesn’t count? It’s actually 7, because the length property always returns:

The index of the last element, plus one

Fair enough. Although by that rule you would expect the element at position 6 to exist and any elements further in the array not to exist. However in this case you can’t see any difference:

console.log(myArray.length);    // prints 7
console.log(myArray[6]);        // prints undefined
console.log(myArray[10]);       // prints undefined

That’s because there is a difference between an index pointing to the value undefined and not having an index at all. It just isn’t particularly easy to tell the difference.

You can’t trust arrays not to behave like objects

Arrays in JavaScript are a subclass of Object, so there is nothing to stop you using strings instead of integers as keys in your array. This is because JavaScript objects are basically just maps, and maps can use strings as keys. So what would be the length of the array in this example?

var myArray = []
myArray[2] = "elementTwo";
myArray["five"] = "elementFive";
console.log(myArray.length);

The length is 3, because that is the last numerical index plus one. The element with the key “five” is ignored by the length property.

That element also won’t appear if you iterate through the array using a for loop. The only way to get that element is to know it exists and use the key to retrieve it, or iterate the array using the for-in loop.

But, as we are about to see, there is a problem with that.

You can’t trust the array iterator

If you iterate through an array with the classic for loop then the number of times it will iterate is the length of the array. If you use the for-in loop then it will only iterate through the elements that actually exist. If you have a sparse array then this might seem more efficient.

var myArray = [];
myArray[3] = "element";
myArray[6] = undefined;

var timesIterated = 0;
for(var i=0; i < myArray.length; i++) {
    timesIterated++;
}

console.log(timesIterated);   // prints 7

timesIterated = 0;
for(var i in myArray) {
    timesIterated++;
}

console.log(timesIterated);   // prints 2

Unfortunately you cannot really use the for-in loop with arrays, because oddly enough when you iterate through the array you do not get the elements in index order. Instead you seem to get them in something like insertion order.

var myArray = [];
myArray[1000] = "elementOneThousand";
myArray[0] = "elementOne";

for(var i in myArray) {
    console.log("index " + i + "=" + myArray[i]);
}

// prints "index 1000=elementOneThousand"
// prints "index 0=elementOne"

Coming from a Java background this seems odd. The lesson is, don’t use the for-in loop with arrays.

You can’t trust the typeof operator

Since arrays in JavaScript are a subclass of Object, sadly the typeof operator cannot distinguish between them.

var myArray = [];
console.log(typeof myArray);    // prints "object"

The general consensus is that the best way to find out if something is an array is to use the toString method.

function isArray(obj) {
    return Object.prototype.toString.apply(obj)
          === "[object Array]";
}
var myArray = [];
console.log(isArray(myArray));    // prints true

Conclusion

Arrays do follow consistent ground rules, but the ground rules may differ from what you expect. This especially applies if you approach them from a Java background, where the length is the upper bound of the array and iterating an array will always give you the elements in index order.

Next time: something a little more obscure.

This is part one of a series of posts about JavaScript quirks. For part two, click here.

Related Posts with Thumbnails

10 Comments

  • AFAIK `typeof` only returns `object`, `number`, `string` and `object`. So, for `Array` you’d use `instanceof`.

    var myArray = [];
    if(myArray instanceof Array) {
    // is an array
    }

    You may well already know this but I thought it was useful to those reading your post.

    • Adam Iley says:

      instanceof Array is probably the correct answer for many (most?) situations, but in fact it is not a reliable way to test for arrays because if an array has been passed from another frame it will return false. If you know exactly where your data is coming from, you’re probably safe to use instanceof, but if you need some code that can detect arrays no matter where it’s come from you need something more like the (slightly ugly, but guaranteed by the spec to work) code in this article.

      It’s worth remembering too that many things are ‘array-like’, for example the arguments parameter which is not an array but has a length and can be indexed into with []. In Firefox and Chrome (and even IE from version 9) even strings are ‘array-like’, while in IE prior to version 9, they are not.

      • The frame communication is a good point. However, I don’t think many people will find themselves in this scenario any more. When dealing with older style methods of Comet, cross frame communication was common place and this scenario has the possibility of presenting a problem. The great news is that with XHR Long-Polling/Streaming using the XMLHttpRequest object (with CORS) and with WebSockets it’s now possible to avoid the cross frame communication and these edge cases.

        There may be some cases where cross frame communication is the only solution – maybe with older browsers. But cross frame communication, and messing with `document.domain` in order to allow JavaScript on two different domains to communicate, is one of the reasons Comet was frequently labelled a ‘hack’ (this is incorrect since Comet is a paradigm and not an implementation as these ‘hacks’ are – I’m sure @martintyler will agree :) ). However, thanks to more modern methods of realtime data delivery, and client/server bi-directional communication, I’m pleased to not have to worry about these older techniques too much and as we move forward with browser technology I’m very hopeful that nobody else will either.

        • Hi Phil.

          It must be great “not to have to worry about these older techniques” these days! Sadly some our target client base are still desperately clinging to IE6 and IE7, despite the huge push to move them away, so we unfortunately have to cater for the old world as well as the new world :-(

          But we are seriously looking at mandating Chrome Frame for those older browsers at some point soon, which would free us to ditch all the old IE hacks that still persist!

        • Adam Iley says:

          It’s not just about hacks, it’s about following rule 1: http://blog.errorception.com/2012/01/writing-quality-third-party-js-part-1.html

          At the end of the day, you have to make *some* assumptions, and usually one of them can be that the person embedding your library isn’t evil and/or stupid (e.g. redefining undefined), but when you want to play nice with others, you have to be a little bit wary.

          Ultimately the fact that something as simple as the correct way to tell if something is an array or not in javascript can generate so much discussion proves the point.

  • @Pat – yeah, I understand your scenario and pleased to hear you are still fighting a good battle. Chrome Frame FTW!

    @adam – nice link. Looks like this Array problem is being discussed quite a bit at the moment:
    http://uxebu.com/blog/2012/01/19/javascript-snippets-isarray-arguments-to-array-x-y-padding-array-unique/

    I think I may have been persuaded to use `isArray` :)

  • John Cowan says:

    The order you get from for/in is the internal order in the hash tabled that underlie objects, so it’s not predictable.

  • Juan Mendes says:

    You can’t trust the length property? Yes you can, if you understand how it works. You didn’t seem to realize that there’s a difference between a property not existing and it being set to undefined. you can’t tell by testing if (a.b === undefined), you have to use if (a.hasOwnProperty(“b”))

    You can’t trust arrays not to behave like objects? Yes, don’t use arrays as objects (hash maps). Arrays are for integer indexed members. It’s not like PHP where arrays are both a hash map and and an integer indexed array.

    You can’t trust the array iterator? They are sparse arrays, your examples are kind of silly, yes. The better thing to say is do not use for in with arrays.

    You can’t trust the typeof operator? That is very true, and your example is what I believe the best way to do it. Some people suggested instanceof, but that will fail if you are working with multiple frames. That is window.frames[0].Array !== window.frames[2].Array, so you would never know against which Array to test against

  • Gma says:

    Blah blah blah

Leave a Comment

*
To prove you're a person (not a spam script), type the security word shown in the picture. Click on the picture to hear an audio file of the word.
Anti-spam image