Websocket – First implementation impressionson Dec 03, 2010 in HTML5 by Dom
Much to my surprise I ended up taking part in the inaugural Caplin HTML5 Hack Day. I say
surprise because I’m more at home writing servers and APIs than working in a browser. Thankfully, it seems there’s a feature in HTML5 for all of us!
Adam I and I teamed up and implemented WebSockets within our SL4B and Liberator products respectively. I’m not going to talk about the development of it – hopefully Adam will write something about WebSockets in the browser.
Instead, I’ll write about my initial impression of WebSockets: To avoid accusations of cheating I’d not read the proposal or followed Internet discussions so my first glimpse of the WebSocket specification was at 12:30pm on Monday. These impressions are written in an irreverent way, but as usual with these things, there’ll be a kernel of truth in there if you care to burrow deep enough.
The proposal itself
Over the years, I’ve read and implemented a number of RFC’s so I’m used to the format of them and the clarity that comes with their structure and the language used. With hindsight, I’d say it was a somewhat successful format.
For some reason, the WebSocket proposal decides on an alternate format. Instead of providing a specification of the protocol, it explicitly tells you how to implement it – all 46 steps for the client handshake.
I’m sorry to say that I didn’t follow all the steps specified for a server implementation, which probably means that we’ll fail the undoubtedly forthcoming WebSocket validation scheme. However, implementing it from on the server is fairly trivial, and as a result of writing some tests I’ve had to follow some of client steps as well. When you’re used to dealing with HTTP some aspects of the protocol are a bit odd, which leads us on to…
It looks like HTTP, smells like HTTP, but isn’t HTTP
This really feels like a get-out clause for the hackiness that is the handshake. Given that most production WebSocket servers will also support a HTTP for fallback purposes (see later) it really should be HTTP at the negotiation stage.
The negotiation stage starts by sending what looks like a HTTP/1.1 upgrade request. Except it’s not. Except when it is. The premise for the nonsense negotiation is that it could be confused with a HTTP form post, which is spurious, since from a quick glance it could easily be confused with a HTTP/1.1 upgrade request, which of course it isn’t.
As part of the negotiation stage (and to prove that it’s not HTTP really), an 8 byte body is smuggled to the server. I presume that this is to make sure that everyone has to rewrite all their proxies to support WebSocket, which seems like an excellent strategy for ensuring corporate adoption.
The handshaking algorithm had me chuckling whilst I was implementing it. I’m sure that counting spaces has prevented a lot of accidental MD5 hashing of 3 unrelated pieces of information (as well as not accidentally implementing the almost HTTP1.1 upgrade mechanism which caught on so well that we’ve no need for https.)
There’s nothing like a stable, backwards compatible protocol
Indeed, but it sure isn’t WebSocket! We ended up implementing version 76 since that’s what’s supported by WebKit and hence Chrome and Safari. The current version under construction changes the protocol once more which means that if you written a server you’ll need to update it again – thankfully Chrome auto-updates so on the client side the pain is lessened (unless you try to connect to a server that implements an older protocol version).
A text only socket connection and stop/start bytes
For that retro feel of only being able to deal with text, I’d recommend playing around with a serial cable, either that or use WebSockets. More seriously, being able to tunnel over port 80 is useful for software other web browsers – so why did it take to version 100 and something for binary data to be supported?
No inbuilt connection heartbeats
Badly configured firewalls have a habit of half-closing TCP connections with unpredictable results (both sides may not receive a close notification). Since the point of WebSockets is to create a persistent connection across the internet, bi-directional heartbeats would be a great idea to keep the connection alive. Looking at a later draft, there appears to be capability to do this.
As a result of increased lock down of corporate environments, the ability to create a bi-directional socket connection over port 80 is incredibly useful since it (along with 443) is often the only external port that can be accessed.
However the hybi-76 websocket spec seems to make the widespread adoption of WebSocket almost impossible: One thing that can be relied upon these days is that web proxies interfere with HTTP (or pseudo HTTP) connections, adding on additional headers which will break a strict WebSocket implementation (assuming of course that a client can smuggle those critical 8 handshake bytes in the first place.)
I look forward to the day when a WebSocket specification is produced that is fully compatible with off-the -shelf proxy servers – until then it’s likely to remain a fringe connection strategy not suitable for widespread adoption in locked down environments.