So the first day of Velocity London 2012 has drawn to an end. And what a day it was. Lots of cool talks, interesting people and plenty of food for thought.
Monitoring and Observability
I started the day with Monitoring and Observability by @postwait. The talk went over all the different problems one has with the monitoring of distributed systems, and the sheer difficulty of trying to debug issues when the world is constantly changing state.
The talk was rather technical, including some live hacking of awk scripts to produce telemetry. This was not for the faint of heart. I consider my knowledge of awk quite average, and now realize how poor it really is.
However a lot of good points were raised about the importance of what data should be monitored and when. For example, it may be quite common to show average latency of web requests on a graph, but this data is completely useless without considering the cardinality of said data.
And taking the mean of the data may not be nearly as important as the 95th or 99th percentile, depending on the use case.
There was also a lot of emphasis on monitoring everything at all times. @postwait monitors the length of every request to his webserver. Wow. But this provides a nice rich dataset to look at when it comes to debugging. Another emphasis was on keeping historical records so that you know when something changes. What is ghttrd.exe? I have no idea, but it has been running on this box for the last year so it’s probably not what is causing the new bug.
One other salient point raised, was the question of why we write “sh*tty code”, his term not mine. He puts this down to the fact that we mess around on our development machines so long that we don’t worry about production enough.
Consider if doctors spent 99% of their time examining medical cadavers. Would they be able to switch to “professional” mode when it came to dealing with patients? And yet this is what we do when we develop code on our machines everyday. Hack away and see if it sticks. Perhaps if we treated every line of our code with the reverence we would treat another human being we would actually produce something we can be proud of. Then again, we might take 3 weeks before we can book a line of code – an appointment – and then we would retire to the golf course for a few holes…
Another cool idea he raised: Have some redundant machines and then use these machines as part of a master slave architecture for development processes. So if you think you can make an improvement, run the dev box in parallel, pass some of the requests over, check the results against the production box and throw them away but log the results (naturally) somewhere they can be observed.
Step by Step Optimization
The second talk was my favorite of the day. Step by Step Optimization showed how to deal with running sites on mobile devices which can often be very slow, painfully so.
@guypod showed us a pretty typical Web Application (for an e-store) and took it from a 16 second load time to a sub 4 second load time.
There are many things that you (or at least I) would never consider to be a big performance hit. For example, the biggest performance gain (6 seconds) was gained from altering the way that redirects are performed.
So instead of using JS to detect that the browser is a mobile browser and then redirecting to a mobile site, instead simply return the mobile browser directly from the URL they request. There were actually a lot of subtleties to efficient redirecting which would fill a lengthy blog post by itself.
The other thing that really blew my mind, was the performance HIT that you take from consolidation. When you take your 20 seperate JS files and consolidate them into one file, your performance actually suffers, due to the fact that browsers do not start processing JS until the entire file is downloaded.
So in the example where all the files were downloaded individually performance was actually better for users! This has my mind buzzing with a million thoughts about how we can improve our performance.
One of the solutions provided was a concept that you should split your JS into two, the JS that is required to render the page at the top (if you do dynamic page generation) and the rest of your logic into another file to be loaded afterwards. This way you can get your page to appear really quickly.
There were loads of other cool optimization hints and tips which this margin is too narrow to contain…
A Web Perf Dashboard: Up & Running in 90 Minutes
For my third talk I went to Web Performance Dashboard with @jeroentjepkema. I have to say I was a little disappointed by it, largely due to the fact that we have already done a lot of dashboard research at Caplin and it didn’t provide anything new.
I think at Caplin we know what a lot of the problems with dashboards are, and this talk failed to bring anything new to my table, apart from: Measure downtime in hours, not percentage. 99% uptime and 2 hours downtime, can be the exact same statistic, but one is much more meaningful to the business.
Just imagine if Google was down for two hours today. And that happened every two weeks. That is what 99% uptime is.
Hands on Performance Deep Dive
The final talk of the day was a live analysis of web site performance by a couple of guys from Google, @PatMeenan and @souders
They took the websites of a couple of Football teams (with plenty Anglo-American jokes about the term “Football”) and analyzed their load times on a 1 meg connection. Chelsea: 16 second load time.
One particular issue was highlighted. Two sites had image carousels. In one, all the images were below the fold and they were loaded instantly, really slowing down the page speed. In the other the images were above the fold and were lazy loaded. However, the carousel did not wait for the images to be fully loaded before it lazy loaded the next image. This meant that for the first spin of the carousel it was mainly showing blank space. Not good.
Whether this can be applied to webapps is questionable, but I think there is a lot to consider about lazy loading when possible.
More interesting to me was some of the tools that I saw being used to investigate the issues. WebPageTest (http://www.webpagetest.org/) is a really cool tool, and I need to get our developers using this yesterday.
Ignite Strata + Velocity
With the talks of the day over, we retired to an Ignite for some quick fire talks: 20 slides, 15 seconds each, talk about something you know. Some very cool talks about technology by some very interesting people. We are definitely going to be seeing some very cool things coming along in the near future.
And with that, day 1 was over. Stumbing out of the Hilton at 7:30pm exhausted, but exhilirated. Can’t wait until tomorrow!
Updated: Added links to slides