The third and final day of this years QCon, and everything seemed a little subdued and quiet. Perhaps because of the conference party last night? The keynote was looking to the future, and what the next set of disruptive technologies might be.
The past was just as interesting thinking back to yesterdays last talk on Lambdas and Streams in Java 8. Excellently communicated, but what are they doing with interfaces? Java 8, out in a couple of weeks, take a look.
The pick of today’s talks below:
Real time systems at Twitter
This was a great start to the day as the speaker explained how Twitter started with a Monorail based on Ruby on Rails and a MySQL database. Then had to scale up rapidly.
Modularizing both vertically and horizontally to allow separate teams to work on their own specialisms with their abstractions leaking through to obstruct other teams. Scala based Futures allow a whole Twitter Timeline to be constructed ahead of time, and then the actual Threading of that is the domain of another team.
All of this providing scalability that provides 1ms Timeline responses at the 50th percentile and 4ms at the 99th. The challenge then becomes monitoring (with Viz and zipkin). Statistics are key, where a 10ms to 11ms increase for the median, may mask a 100ms to 400ms increase in the tail. Managing tail performance as soon as adverse stats are detected keeps everything running smoothly and efficiently.
Very lightweight, not necessarily inspired by node.js, but very like it in everything it can do. And maybe even a little more with it’s SockJS interaction, as well as websockets. It also has built in failover. Even if there is a vert.x process running not currently doing anything, if the others fail it will take over all their running verticles.
Vert.x is a lightweight, reactive, application platform.
Big data at NASA
NASA has always collected lots of data, but now it has had an explosion of information. Up until 2006 it had collected 20 Terabytes across all it’s missions. One mission (MRO) has amassed 200 Terabytes since then.
All of this has to be managed, and it has to be available and in an efficient manner. So NASA at JPL has been spending time researching how to handle big data. They are closely involved with Apache projects like Tika and Apache OODT.
These projects should help them to catalogue all NASA’s planetary science data, triage incoming big data sets, and apply scientific algorithms to disparate big data sets in diverse formats.
All of which should help to absorb the 700 Tb/sec of data that is expected to be received from the Square Kilometre Array once it comes on line.