QCon London 2008 – Part 2 (Banking track details)

By libor

This post is solely dedicated to presentations on “Banking: Complex high volume/low latency architectures” track. For more general conference overview you can go to my previous blog post.

Banking track was very interesting for me not only because I’m working in exactly same domain field but also because challenges imposed by high volume/low latency systems demands very well balanced architecture with extremely careful selection of technology in use. Moreover in this domain is true that some of the latest/greatest stuff of emerging technologies is not always usable here (i.e. for example dynamic languages, WS-*, etc.). Track itinerary should started with Lennart Augustsson presentation about use of DSL in option pricing. Calculate option prices is considered to be the most complex task among all applied areas of finance calculus you can do therefore I was quite anxious to see how DSL model can make this task easier. Unfortunately Lennart Augustsson has got sick and therefore no goodies from this one. Instead of planned DSL presentation take spot place John Davies. His quickly hammered out presentation was actually very refreshing for me. He made great overview of what is state of requirements there, what are expected message speed and volume numbers, what are possibilities and limitations of use of varies data interchange protocols and finally what IONA can offer when dealing with large number of messages specifically in case of messages transformed from one format to another with use of JavaSpaces.
The biggest take away here:

  1. Banking sector will soon (i.e. within 3-6 years) become number 1 in world in processed data volume (i.e. it will surpass Telcos here!). 
  2. By far the most common application interchange format is CSV as simple and easy to understand form – this I have to totally endorse. XML looks for front and middle office as very blown version which adds a huge portion of latency time during parsing on execution.
  3. The back office uses SWIFT protocol. Protocol is very messy but when implemented and used in production by companies almost no errors are encountered. Why? There is heavy penalty from SWIFT calculated as considerable % of price the message carried. Because messages typically carry several millions/billions of dollars penalties are quite costly to companies. This is also way how to ensure system reliability… ;-)
  4. It seems they are using JavaSpaces extensively. There were shown examples on “matching” and “message translations”. Both examples are assuming use of “perfectly” parallelizable process and this starts troubles for me. There was shown a big gain one can get with javaspaces. I’m not 100% sure here about message translation case but order matching is definitely out of scope of this assumption. In real case one must ensure correct order of matching (first one is first served, reflecting special order modificators like complete volume, etc.)  and I believe message translation has exactly same case (i.e. send of translated message from application must be in exactly same order as they arrived). Therefore I would doubt you can gain as much as it was presented especially in case processing must be synchronized.

Next part was about “Keeping 99.95% up time at Merrill Lynch” (i.e. ML) presented by Iain Mortimer. Presentation was primarily focused on designing centralized management and monitoring system on core bank (i.e. tier 0 and 1) systems to quickly pinpoint HW/SW problems and therefore prevent failure (i.e. when there is low disc space) or identify quickly failure(s) and estimate what to do to fix it.

Major challenge ML faced was to unify management and monitoring of 344+ core systems across two geo data centers (i.e. in the USA and Singapore). I’m not sure how many HW boxes and processes they actually have to manage but according Iain’s statement their new monitoring system generates 20 000 events/sec which is a huge number (720M events a 10 hours working day!).

They elected to use slightly “customized” version of standard monitoring/management system. Iain did not mentioned that specifically but I can imagine well known source like HP Open View, IBM Tivoli, etc. Because of number of updates they collecting they’ve used with hierarchical event processing. They make 4 levels of event aggregation, analyzes and monitoring from lowest HW/SW process level up to datacenter/global level. Each monitoring level makes attempt to analyze state of incoming events and decides if events shall be propagated to higher level. This way they can greatly limit number of events on each management level including top one (i.e. datacenter) and therefore not overload system. This approach seems to me is quite smart move.

Presentation of “Real-time Java for latency critical banking applications” was probably the most important for JAVA based developers and not that much in my interested. For me it was curiosity to see what Real-Time systems based on GC can do and how they are constrained from code and memory utilization point of view. Presentation was mainly related to upcoming version from SUN. It only runs on newest Solaris 10 with latest SUN hardware which was regarded by audience as unpleasantly closed target deployment. It would be definitely good to have such chance to test real-time version of .NET in enterprise world but at the end the big question is whether worth to do it as coding and managing can gets quite tricky.

After a bit of real-time theory it was given talk “From Betting to Gaming to Tradefair”. This one was supposed to be again some practical application stuff. Matt Youill actually presented Betfair’s development path from betting site to trading exchange. I have got mixed feeling from it. Presentation has way too much “marketing” in it and second half was probably the lowest quality from all of presentations. What I have taken from it is Betfair has build in whole business logic into Oracle RAC database. Actually they utilized DB so much they are among 5 “hottest” Oracle DB instances in world. Given that they have over million registered users and revenue 200M GBP it looks to me quite risky for company to run system only on single DB instance not to speaking about missing geo scaling/diversity (it might be that they have actually 2 data centers and database sync is done via log shipping but how often they can update state on second place? Definitely not real time failover).

Tradefair application was build in different layout. They make use of distributed services with detail step journaling and persistent support. Implementation is done over abstract Actor object. This object represents user actions and play major role in handling user requests (submit orders, etc.). As the Actors progress with execution they directly save steps into plain disc files. For each master Actor instance there is backup instance which gets synchronized via those saved/updated files (i.e. log tailing).  Obviously this means they use shared SAN disc RAID array to address system reliability and recovery.

Seems to me solution will suffer when they would need to go for geo scaling as RAID setup is not really option here. Another question is how difficult is to do the user scaling if each actor generate file (I assume one specific file instance belongs to specific user) and how easily they can address error/failure inside system given they are using specialized files for each actor. Is really question how they trading adventure ends up.

Last presentation from banking track was “LiquidityHub” presented by Tony Harrop & Jeremy Vickers and I have to say very interesting one. Solution they shown was based on Spring 2 container and with use of JRocket real-time Java version. Given that they were able create solution from nothing to the production within 9 month is quite admirable task.

Solution is essentially based on 3 service configuration (gates in, calc engine, gates out) with communication realized  via Fiorano JMS among them. They claimed end to end latency is between  2-4 miliseconds within 95 percentile. On direct question whether they are using Hibernate for saving data they carefully stated NO due to performance reasons. Overall solution seems to be quite lightweight but I doubt they can really achieve claimed latency on JAVA framework even if used real time version.

And what left unanswered from all presentations? How all those companies reliably measure latency on distributed system without specialized HW? As I’m working on Windows I have two problems here. First one is relatively coarse tick of standard clocks (i.e. sensitivity 16ms here) if I’m not considering high-perf. timers. Second one is how they ensure all clocks across distributed computers are time synchronized on such precise level (i.e. essentially microseconds if one needs to measure in milliseconds). So lack of explanation from presenters makes a big dent on presented latency times.

-Libor

Tags: ,

4 Responses to “QCon London 2008 – Part 2 (Banking track details)”

  1. John Davies Says:

    Libor,
    Thanks for such a comprehensive and, in my case complimentary write up. A few point for future readers… I think it’s safe to say that XML is probably behind the majority of internal messaging in banks with the exception of the front office where you see a lot of FIX and perhaps the back office with SWIFT. The majority of large banks have an internal canonical format usually based on a derivative FpML or more recently parts of ISO-20022. It is true though that we still see a large amount of CSVs and Excel spreadsheets being sent around and there are areas where this is the majority format (sadly).

    Both examples I gave (again pulled up at the last minutes I might add) were cases where, as you say, perfect parallelisation gives near perfect scalability however order-dependent systems are less well served. Obviously this reflects Amdahl’s law and if I’d had more time I would have loved to have shown you some of the issues with had with parallel threads in distributed systems and how we got around them (most of them). Perhaps a topic for QCon in Denmark later this year.

    Finally you ask about latency timing, frequently network monitors will go down to nS timing but most Linux systems will happily measure micro seconds or in some cases better. Since we’re usually talking about a few milliseconds a resolution of micro seconds give us plent of resolution to measure latenct at this level. A useful tool we frequently use for seeing what’s going on on the network is ethereal or tethereal (now replaced by wireshark), I think you can get MS versions of it but don’t try it at work a you’ll probably get fired if anyone catches you using it (i.e. try it but don’t get caught). Distributed systems are normally synchronised by NTP, it takes into account the network latency but of course there is always some level of error, I would expect systems in a subnet to be synchronised to better than 2ms.

    Good write up, please say hi at the next show,

    -John Davies-

  2. pligg.com Says:

    QCon London 2008 – Banking Track Details

    Good overview of the recent banking track at Qcon 2008 in London – talks about Low Latencym processing, John Davies presentation, Swift ML, Betfair’s architecture and more.

  3. libor Says:

    John,
    Many thank for commenting blog post and addressing some of my questions.

    I have to say majority of my work so far was focused on front office application running as SaaS business model (i.e. fully managed solution for banks). Just last year I have chance stepped into middle office “water” functionality (i.e. clearing trades for that matter). Therefore majority my experience with integrating our system with “external” banks comes from trading part. From this point of view I have to say we do not have any client who would use FpML or XML as data exchange format with us. What they usually ask is “plain” CSV file (or DB version) based on “variant” of ClearVision “welcome” table. This leads me to think “banks” data import process is somehow accustomed to CSV form and servers as bases for transformation to their internal format. As emerging “standard” here I can see FIX drop copy IN/OUT variants which gets quite big traction recently.

    Speaking about parallel execution I’m looking forward to hear your experience with resolving issues on this topic especially when there is ordered execution requirement involved.

    Latency time measurement on high throughput system can get quite tricky on any platform especially if you need to do consistent monitoring of all events in precision of several millisecond (say total exec. time up to 5ms on any price update from incoming exchange message to update on user screen or delivered to external system). We are running our application exclusively on MSFT server version which does not support us on this case well. Therefore we have to take special steps to compensate for it. And through this experience I know correlation of time on distributed system might get quite “big” correlation errors (well big in terms of short latency time we are measuring on updates here) especially when system is under high load. From all that steams my interest in latency time measurement. Thanks for putting light on this issue from other system perspective.

    -Libor

  4. My site. Says:

    Look at this….

    Sweet site dude, check out mine when you get a min……

Leave a Reply