Facebook’s record breaking trading volume and a delayed 30 minute start in the trading of their IPO stock. An incident that will be talked about for years as the largest debacle in the history of US trading possibly after the ‘flash crash’ in 2010 where a trillion dollars were wiped out. I am pretty sure that Facebook’s trading fiasco might have made rounds in your coffee house and water cooler discussions, and I am sure that it will be talked about for years to come with a lot of heartburn.
The impact that this had on the trading community was obviously pretty severe causing mayhem, decreased level of confidence in equity markets, and significant irreversible damage. Now this is what we call the ‘million dollar bug‘ in the testing world. A bug that goes undiscovered, leading to a dysfunctional trading system, instilling a FUD (fear, uncertainty and disorder) in the minds of investors, and negatively impacting the credibility of large trading houses like NASDAQ. Not a pretty situation.
Granted that these bugs are difficult to discover as part of the standard operating procedure in the QA cycle and granted that the damage cannot be rolled back, once the bug is in action. But, can the bug be avoided? Not by spending a million dollars after the bug hits you, but possibly before? You can listen to our webcast on the million dollar bug syndrome, but I’ll summarize what we want to say. There are three routes to take with these kind of special bugs. Tactical — address the disruption in a timely manner, strategic — find the root cause and address the situation or preventive — make sure that the problem does not recur.
Most times, you might be unable to prevent these from occurring, primarily because it is driven by Murphy’s Law and a lot of other dependencies which are usually out of your control. BUT, you can minimize the impact of it by better preparing your QA, technology and business teams.
Interestingly, Rutesh, our CEO and founder who presents this session cites an example of an online brokerage firm which was hit by an unknown bug and halted all trading activity with losses of more than 60k trades in less than 30 minutes. And, the approach to deal with this is was what we call the war room approach to find the root cause and create a solid back-up system to make sure that all lost trades are executed at the customer’s discretion. Of course, it’s very important to arm your QA teams with the right ammunition to create solid QA processes, which we call out-of-the-box squad approach which are used to identify the deadly bugs and create enough readiness to fight them, as soon as they’re spotted. NASDAQ’s system failure was unavoidable probably, but some of the not-so-pleasant outcomes could have been taken care of.