The Story of a Failed dot.com !
In the fall of 1999, a prominent dot.com was getting ready to open its electronic doors for business. The Company had terrific hopes for a great holiday shopping season and all seemed ready to go!
The technical team had given a lot of thought to the design and implementation of their e-commerce web site. They had stress-tested the hardware systems and had 'over purchased' hardware to ensure they had adequate capacity. They adopted the latest and greatest software technologies and used Microsoft Transaction Server (MTS) to ensure their application would scale.
The system went live ahead of the season beginning. The traffic flowed and the systems team watched the traffic on the web servers grow and grow. They had on the surface lots of customers! They could see that they still had lots of capacity and plenty of CPU headroom, plenty of RAM and disk space.
All was well! The network bandwidth was fine. Everyone could see that they had done their job, and that the customers had come and orders had to be rolling in. But under the covers all was not as it should have been!
The IT Professionals saw this:
While the Customer was seeing this!
The application problems only surfaced late in the shopping season when they received the first shipping confirmations from the fulfillment service, and the Company realized that the completed orders was only a fraction of what appeared to be being transacted on the system. There was much less business being transacted than the web server activity indicated. But how could that be? What were all those customers doing on the website if not buying things?
During the post-mortem (after the doors had closed in May 2000), it was determined that many of the transactions being attempted, were either taking too long (and the customers left), or failing altogether (and the customers never tried again).
In all the excitement of it all, the team had not deployed the solutions that would have measured and reported on the health of the application in business terms (rather than just in system terms). Essentially they were blind to the application failing. We would never send a doctor in to diagnose a patient problems based upon blood pressure and body temperature alone, without giving access to data on the brain, for instance. Here the systems were alive, but the application brain was dead, and no one knew it.
They could have set up Early Warning notifications in AppMetricsā for Transactions that would have sent alerts when key transactions aborted, or when the transactions took longer than specified.
The net result:
This Company is no longer in business.