Project Management

Diagnosing's performance issues

It has been several weeks now since the federal insurance exchange website went live, was inundated with traffic, went awry, was taken down for maintenance, then brought back online still filled with glitches. Recent analysis reveals many performance problems have yet to be fixed, and end-user experiences across states remain inconsistent.

Unfortunately, what was supposed to be the dawning of a new era for the current administration quickly turned into a major boondoggle, as the site continues to wilt under pressure. After spending untold time and effort defending the Affordable Care Act from judicial and legislative challenge, how could this have happened? More importantly, what lessons can we all learn?

A common problem among web development teams is that scalability is often de-prioritized against the goal of greater site functionality. But adding more features and functionality in order to create a richer end-user experience can actually have the adverse effect of making web pages heavier and slower. This can alienate end users, who want nothing more than to get in and out of a site quickly. Indeed, Compuware’s initial analysis identified several issues resulting from the delivery of feature-rich pages, including third-party services slowing down page load times and failure to merge CSS and JS files on the site’s registration page, resulting in “overweight” content.

Officials conceded that insufficient time was allotted for testing. Certainly, in the rush to launch the site, the IT team responsible for missed several critical opportunities for web performance optimization. But’s performance woes extend well beyond poor code. In fact, the site provides a perfect case-in-point on why several deeply rooted beliefs and attitudes about web load testing and application performance management (APM) need to change.

  1. Organizations need to adopt a more proactive approach to performance management that extends throughout the entire application lifecycle. In other words, testing should not be considered a separate phase of development that happens only once, right before production. Websites and applications should be optimized for performance based on anticipated load from inception, to design, all the way through to the development phase and roll-out. This is a cultural change necessitating tools to foster close collaboration between all those with a stake in application performance, including developers, testers, QA professionals and operations.

  2. Organizations must measure and understand performance under load from the only perspective that matters – that of end users. Today, there is an incredible amount of complexity that stands between the data center and end-user browsers, including the cloud, content delivery networks and third-party services. Measuring performance under load from the true end-user perspective is the only way to understand how all of these variables are converging to deliver a single experience. New APM solutions deliver this view. Armed with this knowledge, organizations can proactively identify end-user performance degradations and fix what is humanly possible along the complete application delivery chain.

  3. Finally, organizations need to understand that even with the best planning, technology can be complicated and unpredictable. Load testing can provide peace of mind that a website or application can hold up under load, but performance issues can arise anytime and from anywhere. Load tests therefore must be supplemented with ongoing monitoring of all transactions in production, 24x7, which enables more proactive, accurate identification of bottlenecks. This also helps reduce inefficient finger-pointing and “war-rooming” at a time when every second counts -- and which is likely going on with the IT team as we speak.

In summary, while the performance issues experienced under heavy load during its first few weeks are not uncommon, they are extremely painful. Surely, end-user expectations for might have been a bit irrational and unforgiving, especially given their knowledge of how much traffic the newly launched site was likely experiencing. Nonetheless, these expectations are today’s reality, and IT organizations must adjust – or face similar consequences.

About the Author

Andreas Grabner is a technology strategist for Compuware APM Business Unit

Cyber. Covered.

Government Cyber Insider tracks the technologies, policies, threats and emerging solutions that shape the cybersecurity landscape.


Reader comments

Wed, Nov 20, 2013 Bob D New Hampshire

Everyone has software problems, even Walmart. Yesterday, I added a webcam to my shopping cart, and then executed the following protocol. 1. Land on the shopping cart page 2. Click on Check Out 3. Return to step 1

Wed, Nov 20, 2013 Anonymous

You ever heard of DTS, the big Defense Travel System mess? Over $500 million before they started forcing agencies to use it. People would take one hour of leave while on TDY to avoid using DTS (it couldn't do time off, so you had to do TDY orders by hand) because doing the paperwork was faster than DTS. DTS is better today, but it's still a big cludgy system that has terrible user flow. When you train someone how to run it, you train them how to avoid the bugs in the system so they can do orders on it. I see no difference in Following the same pattern

Tue, Nov 19, 2013

Getting politicians out of the timeline might help as well. There was no reason to rush this was there? Just get good requirements (this almost certainly was not done - I've worked as an IT manager in the govt for 20 years), iterate the design requirements with the programmers, do the programming (with very close oversight - this also was not likely done well), and then test, test, test, test, test. Roll it out when it is done and tested on a much smaller "pilot" group. Then phase the rollout. What's the rush? Obamacare is the law of the land. Until the website is ready, phones could have been used to sign people up (arguably many more than have been able to sign up online) until it was ready to go "live". Politicians, primarily the President, created this embarrassing mess by setting an unreasonable deadline and sticking to his guns rather than listen to those trying to tell him this was not possible in the timeframe he wanted it. Just because he wants it fast does not mean it can be delivered fast. The President looks kind of silly.

Tue, Nov 19, 2013 OccupyIT

"Surely, end-user expectations for might have been a bit irrational and unforgiving, especially given their knowledge of how much traffic the newly launched site was likely experiencing. " Wow, I was following you right up that that specious conclusion. Last I checked six (6) people were able to complete the process on day one. Where is better capability than that on the irrationality scale? There was a great comment on NPR by an interviewee that said something like, "If your government couldn't build a bridge that stayed up or a road that you could drive on you would call this government dysfunctional and incompetent. It's 2013. A website is part of the basic infrastructure of goverment. If this government cannot build and run a website for their 'best new idea' and yet still has the audacity to tax us for it we can safely call this government dysfunctional and incompetent." Are we going to see an article today about McKinsey's report to Sebelius from March saying the site was at risk?

Please post your comments here. Comments are moderated, so they may not appear immediately after submitting. We will not post comments that we consider abusive or off-topic.

Please type the letters/numbers you see above

More from 1105 Public Sector Media Group