This post was brewing for awhile in my brain, but then Mike Kavis' spot-on analysis about problems with Twitter and Technorati arrived in the blog reader today. I'm currently going through a fairly similar thing on a project I'm involved with, so here's my two cents.
I started using Twitter casually a few months ago, and have run into the "Fail Whale" screen a few too many times to take this site (not the concept, mind you, the site) seriously. Until they get their act together with respect to uptime, there's no way that I'm taking this service with much more than a grain of salt. I have used Jott and PhoneTag (formerly SimulScribe) for almost a year now, and am very satisfied with their services. Twitter? Great concept, really crappy execution when they tried to scale. Deal killer, in my book.
Technorati is another story - they had a banner going on the top of their pages for the better part of last week claiming that they were "under a heavy load." Wonderful, that's probably why I kept getting logged off the service against my wishes or had it lose where I was on the site and return me to the home page on random occasions. Not a good way to earn anyone's trust, much less loyalty.
The classic development solution to these issues is to keep writing and tweaking more code, and not to do much in the way of architecture and design up-front. This usually works for a defined (and usually short) period of time. Then as the demands and problems escalate, the systems have become so complex and jerry-rigged that any further changes not only fail to fix the existing performance or usability issues, but create new ones as well.
This 'duct tape' approach to systems has its limits, and those are usually reached very quickly. Had there been some forethought to architecture and design, three things happen: a) the current set of needs are met; b) the systems are stable in production; and c) the systems can scale as demand rises. This is Architecture Benefits 101 folks, and as shown in the Twitter example, they are ignored at the risk of peril.
So, I've been involved with this project where architecture has been semi-pooh-poohed by sponsors and IT management as not really necessary. That is, until it hit the fan on them within the last two weeks when the business won a major contract and the public-facing websites are glitch-central in a lot of critical instances for their customers. Not only do they have code problems, but they have numerous data-related issues also contributing to the mess. Oops.
Now they have a double whammy: developers are struggling and scrambling to make the existing, very complex, short-cutted-to-the-max code get them through this crisis; and they now realize that there has to be some forward-looking architecture and design done so that these things don't happen again in their highly competitive marketplace. They can't afford to screw this up and lose the business that they landed - funny how stuff like this works sometimes.
Mike mentions in his post that the "Twitter Method" of "architecture" (like it or not, they do have one, even if it sucks) and subsequent negative outcomes across the board cost them at least double what it should have. I think it's worse than that, because mistakes like this happen over and over, so they pay for it, over and over and over. I don't know about you, but I don't like paying for the same work twice, much less 5,6, or even 10 times. But that's precisely what happens when systems are developed and deployed in this manner - the duct tape fails, and the bridge falls down.
Maybe there is some good out of this - Twitter becomes the Poster Child of Crappy Architecture, and what happens to your site, and your business, when "architecture" is performed exclusively with code stop-gaps and shortcuts. You will pay. Heavily.







Comments