During my IT career, whenever there was a possible "major fork in the road" regarding projects with aggressive timelines coupled with the introduction of large and complex pieces of infrastructure, I have always contemplated alternative plans/solutions in case the large/complex piece of infrastructure didn't work out well or couldn't within the indicated project timeframe. This used to occur frequently when infrastructure components such as relational database engines were in their infancy, and continued with subsystems like enterprise service buses (ESBs). ESBs have come a long way from a few years ago, but they still are complex to architect, design, implement, and move to production. Not to mention that the technical talent required to do this is almost always not "in-house," expensive, and in short supply.
Recently, I was assisting an organization in moving and consolidating a number of proprietary databases and data sources to a unified database available to the enterprise. I had developed an architecture and implementation direction that utilized simple and generally reliable database replication from sources to feed data into the enterprise database. I chose this approach because of two factors: a) relatively easy to implement, put into production, and support; and b) the project timelines were aggressive - on the order of a few months.
However, technical personnel from another division with a stake in the project insisted that we use an ESB to move the source data around. Beyond being total technical overkill from the standpoint of transporting data, the organization had no technical or operational experience with the ESB product selected. I unsuccessfully argued in meetings that implementing an ESB is a usually a major project in its own right, that there are lots of 'moving parts' within infrastructure subsystems like this, and that the original projects' timelines wouldn't support the effort.
I got "hooted down" from the technologists in the sister division for my viewpoint, as the second-guessing and chest-thumping ensued that they would "do whatever it took to make it work," which included hiring expensive consultants and training for those involved to ensure that the effort met the goals of the project.
The only one who believed what I was saying was the project manager. He came to me later and asked what we should do, because he shared my belief that the implementation and productionalization of this ESB wasn't going to meet the aggressive timelines for the data project that the business was dictating to the IT organization. I told him that it was time to specify the original, pre-ESB approach as "Plan B," to be utilized in the event that the ESB implementation wasn't going to meet the timelines. I then modified the architecture and development plans in three ways:
- I specified that the data movement apparatus utilize the ESB, with a fairly comprehensive but not incredibly detailed spec since there were numerous design paths available within the ESB product but there was no decision on which one(s) would be implemented, much less work successfully.
- I refactored the initial approach using database replication and other 'one-offing' of data sources into a "Plan B" approach should the ESB approach fail to meet project timelines.
- I separated the source data flows and mechanisms in each approach such that an ESB-centric approach could be fairly easily substituted for Plan B approach when (and if) it was ready.
The reaction that the project manager and I received to these specs from the ESB stalwarts was unexpected, and a bit stunning. In so many words and e-mails, they accused the PM and me of "analysis paralysis," and for failing to realize that they would "do whatever it takes" to force the ESB solution to work within the project timelines. Neither the PM or I believed that, not because of the skill and commitment level of these folks, but the circumstances that we had both seen in previous roles as architects and project managers. The PM asked me to continue to work down the Plan B path "just in case."
A consultant was brought in for initial training on the ESB and to advise on the architecture and implementation issues. When he reviewed my architecure artifacts, one of his initial comments to me was "Why aren't you using replication?" Good question, I told him...:)
Over the next 4-6 weeks, there was scant progress on making the ESB work as its proponents envisioned to satisfy the project requirements, and it became clear to the team and PM that the folks implementing the ESB were way over their heads. These folks are not sub-par performers or inexperienced developers, but they overreached and ignored the advice that the implementation and production paths with complex subsystems like this are a project by themselves and take much longer to get functional and 'right' than others.
The PM came to me during this period and asked me to begin setting Plan B into motion, which I did. We had Plan B up and running in the test environment within a week, and the ESB team members, while they have made some progress towards the project goals, are still working on the basics. Plan B will, most likely, be placed into production later this fall, and can be turned off and separated with minimal disruption when and if the ESB-centric solution is ready.
So, what are the lessons learned here?
First, if you are a project manager with very aggressive schedules for a fairly straightforward project, the introduction of a complex technology that could amount to being overkill means that you now have a major risk item to deal with, and probably a second 'project' to manage as well. In this case, the PM involved now has the ESB as a "project" as well as the original one.
Architects faced with the situation I describe should always have a Plan B specification available and ready to go for portions of initiatives or projects that exhibit high risk with respect to timelines, reliability, and production after go-live. Plan B's are rarely state-of-the-art, but they are implementable, and they work. Which is what really matters in the end.
Plan B specifications and design should be segregated to the extent possible from the remainder of the system such that the "preferred" subsystem can be substituted with a minimum of disruption to the overall system.
Be wary of claims by experienced staff with no background in a brand new technology that "it can't be that difficult," and "I'll do everything possible to make it work," and other reactions when issues and objections are raised with respect to project time-to-market and production issues. In this case, the staff involved are experienced and perform well, but they were in way over their heads. While they realize and acknowledge this now and are much more cognizant of these limitations, the negative effects to the project occurred anyway, and shouldn't have.
These scenarios make business and IT organizations pay at least twice for the same work and functionality. This presents a difficult situation for IT execs and CIOs with limited or contracting budgets.
I am not a "hero" for having a Plan B in situations like this - I always have a Plan B for risky parts of architecture specifications and designs. By risk, I am generally talking about implementation timelines, budgets, functionality, and production support issues after go-live. Architects who do not offer alternatives should strongly consider adding them to their specification and design repertoires.
Finally, be prepared to stick to one's guns in situations like this, because they are often politically- or personality-motivated. There is no universal advice I can give on this because these issues are mostly situational. However, the development of Plan Bs almost always keep options, and doors, open.