Introduction
I was horrified: My beautiful tiered architecture was buckling under the stress tests—at one tenth of the expected load of the system. And this was quality hardware, not the regular junk that we scrapped together in most test labs. After all, this was the company's flagship project. I began to see my once-bright career flash before my eyes.
The only redeeming thing about this situation was that I had learned the hard way to do performance tests early in the project; with that skill mastered, I thought that there might still be enough time to save the situation. A couple of months later, thankfully, things were back on track; but I'll never forget the feeling that I had back then: All the big vendors are pushing this exact architecture. How could this be happening?
Since then, I've learned that tiers and networks are not to be taken lightly, in any architecture. This article chronicles my journey, and how I got to a robust and scalable architecture.
Canned Answers
I got into architecture by the usual route: rising through the ranks from programmer to team lead; specializing in a field or two (transaction-processing, in my case); eventually, splitting my time among a number of projects. At the time, the title "architect" was not popular in my firm; I was a senior systems analyst. My boss, Ed, had no intention of restructuring or re-titling everyone whenever some new methodology or technology came out.
Anyway, object-orientation (OO) was the religion of the day, and the Unified Modeling Language (UML) was the language that we spoke. It turned out that one of the important lessons that I had not learned during my career was how central distribution was in any architecture. In my efforts to create clean and elegant OO models, I would let other concerns—such as network latency—fall through the cracks. It was not that I did not care about performance; it's just that networks were pretty fast, and getting faster all of the time, so I just assumed that everything would be okay.
Although I did make use of numerous patterns that were very much in vogue at the time, I did not roll up my sleeves and get my hands dirty with the code that they prescribed. That turned out to be a big mistake. You see, it's one thing to say, "Just use model-view-controller (MVC) for the presentation layer," and quite another to answer the slew of questions that comes on its heels—questions such as:
· Should the client and server use the same model classes?
· Should View classes directly call Controller classes, or the other way around?
· Is the model a part of the presentation layer or the business-logic layer?
· Should or can we use the same controller classes for both Web and desktop clients?
· If there are Model objects on both the client and the server, how do we move those objects between these tiers?
The (somewhat arbitrary) answers that I gave to some of those questions led the developers in directions that ended up causing systemic performance problems. By keeping myself apart from developers—and giving aloof answers like, "Of course, the model is part of the business-logic layer" and "Obviously, the business-logic layer will be deployed on the server"—I failed to communicate critical elements of the system's architecture. More importantly, I was not attentive to the feedback that came my way as to the various blind spots in the architecture. The developers quickly learned not to "bother" the self-important architect with the "details" of the system, and they did whatever seemed right to them. The only bit of feedback that did eventually have an impact on me was the result of the performance tests. It was clear, unequivocal proof that I had screwed up. The thing that hit home for me was, "This is not working, and I do not have a clue why. I really should know why."
Back to the Drawing Board
I then began a series of design and code reviews to find out where things stood. I can tell you that today I do these kinds of reviews all the time—usually, informally; just sitting down for a morning or afternoon with a developer, maybe even writing some unit tests myself. It's the most effective way that I have found to know what is really being developed (as opposed to what I thought the architecture dictated). It's also a great feedback mechanism for me, so that I can find out what parts of the architecture are not well-defined enough or, possibly, conflict with the chosen technology. Finally, by being more accessible to the developers, if one of them is having a hard time implementing some part of the architecture, instead of their circumventing my best intentions to "just make it work" under the schedule pressure, they come to me.
Janice, the Web development lead, used to just put up with my ivory-tower edicts—going around them, whenever they got in the way. After the first day that I sat down and actually wrote working code with her, code that complied with my designs, did she ever change her tone with her team! She eventually confessed to me that she used to tell her people, whenever my over-engineered designs did not work out, that they should just "do the simplest thing that could possibly work"—just as in eXtreme Programming. Excuses for subverting the design were ready, even before the code was.
After my change in attitude, I overheard her arguing one day with one of her programmers that just because his code worked did not mean that he was done; he had added a dependency that I had explicitly ruled out for that part of the system. I do not know if you have heard about architectural governance; for me, I have to practice it a lot differently from how it sounds. Architectural persuasion is a lot closer to what I have found that works.
Anyway, when I had to face all of the questions that I had pretty much ignored before in the light of the current performance problems, things did not seem so cut-and-dried anymore. I had originally been pushing for (what I thought was) an elegant callback mechanism between domain objects and the Controller classes. Logically, it seemed like a fine solution—until the performance analysis showed that the lion's share of time went towards fine-grained validation of the Model objects bouncing between client and server. The developers tried to make the situation a bit better by performing the inter-tier calls asynchronously; but under load, that solution just caused a depletion of the presentation tier's thread pool, to the point where the Web server was not servicing any more requests. I could have sworn that I heard someone snicker, "Just use MVC."
Apparently, my plight was a common one. The forums were full of developers and architects who sought guidance on these issues. Sifting through the masses of questions, explanations, code examples, and general guidance, I found references to the "Fallacies of Distributed Computing." The most surprising thing for me was that, when I looked them up, I noticed that they had been published before the rise of the Web. Even more shocking was the reference to earlier work from the 1970s. It was the first time that I had been blindsided by age-old wisdom in my cutting-edge work. Never in my four years of college had any of this come up. And I had done a double major in software and networking.
Distribution Trumps OO
The worst of it was the realization that object-oriented analysis and design were not the be-all, end-all for systems design. I was well versed in the practice of separation of concerns, but it took some time to accept that distribution trumped "pure" OO. In terms of technology, there were so many solutions available so as to make it look like the network was not even there: CORBA, RMI, and RPC—all available on a multitude of platforms. There was no reason for me, as an architect, to sully my designs with such low-level details.
That was the switching point for me. I revisited all of my basic assumptions and reworked much of the design. When Ed looked over the new design, he told me that it looked like I was trying to do messaging between my client and server, but that I had not gone the whole way. He suggested that I go talk to Randy, and take a look at one of his projects—an enterprise application integration (EAI)—to see what might be applicable.
At first, I was reluctant even to go look. I mean, none of the vendors, big or otherwise, was putting out any guidance in that direction. Even the competing technology camps were pretty much aligned, when it came to the architecture of scalable Web-based systems. A week or two later, I ran into Clem, an old mentor of mine, at a conference; and, after catching up, I asked him what he thought about my situation. Clem did not specifically answer my question (he never does; it drives me nuts sometimes), but he reminded me that breadth, as well as depth, is critical to the success of any architect. The very next day, I looked Randy up, and I asked him if he could explain the architectural and design foundations on which the project was based.
I will admit that I had a prior prejudice against EAI. As a developer, I had worked on a couple of large-scale integration projects, and the work was dull. I mean, I was out-of-my-skull bored. Translating between formats and doing simple look-ups was not my definition of career-advancing work. I would like to say that I went to that meeting with an open mind, but I was skeptical that anything would be relevant to what I was working on.
Randy was a small, quiet guy who was known for getting projects done on time and without any fuss. Although my six-foot frame towered over him, my reputation did quite the opposite. After we both got our coffees and got comfortable in one of the corner lounges, Randy began talking. He started by telling me about the nature of EAI projects and how the term came about. I was always interested in the history of computing, and Randy's knowledge was encyclopedic. After listening through the basic principles, I explained that I did not understand what any of this had to do with my Web application.
His eyes twinkled as he responded, "None of the vendors is talking about these kinds of things, right?" I nodded. He stood up and started pacing a bit.
"I had the same thing happen to me, when I was getting started. It's not that the vendors are at fault here, it's just that projects are not what they do. They build products, which they then have to sell—the most important thing being that most of today's products are enhancements on those of yesterday. You have to understand that the guidance that they put out is focused primarily on how to use their products. For some reason, a lot of us on the project side of the fence take that guidance out of that context and consider it to be the way to build projects. If there are other technologically agnostic solutions available, there really isn't any interest for the vendor to espouse them—especially not if these are superior to their own product."
I was getting a bit of a sinking feeling in my stomach, but everything that he said made sense.
"Don't get me wrong," Randy continued. "These alternate styles of working eventually make it into the products; but it usually takes two to three versions for that to happen. You think that EAI work has little to do with Web applications? Well, let me tell you that quite a lot of the work we do these days has to do with getting between Web servers and middle-tier servers, because of poor design choices that were made there. Sometimes, I think that these Web guys employ the MVC pattern without even thinking about the Fallacies of Distributed Computing—you know, the one about latency being zero."
I was lucky that I had read up on that before then! I tried to give a knowing smile, but I do not think that it came out that way, because Randy gave me this look that said, "Don't tell me you did that."
"You see, the layered architectures that everyone is bandying about say nothing about how those layers should be distributed. There's always the issue about just what the project really requires that often ends up in a mismatch between the design and deployment. I see relatively simple designs that end up being deployed piecewise on a bunch of servers, and other more intricate designs running on a single server."
Apparently, Randy was just warming up.
"This idea of design being independent of deployment just doesn't work in the field—either in terms of performance or scalability. You can't just take an in-process call that crosses two layers, distribute those layers to different servers, and get that same call to cross the wire in the same time as before. And that's even if the network isn't under load. I'll let you in on a little secret: We don't have half the bandwidth we need here. What with the new Web-based document-management system clogging up the network with 10 and 20 MB files, we can't afford to have applications cross the network for every little thing—especially not several times on the same item. Don't forget that if we're talking about a public-facing Internet system, that presentation tier will be in the DMZ, and you'll have to cross a firewall to get to your middle tier. I've even heard some cases where intranet Web servers got put in the DMZ. You really can't afford to cross the network, unless you absolutely have to."
Everything that Randy said made sense. I mean, I cannot say that there was anything particularly new there. But the questions that I had asked previously were still left unanswered. I tried to get Randy to pin down that guidance in terms of validation specifically—one of my main pain points. Although I did not expect to surprise him on that point, his broad grin spoke volumes as to the number of times that he had solved that exact problem.
"It's that whole layer-tiering thing that has you confused. There is absolutely nothing wrong with deploying the same validation logic to both the presentation and middle tiers. Although that only makes sense for simple validation. Once you get into validation logic that operates on a large number of connected objects, it might not make sense to put that on the presentation tier. Not so much because it can't run there; it can, but it's just that you want to limit the amount of data that crosses the network. The more data-intensive the logic that you're writing, the closer it should be run to the physical data store. That doesn't necessarily mean that you should always implement it in stored procedures within your database; but it should be an option that you consider."
I found myself nodding midway through that, and just kept on nodding until the end. Since then, database technologies have continued to advance—now, supporting object-oriented runtimes such as Java and the .NET Common Language Runtime (CLR), not to mention Web services—so that running complex, object-oriented validation logic within the data store is a more practical option today than it was then.
Conclusion
Anyway, that one-hour meeting quickly ate up the entire afternoon, and my previous cynicism about the relevance of other styles of working had disappeared completely. Randy's job was to fix up the shortsighted architectural decisions that people like me had deployed in the enterprise. In that animated discussion, I had learned (among other things) that, while a single domain object model might be valid for most Web-based systems, trying to use that same object model for rich-client development was not always wise. Those mid-tier domain objects were designed to be used within the scope of a transaction, where traversing an object graph would just rehydrate those objects from the persistent store. Client-side development with an n-tier environment would not have that same behavior. Binding those objects to visual elements was the important technological piece of the puzzle there.
When I asked about reusing Controller classes between Web and rich clients, I was reminded that the whole point of the rich client was to enable different, more interactive work than the Web model. If the views were logically different, and the tasks that the user performed differed, too, it just did not make sense to try to use the same classes to do both. It was just a reminder of the Single Responsibility Principle that was always at the heart of OO.
The experience was dizzying, I'll tell you that. On the one hand, OO principles were falling by the wayside in all areas that dealt with networks, tiers, and distribution. On the other hand, those same principles were as relevant as ever in making other, non–communication–related architectural decisions. If I had once thought that the field of solution architecture was deep, my appreciation for its breadth had greatly increased.
Since then, pragmatism and experience in the field have tempered my enthusiasm for creating "elegant" designs. Also, the knowledge around these areas has become better known, documented, and distilled in the forms of patterns. While I do not think that any book could have replaced my session with Randy, it definitely would have started me off in the right direction. Some of the books that I find valuable—and to which I continue to refer, in my day-to-day work—include [Fowler, et al., 2002] and [Hohpe & Woolf, 2003].
Critical-Thinking Questions
· In what parts of your architecture do you employ asynchronous communication? Why? If not, why not?
· Do you treat distribution boundaries different from logical boundaries, in your architecture? If so, in what way are they different?
· How would you test the scalability of an architecture? When would be the earliest that you could perform such a test?
Further Study
· [Fowler, et al., 2002] Fowler, Martin, et al. Patterns of Enterprise Application Architecture. Reading, MA: Addison-Wesley Professional, 2002. (ISBN: 0321127420)
· [Hohpe & Woolf, 2003] Hohpe, Gregor, and Bobby Woolf. Enterprise Integration Patterns: Designing, Building, and Deploying Messaging Solutions. Boston, MA: Addison-Wesley Professional, 2003. (ISBN: 0321200683)
Glossary
Architectural governance—The practice and orientation by which enterprise architectures and other architectures are managed and controlled at an enterprise-wide level.
CORBA—Common Object Request Broker Architecture.
DMZ—Demilitarized Zone, a perimeter network that sits between an organization's internal network and an external network.
EAI—Enterprise application integration.
eXtreme Programming—A deliberate, disciplined approach to software development.
Fallacies of Distributed Computing—A set of common but flawed assumptions that are made by programmers when they are first developing distributed applications.
Messaging—A style of communication between software elements, in which distinct messages are sent and received.
MVC—Model-view-controller. This is a common pattern for organizing presentation code.
N-tier—An architecture in which an application is executed by more than one distinct software agent, running on more than one physical machine.
RMI—Remote Method Invocation.
RPC—Remote Procedure Call.
UML—The Unified Modeling Language.
About the author
Udi Dahan is a Microsoft Solutions Architect MVP, a recognized .NET development expert, and the Chief IT architect of KorenTec. He is known as a primary authority on service-oriented architecture in Israel, and he consults on the architecture and design of large-scale, mission-critical systems that are developed all over the country. Udi's experience spans technologies that are related to command-and-control systems, real-time applications, and high-availability Internet services. For more information, please visit www.udidahan.com.