IBM DeveloperWorks: The Liar View bug pattern. In my research for a simple, cogent explanation of the model-view-controller architecture that would not confuse the hell out of everyone in the world without a CS degree, I came across this article on what they call the Liar View bug pattern, an article which is, I believe, specifically designed to confuse the hell out of everyone in the world without a CS degree. But here’s the plain English version:
“Model-view-controller” is just a fancy name for the sort of architecture that everybody says you ought to be building, but nobody actually builds (except Smalltalk freaks, but never mind that). It is not specific to web applications, but that’s the example I’ll use because that’s what everybody says they want to build these days.
In your standard multi-tier web application, your JSP/ASP/PHP/DTML/whatever layer is the “view”; this is what the end user actually sees and interacts with. When the user clicks ’submit’ in an HTML form, the request (complete with all the form information) is sent to the “controller”. In ASP development, this is another ASP page; in J2EE development, it’s a JavaBean; in other environments, it could be a CGI script. To-may-to, to-mah-to.
The controller parses out the data that the user entered in the HTML form and passes it to the “model”, which contains actual business logic. And this is where everybody always skimps. In ASP development, the business logic is supposed to be handled in a COM object, deployed under MTS. In J2EE, it’s supposed to be in an EJB. Other architectures have similar lofty notions about where business logic should be, which nobody actually follows; they just stuff the code wherever it happens to be convenient, and bitch about it later when the application can’t be extended or code gets duplicated or things generally go downhill. Don’t blame me, I told you how to do it right in the first place. Does anybody listen to me? No. Read on.
Once the model is done doing whatever business-y things it does, it returns some result back to the controller, which, depending on the result, selects a new view to present to the end user. In a web application, this means redirecting to the appropriate ASP/JSP/whatever page. Depending on your needs, your controller may also pass along the result it got from the model; in a web application, this could mean concatenating values in the query string, or stuffing them somewhere in the Session variable.
I would like to reiterate that no one actually does all this.
There is a point to architecting applications this way: flexibility. The beauty of model-view-controller separation is that new views and controllers can be created independently of the model. The model knows nothing of HTML forms in ASP or JSP pages or whatever. The model defines a set of business functions that only ever get called by controllers, and the controllers act as proxies between the end user (interacting with the view) and the business logic (encapsulated in the model). This means that you can add a new view (like a wireless mobile phone interface is the standard example — and again, nobody ever does that either) and its associated controller, and your model doesn’t know or care that there are now two different ways for human beings to interact with the application. This is called “code reuse”, specifically “business logic reuse”.
Another great thing about the model-view-controller architecture is that you can create automated tests to test your business logic, since every conceivable business-y thing you’d want to do call be called directly without going through a user interface. In essense, you create automated controllers that take their input from views that don’t require human intervention (databases, files, or even just hard-coded values), call the model, and then check the results and make sure they’re correct. There are entire frameworks available for doing this sort of testing; I wrote about Python’s unit testing framework in my book.
Notice, however, that these automated tests are only testing one piece of your application: the business logic. That’s all well and good (tested business logic is better than untested business logic), but it’s only a small piece of the entire puzzle. There may well be bugs in your views, or in your controllers, which manifest themselves as “bugs” to the end user. Maybe the controller is keeping a local cache (in the Session variable) of some business data (returned from the model) and fails to update it properly; the user does something in the view, expecting to see the data change, but lo and behold, the data does not change, and the user becomes very very angry and takes it out on their keyboard/monitor/minitower/whatever. The bitch of it is that the data really did change — at least, as far as the model is concerned — but after it changed, the controller was doing other stuff (keeping a local cache) and failed in its duties (keeping the cache fresh). Automated testing that only tests the model will not catch this bug.
So that’s the entire point of IBM’s article that I linked above: if you test one piece of your application, it won’t catch bugs in other pieces. Duh. But it points out an important point — not really a problem, exactly, but a point nonetheless — about the model-view-controller architecture. We went to all this trouble to separate the different areas of our application (and believe me, it’s a lot of trouble), and that paid off in the sense that it allowed us to set up automated tests of our business logic. But don’t get seduced into thinking that you can fire all your QA staff because of this (or, in the case of my previous employer, not hire any to begin with). Your automated tests don’t even attempt to touch the parts of the application that end users actually care about: the interface (and associated proxies back to the business logic).
Depending on your environment, you may also be able to automate testing your interface too. For web applications, there is a framework called HttpUnit which will allow you to set up scripts for simulating actual user input in your HTML forms, and checking the return values. This is, of course, dependent on your interface, so it can be fragile if your generated HTML pages ever change significantly (like changing the name of a form field or something). Automated testing of your model is also fragile, in the sense that it depends on the public API of your model, but (in my experience) this changes much less quickly than the user interface.
Again, you should treat all of this more as an ideal than an actuality. No developers I know actually program like this, although the good ones at least try. The separation between business logic and other stuff may start out clean, but it never stays that way for long. Business logic always seaps into the controllers, and into the views themselves. And the API for models inevitably slants towards the first view that you implement, making it more difficult to implement radically different views that require data sliced in different ways, or can only offer a certain subset of the data that the model’s API functions require.
Often people cheat and implement interface-specific logic in the model, because it’s so much faster and easier and more direct to do it there than anywhere else. “Faster” is the key word here; strict separation between business logic and presentation can be time-consuming, as the controllers spend too much time marshalling data into the format the model expects (parameters to a function) and unmarshalling the results into data the view can use. So you’ll end up with an Item class that has nice business-y functions like getStockStatus (returns Integer) and getExtendedInfo (returns HashMap), and then an abomination like getExtendedInfoAndSomeOtherStuffAsHTMLForThatReportThatJoeSaidWasTooSlow (returns String, which is the actual HTML to present to the user). Oops. That’s the real world for you.
Update: Hugh Pyle notes another cool thing about the model-view-controller architecture which I didn’t discuss above, namely that it can (in certain environments) operate asynchronously. Views register themselves with the model to say that they’d like to receive updates whenever certain things happen (new data added, data modified, whatever). This is usually accomplished with some sort of message passing, but it could also be a function callback; it depends on what your programming environment supports. But the point is that the model doesn’t know anything specifically about the views, only that they’ve registered to receive messages under certain circumstances.
This doesn’t work for standard multi-tier web applications, because the view is generated as HTML and sent to the client, where it sits dumbly until the end user does something to initiate an action. So the model can’t asynchronously tell the view that something interesting happened, because the view isn’t listening.
This, incidentally, is why web applications fundamentally suck. They’re like playing turn-based roleplaying games instead of first-person shooters. You get whatever you get in your HTML forms, and then you’re stuck with it until you click ’submit’ and play the next turn.
Shameless plug: I teach this stuff for a living, you know. My next Java/J2EE class runs January 14th through 18th, 2002. For future classes, you can contact me directly or contact my employer.
§
I am no longer accepting public comments on this post, but you can use this form to contact me privately. (Your message will not be published.)
§
firehose ‧ code ‧ music ‧ planet
© 2001-8 Mark Pilgrim