Conrad Bock
Reprinted from
Journal Of Object-Oriented Programming
Vol 12, No 5, September 1999
Copyright © 1999
SIG Publications, Inc, New York, NY
A unified behavior model is one that has the capabilities of all behavior
models. At first this might seem as desirable as it is difficult to achieve.
In the fact, the reverse is true. Alan Turing invented the first unified
programming language in 1936 [3]. It is a simple theoretical machine that
stores data as a sequence of characters on a tape, and modifies it with
a head that can be moved around (see Figure 1). . The Turing
machine is programmed in a language that only supports reading and
writing a single character and then moving the head one bit to the right or
left. Other theoretical machines have been invented, such as the Post machine,
but these proved to be no more powerful than Turing's. It is now assumed that
all possible programs can be written with a Turing machine.
Turing's unification did not induce wide industrial adoption of his machine, needless to say. It's far too hard to write programs with it. What we want is a language as capable as Turing's, but easy to use. This is more difficult than simple unification. Easy-to-use languages are generally designed for particular applications. For example, APL is the most concise for array-based applications, C is the best for working at the memory location level, Lisp is the choice for list-based programs, and so on. What we really need is a language as capable as Turing's that's easy to use for all applications. This is even more difficult to achieve. We might first try to put all languages together into one large one. The user could program in APL for arrays, Lisp for lists, and so on. But there would be no point unless the various sublanguages are integrated, so that the programmer could, for example, use array operations on lists.
What we really, really need is a Turing-equivalent language that is easy to use and is well integrated. Now we have reached a most difficult problem. Such a language would need to resolve, for example, the meaning of applying a Lisp function like CAR (get the first element of a list) to a two-dimensional array.
All this is familiar to industrial language designers as the tradeoff between generality and application-specific power. Since most languages are Turing-equivalent, the only criteria to distinguish them is how easily certain applications can be programmed. The designer chooses the proper balance between generality and specificity in each case, without being concerned about unification. The authors of UML, on the other hand, coming from a more academic background, attempted to create the equivalent of Turing machines at the modeling level. The rest of the article takes the UML as a case study in the difficulties of building unified behavior languages for industrial use.
The UML attempts to unify three traditional kinds of behavior model: control
flow, data/object flow, and state machines [2]. The details of these kinds of
model are given in the previous article of this series [1]. Here is a quick
recap. The term step is used below to mean the activities which
behavior models coordinate.
1. Control flow emphasizes the sequence of steps by requiring one step to finish before another starts, without concern for the availability of inputs. For example, postal delivery people begin rounds after getting to work, regardless of whether mail needs delivery, because there may be mail to pick up.
2. Data flow emphasizes calculation of inputs for steps by requiring that they be explicitly provided by outputs from other steps. For example, in "just-in-time" manufacturing, each assembly step is carried out when the parts arrive from other assemblies, regardless of when that is.
3. State machines emphasize response to external stimuli by requiring that each step begin only when certain events happen in the environment of the system, with step inputs provided by the events. For example, in a vending machine, depositing money causes the amount to be displayed to the buyer.
The three kinds of behavior model arose in the particular applications to which they are most suited. Users of any particular type of model usually try to use their favorite modeling technique outside of its normal domain, leading to hybrids of the three models. For example, an object-flow practitioner in manufacturing might be assigned to work on an embedded system, and naturally tries to apply object flow to it. The modeler would find it necessary to introduce special steps to wait for external events, resulting in a larger model than in the corresponding state machine, but fitting better in the object-flow style. The modeler could continue adding functionality until all aspects of state machines were covered under object flow, though in a more cumbersome way. This process repeats when the modeler is assigned to a task suited to control flow, such as business modeling.
The authors of the UML went through the same hybridization process, but starting with state machines, because these are most familiar to object-orientation programmers. Control flow and data/object flow were added later to support business modeling. The next section traces the path from state machines to control and object flow in UML. The UML has an additional control-flow model called collaboration, which is not addressed here. It shows how objects interact when performing particular tasks. The collaboration model is generally meant to show only the part of a behavior that is relevant to a particular task, and in any case, only that part which involves interaction between objects. It is not integrated with the other UML behavior models.
Except for collaborations, the UML behavior models start with state machines
and extend them to cover control flow and data/object flow, which are
collectively called activity graphs. Unification in general often uses
the same language construct do otherwise separate things. For example, state
machines normally describe the behavior of objects, but in UML, behavior can be
taken as an object itself, with its own state machine.
For example, suppose the control flow for a postal behavior has a step for sorting the mail followed by a step for delivering the mail. Each of these steps in UML would be modeled as a state (see Figure 2). The first state represents the state of sorting mail, that is, mail is being sorted when the postal behavior is in this state. Likewise for the second state of delivering mail. Transitions between these states also refer to the behavior itself. For example, when the mail is finished being sorted, a transition is taken to the next state for delivering it. This means the transition is being triggered by an event inside the state machine, namely the event for finishing a state's activity. Normally transitions are taken when events happen outside the state machine. The UML employs a similar approach to model data/object flow. A special kind of state called an object-flow state is used to represent the state of a behavior when one step has output an object or datum that is ready to be input to another step. Using the previous example, the step for sorting mail yields the sorted mail as output, which is then input to the mail delivery step. In UML this is modeled with an object-flow state in between mail sorting and delivery that represents the fact that sorted mail is ready to be delivered (see Figure 3). The dashed arrows notate ordinary state transitions that happen to be used with object-flow states. An object-flow state is actually more like an event, because no activity happens in it, and transitions leaving it are triggered without waiting for any other event.
One might think given the above that there is a model in UML for describing the execution of a behavior, so the state machine could react it. There isn't. For example, transitions that are taken on completion of step simply have no trigger event specified for them. Normally this would mean that the machine could never leave the state. The lack of execution model may also be the reason that object flow is treated as a state rather than an event. With triggerless transitions already used for completion events, there was nothing else to do.
The UML behavior models inherit problems that characterize language hybrids
generally. The first one that people encounter is comprehensibility. For
example, state machine users expect events to come from outside the machine,
not inside, and expect states to be states of an object, not of a behavior.
For these same reasons, control and data/object flow users do not immediately
see how state machines apply to their models. These pedagogical issues are
exacerbated in UML by the names chosen for activity graph elements: action
state, object-flow state, and so on. The result is confusion even
among the UML authors themselves.
Even if issues of understanding are addressed with added expenditure on training, significant problems remain in UML's unification, due to its emphasis on state machines. For example, a particular event may only be processed once by a state machine, because state machines are reactive to their environment, not recorders of its history. This is not acceptable in business modeling, which treats an event as a persistent object that does not disappear simply because some activity reacts to it. The reader can imagine how badly a business would perform if events were forgotten just because one action was taken in response.
Another difficulty is in the notation for data/object flow. State transitions cannot be directed to or from particular inputs or outputs of states. This means, for example, that if two input parameters to a state's activity are of compatible types, it may be ambiguous which one takes a particular input from UML's object-flow state. There is nothing in the notation to resolve the problem, though fortunately the XML exchange models for UML record enough additional semantics to disambiguate the notation. Also data/object flow doesn't usually notate the flowing object as a node, because it is semantically only an event, as discussed earlier [5]. Tool vendors are forced to invent or adjust their own notation for these cases and will consequently fragment the standard.
Another problem with UML behavior integration is that state machines do not have parameters. This means those users wanting to model functional decomposition cannot effectively reuse state machines. For example, a business function for issuing an expense check cannot take the amount as a parameter. The user is forced to assign the business function to an object so that an operation can parameterize it. This may be gospel to object-orientation practitioners, but business modelers are aware that the responsibility for an activity may change over time, and consequently prefer to focus their models initially on the result that is expected from a function rather than who performs it [8][9]. This is related to the concept of interfaces adopted in the object orientation community, but more powerful. Business modelers are well in advance of object-oriented modelers in this respect.
Related to the above issue is that the UML uses object flow for modeling cause and effect. For example, fixing a roof is an activity with at least two facts true at the end, namely the roof being fixed and the workers being free for other jobs. Each result has a different effect, namely billing the customer and assigning the workers to new jobs. Normally business models will give the roof-fixing activity two resulting events, and link each to the effect that they have [8][9]. In UML, the resulting events are modeled with an object-flow state that has a signal as its object (see Figure 6). This means the roof-fixing activity outputs two signals, and the billing and worker-reassignment activities take these as inputs. But a different business process may need to bill the customer before the roof is fixed, or reassign workers that are in the middle of a job. So BILL CUSTOMER should take a job description, not a fixed roof. Likewise, fixing a roof should not be required to output the workers that are free, just because a later operation in one case needs this information. The UML forces business modelers into defining activities that are bound to their usage, thereby impairing reusability.
There is not space here to detail other difficulties with UML's behavior integration. Readers will discover these on their own, no doubt. The good news is that the authors of the UML on seeing these problems committed to separating control flow and data/object flow from state machines in the next major release. The bad news is they may simply attempt a grander unification, creating more confusion and disfunctionality.
This article examines the tradeoff between generality and application-specific
power in behavior models, using UML as a case study. UML ignores this tradeoff
and attempts a full unification, similar to what Turing achieved in computation
theory. This article explains how the UML handles its behavioral unification
and the resulting difficulties. Users would have been much better served by
multiple behavior languages designed for specific purposes, especially for
business modeling.
[1] Bock, Conrad, "Three Kinds of Behavior Model," Journal of
Object-Oriented Programming, 12:4, July/August 1999.
[2] Rational Software, et al, UML Semantics, version 1.1, Rational Software Corporation, Santa Clara, CA, September 1997. Updated 1.3 revision will be available at http://www.omg.org.
[3] Turing, Alan, "On Computable Numbers, With An Application to the Entscheidungsproblem," Proc. London Math Soc., Ser. 2,42 (1936), pp 230-265.
[4] Booch, Grady, James Rumbaugh, and Ivar Jacobson, The Unified Modeling Language User Guide, Addison-Wesley, 1999.
[5] Schlaer, Sally, and Stephan J. Mellor, Object Lifecycles: Modeling the World in States, Prentice Hall, 1992.
[6] Bock, Conrad, et al, "Suggested Revisions to Activity Models for Business Process Modeling", Object Management Group document ad/98-06-13, ftp://ftp.omg.org/pub/docs/ad/98-06-13.ppt, or pdf, June 1998.
[7] Seidewitz, Ed, "The minutes for the UML RTF meeting on 6/11/98", Object Management Group document ad/98-06-12, ftp://ftp.omg.org/pub/docs/ad/98-06-12.htm, June 1998.
[8] Martin, James, and James J. Odell, Object-Oriented Methods: A Foundation (UML edition), Prentice Hall, Englewood Cliffs, NJ, 1998.
[9] Keller, Gerhard and Teufel, Thomas, SAP R/3 Process Oriented Implementation: Iterative Process Prototyping, Addison-Wesley, 1998.