BitRouter’s XML State Machine Technology For Configurable Graphical User Interfaces
The technology described in this white paper is covered by U.S. Patent No. 7,877,727
March 14, 2011
Using BitRouter’s patent pending XSM Technology, television and set top box designers rapidly prototype and deliver new functionality and graphical designs. The underlying software code base remains intact and does not require changes. As a result, design and testing is a much simpler process that can involve creative people who are not necessarily familiar with low level programming languages.
This paper discusses the XML State Machine (XSM) technology and its use for rapid user interface (UI) development. XSM uses generalized hierarchical state machines for creating UIs, where the UIs are defined by an XML configuration file.
TVgui is an implementation of the XSM technology for digital TV devices. This paper asserts the need for the TVgui program, the design approach taken, the XML syntax used to abstract the GUI, steps in creating the graphical assets and XML source code for the GUI, the generalized state machine implementation, and trade-offs of the approach.
|DTD||Data Type Definition|
|EMS||Electronic Manufacturing Service|
|GUI||Graphical User Interface|
|HSM||Hierarchical State Machine|
|ODM||Original Design Manufacturer|
|OEM||Original Equipment Manufacturer|
|TVgui||A digital TV application based on XSM technology|
|UML||Unified Markup Language|
|UML||Unified Markup Language|
|XML||eXtended Markup Language|
|XSM||XML State Machine, BitRouter’s patent pending technology|
Anyone who has been involved in developing user interfaces understands the complexities of that endeavor. First, a human interface development expert analyzes the UI requirements and conceptualizes a user-friendly look and feel. A graphic artist then refines the look of the GUI and creates its media assets. Finally, programmers design and implement the GUI code, integrating the media assets and implementing the feel (behavior) of the GUI. Each step is inherently error prone and time consuming.
In the consumer electronics industry today, design cycles are rapidly shrinking. CE device manufacturers and designers want turnkey solutions that can be customized, tested, manufactured, and delivered to market in a matter of weeks. Frequently, silicon companies and EMS providers deliver reference designs for which OEMs and ODMs build customer user interfaces. These user interfaces need rapid and frequent changes to differentiate products and add unique value to each model. The time required to design and implement the custom user interface is a key bottleneck in the “time to market.”
BitRouter wanted a way significantly improve the process and eliminate this custom user interface bottleneck. Using BitRouter’s XSM technology, OEMs and ODMs rapidly create and customize GUIs for digital TV and embedded devices, such as TV sets, converter boxes and set-top boxes. Using this technology, a GUI can be defined and configured using an XML file. An executable program parses and renders the defined GUI. This is generically called skinning. In BitRouter’s approach to skinning also involves defining the behavior of the GUI, not just its looks.
The resulting program, called an XML State Machine (XSM) engine, is best described as a generalized hierarchical state machine, configurable by an XML file, and targeted at creating UIs. The XSM engine source code is completely reusable. The only difference between different UIs for the same device is the XML code. Coding is simplified to skinning and developing the XML file defining the GUI.
This paper discusses a particular type of XSM engine meant for digital TV receivers, appropriately called TVgui. XSM engines can be developed for any device that requires a user interface, such as gas pumps, credit card scanners, mobile phones, medical devices, etc.
XML Syntax Definition
The XML syntax is defined in a Data Type Definition file (DTD). The DTD abstracts out the state machine and GUI concepts. See Listing 1 (DTD Syntax) for reference.
The feel of a GUI is defined by how it reacts to events. A GUI can behave differently to the same event depending on the state it is currently in. An important aspect of the syntax definition is the abstraction of the state machine nature of the GUI. This involves defining events and how the GUI reacts to them. The XML syntax is modeled after UML state machines (http://www.uml.org/). It has a TopState element as the XML file’s root element, a SubState element that defines a state that the GUI can be in, and a Reaction element defining how to react to an event. The TopState root element contains one or more Reaction elements and one or more SubState elements, and a SubState element contains zero or more Reaction elements and zero or more SubState elements, thus giving a hierarchical state machine skeletal structure. In UML parlance, TopState is a pseudostate, so it must have at least one Reaction element that causes the state machine to go to one of TopState’s nested SubStates when the TopState is entered.
The Reaction element has an enumerated event attribute that defines the event to be acted upon. In the TVgui digital TV application, the XSM engine uses certain middleware services, one of which fills the program’s event queue with input events. Defining the different values of the event attribute was simply a matter of modeling the input event enumerations (for front panel pushbuttons, a remote control, and a keyboard) in the appropriate C header file. To that set of enumerations we added other state machine and GUI-related events. For example an onEntry event so an action can be done when a SubState is entered, an onExit event so an action can be done when a SubState is exited, an onTimeout event when a timer defined in the XML file times out, and an onAny event used, for example, to wake the device up in a standby state upon any event occurring.
The Reaction element has enumerated action and guard attributes. The action to an event can be to do nothing (i.e., swallow the event); transition to another SubState; start, stop, or restart a timer; display or erase an image or textbox; spawn a SubState in another window or destroy a SubState that was spawned; reread the XML file; post another event; and a host of digital TV-specific actions, like turn closed captioning on or off, set a specific audio or menuing language, and so on. The guard attribute qualifies reacting to an event—when the guard is true, the event is acted upon, and when it is false, the event is not acted upon.
Graphically-related elements define the look of a GUI. There are elements representing fonts, textboxes, images, and backgrounds. One reaction to an event may be to display or erase an image or a textbox. For such a reaction the Reaction element can contain an Image or TextBox element.
For use in a menuing context—for example, where a Main-Menu SubState contains a number of other SubStates—a SubState can have one graphical representation for when the SubState is highlighted in its parent SubState (via a MenuEntryOn element) and another representation for when it is unhighlighted (via a MenuEntryOff element). This is how buttons are defined in the XML file. The implementation is such that when a SubState’s graphics are rendered, all of its immediate SubStates are examined for these MenuEntryOn and MenuEntryOff elements so that their buttons can be drawn. When a SubState is cursored over and selected, the default behavior of TVgui is to enter that state. This behavior can be overridden, though, by a Reaction element.
Coupling the state machine nature of a GUI with its graphical nature in the XML file as described leads to a natural way to develop a menuing system. For simple actions, such as setting the audio language or menuing language, SubStates can be used to define buttons which, when cursored over and selected, cause a Reaction element to perform the action. For more complex situations, such as changing a channel or displaying channel and program information, a ViewPort element is used.
A ViewPort is a GUI widget that either controls the middleware in some manner or extracts and displays information from the middleware. By default, a ViewPort becomes visible when the SubState it’s in is entered. The Viewport has events dispatched to it when it gains focus such as by a highlight or cursor moving over it. Likewise, the ViewPort is erased when the SubState is exited. This default behavior can be changed by Reaction elements. Reaction elements are used to turn a ViewPort on or off. ViewPorts are able to accept events once they are on and have focus. A ViewPort’s graphical elements are removed and it is unable to accept events when it is off.
The ViewPorts within a SubState and the SubState’s contained SubStates are navigable items, or hotspots. TVgui has a default navigation algorithm that determines how these items are navigated using Up, Down, Left, and Right events. This default navigation can be overridden with the MenuFocusOverride element.
The implementation is such that if a ViewPort has input focus (i.e., it has been cursored over), it has the first shot at handling the event. If the ViewPort does not handle the event, a Reaction element that can handle the event is searched for in the Current SubState. If one is not found, the current SubState’s parent SubState’s Reaction elements are searched to see if they can handle the event. And, if not, its parent SubState is searched, and so on, until either the event is handled or dropped if the TopState does not handle the event. Miro Samek, the author of Practical Statecharts in C/C++, refers to this as behavioral inheritance.
Steps in Creating a GUI
To create a GUI with this technology, the recommended steps, are, in order:
- Sketch the GUI by hand, identifying the SubStates in the GUI. For TV applications, a natural UI has three major SubStates immediately under TopState:
- A Watching TV SubState, where all the ViewPorts that would be used when watching a program are placed. These are the ViewPorts that would be used outside of a menuing system, like channel changing, channel and program banner information, and closed caption service selection.
- A Main Menu SubState, which is the top element of all functionality accessed through a menuing system.
- A Standby SubState, where the device is in a “powered-off” mode.
- Create the XML file (or modify an existing one), first putting in the state machine skeletal structure. This means putting the appropriate SubStates under the TopState and the appropriate SubStates within the other SubStates.
- Put the ViewPorts in the appropriate SubStates.
- Put in the Reaction elements, flushing out the full behavior of the system. This may mean overriding the default navigation, spawning SubStates in separate windows, and managing SubState transitions appropriately (e.g., when in the Watching TV SubState, a Remote Control Guide button keypress event can cause a transition to an EPG SubState nested within the Main Menu SubState).
- Create a binary representation of the XML using tools provided by BitRouter. Using a binary representation of the XML in the final target provides two advantages: a) the memory footprint of the XML State Machine engine is substantially reduced because it does not need to parse XML, and b) the time required to switch between skins is minimal because the XML is already in a binary format and does not need to be parsed.
Since TVgui does not include the XML parser primarily to save memory, a PC-based utility called createbx is provided that creates the binary version of the XML file. The createbx utility uses the libxml2 library from xmlsoft (http://xmlsoft.org/) to validate and parse the XML file. The first thing createbx does is validate the XML file with the DTD, and if that passes, it parses the XML file and checks that the XML file is valid from an application point of view. If the XML file is invalid from either a DTD or an application-specific reason, it logs an error message and exits. So between each of the steps above, or between any significant modifications of the XML, it’s always good to run createbx and see if there are any problems that can be fixed sooner rather than later.
Listing 2 is an example of a relatively simple skin. The TopState defines the fonts to be used in the skin and its Reaction elements define the overall behavior of the skin. TopState contains the three SubStates mentioned previously: Watching TV, Main Menu and Standby.
A note about the ViewPorts within the Watch-TV SubState: for TV applications, there will always be a Watch-TV SubState with the ChannelSelection, ChannelInfo, CaptionServiceSelection, and EPGDetailedProgramInfo ViewPorts within it, so it was convenient to implement these ViewPorts such that the ChannelSelection ViewPort receives all the events and dispatches them as appropriate to the other ViewPorts. The only other cooperating ViewPorts are the EPGGridSelection and EPGDetailedProgramInfo ViewPorts, where EPGGridSelection receives events and appropriately dispatches them to EPGDetailedProgramInfo.
Figure 1 shows the main menu when it is first entered. Figure 2 shows a channel scan in progress. When TVgui is started for the very first time, there is no channel map data and the skin automatically starts a channel scan to obtain the data. Figure 3 shows how the GUI looks when the main menu is entered and the UP or DOWN key is pressed, so that the Signal Meter ViewPort is shown.
Figure 1 – Main Menu
Figure 2 – Channel Scan
Figure 3 – Signal Meter
Generalized State Machine Implementation
This section discusses TVgui’s internal data structures. The basic data structure is the one representing a SubState. A linked-list of all the SubStates is created, with the head of the list representing the TopState. The SubState data structure contains a pointer to the SubState’s parent SubState (which for the TopState is NULL), and linked-lists of the SubState’s Reaction, ViewPort, textbox, and image data structures, as well as pointers to SubStates used for navigation (Up, Down, Left, Right) purposes. During application initialization, a pointer to the current SubState is set to the head of the SubState linked-list (which points to the TopState).
The XML State Machine engine, TVgui in this case, is started by handling an onEntry event. The parsing of the XML file has already verified that the TopState element (which at startup is the current SubState) contains a Reaction element having an event attribute with a value of onEntry and an action attribute with a value of GoTo.
When transitioning from the current SubState to a target SubState, following the UML, any SubState exit actions are done first from the current SubState up to, but not including, the least common ancestor SubState of the current and target SubStates. This means that Reaction elements are searched for onExit event attribute values. Next, entry actions (other than GoTo’s) are done from (but not including) the least common ancestor SubState to the target SubState, meaning that Reaction elements are searched for onEntry event attribute values with other than GoTo action attribute values. Finally, for the target SubState, any onEntry GoTo action is done to handle the case where dropping into one SubState causes a drop into a nested SubState (in UML, this would be referred to handling an initial pseudo state).
Another middleware service TVgui uses is a window manager. Each graphical element in the XML file is put into its own window. A SubState can have a Background element to be used in a menuing context that all other graphical elements of a SubState are put on top of. Consequently, Background windows have the lowest window priority. Windows containing the SubStates’ MenuEntryOn and MenuEntryOff images or textboxes have the next highest priority. All other windows containing Images and TextBoxes have the next highest priority, and finally, windows containing ViewPorts have the highest priority.
Admittedly, this strategy can create a large number of windows, but so far this has not been a problem. Each port of the middleware has a predefined maximum number of windows, typically 255. The most complicated GUI built so far uses 180 windows.
ViewPort Implementation and Other Considerations
As previously discussed, ViewPorts are GUI widgets that control and display information from the middleware. In essence, they are predefined state machines, having a fixed look and feel. The advantage to this approach is that ViewPorts allow for rapid creation of a GUI; the disadvantage is the limited customization they allow. It’s a trade-off between speed and flexibility. Currently, there is only a ViewPort XML element, with an enumerated attribute named Item identifying the specific ViewPort the element refers to. Customization is limited to setting a coloring scheme that all of the ViewPorts follow, and each ViewPort can use whatever font is defined in the XML file. In the next revision of TVgui, we plan to break out each ViewPort into its own XML element, with its own unique characteristics and options abstracted into its attributes and sub-elements, allowing a much greater degree of customization.
TVgui uses middleware services providing a kernel abstraction layer, event queue, and graphics library, among others. Porting TVgui requires porting these middleware components to the new target. This means that once BitRouter’s TVrefpak has been ported to a target, there is no further porting required for TVgui. Besides rapidly creating GUIs, TVgui has also proven to be a handy test application for the middleware components.
From a GUI design perspective, the main difference between the various boards TVgui has been ported to is their On Screen Display sizes. Typically, a skin made for one board will not look appropriate for another; it may look squashed, stretched, or too small, or the graphics components wind up in unexpected places. Graphics from one skin to another may be reusable, and one can play tricks using XML entities, but each skin is more or less specific to a particular board.
BitRouter’s graphics library supports 8, 16, and 32-bit bitmaps. Its media library supports JPEG, PNG, GIF, TIFF, and run-length encoded bitmap files. Using 8 or 16-bit bitmaps or compressed images helps in reducing the footprint needed for a skin.
Using the same basic TVgui a designer can implement multiple DTV devices, such as an ATSC converter box, an integrated TV, a connected home TV, etc. Once all the underlying hierarchical state machine infrastructure is in place within TVgui, its use for different devices is achieved by manipulating the XML skin. Optionally, the underlying state machines (ViewPorts) that are not required for a particular device may also be compiled out, rather than disabled using XML.
BitRouter has implemented a commercial ATSC converter box using the TVgui application. The complete skin for this product is defined by only 2,200 lines of XML code. This is a key strength of the XSM technology. OEMs and ODMs need only manipulate 2,200 lines of XML to personalize the product for market introduction. A person-year of GUI development and testing has been reduced to a week’s worth of XML editing. And, no C/C++ level source needs to be released to the OEM or ODM.
The CE industry faces shrinking design cycles. BitRouter’s patent pending XSM technology offers an order of magnitude improvement in time to market by providing the following benefits:
- Pre-packaged functionality and UIs
- UI customization by manipulating minimal lines of XML code
- An upward migration of designer skills from C/C++ programming to XML or GUI design tools.
- IP protection by not requiring release of source code or API libraries
- Rapid UI prototyping for testing, demos, and products
- Reusable, generalized hierarchical state machine building blocks
- Devices with multiple user selectable and downloadable skins
- UI authoring and editing using third-party and BitRouter’s GUI editing tools
About the Author: Robert Sharp is a Principal Software Architect at BitRouter where he develops applications for digital TV devices. He is the inventor and principal patent applicant for the XSM technology. He holds an MSEE from San Diego State University