You are here

Data Models

We present the data models for a generic workflow editor catering for different workflow languages and for multi-lingual meta-workflow editing. We require this generic editor to have the versatility to facilitate existing working practices by retaining the same look-and-feel of the editors used by the parent workflow languages/systems...

Workflow enactment systems are emerging that support multi-lingual workflows, and, good quality web-based workflow editors are now feasible - this provides a unique opportunity to introduce multi-lingual graphical workflow editors which would yield substantial benefits:

  • workflow users would find it easier to share and combine methods encoded in multiple workflow languages,
  • the common framework would stimulate conceptual convergence and increased workflow component sharing, and
  • the many workflow communities could share a substantial part of the effort of delivering good quality graphical workflow editors in browsers.

Workflow systems and user roles

The main challenge for a generic web-based workflow editor has been to design a single model that is sufficiently comprehensive for integrating the diversity of workflow languages, concepts, components and representations without being overly complex. We identify three user roles and note that each role contributes information predominately to particular parts of the data model that we introduce below.

Workflow language importer will be an expert in that language who configures basic entities and is responsible for providing the language properties (levels Workflow Language and Category of the entity-relationship diagram shown, setting up the standard look and feel (i.e., mapping categories and classes to visual forms in the controller) and establishing the default list of registries thereby pre-populating the Class level and establishing an initial set of mWorkflowInstances.

Workflow creator is an expert in a particular workflow language and some domain of use of that language. They are enabled to process all steps for the whole lifecycle of editing a workflow including altering graphical representations, thereby producing from scratch or from prior templates, new mWorkflowInstances and all their associated Instance level information, and optionally new composite classes, that may also be submitted to a registry for reuse. They may also set some specific controller properties for their products.

Workflow editor is a typical user from a domain applying the workflows. They will view and parameterise mWorkflowInstances, and then submit them. They may set viewing preferences in the controller and store versions of their submitted workflows and results a la VisTrails.

A data model for the 'model' perspective

Our development work on a generic web-based workflow editor adopts the model-view-controller (MVC) pattern for interactive systems.

Here the model captures the properties of each workflow language, each community's mechanisms for sharing, such as access to registries of services and data, and the details of each workflow instance. The properties of a language will be specified once per language by a specialist in that workflow language, and the sharing mechanisms will be shaped and pre-populated at that stage. The view provides a manipulable visualisation of the model, e.g., of a particular workflow instance that is being created, edited or submitted. The controller contains parameters that govern the transformations between the model and the view. In part, it is set by workflow language experts as they install their language in the framework, so that the familiar look-and-feel encourages users to adopt the web-based system. In part, it is set by user preferences, e.g., determining which aspects of a workflow are visible, how nesting, scale and complexity are managed, as well as conventional control of sharing, colour, authorship, etc.

Entity relationship diagram for the 'model' aspect of the generic web-based workflow editor:
Entity relationship diagram for the 'model' aspect of the generic web-based workflow editor

Entity Description
Registries & external resources
mRegistry External descriptions of computational resources, data sources, libraries, workflow components, tools, and web services.
mExternalResource Available compute and data resources.
mPackage Collections of components.
Workflows and languages
Workflow Language
mWorkflow Language Each workflow language installed.
mTextCategory Major roles for text, e.g., plain, script, structured, XML.
mConnectionCategory How data are passed, e.g., as parameters, files, streams, and control flow, e.g., split, join pair, or condition.
mConnectorCategory Types of input and output to a process.
mProcessCategory Categories of process, e.g., application, inline function, web service or stream processor.
mTextClass A role for text, e.g., class name, instance identifier, input parameters, description, annotation.
mConnectionClass A specific pattern of data transport and flow control, e.g., deliver output file to destinations and start them.
mConnectorClass A specific form of connection termination on a process boundary, e.g., parameter input or data-stream output.
mProcessClass Behaviour and algorithm that this process applies, e.g., DBQuery, Merge.
mTextInstance Acts in exactly one of approximately 20 roles, including: {naming, identifying, describing or annotating} a {process, connection, connector or workflow} or providing a parameter or script.
mConnectionInstance An instance of a connection class, with a given connector or plain text as source and >=1 destination connectors.
mConnectorInstance A particular connector on a process instance at an end of a connection.
mWorkflowInstance The whole workflow on which the editor is acting or a sub-workflow corresponding to an expansion of a composite or meta-node process.
mProcessInstance An instance of a process class.

To make explicit the roles of entities in the model, entities belonging to the model (M) perspective are prefixed with m. The entity relationship diagram below represents the model (M) perspective of the logical information needed by the workflow editor. A number of relationships with mTextInstance have been suppressed for clarity. These allow the representation of names, annotation, comments, parameters, and scripts, to be specified and controlled.There are two main parts to the model perspective: firstly, workflows and languages, and secondly, registries and external resources. Workflows and languages include the entities for defining the logical flow of a workflow and distinct workflow instances, registries and external resources provide information about available workflow components and constructs, and accessible computation and data resources. Where applicable, workflow constructs may be bundled as packages that would be loaded as pallets of usable icons in the editor's view perspective.

The logical distinction between the two parts derives from the provenance of the data stored in the entities. Users are enabled to compose workflows from configured workflow languages with information provided in the workflow editor. In contrast, registries and external resources contain information, which is typically set up by an expert with knowledge about addresses and interfaces of systems outside of the workflow editor. Information in these registries is accumulated from many sessions by many users working either via the web-based generic editor or via today's tools. The goal is to offer a generic interface for diverse registries - eventually this should be underpinned by a standard registry API. Thus, users are enabled to conveniently connect to and use multiple registries.

The conceptual model for the entities of workflows and languages has four layers. In the first, a workflow language is introduced with, in the second layer, all categories of components that can be used in potential workflows, for each language. mProcessCategory, for example, defines the types of processes dependent on the workflow language, e.g. a Job in gUSE or a PrimitivePE, CompositePE or FunctionalPE (where PE denotes Processing Element) in Dispel. In the third layer, the classes of the components for each category are represented while the fourth layer contains the specific instances of all of the classes in any workflow instance that is being edited. Fortunately, as we are only editing the graph, not interpreting it, the model does not have to capture all of the semantics of the various workflow languages; it only needs to discriminate nodes and edges that need to be treated differently in each installed language. It needs to differentiate nodes or edges when they need to be rendered differently or when the available editing actions are different. For example, a processing node that is primitive does not have the possibility of being opened to show its expansion, whereas a composite node (or meta node) can be opened to show its expansion in any of its available forms - this enables recursive composition of workflows to be viewed and edited.

The 'view' and 'controller' perspectives

The view entities for the view perspective are required to represent only the model entities at the class and instance levels that have been or are being viewed. They are mostly homomorphic with the model perspective, with the corresponding name except with prefix v. A good example of where their form is more complexis in the representation of connections, as shown in the figure below. Here, a set of not-necessarily-connected line segments (so that curved paths, manhattan paths, data-distribution trees, iteration and paralellisation may all be visualised) denote a mConnectionInstance.By default, the generation of the visual form is controlled by a controller entity higher in the conceptual tree, here a cConnectionCategory, that specifies the rules for all connections in that category. But the actual values in the view perspective may be set during upload from another representation or by a user manipulating the auto-generated form. Each v entity needs to reference its corresponding m entity and retain user-set or uploaded viewing information, such as size, icons and instance positions, connection routes, colouring, shading, fonts and line styles.

Entity relationship diagram fragment illustrating 'model', 'view' and 'controller' perspectives:
ERD fragment illustrating model, view and controller perspectives

The elements of the controller (C) perspective have prefix c and for the most part, the corresponding names. They are only needed for the category level, as they describe the mappings for all entities consistent with the category to which they are linked. They may be created at a lower level to describe exceptional behaviour. They describe a two-way mapping between the model and view perspectives and what a user is permitted to change. For example, they specify the actions on hover, click and double-click, and the set of enabled operations shown on right click for their referenced entities. Their language-specific values are set by a workflow language importer, and their
other values are set as user preferences. The handling of meta elements is a good example. They might highlight all of their input and output paths and immediately connected elements on click, might show a succinct description on hover, and might expand to expose their internal implementation on double-click, and offer a full repertoire of available operations on right click. Such potential behaviour will be described in a corresponding controller entity.

What next?

We shall provide database schemas on this page soon, and continue to populate the rest of the WBWFE webpages.

Please free to comment on this work, and/or get in touch with us – different views are welcome, and the only commenting we’ll censor is spam and other inappropriate material.