Data streaming is a strategy for scalable or continuous data processing. We have developed a high-level notation for describing distributed and heterogeneous data-streaming workflows called DISPEL and have a substantial body of applications described in DISPEL. An implementation based on OGSA-DAI exists and at least two other implementations are partially constructed. The Open Questions that need investigating via a series of experiments are:
1] Can these applications be moved onto a cloud platform simply by using virtual machines encapsulating the existing DISPEL/OGSA-DAI implementation?*
2] If this is feasible, what are the observed performance and cost measurements on cloud infrastructures and how do these compare with expectations?
3] Are there obvious modifications to the implementation strategy that would be better adapted to the Cloud?
This could be investigated as part of the OSDC PIRE programme of student visits.
It could also form the basis of an MSc project, or be the kick-off point for a PhD that investigated in depth how best to implement data-streaming workflows in a distributed context.
*The ADMIRE project produced ADMIRE, including DISPEL enactment, in a VM.