The proposed Network is motivated by the potential for synergy between two fields of technology and technique, e-Science and Digital Repositories, and the benefits that will be obtained by increasing interaction and cooperation between researchers and practitioners in these fields. This was recognised by the keynote speech of Tony Hey at the Open Repositories 2007 conference. Tony Hey, currently VP of External Research at Microsoft Research, and former Director of the UK e-Science Core Programme, called for an integration of repositories into the new scholarly life cycle envisioned in e-Research .
The digital material generated from and used by academic and other research is to an increasing extent being held in formally managed digital repositories. In many cases, these systems are used currently to hold relatively simple objects, for example an institution’s pre-prints and publications, or e-theses. However, some institutions are beginning to use them to manage research data in a variety of disciplines, including physical sciences, social sciences, and the arts and humanities, in part as a result of various programmes funded by the JISC.
Repositories are changing not only in the type of content that they hold, but also in the ways they are used. A major motivation in setting up and populating digital repositories has been (and is) to make the results of research available to a wider audience, by encouraging or in some cases mandating deposit and open access principles. Repository software is, however, becoming more sophisticated, allowing complex digital content to be stored in such a way that its internal structure and external context can be explicitly represented, managed and exposed.
Such systems allow us to move away from the model of a stand-alone repository, where objects are simply deposited for subsequent access and download. Instead, researchers are developing more sophisticated models in which a repository is an integrated component of a larger research infrastructure, incorporating advanced tools and workflows, and being used to model complex webs of information and capture scholarly or scientific processes in their entirety, from raw data through to final publications. Repositories thus add value to the data-driven research lifecycle . Within the e-Science communities, much of the focus as regards data management has been on techniques for the efficient organisation of and access to large and distributed data sets, an issue that has been well addressed by various flavours of grid middleware. The particular challenge raised here, however, is not just size, but rather the very nature of the data, which can be highly diverse, complex, fuzzy and context-dependent, as well as the highly interpretative character of research in many disciplines, such as the humanities.
Another issue to be addressed is the “silo” mentality . Even if data is held in formally managed digital repositories, these are often managed on an institutional basis, resulting in information that is widely dispersed and not easy for researchers to locate and access. Although the repository content is in principle accessible via the internet, it is often held at a “deep” level that is not amenable to traditional discovery techniques. If, as we expect, digital repositories take on a central and pivotal role in the research lifecycle, then there is a clear strategic need to develop methods and tools to enable collaborative research through the coordination and federation of such complex and dispersed resources.
The purpose of the Network will thus be to address a group of related issues where e-Science and Digital Repositories meet: the incorporation of e-science methodologies and tools within digital repositories, the integration of repositories within wider e-infrastructures, the enhancement of digital repositories to manage complex scientific workflows, the federation of repositories using advanced techniques such as grid and semantic technologies. We will address these issues within the broader context of data creation and curation in e-Science.
A particular feature of the proposed Network is that it will operate across disciplines, including the arts, humanities and social sciences as well as the sciences in the narrower sense, as the technologies addressed are trans-disciplinary. We will strive to encourage inter-disciplinary contacts and collaborations, and to facilitate the transfer of knowledge and expertise gained within one discipline to other fields. Moreover, given the range of potential applications of the technologies, we do not expect interest in the Network to be restricted to the academic world, but also to include non-academic institutions. Consequently, we will encourage industrial participants, including commercial companies, cultural heritage organisations (e.g. museums, art galleries, and historic libraries), the media, and public/government bodies.