High-throughput array-based screens are a popular experimental technique employed by molecular biologists in their endeavour to unravel highly complex gene-interactions. Over the past decade a wealth of research has focused on the development of advanced statistical techniques with which to interrogate these large datasets. Simultaneously, a similar level of effort has been expended in the development of large-scale, open access data repositories and accompanying standards for the reporting of experiment metadata. In disciplines such as mathematics, statistics, and computing these repositories have also become a rich source of data with which to develop and evaluate novel or improved methods of clustering and classification.
Despite these advances towards standardisation and accessibility, the detail of reported experiment metadata remains insufficient to allow the confident re-analyses of the raw data. During the seminar I will discuss recent developments in the following two aspects of the reporting and publication process that promise to improve the quality, depth, and utility of experiment metadata:
1) the need for a greater appreciation for sources of bias in the experiment design and the propensity for noise introduced during sample processing;
2) the need for more rigorous reporting of the statistical treatment of data in 'Methods' sections of journal articles.