You are here

Finished student projects

The list of student projects below are all finished.

Privacy Protection for a Brain Imaging Databank

Student: 
Jyothsna Vivekanand Shenoy

In recent years there has been an increasing trend towards releasing micro-data to the public. This can be very important for research, but in some cases (e.g. medical data) these releases are limited due to privacy protection issues. Anonymisation is a limited solution that does not fully protect the individuals. Even when all the personal identifiers have been removed it might be possible to identify an individual from an anonymous records using quasi-identifiers and data linking with some other external data source (see references).

Project status: 
Finished
Degree level: 
MSc
Background: 
Knowledge of databases. Programming skills.
Supervisors @ NeSC: 
Student project type: 
References: 
Fung, Benjamin C. M. and Wang, Ke and Chen, Rui and Yu, Philip S. "Privacy-preserving data publishing: A survey of recent developments" ACM Computing Surveys, Vol. 42, No. 4, Article 14 B.-C. Chen, D. Kifer, K. LeFevre and A. Machanavajjhala. "Privacy-Preserving Data Publishing" Foundations and TrendsR in Databases Vol. 2, Nos. 1–2 (2009) 1–167 L. Sweeney. "k-Anonymity: a model for protecting privacy". In International Journal on Uncertainty, Fuzziness and Knowledgebased Systems, 10(5), pages 557-570, 2002 Samarati P (2001). "Protecting respondents' identities in microdata release". IEEE Transactions on Knowledge and Data Engineering, 13(6):1010{1027

Investigating the Rule Construction Mechanism in Ant-Miner

Student: 
Hariharan Anantharaman

This project will appeal to you if you are interested in Learning from Data and Nature-Inspired Computation.

Project status: 
Finished
Degree level: 
MSc
Supervisors @ NeSC: 
Student project type: 

Investigating Array Databases for Managing Climate Data

Student: 
Jian Qiang

This is a challenging project and will appeal to students keen to make a contribution in the areas of scientific databases and geoinformatics.

Project status: 
Finished
Degree level: 
MSc
Subject areas: 
Databases
Software Engineering
Student project type: 

Scientific applications: exploiting the data bonanza. The microscopy case.

he aim of the project is to perform some exploratory work on how to deal with the problem of I/O bound processing, by implementing technology-specific components in a provided system. The goal is to distribute data and processing so that a CPU processes data locally, minimising data transfer. The assumption is that I/O is the major bottleneck in processing, and computation could be done with less powerful (greener and cheaper) CPUs, rather than with a powerful CPU that wastes energy waiting for data. Different technologies for storing and processing the data can be explored.

Project status: 
Finished
Degree level: 
MSc
Supervisors @ NeSC: 
Subject areas: 
e-Science
Databases
Distributed Systems
Software Engineering
Student project type: 

Runoff prediction from a Hydrologic Spatio-Temporal Database

Student: 
Charalampos Sfyrakis
Grade: 
first

Present day instrumentation networks in rivers provide huge quantities of multi-dimensional data. Although there are numerous machine learning tools that can extract trends, find patterns and predict future states given some data, it is crucial to properly optimize these techniques according to the semantic content of the data. Hydrology is a data immense science, which requires efficient mining of trajectories of measurements taken at different time points and positions.

Project status: 
Finished
Degree level: 
MSc
Background: 
data mining
Supervisors @ NeSC: 
Subject areas: 
e-Science
Machine Learning/Neural Networks/Connectionist Computing
Projects: 
Student project type: 

Connecting Rapid with the jclouds multi-cloud framework

Principal goal: to extend Rapid, a tool for developing web portals for scientific computing, to operate with jclouds.

This is project is part of the Google Summer of Code 2010 (see http://www.omii.ac.uk/wiki/RapidJclouds)

Project status: 
Finished
Degree level: 
NR
Background: 
Java, XML
Supervisors @ NeSC: 
Subject areas: 
e-Science
Projects: 
Student project type: 

Extension of Rapid to the Hadoop Framework

Student: 
Harika Yasa

Principal goal: to extend Rapid, a tool for developing web portals for scientific computing, to operate with Apache Hadoop.

This is project is part of the Google Summer of Code 2010 (see http://www.omii.ac.uk/wiki/RapidHadoop)

Project status: 
Finished
Degree level: 
NR
Background: 
Java, XML
Supervisors @ NeSC: 
Subject areas: 
e-Science
Projects: 
Student project type: 

Accelerating Genome-Wide Association Studies with Graphics Processors

Student: 
Jeff Poznanovic
Grade: 
first

Principal goal: to substantially improve the performance of the data-intensive analysis for genome-wide association studies (GWAS) by using graphics processing units (GPUs).

Project status: 
Finished
Degree level: 
MSc
Supervisors @ NeSC: 
Other supervisors: 
Dave Liewald, Centre for Cognitive Ageing and Cognitive Epidemiology. Gail Davies, Centre for Cognitive Ageing and Cognitive Epidemiology.
Subject areas: 
e-Science
Bioinformatics
Computer Architecture
Distributed Systems
Parallel Programming
References: 
NIH National Human Genome Research Institute, "Genome-wide association studies," http://www.genome.gov/20019523 PLINK, http://pngu.mgh.harvard.edu/~purcell/plink CUDA, http://www.nvidia.com/object/cuda_home.html OpenCL, http://www.khronos.org/opencl/
Student project type: 

Parameter fitting of cosmological models using billions of galaxies

Student: 
Martha Axiak
Grade: 
first

Principal goal: to develop, test and make available to the cosmology community a parameter estimation method for models that explain our dark Universe.

Project status: 
Finished
Degree level: 
MSc
Background: 
Evolutionary computation, optimisation, machine learning and/or statistics are all desirable.
Supervisors @ NeSC: 
Other supervisors: 
Tom Kitching, Institute for Astronomy, Edinburgh; tdk@roe.ac.uk, tom.kitching@googlemail.com
Subject areas: 
Genetic Algorithms/Evolutionary Computing
Machine Learning/Neural Networks/Connectionist Computing
WWW Tools and Programming
References: 
There is a good review of statistical methods used in cosmology here with some further references suggested http://xxx.lanl.gov/abs/0911.3105 chapter 13 goes into some discussion on the monte carlo methods we use. The standard tool for cosmological parameter estimation is cosmomc which is here http://cosmologist.info/cosmomc/ The original paper for this is here http://arxiv.org/abs/astro-ph/0205436 and the first application is here http://arxiv.org/abs/astro-ph/0302306 A slightly more advances nested sampling method is called multinest which is described here http://xxx.lanl.gov/abs/0809.3437 A general discussion on the current status of cosmology is http://xxx.lanl.gov/abs/astro-ph/0610906 though warning there is some technical details (and a lot of acronyms).
Student project type: 

Data mining to identify small molecules with bioactivity

Student: 
Gideon Jansen Van Vuuren
Grade: 
first

Principal goal: to apply machines learning to identify small molecues that are likely candidates to have relevant bioactivity for follow-up wet-lab experiments.

Project status: 
Finished
Degree level: 
MSc
Background: 
Machine learning essential, biology/bioinformatics desirable.
Supervisors @ NeSC: 
Other supervisors: 
Jan Wildenhain, Tyers Lab, School of Biological Sciences (http://tyerslab.bio.ed.ac.uk/lisa/indPage.php?id=jwil315) Michaela Spitzer, Tyers Lab, School of Biological Sciences
Subject areas: 
Bioinformatics
Machine Learning/Neural Networks/Connectionist Computing
Student project type: 

Pages