Ciao a tutti,
nel mio laboratorio cerchiamo uno studente di specialistica che sia disposto a svolgere la sua tesi da noi, su un progetto correlato al data management di risultati ottenuti dal progetto 1000genomes.
Per maggiori dettagli vi posto qui sotto la proposta di progetto originale in inglese, ed in ogni caso sono disponibile per rispondere a domande. Mettiamo a disposizione una borsa di studio di 900 euro al mese per un anno, una cifra che è sufficente per vivere discretamente qui a Barcellona, anche se non nel lusso :-).
La pagina web del nostro gruppo è: http://www.upf.edu/bioevo/
“Data Management for Human 1000 Genomes Project”
Group Leader: Jaume Bertranpetit,
Unitat de Biologia Evolutiva: http://www.upf.edu/bioevo/index.html
Pompeu Fabra University & Institut de Biologia Evolutiva (CSIC-UPF)
Abstract, References & skills see below. Funding is available.
Preferable Start: April/ May/June 2010 – End of 2010, extension possible.
Abstract
Our laboratory is working on human evolution and genomics. Most projects use a combination of our
own wet-lab produced data and bioinformatic analysis in order to address scientific questions centered
on natural selection and adaptation within the human specie or primates.
Recently, we have started to make the planning to use and study the data from the 1000 genomes
project (www.1000genomes.org) which aims at releasing the full genomic sequences of 2,500 human
individuals within a few years, and is already releasing pilot data on polymorphisms and sequences of a
smaller set of individuals.
Our group is interested in using the data from 1000 genomes, but we lack the time to properly
investigate certain technical details related to it, such as the file formats used, the alternatives for data
management, and the tools being developed for it. The role of the student will be to investigate these
aspects and test different solutions for the management of data and the results from different analysis,
like relational, document-based or other kind of databases, binary formats like HDF5 or Tokyo Cabinet,
flat files, etc. Moreover, the student will help in the application of different tests for positive selection
to the data, under our supervision. Flexible funding is available depending on time commitment and
skills of the student.
This project will give the student a background to current topics in human genomics as well as
computational skills that are likely to become very useful for him in the following years. Moreover, our
laboratory has a positive atmosphere where people from diverse multidisciplinary and international
backgrounds collaborate, and the project will be carried out under a philosophy inspired by Extreme
Programming, which will teach the student the value of team-working and software design.
References:
Grossmann et al. (2010). A Composite of Multiple Signals Distinguishes Causal Variants in Regions of
Positive Selection. Science doi: 10.1126/science.1183863 (overview of positive selection tests that will
be applied to the 1000 genomes data)
Expected Skills:
- average proficiency with the python programming language
- knowledge of SQL or any kind of database (relational or not)
- Unix skills and knowledge of flat format handling tools (grep, gawk)
- basics of software engineering will be a plus (e.g. if you know at least what an use case is)
References and recommended readings:
- http://www.1000genomes.org/ (1000 genomes home page)
- software carpentry for bioinformatics (http://software-carpentry.org/)
- http://extremeprogramming.org/rules
- http://www.bioinformaticszen.com/software/dealing-with-big-data-in-bioinformatics/
- HDF5: http://www.hdfgroup.org/HDF5, http://h5py.alfven.org/, http://www.pytables.org/
- Tokyo Cabinet http://1978th.net/tokyocabinet/
- Apache CouchDB: http://couchdb.apache.org/
Mi potete contattare via pm o su questo stesso topic. Con me vi potete rivolgere in italiano o in inglese, ma tenete in considerazione che i miei colleghi non parlano italiano.
Non ti lasciare sfuggire la possibilitá di lavorare con un autentico moderatore di molecularlab :-)