Outline
 
Resources
 
 Meeting Schedules
 Email Archive
 Interologs CVS DOCS
 Key Papers
 Gold Standard
 Conserved pathways/complexes
 Useful Links

ULYSSES - Project Outline


The elucidation of gene function on a whole genome scale is the central objective of functional genomics.  The focus of our attention is directed towards human genes, in particular to understand the biology of our species and to identify candidate genes for potential drug targets.  Practically, it is difficult to carry out complex experiments in human cells, for both technical and ethical reasons.  Diverse model organisms have played a central role in shedding light on gene function by circumventing many of the obstacles associated to research in human model systems.  The vital assumption is that genes sharing a common ancestor through duplication before speciation (orthologs) typically occupy the same functional niche in different species.  Most of the data from experiments in organisms as diverse as yeast, worm, and fly have been collected in large data sources and are available to the public.  Here we propose an integrated bioinformatics platform to suggest potential function for human genes based on observed data for orthologous genes in model organisms.

        As a first step, one needs to identify orthologs for each human gene in a defined set of organisms by sequence comparison.  The National Institute for Biotechnology Information (NCBI) developed a system, HomoloGene, for the automated detection of orthologs in 12 completely sequenced eukaryotes. This evolutionary classification of genes based on homologous relationships is the core of our integrated annotation system.  


 
A B  

Figure. Orthologous genes are defined across several species (planes, A). Interactions are identified within each species (lines, A), associations are correlated (B).  

The next stage is the identification of protein interaction data for genes from the core organisms (C. elegans, D. melanogaster, S. cerevisiae).  Hereby, we distinguish between direct (yeast-two-hybrid) and indirect (complex purification) protein interactions.   Our aim is to attribute various labels to sets of orthologous genes in order to increase confidence in results from high-throughput experiments by complex data integration across species.  Many of the datasets amenable for integration are captured in individual databases and rendered accessible to the public.
        Our goal was to identify major data repositories and to link their content in a central platform. We coordinated this project with UBiC (UBC Bioinformatics Centre) to fully capitalize on the ongoing Integrated Database Project (Atlas, Shah et al., submitted), which is an effort to integrate multiple forms of biological, publication and ontological data under one query space for data mining.  The challenge of our approach, compared to other efforts, is a gene network comparison based on the orthologous relationships between individual genes from different species, the inclusion of multiple techniques and datasets, and the statistical evaluation of the significance of individual interactions.


References

[1] H. Yu et al., Annotation transfer between genomes: protein-protein interologs and protein-DNA regulogs. Genome Res. 2004 Jun;14(6):1107-18.

[2] P. Bork et al., Protein interaction networks from yeast to human. Curr Opin Struct Biol. 2004 Jun;14(3):292-9.

[3] L.R. Matthews et al., Identification of potential interaction networks using sequence-based searches for conserved protein-protein interactions or "interologs". Genome Res, 11 (2001) 2120-6.

Project Members:

Wyeth W. Wasserman, CMMT (PI, TFs)
B.F. Francis Ouellette, UBiC (PI, Atlas)
Danielle Kemmer, CMMT, Karolinska Institute (Graduate student, project leader, data selection)
Sohrab P. Shah, UBiC (Chief, high-throughput bioinformatics, database design, data integration)
Jochen Brumm, CMMT (Graduate student, statistical advice, pathways)
Jonathan Lim, CMMT (Software developer, software for interfacing)
Raf Podowski, Karolinska Institute (Graduate student, text analysis and literature)
Yong Huang, UBiC (Database administrator, Atlas, database integration)
John Ling, UBiC (Software developer, Atlas, data retrieval)

Team e-mail: ilg@cmmt.ubc.ca