Phase 5 of the Cancer Gene Index Project Begins


By Allison Proffitt

June 25, 2008 | Sophic has announced $1.3 million of funding from the National Cancer Institute to complete the Cancer Gene Index Project over the next 12 months. Sophic started the project in June 2004 with the goal of mining 8.8 million Medline abstracts to identify suspected cancer genes and manually annotate gene-disease and gene-compound relationships. So far 4,658 cancer genes have been made publically available on the NCI website.

“We’ve completed four years of work and this phase is the completion phase, which means we will have completely analyzed and annotated the 6,610 identified cancer genes with manual annotations for role codes and evidence codes,” Patrick Blake, Sophic’s CEO told Bio-IT World. “I just attended the caBIG conference and they are identifying this dataset as the backbone for cancer research across the cancer community. We’re proud and we’re thrilled to be able to offer this asset to people who are fighting this terrible disease.”

The fifth phase of the project, announced on Monday, will bring the total number of cancer related genes indexed to 6,610. Sophic has completed the work in conjunction with NCI and Biomax Informatics AG of Munich, Germany in what Blake calls “a true collaboration” using Biomax’ BioLT literature mining tool. “We developed a ‘factory assembly line’ methodology that allows the automated text mining results to be fed into the scientific team who curate and annotate the information in an efficient, quality-controlled, work-flow process,” said Klaus Heumann, CEO of Biomax. The phase-based strategy has been designed so that “nothing is missed” and that all cancer genes, cancer types, and compounds and treatments related to cancer genes are examined.

Click here to login and leave a comment.  

0 Comments

Add Comment

Text Only 2000 character limit

Page 1 of 1

White Papers & Special Reports

“Storage for Science – Methods for Managing Large and Rapidly Growing Data Stores in Life Science Research Environments” sponsored by Isilon
Large and rapidly growing stores of file-based and other data are a hallmark of life science research and bioinformatics. Determining how best to manage those data stores has become a significant challenge for Researchers and IT Pros alike.

This paper is intended to:

  • Provide guidance on the many storage requirements common to Life Science research;
  • Explain the evolution of modern storage architectures;
  • Summarize the major data storage architectures currently in use.

Additionally, it will present the Isilon IQ clustered storage product as a strong and flexible solution to those needs. Download now



Next-Generation Technologies Revolutionizing Oncology and Diagnostics
underwritten by Definiens

This “Briefing On” collection of Bio-IT World features, commentaries and analysis, presents some of the latest thinking on high-throughput technologies that are being applied to the fields of research and drug discovery, with particular emphasis on oncology, diagnostics and imaging technologies. Download now at no charge compliments of the underwriting sponsor, Definiens. Download This Free Paper



MetaMiner™ Cystic Fibrosis Report, Sponsored by GeneGo
This paper discusses the MetaMiner™ (CF) data analysis platform for a broad range of CF researchers designed to: 1. Easily assemble important biological and chemical experimental data available today in cystic fibrosis research. 2. Visualize key mechanisms leading to the disease through pathway maps and network models 3. Provide the CF community a “one stop shop” tool for uploading and analyzing experimental data in a disease-centered interface. Download now



Life Science Webcasts & Podcasts

Storage for Science
Methods for Managing Large and Rapidly Growing Data Stores in Life Science Research Environments

Sponsored by Isilon

Large and rapidly growing stores of file-based and other data are a hallmark of life science research and bioinformatics environments. Determining how best to manage those data stores has become a significant challenge for the Researchers and IT Professionals that support them.

This webcast is intended to:

  • Provide guidance on the many storage requirements common to Life Science research;
  • Explain the evolution of modern data storage architectures;
  • Summarize the major data storage architectures currently in use;
  • Present the Isilon IQ clustered storage product as a strong and flexible solution to those needs.

    Download this webcast

More Podcasts

Job Openings

Isilon Systems ~ Senior Marketing Communications Manager
Isilon Systems is the worldwide leader in clustered storage systems and software for digital content and unstructured data. We seek an experienced marketing communications professional/writer expert in creating and delivering effective and persuasive business communications. The ideal candidate can think at the strategic and conceptual level and act, simultaneously, as a highly-effective and productive individual contributor. The position is based in Seattle, WA. For additional information click here:

Lilly Singapore Center for Drug Discovery (LSCDD) - Associate Director of Informatics
Lead and mentor a strong team for the Bioinformatics group at the Integrative Computational Sciences (ICS) department at LSCDD towards the development of novel algorithms, data analysis methods and software tools for drug discovery. Work closely with the Software Engineering group at ICS, and collaborate with the Discovery IT organization in Europe and USA. For additional information, or to apply visit: LSCDD

Related Resources & Products

Cancer Biomarkers: Adoption Is Driving Growth
Cancer Biomarkers: Adoption Is Driving Growth
Translational Cancer Medicine
Cancer Diagnostics
Cancer Diagnostics




For reprints and/or copyright permission, please contact RMS, 1808 Colonial Village Lane, Lancaster, PA;

(717) 399-1900 ext. 125 or via email to [email protected].