Genetics Workshop

Structural and Functional Annotation Issues with Large Eukaryotic Genomes

Robin Buell
Dept. of Plant Biology, Michigan State University

A critical component of genomics is gene annotation. Structural annotation involves identification of genes, both protein and non-coding RNAs, and relies on a combination of computational predictions and experimental evidence. A compounding problem in structural annotation is the lack of sensitivity and specificity in ab initio gene finders, lack of deep transcript support, and error propagation in automated annotation pipelines. To address these issues in annotation of the rice genome, we have developed an iterative set of quality assessment steps and semi-automation of the pipeline thereby generating high quality annotation of a 370 Mb genome that encodes ~42,000 protein coding genes. Functional annotation of genomes is highly dependent on transitive annotation and is highly susceptible to propagation of incorrect annotations. To address these issues in annotation of the rice genome, we have developed a functional annotation pipeline that reduces transitive annotation errors by providing a higher level of annotation that while less specific, is more accurate.

Michigan State University | Department of Statistics and Probability | Statistical Genetics Lab
l>cs Lab