Franklin & Marshall College Franklin & Marshall College

    • u-h-c41033a8c3c9-jpg
    • u-h-1c525a73035c-jpg
    • u-h-7988b78f2d1b-jpg

Coloring the Genome

Christian Castaneda '10

Mining the Vast Data Produced by SNP Microarrays

Advisors: Dr. Erik Puffenberger, F&M Department of Biology
Dr. Janardhan Iyengar, Department of Computer Science

The goal of my research project has been to develop single nucleotide polymorphism (SNP) Analysis algorithms and software specifically based on the techniques employed by Dr. Puffenberger at The Clinic for Special Children in Strasburg, PA. Currently, the clinic employs Microsoft Excel to generate their Areas of Homozygosity Graphs, which help identify genes that may contain mutations leading to disease.  Unfortunately, using Excel for this task is extremely inefficient, because it is not easy to input or manipulate large data sets. In addition, the clinic will soon upgrading to new Affymetrix 6.0 Arrays, which have approximately 1.8 million genetic markers, as opposed to the current 10,000. Excel is incapable of handling this increase in the number of raw data points. During the fall 2009 semester, I developed new SNP processing algorithms using the programming language Python, to create a new application that combines efficient data analysis and graph production with an easy-to-use graphical user interface. Now that basic functionality is complete, I am continuing work on analyses of similarity.

  • images-departments-biology-students_in_action-castanedapic-jpg