PhD in Computer Science, Utah State University (2009)
BS in Computer Science, Xidian University (1999)
My research interests lie broadly in bioinformatics and computational biology. The primary research goal is to develop and apply computationally intensive techniques such as data mining and machine learning algorithms to address challenges in understanding biological systems, which involves theories and techniques from multi-disciplinary fields of computer science, statistics, biochemistry and biophysics. A general workflow of my research is to identify and study biological problems, construct computational models, develop and apply efficient and effective machine learning techniques to systematically analyze and understand biological processes. Specifically, my recent research focuses on problems such as prediction of protein functions and functional sites, identification of deleterious SNPs and indels, and drug repositioning.
Computatioal Drug Repositioning , Industry grant offered by GlaxoSmithKline LLC, $99,723, 8/1/2012 - 7/31/2013, Principal Investigator.
(*indicates student co-author)
Journal Publications
14. Yan, X., Gurtler, J.B., Fratamico, P.M., Hu, J., Juneja, V.K. Phylogenetic Analysis of Bacterial Toxin MazF Protein among Probiotic Strains and Food-borne Pathogens and Potential Implications of Engineered Probiotics Intervention in Food. Cell & Bioscience, 2012, 2:39.
13. Sim, N.L., Kumar, P., Hu, J., Henikoff, S., Schneider, G., and Ng, P. SIFT web server: predicting effects of amino acid substitutions on proteins.. Nucleic Acid Research, 2012 Jun 11. [Epub ahead of print] (impact factor: 8.026). (PMID: 22689647)
12. Hu, J., and Ng, P. Predicting the Effect of Frameshifting indels. Genome Biology, 2012, 13:R9 (impact factor: 9.04). (PMID:22322200) [Highly Accessed]
11. Hu, J., and Yan, X. BS-KNN: An Effective Algorithm for Predicting Protein Subchloroplast Localization. Evolutionary Bioinformatics 2012:8 79-87. (doi: 10.4137/EBO.S8681) (impact factor: 2.684)
10. Yan, X., Gurtler, J., Fratamico, P., Hu, J., Gunther, N. W. IV., Juneja, V., and Huang, L. Comprehensive Approaches for Molecular Biomarker Discovery for the Detection and Identification of Cronobacter spp. (Enterobacter sakazakii), Salmonella and Other Foodborne Pathogens. Applied and Environmental Microbiology, Vol 77, No. 5, pp 1833-1843, March 2011. (PMID: 21239552) (Impactor factor: 3.778)
9. Hu, J. and Yan, C. A comparative Analysis of Protein Interfaces. Protein Pept. Lett., vol 17, issue 11, 1450-1458, 2010. (Impact factor: 1.755)
8. Hu, J. and Yan, C. A tool for calculating binding-site residues on proteins from PDB structures. BMC Structural Biology, 2009, 9:52.3. (Impact factor: 2.79)
7. Hu, J. and Yan, C. A Method for Discovering Transmembrane Beta-barrel Proteins in Gram-negative Bacterial Proteomes.Computational Biology and Chemistry, 2008 32:298-301. (Impact factor: 1.37)
6. Hu, J. and Yan, C. Identification of Deleterious Non-synonymous Single Nucleotide Polymorphisms Using Only Sequence-derived Information, BMC Bioinformatics, 2008, 9:297. (Impact factor: 3.43)
5. Hu, J. and Yan, C. Protein Subcellular Localisation Prediction with Improved Performance. International Journal of Functional Informatics and Personalised Medicine. 2008, 1 (3), 321-328.
4. Hu, J., and Yan, C. HMM_RA: An Improved Method for Alpha-helical Transmembrane Protein Topology Prediction. Bioinformatics and Biology Insights, 2008 2: 67-74.
3. Yan, C., Hu, J., and Wang, Y. Discrimination of Outer Membrane Proteins Using a K-nearest Neighbor Method. Amino Acids,2008 35(1):65-73. (Impact factor: 3.88)
2. Yan, C., Hu, J., and Wang, Y. Discrimination of Outer Membrane Proteins with Improved Performance. BMC Bioinformatics,2008, 9:47. (Impact factor: 3.43)
1. Yan, C. and Hu, J. An Exploration to the Combining of Solvent Accessibility With Amino Acid Sequence in the Identification of Helix-Turn-Helix motifs, WSEAS Transaction on Biology and Biomedicine, 2006,6(3): 477-484.
Conference Publications
11. Hu, J., Forcier, A.* and Yan, C. A Method For Predicting ATP-Binding Pockets Based On Amino Acid Microenvironment. Accepted by the 5th International Conferenceon Bioinformatics and Computational Biology (BICoB).
10. Hu, J. and Yan, C. Predicting DNA-binding Sites by Exploring the Distributionof Atom Groups Around the Surface. Accepted by the 2011 International Conference on Bioinformatics & Computational Biology (BIOCOMP' 11) (Acceptance rate: 21%)
9. Hu, J., Schilder, M.* and Yan, X. A Neighor-Weighted K-Nearest Neighbor Method For Predicting Protein Subnuclear Localizations (poster paper). IEEE 1st International Conference on Computational Advances in Bio and medical Sciences (ICCABS'11), pp. 247, Feb 3-. 2011, Orlando, FL, USA. (ISBN: 978-1-61284-851-8 ) (doi 10.1109/ICCABS.2011.5729900 ) (Acceptance rate: 42%)
8. Hu, J., Identification of Transmembrane β-Barrel Proteins Using a K-Nearest Neighbor Method Based On Weighted Manhattan Distance, In proceeding of the 2010 International Conference on Bioinformatics & Computational Biology (BIOCOMP' 10), pp 29-34, Las Vegas, NV, July 12-15, 2010. (Acceptance rate: 27%)
7. Hu, J. Predicting Subcellular Localizatoin of Gram-negative Proteins with Improved Performance. In proceeding of 2nd International Conference on Bioinformatics and Computational Biology (BICoB), Honolulu, Hawaii, USA, March 24-26, 2010.(Acceptance rate: 45%)
6. Hu, J. and Yan, C. Mining Sequence Features for DNA-binding Site Prediction. In Proceedings of IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, 2008.
5. Hu, J. and Yan, C. Predicting Protein Subcellular Localizations Using Weighted Euclidian Distance. In Proceedings of IEEE 7th International Symposium on BioInformatics and BioEngineering, 2007, 1370-1373.
4. Yan, C. and Hu, J. Identification of Helix-Turn-Helix Motifs From Amino Acid Sequence. In Proceedings of IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, 2006, 1-7.
3. Yan, C. and Hu, J. A Hidden Markov Model for the Identification of Helix-Turn-Helix Motifs. In Proceedings of WSEAS International Conference on Cellular and Molecular Biology-Biophysics and Bioengineering, 2006, 14-19.
2. de Garis, H., Liu, R., Huang, D. and Hu, J. Artificial Brains. An Inexpensive Method for Accelerating the Evolution of Neural Network Modules for Building Artificial Brains, In Proceedings of the AGI Workshop, 2006, 144-158.
1. Flann N. S., Hu J., Bansal M., Patel V. and Podgorski G. Biological Development of Cell Patterns: Characterizing the Space of Cell Chemistry Genetic Regulatory Networks. In proceedings of Eighth European Conference on Artificial Life, 2005, 57-66.
1. Poster presentation, "Predicting the effects of 3n indels", The Pacific Symposium on Biocomputing (PSB) 2013.
2. Invited Talk: "Predicting the effects of frameshifting indels using a machine learning approach", Dickinson College, 2012.
3. Poster presentation, "SIFT Indel: an on-line tool for predicting the effects of frame shifting indels", 2012 Pacific Symposium on Biocomputing (PSB 2012)
4. Poster presentation, "BS-KNN: A K-Nearest Neighbor Method for Predicting Protein Subchloroplast Locations Based on Bit-Score Weighted Euclidian Distance", ACM-BCB 2011
5. Oral Presentation: "Identification of Transmembrane β-Barrel Proteins Using a K-Nearest Neighbor Method Based On Weighted Manhattan Distance", 2010 International Conference on Bioinformatics & Computational Biology (BIOCOMP' 10), Las Vegas, NV, July 12-15, 2010.
6. Oral Presentation: "Predicting Subcellular Localizatoin of Gram-negative Proteins with Improved Performance", 2nd International Conference on Bioinformatics and Computational Biology (BICoB), Honolulu, Hawaii, USA, March 24-26, 2010.
7. Poster Presentation: "A Useful Tool for Calculating Binding-site Residues on Proteins from PDB Structures", 7th Annual Rocky Mountain Bioinformatics Conference, Aspen, CO, December 10-12, 2009.
Andrew Forcier 12'
Marc Schilder 11'
August, 2012 - August, 2013 Sabbatical Leave.
CPS112: Data Structures
MAT273 A & B: Discrete Mathematics
2012 (Spring)
CPS112: Data Structures
MAT273 A & B: Discrete Mathematics
2011 (Fall)
CPS111 A: Introduction to Computational Thinking Using Python
2011 (Spring)
CPS112: Data Structures and Algorithms, Lab
CPS373: Bioinformatics I
2010 (Fall)
CPS170B: Introduction to Computational Thinking Using Python
CPS261: Datat Structures and Algorithms II
2010 (Spring)
CPS 270: Datat Structures and Algorithms I
CPS 373: Introduction to Bioinformatics I
2009 (Fall)
CPS 170: Introduction to Computational Thinking; Lab