Data Science in Health Sciences

Speakers and abstracts for the data science in health sciences session

These talks focus on data science applications in biology and medicine


 

Nara Chon portrait

Nara Chon

Speaker Affiliation: Professional Research Assistant, Department of Chemistry, CU Denver

Title: Using Protein Structure Prediction with Molecular Phylogenetic Analysis to Understand Membrane Interaction in Synaptotagmin-like Proteins

Abstract: Protein structure prediction is emerging as a core technology for understanding biomolecules and their interactions. Here, we have combined structure prediction with molecular phylogenetic analysis in order to provide information on the evolution of electrostatic membrane binding in synaptotagmin-like proteins (Slps). Slp family proteins play key roles in the membrane trafficking of large dense-core secretory vesicles in vertebrates. Our previous experimental and computational study found that the C2A domain of Slp-4 (also called granuphilin) binds with high affinity to phosphatidylinositol-(4,5)-bisphosphate (PIP2) and phosphatidylserine (PS) found in the cytoplasmic leaflet of the plasma membrane. A conserved site centered on three lysine residues binds selectively to PIP2, while ~10 other lysine and arginine residues on the surrounding protein surface provide a positive electrostatic potential that amplifies the membrane interaction via nonspecific electrostatic contacts. Because the polybasic surface contributes greatly to Slp-4 C2A domain membrane binding, we hypothesized that the net charge on the protein surface might be conserved across all C2A domains in Slp family. To test this hypothesis, the known C2A sequences of Slp-2 and Slp-4 among vertebrates were sorted by class (from mammalia to pisces) using molecular phylogenetic analysis. Consensus sequences for each class were then identified and used to generate homology structures, from which Poisson–Boltzmann electrostatic potentials were calculated. The results demonstrate that the charge on the membrane-binding surface is highly conserved throughout the evolution of both individual Slp proteins as well as in the mammalian Slp C2A domain family overall. Such molecular phylogenetic analysis-driven computational analysis may help to describe the evolution of electrostatic interactions between proteins and membranes which are critical for their function.

Yonghua Zhuang portrait

Yonghua Zhuang

Speaker Affiliation: Assistant Professor, Department of Pediatrics-Endocrinology, CU Anchutz

Title: Deep Learning on Graphs for Multi-Omics Classification of COPD

Abstract: Network approaches have successfully been used to help reveal complex mechanisms of diseases including COPD. We previously incorporated prior protein-protein interaction knowledge and high-through omics data to construct COPD disease-specific networks using AhGlasso method. However, how to incorporate protein-protein interaction (PPI) information with single or multiple omics data for disease prediction is still limited even with recent advances. Recently deep learning including convolution Graph Neural Network (ConvGNN) has shown great potential for disease classification using transcriptomics data and known PPI networks from existing databases. In this study, we first reconstructed the COPD-associated PPI network through the AhGlasso algorithm based on one independent transcriptomics dataset including COPD cases and controls. Then we extended the existing ConvGNN to successfully incorporate COPD-associated PPI, proteomics, and transcriptomics data and developed a prediction model for COPD classification.  This approach improves accuracy over several conventional classification methods and neural networks that do not incorporate network information. We also demonstrated that the updated COPD-associated network developed using AhGlasso further improves prediction accuracy. Although deep neural networks often achieve superior statistical power in classification compared to other methods, it can be very difficult to explain how the model, especially graph neural network(s), makes decisions on the given features and identify the features that contribute the most to prediction generally and individually. To better explain how the spectral-based Graph Neural Network model(s) works, we applied one unified explainable machine learning method, SHapley Additive exPlanations (SHAP). SHAP builds surrogate models to black-box machine learning models and provides them interpretability using the property of local and global explainability. We identified the top important genes/proteins and important subnetworks in the ConvGNN model for COPD prediction through SHAP value analysis. Gene Ontology (GO) enrichment analysis on the top 30 important genes/proteins further identified six significantly enriched molecular function pathways for COPD prediction.

Katie Mullen portrait

Kathleen R. Mullen

Speaker Affiliation: Postdoctoral Fellow, Department of Biomedical Informatics, CU Anschutz / Equine Internal Medicine Specialist, Littleton Equine Medical Center

Title: Corralling Veterinarians and Informaticists to Improve Health Outcomes Across Species

Abstract: Harmonizing human and veterinary clinical and medical research data has the potential to improve health outcomes across species. While many advancements have been made with human phenotyping and precision medicine, the majority of available veterinary data on domestic and exotic animals is not yet standardized. Performing analytics with veterinary medical data would enable us to translate our knowledge about animal health into knowledge about human health and vice versa. One goal is to harmonize cross-species electronic health records to allow for household-wide (human and companion animal) analyses. For example, animals can be bio-sentinels for environmental exposures associated with cancer and other health risks. Collaborative efforts are underway to create interoperability in human and veterinary health data across Clinical and Translational Science Award One Health Alliance institutions in order to advance innovations in human and animal health.

Evan Shapiro portrait

Evan Shapiro

Speaker Affiliation: PhD Student, Department of Mathematical and Statistical Sciences, CU Denver

Title: Feature space in biological data science

Abstract: Forthcoming

Eric Young portrait

Eric Young

Speaker Affiliation: Senior Professional Research Assistant, Department of Mathematical and Statistical Sciences, CU Denver

Title: Leveraging multiple PRS and phenotype pairs to explore latent environmental factors in disease risk

Abstract: Forthcoming

CMS Login