Home Blogs Rolando Garcia-Milian's blog

Rolando Garcia-Milian's blog

Welcome Nur-Taz Rahman

August 1, 2019 - 10:20am by Rolando Garcia-Milian

Please welcome our newest team member Dr. Nur-Taz Rahman, Simbonis Fellow in Bioinformatics - made possible through the generosity of the estate of Dr. Stanley Simbonis ’53, ‘57MD.   In her new role, Nur will be serving as an informationist in the established Bioinformatics Support Program at the Cushing/Whitney Medical Library. Nur is not new to Yale as she completed her PhD in the lab of Dr. Diane Krause, Yale Department of Cell Biology, where she worked extensively with bioinformatics for the study of stem cells. While working on her PhD she developed and taught (in collaboration with the Medical Library, ITS, and the Center for Res. Computing)  two 2-hours sessions on RNA-seq data analysis. This, in addition to her roles as writing tutor, mentor, and teaching assistant for different courses. She has also participated in the prestigious National Institute of Health Genomics Hackathon.   As genomics medicine and biomedical sciences become more data-intensive disciplines, her expertise in biomedical data sciences combined with her passion for teaching and helping, will prove invaluable in supporting the work of Yale biomedical researchers.

Two New Bioinformatics Tools

July 1, 2019 - 1:46pm by Rolando Garcia-Milian

The Medical Library is providing free access for Yale affiliates to two new resources for drug discovery and research: METADRUG MetaDrug contains curated information on biological effects of small molecule compounds. It is a systems pharmacology solution that combines pharmacogenomics and toxicogenomics as well as predictive capabilities. Every target in MetaDrug comes with protein interactions to explore biological pathways affected by the user’s compounds and network neighborhood of drug targets. OMICs data analysis capabilities provide an additional approach for solving the compound’s mechanisms of action, discovering drug efficacy biomarkers, and corroborating the hypotheses generated by classical structure-based methods. If you already have a MetaCore account, this module is already included. Otherwise, please request your MetaCore/MetaDrug account here Integrity Integrity focuses exclusively on pharma and drug development intelligence, harmonizing and integrating essential biological, chemical, and pharmacological data from disparate sources into a single platform. Integrity provides easy access to pipeline data, granular target and MOA information, and manually curated data specific to drug development. Request an account by filling this form   SUBSCRIBE RECEIVE MONTHLY ALERTS ON NEW TRAINING AND SOFTWARE  

(2 X 2) 2 High-throughput Data Analysis Workshops X 2 on NCBI Public Databases.

September 19, 2016 - 10:09am by Rolando Garcia-Milian

Cushing/Whitney Medical Library has organized these four workshops, two of them on high-throughput data analysis tools and two on NCBI public databases. Although these are free and open to any Yale affiliate, registration is required due to limited seating.   Title:      NGS Data Analysis in Partek® Software: Onsite Workshop   Description:    Morning Session (Overview, Hands On: Analysis on RNA-Seq Data): 9:00AM – 12:00PM Free access to Partek Flow is provided by the Yale Medical Library.  This session will start with an overview of Partek Software solutions followed with a hands on RNA-Seq Data Analysis in Partek Flow. Topics will include how to use statistical tests to identify differentially expressed transcripts and alternative spliced genes among sample groups, how to generate a list of genes of interest and identify high level biological trends using Gene Ontology. Import data (fastq, bam, text format)  Perform QA/AC (Pre-alignment QA/AC, Post-alignment QA/QC) Trim bases Alignment Gene/transcript abundance estimate (E/M)  Differential expression detection (GSA, ANOVA) Filter gene list GO Enrichment Analysis Visualization Quality score distribution Base composition PCA scatterplot Dotplot Volcano plot Hierarchical clustering Chromosome view Afternoon Session (Open Lab; Q&A): 1:30PM – 4:00PM We hope to see you there!   Date & Time:     9:00am - 12:00pm, Thursday, September 29, 2016 Location:           C-103, SHM, 333 Cedar St Campus:           Medical School Presenter:          Eric Seiser, PhD, Field Application Scientist, Partek Inc.   Title: Broadcast. Navigating NCBI Molecular Data Using the Integrated Entrez System and BLAST Description: This workshop will be broadcasted from the Taubman Health Sciences Library, Univ Michigan, and provides an introduction to the NCBI molecular databases and how to access the data using the Entrez text-based search system and BLAST sequence similarity search tool. You will learn the varied types of available molecular data, and how to find and display sequence, variation, genome information using organism sources (Taxonomy), data sources (Bioproject) and emphasizing the central role of the gene as an organizing concept to navigate across the integrated databases (Gene, Nucleotide, Protein, dbSNP and other resources). Location: Cushing/Whitney Medical Library, Simbonis Conference Room 101A, 333 Cedar St Presenter: Peter Cooper, Ph.D. of the National Center for Biotechnology Information Date/time: 9:00am - 12:00pm, Tuesday, October 4, 2016   Title: Broadcast. A Practical Guide to NCBI BLAST Description: This workshop will be broadcasted from the Taubman Health Sciences Library, Univ Michigan and highlights important features and demonstrates the practical aspects of using the NCBI BLAST service, the most popular sequence similarity service in the world. You will learn about useful but under-used features of the service. These include access from the Entrez sequence databases; the new genome BLAST service quick finder; the integration and expansion of Align-2- Sequences; organism limits and other filters; re-organized databases; formatting options and downloading options; and TreeView displays. You will also learn how to use other important sequence analysis services associated with BLAST including Primer BLAST, an oligonucleotide primer designer and specificity checker; the multiple protein sequence alignment tool, COBALT; and MOLE-BLAST, a new tool for clustering and providing taxonomic context for targeted loci sequences (16S, ITS, 28S). These aspects of BLAST provide easier access and results that are more comprehensive and easier to interpret. Location: Cushing/Whitney Medical Library, Simbonis Conference Room 101A, 333 Cedar St Presenter: Peter Cooper, Ph.D. of the National Center for Biotechnology Information Date/time: 9:00am - 12:00pm, Wednesday, October 5, 2016   Title:      Make new discoveries with your OMICs data: Hypothesis testing and assumption-free exploration Description:        10AM- Noon: Exemplary workflows for different experiment designs Hypothesis testing and Assumption-free data exploration Working with annotations, dynamic and interactive plots Input data: any matrix multivariate data (RNAseq, Microarrays, Proteomics, miRNA, Metabolomics, Lipidomics, methDNA, Mulitplex and FACS, Clinical data, Biomarkers, etc.), as well as publicly available GEO data, gene sets files, gene ontology. Complete list is available here. 1-2PM: Getting started session – take advantage of a trial access for Yale! Have a look the info uploaded to the Yale Library folder, including presentation, case studies, tutorials, etc. Qlucore tools allow researchers to perform advanced visualization, exploration and statistical analysis of omics data with the help of an intuitive GUI. Targets of interest can be further explored in terms of biological insight using GO and GSEA.  Unmatched speed, immediate visual feedback, continuous visualization, and synchronized views significantly shorten both data-to-result and query-to-discovery times. By combining right annotations with statistical methods, data selection tools, and the eliminated factors function, a very broad range of different experiment designs can be analyzed with exceptional productivity. This solution draws upon both innovative and classical approaches, fueled by best-in-class industrial and academic research. Qlucore Omics Explorer helps you advance your research by: boosting the speed of your analysis at least by 50% generating new ideas, hypotheses, and giving you a new prospective on your data, and questions you ask of it helping recognize significant insight that is specific to biological process, disease, or function, as well as assumption-free exploration keeping your projects on track with simple QC checks on every step providing publication ready graphics, and intermediate results for collaboration. Qlucore Omics Explorer is used by big commercial companies as well as major research organizations and Universities across Europe and US. (e.g., Boehringer Ingelheim, Roche Diagnostics,  AstraZeneca, DFCI, BWH, Harvard, MD Anderson, MSKCC, MedImmune, Novo Nordisk, etc.). Date & Time:     10:00am - 12:00pm, Thursday, October 27, 2016 Location:           C-103 - SHM 333 Cedar St Campus:            Medical School Presenter:          Yana Khalina-Stackpole, PhD, Business and Support manager, Qlucore

Fall Training Sessions on Bioinformatics at the Medical Library

September 6, 2016 - 10:40pm by Rolando Garcia-Milian

The Yale Medical Library is offering a number of bioinformatics training session this Fall. These sessions are free and open to any Yale affiliate but registration is required due to limited seating. Please contact Rolando.milian@yale.edu  for questions or comments.   Title: The VERY Basics of the Unix Command Line A lot of biomedical software programs do not come with a graphical user interface (GUI), and a Unix command-line terminal environment is required to run such programs. In this 2-hour session, you will learn the basics of a Unix command-line terminal, such as how to navigate the file system, the permission and security structure, and how to run programs from the command line. No previous Unix or command-line experience is required to attend this session. Date: Thursday, October 6, 2016 Time: 10:00am - 12:00pm Location: Cushing/Whitney Medical Library, Room 103 TCC, 333 Cedar St Campus: Medical School   Title: Introduction to Enrichment Analysis Tools Bioinformatics enrichment tools play an important role in identifying, annotating, and functionally analyzing large list of genes generated by high-throughput technologies (e.g. microarrary, RNA-seq, ChIP-chip). This workshop will provide an overview of the principle, type of enrichments, and the infrastructure of enrichment tools. By using concrete examples, it will also introduce free tools for enrichment analysis as well as those licensed by the Medical Library Date: Thursday, September 8, 2016 Time: 11:00am - 12:30pm Location: Cushing/Whitney Medical Library, Simbonis Conference Room 101A, 333 Cedar St. Campus: Medical School   Title: Making Sense of Genomic Variation: Part 1 SNP Annotation The specific combination of genetic variation in an individual defines not only the external appearance but also susceptibility to diseases, cancer, genetic disorders, drug response, etc. This explains the great interest in discovering and cataloging these variations and using them for disease association and functional studies, among others. In this session we will review the most popular databases and tools to annotate, analyze and visualize genetic variations. Some of the databases and tools that will be discussed are: dbSNP Online Mendelian Inheritance in Man a comprehensive, authoritative compendium of human genes and genetic phenotypes. GWAS Catalog/PheGenI EBI-Ensembl Variant Effect Predictor to annotate and determine the effect of variants on genes, transcripts, and protein sequence, as well as regulatory regions. And more… Date: Thursday, September 22, 2016 Time: 11:00am - 12:30pm Location: Cushing/Whitney Medical Library, Simbonis Conference Room 101A, 333 Cedar St. Campus: Medical School   Title: Making Sense of Genomic Variation: Part 2 Structural Variants Structural variation encompasses diverse types of genomic variants including deletions, duplications, inversions, transpositions, translocations, among others. In many cases, determining whether a particular genetic variant is pathogenic or benign and its correlation with respect to a patient's disease phenotype is challenging. In this session we use online resources and tools to find, retrieve, annotate, and visualize structural variants NCBI’s database of genomic structural variants dbVar Database ofDatabasE of Chromosomal Imbalance and Phenotype in Humans Genomic Variants DGVa UCSC and Ensembl genome browsers Date: Thursday, October 6, 2016 Time: 11:00am - 12:30pm Location: Cushing/Whitney Medical Library, Simbonis Conference Room 101A, 333 Cedar St. Campus: Medical School   Title: Introduction to Genome Browsers. Part 1 Ensembl Ensembl provides access to genomic information with a number of visualization tools. By using Ensembl researchers can download data directly (e.g., genomic sequences), visualize many types of data (e.g., structural, variation, regulatory) directly on a genome assembly. In this session will review the basic functionalities and navigation of Ensembl by using specific examples. We will also use BioMart interface to answer questions and retrieve data and information from databases without the need of having any programming expertise. Date: Thursday, October 20, 2016 Time: 11:00am - 12:30pm Location: Cushing/Whitney Medical Library, Simbonis Conference Room 101A, 333 Cedar St. Campus: Medical School   Title: Understanding Research Impact Nowadays, it is not uncommon for employers, academic institutions, and funding agencies to ask for evidence of research impact before making important decisions, such as tenure promotions, academic honors, or grant awards. Therefore, it is important for researchers to understand what research impact is and what they can do to document, enhance, measure and present their research impact to those decision makers. This session introduces the core concepts of research impact, its deep roots and long tradition, the various quantitative metrics of impact, and an emerging practical framework for telling impact stories. This session also introduces how to publish and disseminate research work in ways that improve discoverability and therefore enhance impact. Date: Thursday, November 10, 2016 Time: 10:30am - 11:30am Location: Cushing/Whitney Medical Library, Room 103 TCC, 333 Cedar St Campus: Medical School   Title: My Bibliography and SciENcv: grant reporting, compliance and biosketch through MyNCBI Although not required at this point, the NIH suggest the use of the Science Experts Network Curriculum Vitae (SciENcv), -a MyNCBI online tool- that serves as an interagency system designed to create biosketches for multiple federal agencies. This, along with the use of My Bibliography for grant activity reporting and NIH Public Access Policy compliance, increases the importance using MyNCBI as a tool for managing NIH-sponsored research. This workshop introduces researchers, research assistants and administrators on the effective use of these online tools and will cover the following among other topics: How to create MyNCBI account and how to link it to the eRA Commons account How to delegate your account How to populate and manage My Bibliography How to use My Bibliography for grant reporting/compliance How to use SciENcv to create different biosketches (from scratch, from external source, etc) How to create and ORCID ID* and how to link SciENcv to that ORCID ID *ORCID stands for Open Research and Contribution ID. Some publishers and journals (Springer, Wiley, Journal of Neuroscience, The Journal of Immunology, etc.) are asking authors to submit their ORCD ID along with their manuscripts for publication. Date: Thursday, December 1, 2016 Time: 11:00am - 12:30pm Location: Cushing/Whitney Medical Library, Simbonis Conference Room 101A, 333 Cedar St Campus: Medical School   Title: The VERY Basics of the Unix Command Line A lot of biomedical software programs do not come with a graphical user interface (GUI), and a Unix command-line terminal environment is required to run such programs. In this 2-hour session, you will learn the basics of a Unix command-line terminal, such as how to navigate the file system, the permission and security structure, and how to run programs from the command line. No previous Unix or command-line experience is required to attend this session. Date: Thursday, December 8, 2016 Time: 10:00am - 12:00pm Location: Cushing/Whitney Medical Library, Room 103 TCC, 333 Cedar St. Campus: Medical School  

Access to Partek Flow for the analysis of NGS data available to Yale biomedical researchers

July 8, 2016 - 2:11pm by Rolando Garcia-Milian

The Yale Medical Library is providing access to Partek Flow, a Graphical User Interface and user-friendly software for the analysis of RNA, SmallRNA, and DNA sequencing experiments. A webinar showing how to use this software will take place in SHM C-103 on August 4, 2016 (see details below). Webinar: NGS Data Analysis in Partek Software Description: Why have over 5,000 scientific articles cited Partek software for turning their data into discovery? Because it empowers scientists to perform sophisticated statistical analyses with intuitive point-and-click actions, no command-line knowledge needed.  Join us for a complimentary webinar to see how Partek Flow software can be used to analyze your RNA, SmallRNA, and DNA sequencing experiments. Using an RNA-Seq data set, we’ll demonstrate how to check read quality, align reads against a reference genome, quantify RNA transcript levels, and identify differentially expressed genes. We’ll show you how to save your analysis steps and parameters in your own start-to-finish, repeatable and shareable pipeline. The webinar will conclude with a live Q&A session. Flow that aligns RNA-Seq reads to a reference genome using the STAR aligner followed by quantification of reads to a transcriptome (from https://documentation.partek.com/display/FLOWDOC/Pipelines) Date & Time:      9:30am - 11:00am, Thursday, August 4, 2016 Location:              C-103 - SHM 333 Cedar St, New Haven CT 06520 Campus:              Medical School Presenter:          Eric Seiser, PhD, Field Application Scientist, Partek Inc.

Day of Data 2016 Spring Discussion Series: Outcome Defined Organization of Patient Profiles

April 25, 2016 - 11:48am by Rolando Garcia-Milian

The Day of Data 2016 Spring Discussion Series will feature Dr. Alexander Cloninger. Dr. Cloninger has active applied collaborations with medical researchers at the Center for Outcomes Research and Evaluation at Yale and the National Institutes of Health. His current research deals with defining and analyzing patient similarity for the purposes of clustering and outcome prediction. Thursday, May 5, 2016 1:30 – 3:00 pm Sterling Memorial Library Lecture Hall Alex Cloninger is a Gibbs Assistant Professor in the Applied Mathematics Program at Yale, where he as been since 2014. He completed his Ph.D. at University of Maryland as a member of the Norbert Wiener Center for Harmonic Analysis, and his undergrad at Washington University in St. Louis. His research interests lie in the areas of machine learning and diffusion geometry, ranging from theory to implementation and data processing, with a focus on developing novel algorithms to work with medical data. SPONSORED BY Yale Center for Research Computing Yale Institution for Social and Policy Studies Yale University Library

Four On-site Workshops on Next-Generation Sequencing Data Analysis Tools

April 20, 2016 - 10:10am by Rolando Garcia-Milian

The End-user Bioinformatics Program at the Yale Cushing/Medical Library is hosting these four workshops on tools for the analysis of NGS data. Besides the two trainings on tools for functional analysis of NGS data already supported by the Medical Library (Ingenuity Pathway Analysis and MetaCore), we will have a presentation on Partek Flow, and another one on CLC Bio (QIAGEN). The medical library will ask for feedback on these tools for future support and licensing. If you are interested in attending any of these presentations, please register to reserve your seat. Please contact Rolando Milian for questions or comments. Title:    Introductory Workshop to MetaCore and Key Pathway Advisor – Pathway Analysis of “Omics” Data This hands-on training workshop will highlight basic functionalities as well as cover use cases to: Predict upstream regulators of gene expression using causal reasoning Use synergic enrichment analysis of upstream regulators and observed gene expression changes to identify key pathways associated with your data Compare between experiments to uncover phenotypic differences using enrichment analyzes Search and explore genes, proteins, diseases and compounds Date & Time:    9:00am - 11:00am, Thursday, April 28, 2016 Location:    Beaumont room (2nd floor), SHM, 333 Cedar St, New Haven CT 06510 Presenter:    Deborah Riley, PhD, Senior Solution Scientist – Thomson Reuters Life Sciences   Title:    Start-to-finish Analysis Software for NGS & Microarray Data. (miRNA-Seq Analysis with Partek: Serum miRNA Study in Alcohol Use Disorder Subjects Suggests Alterations of CNS Structure and Function)      During this seminar, we will feature a successful miRNA-Seq based study of extracellular miRNAs in 20 individuals diagnosed with alcohol use disorder (AUD).  We will demonstrate how to go from raw NGS data to biological interpretation using Partek software.  Analysis of the sequencing data using Partek Flow will include: checking the quality of reads generating aligned reads quantifying miRNA levels determining differentially expressed miRNAs By integrating miRNA-Seq results in Partek Genomics Suite, we will demonstrate how differentially expressed miRNAs impact CNS structure and function using Partek Pathway.  Lastly, miRNA expression microarray data will be analyzed in Partek Genomics Suite to validate findings from the next generation sequencing data. Date & Time:    9:00am - 11:00am, Tuesday, May 3, 2016 Location:    C-103 - SHM 333 Cedar St, New Haven CT 06520 Presenter:    Dr. Eric Seiser, Field Application Scientist, Partek Incorporated   Title:    Ingenuity Pathway Analysis Hands On Training If you have gene (including RNAseq), protein and metabolic expression data, you should be using IPA to guide you with the biological interpretation of your data.  Using IPA you will learn how to rapidly understand: Pathway involvement and change Effected biological processes Causal regulators and their directional effect on genes, functions and diseases across multiple time points or doses. You will also learn to explore IPA’s knowledge and discovery tools that allow you to relate the most recent literature findings to your research.   Requirement:  Laptop and active IPA account (Request your account here) Presenter: Devendra Mistry, PhD, Field Application Scientist, Ingenuity Products, QIAGEN Date & Time:    9:00am - 11:30am, Tuesday, May 10, 2016 Location:    C-103 - SHM 333 Cedar St, New Haven CT 06510   Title: CLC Genomics Workbench Overview of Application, Importing NGS read data, QC & Pre-processing De novo assembly – Genomes & Transcriptomes. Characterizing Contigs, Joining & Finishing Mapping/Alignment to Reference, Variant Calling, Annotation & Filtering RNA Seq Analysis Workflow & Tools Overview of Microbial Modules (Finishing & Microbial Genomics) CLC Biomedical Workbench & Ingenuity Variant Analysis Prebuilt intuitive pipeline for your human DNA-seq data that allows you to quickly go from reads or called variants to identifying and prioritizing the casual variants. Date & Time:    1:00pm - 3:00pm, Tuesday, May 10, 2016 Location:    C-103 - SHM 333 Cedar St, New Haven CT 06510 Presenter:    Devendra Mistry, PhD, Field Application Scientist, Ingenuity Products, QIAGEN    

On-site NCBI Bioinformatics Workshop at Yale School of Medicine

February 24, 2016 - 2:00pm by Rolando Garcia-Milian

On April 5 and 6, Dr. Peter Cooper*** will provide training in the form of four workshops (see below) on the some of the most valuable National Center for Biotechnology Information bioinformatics resources and tools at Yale School of Medicine. This training is hosted by the Yale Cushing/Whitney Medical Library. Although free and open to any Yale affiliate, it is recommended to register since seating is limited.   Please contact Rolando Milian for questions on these sessions: 203-785-6194   A Practical Guide to NCBI BLAST   This workshop highlights important features and demonstrates the practical aspects of using the NCBI BLAST service, the most popular sequence similarity service in the world. You will learn about useful but under-used features of the service. These include access from the Entrez sequence databases; the new genome BLAST service quick finder; the integration and expansion of Align-2-Sequences; organism limits and other filters; re-organized databases; formatting options and downloading options; and TreeView displays. You will also learn how to use other important sequence analysis services associated with BLAST including Primer BLAST, an oligonucleotide primer designer and specificity checker; the multiple protein sequence alignment tool, COBALT; IgBLAST, a tool for analysis of antibody and T-cell receptor sequences; and MOLE-BLAST, a new tool for clustering and providing taxonomic context for targeted loci sequences (16S, ITS, 28S). These aspects of BLAST provide easier access and results that are more comprehensive and easier to interpret. Date:                     Tuesday, April 5, 2016 Time:                     9:00am - 12:00pm Location:              C-103 - SHM 333 Cedar St, New Haven CT 0652   Accessing Genomes, Assemblies and Annotation Products   You will learn how NCBI processes genome-level data and produces annotation through the prokaryotic and eukaryotic genome annotation pipelines. You will find, browse, and download genome-level data for your organism of interest and for environmental and organismal metagenomes using the Genome, BioProject and Assembly resources. In addition to assembled and annotated data, you will retrieve and download draft whole genome shotgun and read-level next-gen sequencing data from the Nucleotide and Sequence Read Archive (SRA) databases. You will access results of precomputed analyses of genomes, as well as perform your own analyses of assembled and unassembled genomic data using NCBI's genome BLAST and SRA-BLAST services. Date:                     Tuesday, April 5, 2016 Time:                     1:30pm - 4:00pm Location:              C-103 - SHM 333 Cedar St, New Haven CT 06520   Accessing NCBI Human Variation and Medical Genetics Resources   You will learn to use and access resources associated with human sequence variations and phenotypes associated with specific human genes and phenotypes. The workshop will emphasize the Gene, MedGen and ClinVar resources to search by gene, phenotype and and variant respectively. You will learn how to map variation from dbSNP and dbVAR onto genes, transcripts, proteins, and genomic regions and how to find genetic tests in GTR. You will also gain experience using additional tools and viewers including PheGenI, a browser for genotype associations and the new Variation Viewer the 1000 Genomes Browser, which provide a useful ways to search for, map and browse variants as well as upload and download data in genomic context. Date:                     Wednesday, April 6, 2016 Time:                     9:00am - 12:00pm Location:              C-103 - SHM 333 Cedar St, New Haven CT 06520   Exploring Gene Expression Information at the NCBI   You will find, display and analyze microarray and sequence-based expression data that are stored in the Gene Expression Omnibus (GEO), Sequence Read Archive (SRA), UniGene, and Epigenomics databases to investigate the potential for expression of transcript splice variants and examine the levels of expression under varied experimental conditions as well as in different tissues and disease states. You will analyze Microarray data the on-demand GEO2R tool and will explore the precomputed transcript analyses that are displayed on the UniGene and GEO Profiles pages. You will explore genome-aligned RNA-Seq data through the Gene database's sequence viewer displays and analyze raw RNA-Seq reads in the SRA database using NCBI's SRA-BLAST service. Date:                     Wednesday, April 6, 2016 Time:                     1:30pm - 4:00pm Location:              C-103 - SHM 333 Cedar St, New Haven CT 06520 ***Dr. Peter Cooper, Staff Scientist, National Center for Biotechnology Information (NCBI) directs the scientific outreach and training program for the National Center for Biotechnology Information at the National Library of Medicine. Peter has conducted and developed training courses for biologists in the use of NBCI molecular databases and has provided scientific user support for the NCBI since 1998. Prior to joining the NCBI Peter pursued diverse biological research interests including peptide neurochemistry, marine environmental toxicology, and taught biology and chemistry. Peter earned a BS from Virginia Tech, a MA in chemistry from the Johns Hopkins University and a Ph.D. in Marine Science from the College of William and Mary, School of Marine Science in 1996

DatabasE of genomiC varIation and Phenotype in Humans using Ensembl Resources (DECIPHER)

January 6, 2016 - 1:41pm by Rolando Garcia-Milian

Many genetic variants are novel or rare which makes difficult their clinical interpretation. The DECIPHER Consortium was initiated in 2004 as a community of academic centers of Clinical Genetics who submit consented, anonymized  genotype  and  phenotype  data  from  patients  with  rare  genomic  disorders for sharing with other clinicians and researchers. The identification of patients sharing variants in a given locus with common phenotypic features leads to greater certainty in the clinical interpretation of these variants. As of January 6, 2015, there are 18 539 publicly available patient record, 51 496 phenotype observation in these patients, and 27 175 publicly available copy-number variants in this database. DECIPHER can be search by phenotype, by genomic position, band, gene, pathogenicity, variant consequence, etc. Results are presented as a table or can be visualized in a browser. This browser contains different tracks where variants can be visualized in the context of other data. Learn more on DECIPHER and how to use it to make sense of genetic variants at the workshop “Making Sense of Variation”.  You can also contact Rolando Garcia-Milian with questions on this or any other variation tool, References DECIPHER: Database of Chromosomal Imbalance and Phenotype in Humans using Ensembl Resources. Firth, H.V. et al (2009). Am.J.Hum.Genet 84, 524-533 (DOI: dx.doi.org/10/1016/j.ajhg.2009.03.010)

Do not let Excel to deplete your gene list

November 24, 2015 - 3:25pm by Rolando Garcia-Milian

Last night, while preparing an RNAseq dataset for functional analysis. I found this problem again. When opening high-throughput data results into Excel be aware that this software will convert (by default) some gene symbols into a date format- see examples in the table below. These conversions are not reversible so the original name cannot be recovered. Zeeberg et al. reported this problem back in 2004. If you are not aware of this and proceed with the functional analysis, those genes (converted into dates) will not be recognized and will not be computed. If you think that this will never happen to you, this error have been found in a project as important as the Cancer Genome Atlas.   One way to avoid this –from the end-user bioinformatics perspective- is to define the column containing the gene symbols as “Text” under the “Column data format” as shown in the figure below. It is always recommended –whenever possible- to use unique identifiers (Ensembl IDs, Gene IDs, Affymetrix IDs, etc.) other than gene symbols. If you are not sure, you can always go to the Gene database (NCBI, NIH) whenever looking for the official symbol of a gene.    For questions, consultations, or help with you functional analysis, please do not hesitate to contact me. Example of some human gene symbols that will be converted into dates by Excel.
Subscribe to RSS - Rolando Garcia-Milian's blog