Rolando Garcia-Milian's blog

(2 X 2) 2 High-throughput Data Analysis Workshops X 2 on NCBI Public Databases.

19 September 2016 - 10:09am by Rolando Garcia-Milian

Cushing/Whitney Medical Library has organized these four workshops, two of them on high-throughput data analysis tools and two on NCBI public databases. Although these are free and open to any Yale affiliate, registration is required due to limited seating.  

Title:      NGS Data Analysis in Partek® Software: Onsite Workshop

Registration: http://schedule.yale.edu/event/2819565

Description:   

Morning Session (Overview, Hands On: Analysis on RNA-Seq Data): 9:00AM – 12:00PM

Free access to Partek Flow is provided by the Yale Medical Library. Register for an account here

This session will start with an overview of Partek Software solutions followed with a hands on RNA-Seq Data Analysis in Partek Flow. Topics will include how to use statistical tests to identify differentially expressed transcripts and alternative spliced genes among sample groups, how to generate a list of genes of interest and identify high level biological trends using Gene Ontology.

  • Import data (fastq, bam, text format)
  •  Perform QA/AC (Pre-alignment QA/AC, Post-alignment QA/QC)
  • Trim bases
  • Alignment
  • Gene/transcript abundance estimate (E/M)
  •  Differential expression detection (GSA, ANOVA)
  • Filter gene list
  • GO Enrichment Analysis
  • Visualization
    • Quality score distribution
    • Base composition
    • PCA scatterplot
    • Dotplot
    • Volcano plot
    • Hierarchical clustering
    • Chromosome view

Afternoon Session (Open Lab; Q&A): 1:30PM – 4:00PM

We hope to see you there!

 
Date & Time:     9:00am - 12:00pm, Thursday, September 29, 2016

Location:           C-103, SHM, 333 Cedar St

Campus:           Medical School

Presenter:          Eric Seiser, PhD, Field Application Scientist, Partek Inc.

 

Title: Broadcast. Navigating NCBI Molecular Data Using the Integrated Entrez System and BLAST

Registration: http://schedule.yale.edu/event/2832345

Description: This workshop will be broadcasted from the Taubman Health Sciences Library, Univ Michigan, and provides an introduction to the NCBI molecular databases and how to access the data using the Entrez text-based search system and BLAST sequence similarity search tool. You will learn the varied types of available molecular data, and how to find and display sequence, variation, genome information using organism sources (Taxonomy), data sources (Bioproject) and emphasizing the central role of the gene as an organizing concept to navigate across the integrated databases (Gene, Nucleotide, Protein, dbSNP and other resources).

Location: Cushing/Whitney Medical Library, Simbonis Conference Room 101A, 333 Cedar St

Presenter: Peter Cooper, Ph.D. of the National Center for Biotechnology Information

Date/time: 9:00am - 12:00pm, Tuesday, October 4, 2016

 

Title: Broadcast. A Practical Guide to NCBI BLAST

Registration: http://schedule.yale.edu/event/2832347

Description: This workshop will be broadcasted from the Taubman Health Sciences Library, Univ Michigan and highlights important features and demonstrates the practical aspects of using the NCBI BLAST service, the most popular sequence similarity service in the world. You will learn about useful but under-used features of the service. These include access from the Entrez sequence databases; the new genome BLAST service quick finder; the integration and expansion of Align-2- Sequences; organism limits and other filters; re-organized databases; formatting options and downloading options; and TreeView displays. You will also learn how to use other important sequence analysis services associated with BLAST including Primer BLAST, an oligonucleotide primer designer and specificity checker; the multiple protein sequence alignment tool, COBALT; and MOLE-BLAST, a new tool for clustering and providing taxonomic context for targeted loci sequences (16S, ITS, 28S). These aspects of BLAST provide easier access and results that are more comprehensive and easier to interpret.

Location: Cushing/Whitney Medical Library, Simbonis Conference Room 101A, 333 Cedar St

Presenter: Peter Cooper, Ph.D. of the National Center for Biotechnology Information

Date/time: 9:00am - 12:00pm, Wednesday, October 5, 2016

 

Title:      Make new discoveries with your OMICs data: Hypothesis testing and assumption-free exploration

Registration: http://schedule.yale.edu/event/2832289

Description:       

10AM- Noon:

  • Exemplary workflows for different experiment designs
  • Hypothesis testing and Assumption-free data exploration
  • Working with annotations, dynamic and interactive plots

Input data: any matrix multivariate data (RNAseq, Microarrays, Proteomics, miRNA, Metabolomics, Lipidomics, methDNA, Mulitplex and FACS, Clinical data, Biomarkers, etc.), as well as publicly available GEO data, gene sets files, gene ontology. Complete list is available here.

1-2PM:

Getting started session – take advantage of a trial access for Yale! Have a look the info uploaded to the Yale Library folder, including presentation, case studies, tutorials, etc.

Qlucore tools allow researchers to perform advanced visualization, exploration and statistical analysis of omics data with the help of an intuitive GUI. Targets of interest can be further explored in terms of biological insight using GO and GSEA.  Unmatched speed, immediate visual feedback, continuous visualization, and synchronized views significantly shorten both data-to-result and query-to-discovery times.

By combining right annotations with statistical methods, data selection tools, and the eliminated factors function, a very broad range of different experiment designs can be analyzed with exceptional productivity. This solution draws upon both innovative and classical approaches, fueled by best-in-class industrial and academic research.

Qlucore Omics Explorer helps you advance your research by:

  • boosting the speed of your analysis at least by 50%
  • generating new ideas, hypotheses, and giving you a new prospective on your data, and questions you ask of it
  • helping recognize significant insight that is specific to biological process, disease, or function, as well as assumption-free exploration
  • keeping your projects on track with simple QC checks on every step
  • providing publication ready graphics, and intermediate results for collaboration.

Qlucore Omics Explorer is used by big commercial companies as well as major research organizations and Universities across Europe and US. (e.g., Boehringer Ingelheim, Roche Diagnostics,  AstraZeneca, DFCI, BWH, Harvard, MD Anderson, MSKCC, MedImmune, Novo Nordisk, etc.).

Date & Time:     10:00am - 12:00pm, Thursday, October 27, 2016

Location:           C-103 - SHM 333 Cedar St

Campus:            Medical School

Presenter:          Yana Khalina-Stackpole, PhD, Business and Support manager, Qlucore

Fall Training Sessions on Bioinformatics at the Medical Library

6 September 2016 - 10:40pm by Rolando Garcia-Milian

The Yale Medical Library is offering a number of bioinformatics training session this Fall. These sessions are free and open to any Yale affiliate but registration is required due to limited seating.

Please contact Rolando.milian@yale.edu  for questions or comments.

 

Title: The VERY Basics of the Unix Command Line

Registration required: http://schedule.yale.edu/event/2800083

A lot of biomedical software programs do not come with a graphical user interface (GUI), and a Unix command-line terminal environment is required to run such programs. In this 2-hour session, you will learn the basics of a Unix command-line terminal, such as how to navigate the file system, the permission and security structure, and how to run programs from the command line. No previous Unix or command-line experience is required to attend this session.

Date: Thursday, October 6, 2016

Time: 10:00am - 12:00pm

Location: Cushing/Whitney Medical Library, Room 103 TCC, 333 Cedar St

Campus: Medical School

 

Title: Introduction to Enrichment Analysis Tools

Registration required: http://schedule.yale.edu/event/1154118

Bioinformatics enrichment tools play an important role in identifying, annotating, and functionally analyzing large list of genes generated by high-throughput technologies (e.g. microarrary, RNA-seq, ChIP-chip). This workshop will provide an overview of the principle, type of enrichments, and the infrastructure of enrichment tools. By using concrete examples, it will also introduce free tools for enrichment analysis as well as those licensed by the Medical Library

Date: Thursday, September 8, 2016

Time: 11:00am - 12:30pm

Location: Cushing/Whitney Medical Library, Simbonis Conference Room 101A, 333 Cedar St. Campus: Medical School

 

Title: Making Sense of Genomic Variation: Part 1 SNP Annotation

Registration required: http://schedule.yale.edu/event/2794996

The specific combination of genetic variation in an individual defines not only the external appearance but also susceptibility to diseases, cancer, genetic disorders, drug response, etc. This explains the great interest in discovering and cataloging these variations and using them for disease association and functional studies, among others. In this session we will review the most popular databases and tools to annotate, analyze and visualize genetic variations. Some of the databases and tools that will be discussed are:

  • dbSNP
  • Online Mendelian Inheritance in Man a comprehensive, authoritative compendium of human genes and genetic phenotypes.
  • GWAS Catalog/PheGenI
  • EBI-Ensembl Variant Effect Predictor to annotate and determine the effect of variants on genes, transcripts, and protein sequence, as well as regulatory regions.
  • And more…

Date: Thursday, September 22, 2016

Time: 11:00am - 12:30pm

Location: Cushing/Whitney Medical Library, Simbonis Conference Room 101A, 333 Cedar St.

Campus: Medical School

 

Title: Making Sense of Genomic Variation: Part 2 Structural Variants

Registration required: http://schedule.yale.edu/event/2795002

Structural variation encompasses diverse types of genomic variants including deletions, duplications, inversions, transpositions, translocations, among others. In many cases, determining whether a particular genetic variant is pathogenic or benign and its correlation with respect to a patient's disease phenotype is challenging.

In this session we use online resources and tools to find, retrieve, annotate, and visualize structural variants

  • NCBI’s database of genomic structural variants dbVar
  • Database ofDatabasE of Chromosomal Imbalance and Phenotype in Humans
  • Genomic Variants DGVa
  • UCSC and Ensembl genome browsers

Date: Thursday, October 6, 2016

Time: 11:00am - 12:30pm

Location: Cushing/Whitney Medical Library, Simbonis Conference Room 101A, 333 Cedar St.

Campus: Medical School

 

Title: Introduction to Genome Browsers. Part 1 Ensembl

Registration required: http://schedule.yale.edu/event/2795004

Ensembl provides access to genomic information with a number of visualization tools. By using Ensembl researchers can download data directly (e.g., genomic sequences), visualize many types of data (e.g., structural, variation, regulatory) directly on a genome assembly. In this session will review the basic functionalities and navigation of Ensembl by using specific examples. We will also use BioMart interface to answer questions and retrieve data and information from databases without the need of having any programming expertise.

Date: Thursday, October 20, 2016

Time: 11:00am - 12:30pm

Location: Cushing/Whitney Medical Library, Simbonis Conference Room 101A, 333 Cedar St.

Campus: Medical School

 

Title: Understanding Research Impact

Registration required: http://schedule.yale.edu/event/2803327

Nowadays, it is not uncommon for employers, academic institutions, and funding agencies to ask for evidence of research impact before making important decisions, such as tenure promotions, academic honors, or grant awards. Therefore, it is important for researchers to understand what research impact is and what they can do to document, enhance, measure and present their research impact to those decision makers. This session introduces the core concepts of research impact, its deep roots and long tradition, the various quantitative metrics of impact, and an emerging practical framework for telling impact stories. This session also introduces how to publish and disseminate research work in ways that improve discoverability and therefore enhance impact.

Date: Thursday, November 10, 2016

Time: 10:30am - 11:30am

Location: Cushing/Whitney Medical Library, Room 103 TCC, 333 Cedar St

Campus: Medical School

 

Title: My Bibliography and SciENcv: grant reporting, compliance and biosketch through MyNCBI

Registration required: http://schedule.yale.edu/event/2795006

Although not required at this point, the NIH suggest the use of the Science Experts Network Curriculum Vitae (SciENcv), -a MyNCBI online tool- that serves as an interagency system designed to create biosketches for multiple federal agencies. This, along with the use of My Bibliography for grant activity reporting and NIH Public Access Policy compliance, increases the importance using MyNCBI as a tool for managing NIH-sponsored research. This workshop introduces researchers, research assistants and administrators on the effective use of these online tools and will cover the following among other topics:

  • How to create MyNCBI account and how to link it to the eRA Commons account
  • How to delegate your account
  • How to populate and manage My Bibliography
  • How to use My Bibliography for grant reporting/compliance
  • How to use SciENcv to create different biosketches (from scratch, from external source, etc)
  • How to create and ORCID ID* and how to link SciENcv to that ORCID ID

*ORCID stands for Open Research and Contribution ID. Some publishers and journals (Springer, Wiley, Journal of Neuroscience, The Journal of Immunology, etc.) are asking authors to submit their ORCD ID along with their manuscripts for publication.

Date: Thursday, December 1, 2016

Time: 11:00am - 12:30pm

Location: Cushing/Whitney Medical Library, Simbonis Conference Room 101A, 333 Cedar St

Campus: Medical School

 

Title: The VERY Basics of the Unix Command Line

Registration required: http://schedule.yale.edu/event/2803929

A lot of biomedical software programs do not come with a graphical user interface (GUI), and a Unix command-line terminal environment is required to run such programs. In this 2-hour session, you will learn the basics of a Unix command-line terminal, such as how to navigate the file system, the permission and security structure, and how to run programs from the command line. No previous Unix or command-line experience is required to attend this session.

Date: Thursday, December 8, 2016

Time: 10:00am - 12:00pm

Location: Cushing/Whitney Medical Library, Room 103 TCC, 333 Cedar St.

Campus: Medical School

 

For a full list of training sessions including PubMed, EndNote, etc. , please visit the medical library calendar: http://library.medicine.yale.edu/classes

Access to Partek Flow for the analysis of NGS data available to Yale biomedical researchers

8 July 2016 - 2:11pm by Rolando Garcia-Milian

The Yale Medical Library is providing access to Partek Flow, a Graphical User Interface and user-friendly software for the analysis of RNA, SmallRNA, and DNA sequencing experiments. A webinar showing how to use this software will take place in SHM C-103 on August 4, 2016 (see details below).

Webinar: NGS Data Analysis in Partek Software

Registration here

Description: Why have over 5,000 scientific articles cited Partek software for turning their data into discovery? Because it empowers scientists to perform sophisticated statistical analyses with intuitive point-and-click actions, no command-line knowledge needed.

 Join us for a complimentary webinar to see how Partek Flow software can be used to analyze your RNA, SmallRNA, and DNA sequencing experiments. Using an RNA-Seq data set, we’ll demonstrate how to check read quality, align reads against a reference genome, quantify RNA transcript levels, and identify differentially expressed genes. We’ll show you how to save your analysis steps and parameters in your own start-to-finish, repeatable and shareable pipeline.

The webinar will conclude with a live Q&A session.

Flow that aligns RNA-Seq reads to a reference genome using the STAR aligner followed by quantification of reads to a transcriptome (from http://www.partek.com/pipelines)

Date & Time:      9:30am - 11:00am, Thursday, August 4, 2016

Location:              C-103 - SHM 333 Cedar St, New Haven CT 06520

Campus:              Medical School

Presenter:          Eric Seiser, PhD, Field Application Scientist, Partek Inc.

Day of Data 2016 Spring Discussion Series: Outcome Defined Organization of Patient Profiles

25 April 2016 - 11:48am by Rolando Garcia-Milian

The Day of Data 2016 Spring Discussion Series will feature Dr. Alexander Cloninger. Dr. Cloninger has active applied collaborations with medical researchers at the Center for Outcomes Research and Evaluation at Yale and the National Institutes of Health. His current research deals with defining and analyzing patient similarity for the purposes of clustering and outcome prediction.

  • Thursday, May 5, 2016
  • 1:30 – 3:00 pm
  • Sterling Memorial Library Lecture Hall

Alex Cloninger is a Gibbs Assistant Professor in the Applied Mathematics Program at Yale, where he as been since 2014. He completed his Ph.D. at University of Maryland as a member of the Norbert Wiener Center for Harmonic Analysis, and his undergrad at Washington University in St. Louis. His research interests lie in the areas of machine learning and diffusion geometry, ranging from theory to implementation and data processing, with a focus on developing novel algorithms to work with medical data.

SPONSORED BY
Yale Center for Research Computing
Yale Institution for Social and Policy Studies
Yale University Library
For more information, click here 

 

Four On-site Workshops on Next-Generation Sequencing Data Analysis Tools

20 April 2016 - 10:10am by Rolando Garcia-Milian

The End-user Bioinformatics Program at the Yale Cushing/Medical Library is hosting these four workshops on tools for the analysis of NGS data. Besides the two trainings on tools for functional analysis of NGS data already supported by the Medical Library (Ingenuity Pathway Analysis and MetaCore), we will have a presentation on Partek Flow, and another one on CLC Bio (QIAGEN). The medical library will ask for feedback on these tools for future support and licensing. If you are interested in attending any of these presentations, please register to reserve your seat. Please contact Rolando Milian for questions or comments.

Title:    Introductory Workshop to MetaCore and Key Pathway Advisor – Pathway Analysis of “Omics” Data
This hands-on training workshop will highlight basic functionalities as well as cover use cases to:

  • Predict upstream regulators of gene expression using causal reasoning
  • Use synergic enrichment analysis of upstream regulators and observed gene expression changes to identify key pathways associated with your data
  • Compare between experiments to uncover phenotypic differences using enrichment analyzes
  • Search and explore genes, proteins, diseases and compounds

Registration required
Date & Time:    9:00am - 11:00am, Thursday, April 28, 2016
Location:    Beaumont room (2nd floor), SHM, 333 Cedar St, New Haven CT 06510
Presenter:    Deborah Riley, PhD, Senior Solution Scientist – Thomson Reuters Life Sciences


 
Title:    Start-to-finish Analysis Software for NGS & Microarray Data.
(miRNA-Seq Analysis with Partek: Serum miRNA Study in Alcohol Use Disorder Subjects Suggests Alterations of CNS Structure and Function)     
During this seminar, we will feature a successful miRNA-Seq based study of extracellular miRNAs in 20 individuals diagnosed with alcohol use disorder (AUD).  We will demonstrate how to go from raw NGS data to biological interpretation using Partek software.  Analysis of the sequencing data using Partek Flow will include:

  • checking the quality of reads
  • generating aligned reads
  • quantifying miRNA levels
  • determining differentially expressed miRNAs

By integrating miRNA-Seq results in Partek Genomics Suite, we will demonstrate how differentially expressed miRNAs impact CNS structure and function using Partek Pathway.  Lastly, miRNA expression microarray data will be analyzed in Partek Genomics Suite to validate findings from the next generation sequencing data.
Registration required
Date & Time:    9:00am - 11:00am, Tuesday, May 3, 2016
Location:    C-103 - SHM 333 Cedar St, New Haven CT 06520
Presenter:    Dr. Eric Seiser, Field Application Scientist, Partek Incorporated

 

Title:    Ingenuity Pathway Analysis Hands On Training
If you have gene (including RNAseq), protein and metabolic expression data, you should be using IPA to guide you with the biological interpretation of your data.  Using IPA you will learn how to rapidly understand:

  • Pathway involvement and change
  • Effected biological processes
  • Causal regulators and their directional effect on genes, functions and diseases across multiple time points or doses. You will also learn to explore IPA’s knowledge and discovery tools that allow you to relate the most recent literature findings to your research.  

Requirement:  Laptop and active IPA account (Request your account here)
Registration required
Presenter: Devendra Mistry, PhD, Field Application Scientist, Ingenuity Products, QIAGEN
Date & Time:    9:00am - 11:30am, Tuesday, May 10, 2016
Location:    C-103 - SHM 333 Cedar St, New Haven CT 06510

 

Title: CLC Genomics Workbench
Overview of Application, Importing NGS read data, QC & Pre-processing

  • De novo assembly – Genomes & Transcriptomes. Characterizing Contigs, Joining & Finishing
  • Mapping/Alignment to Reference, Variant Calling, Annotation & Filtering
  • RNA Seq Analysis Workflow & Tools
  • Overview of Microbial Modules (Finishing & Microbial Genomics)
  • CLC Biomedical Workbench & Ingenuity Variant Analysis
  • Prebuilt intuitive pipeline for your human DNA-seq data that allows you to quickly go from reads or called variants to identifying and prioritizing the casual variants.

Registration required
Date & Time:    1:00pm - 3:00pm, Tuesday, May 10, 2016
Location:    C-103 - SHM 333 Cedar St, New Haven CT 06510
Presenter:    Devendra Mistry, PhD, Field Application Scientist, Ingenuity Products, QIAGEN  
 

The Yale Medical Library is Providing Access to the Human Gene Mutation Database (HGMD®)

17 April 2016 - 9:45pm by Rolando Garcia-Milian

As part of its End-user Bioinformatics Program, the Cushing Whitney Medical Library is providing access to the the Human Gene Mutation Database (HGMD®) to all Yale affiliates. This database organizes all known genotypes responsible for causing human inherited disease along with disease-associated polymorphisms published in the peer-reviewed literature— HGMD mutation data are manually curated from the scientific literature.

HGMD is available in two versions: one public, one obtainable by subscription. The public version is maintained is only updated twice per annum and is permanently 3 years out of date. The professional version is available to both commercial and academic/non-profit users via subscription from BIOBASE (QIAGEN). Access to HGMD should be done by following this link (https://portal.biobase-international.com/cgi-bin/portal/login.cgi and clicking on HGMD (access to BIOBASE Proteome or TRANSFAC continue to be provided by the medical library). VPN is required if connecting off-campus.

HGMD does not cover either somatic or mitochondrial mutations. For these, please visit COSMIC and MitoMap For pharmacological variants, PharmGKB The Medical Library offers regular training sessions on how to use these and other resources for variant annotation e.g. Database of Genomic Variants, DECIPHER and Copy Number Variation in Disease.

Please check the class calendar or contact Rolando Milian if you are interested in learning how to use these resources.

On-site NCBI Bioinformatics Workshop at Yale School of Medicine

24 February 2016 - 2:00pm by Rolando Garcia-Milian

On April 5 and 6, Dr. Peter Cooper*** will provide training in the form of four workshops (see below) on the some of the most valuable National Center for Biotechnology Information bioinformatics resources and tools at Yale School of Medicine. This training is hosted by the Yale Cushing/Whitney Medical Library. Although free and open to any Yale affiliate, it is recommended to register since seating is limited.  

Please contact Rolando Milian for questions on these sessions: 203-785-6194

 

A Practical Guide to NCBI BLAST

Register here

This workshop highlights important features and demonstrates the practical aspects of using the NCBI BLAST service, the most popular sequence similarity service in the world. You will learn about useful but under-used features of the service. These include access from the Entrez sequence databases; the new genome BLAST service quick finder; the integration and expansion of Align-2-Sequences; organism limits and other filters; re-organized databases; formatting options and downloading options; and TreeView displays. You will also learn how to use other important sequence analysis services associated with BLAST including Primer BLAST, an oligonucleotide primer designer and specificity checker; the multiple protein sequence alignment tool, COBALT; IgBLAST, a tool for analysis of antibody and T-cell receptor sequences; and MOLE-BLAST, a new tool for clustering and providing taxonomic context for targeted loci sequences (16S, ITS, 28S). These aspects of BLAST provide easier access and results that are more comprehensive and easier to interpret.

Date:                     Tuesday, April 5, 2016

Time:                     9:00am - 12:00pm

Location:              C-103 - SHM 333 Cedar St, New Haven CT 0652

 

Accessing Genomes, Assemblies and Annotation Products

Register here

You will learn how NCBI processes genome-level data and produces annotation through the prokaryotic and eukaryotic genome annotation pipelines. You will find, browse, and download genome-level data for your organism of interest and for environmental and organismal metagenomes using the Genome, BioProject and Assembly resources. In addition to assembled and annotated data, you will retrieve and download draft whole genome shotgun and read-level next-gen sequencing data from the Nucleotide and Sequence Read Archive (SRA) databases. You will access results of precomputed analyses of genomes, as well as perform your own analyses of assembled and unassembled genomic data using NCBI's genome BLAST and SRA-BLAST services.

Date:                     Tuesday, April 5, 2016

Time:                     1:30pm - 4:00pm

Location:              C-103 - SHM 333 Cedar St, New Haven CT 06520

 

Accessing NCBI Human Variation and Medical Genetics Resources

Register here

You will learn to use and access resources associated with human sequence variations and phenotypes associated with specific human genes and phenotypes. The workshop will emphasize the Gene, MedGen and ClinVar resources to search by gene, phenotype and and variant respectively. You will learn how to map variation from dbSNP and dbVAR onto genes, transcripts, proteins, and genomic regions and how to find genetic tests in GTR. You will also gain experience using additional tools and viewers including PheGenI, a browser for genotype associations and the new Variation Viewer the 1000 Genomes Browser, which provide a useful ways to search for, map and browse variants as well as upload and download data in genomic context.

Date:                     Wednesday, April 6, 2016

Time:                     9:00am - 12:00pm

Location:              C-103 - SHM 333 Cedar St, New Haven CT 06520

 

Exploring Gene Expression Information at the NCBI

Register here

You will find, display and analyze microarray and sequence-based expression data that are stored in the Gene Expression Omnibus (GEO), Sequence Read Archive (SRA), UniGene, and Epigenomics databases to investigate the potential for expression of transcript splice variants and examine the levels of expression under varied experimental conditions as well as in different tissues and disease states. You will analyze Microarray data the on-demand GEO2R tool and will explore the precomputed transcript analyses that are displayed on the UniGene and GEO Profiles pages. You will explore genome-aligned RNA-Seq data through the Gene database's sequence viewer displays and analyze raw RNA-Seq reads in the SRA database using NCBI's SRA-BLAST service.

Date:                     Wednesday, April 6, 2016

Time:                     1:30pm - 4:00pm

Location:              C-103 - SHM 333 Cedar St, New Haven CT 06520

***Dr. Peter Cooper, Staff Scientist, National Center for Biotechnology Information (NCBI) directs the scientific outreach and training program for the National Center for Biotechnology Information at the National Library of Medicine. Peter has conducted and developed training courses for biologists in the use of NBCI molecular databases and has provided scientific user support for the NCBI since 1998. Prior to joining the NCBI Peter pursued diverse biological research interests including peptide neurochemistry, marine environmental toxicology, and taught biology and chemistry. Peter earned a BS from Virginia Tech, a MA in chemistry from the Johns Hopkins University and a Ph.D. in Marine Science from the College of William and Mary, School of Marine Science in 1996

DatabasE of genomiC varIation and Phenotype in Humans using Ensembl Resources (DECIPHER)

6 January 2016 - 1:41pm by Rolando Garcia-Milian

Many genetic variants are novel or rare which makes difficult their clinical interpretation. The DECIPHER Consortium was initiated in 2004 as a community of academic centers of Clinical Genetics who submit consented, anonymized  genotype  and  phenotype  data  from  patients  with  rare  genomic  disorders for sharing with other clinicians and researchers. The identification of patients sharing variants in a given locus with common phenotypic features leads to greater certainty in the clinical interpretation of these variants. As of January 6, 2015, there are 18 539 publicly available patient record, 51 496 phenotype observation in these patients, and 27 175 publicly available copy-number variants in this database.

DECIPHER can be search by phenotype, by genomic position, band, gene, pathogenicity, variant consequence, etc. Results are presented as a table or can be visualized in a browser. This browser contains different tracks where variants can be visualized in the context of other data.

Learn more on DECIPHER and how to use it to make sense of genetic variants at the workshop “Making Sense of Variation”. Please register here if you would like to attend.

You can also contact Rolando Garcia-Milian with questions on this or any other variation tool,

References

DECIPHER: Database of Chromosomal Imbalance and Phenotype in Humans using Ensembl Resources. Firth, H.V. et al (2009). Am.J.Hum.Genet 84, 524-533 (DOI: dx.doi.org/10/1016/j.ajhg.2009.03.010)

Do not let Excel to deplete your gene list

24 November 2015 - 3:25pm by Rolando Garcia-Milian

Last night, while preparing an RNAseq dataset for functional analysis. I found this problem again. When opening high-throughput data results into Excel be aware that this software will convert (by default) some gene symbols into a date format- see examples in the table below. These conversions are not reversible so the original name cannot be recovered. Zeeberg et al. reported this problem back in 2004. If you are not aware of this and proceed with the functional analysis, those genes (converted into dates) will not be recognized and will not be computed. If you think that this will never happen to you, this error have been found in a project as important as the Cancer Genome Atlas.  

One way to avoid this –from the end-user bioinformatics perspective- is to define the column containing the gene symbols as “Text” under the “Column data format” as shown in the figure below. It is always recommended –whenever possible- to use unique identifiers (Ensembl IDs, Gene IDs, Affymetrix IDs, etc.) other than gene symbols. If you are not sure, you can always go to the Gene database (NCBI, NIH) whenever looking for the official symbol of a gene.   

For questions, consultations, or help with you functional analysis, please do not hesitate to contact me.

Example of some human gene symbols that will be converted into dates by Excel.

Get your omics functional analysis done: upcoming trainings on Ingenuity Pathway Analysis and MetaCore

15 October 2015 - 4:45pm by Rolando Garcia-Milian

The Yale Medical Library is providing to all Yale affiliates free access to two of the most powerful commercial bioinformatics tools for the analysis of omics data: MetaCore and Ingenuity Pathway Analysis. This is part of a pilot project conducted by the medical library in order to find sustainable and long term access to these tools. Please register for these upcoming trainings if you are interested in learning how to use these tools or if you need a refresher.

For questions on how to register for an account or comments please contact Rolando Milian

Title: Introduction to Ingenuity Pathway Analysis

Description:

  • What is IPA and what questions can it address?
  • Overview of key features in IPA
  • Ingenuity Knowledge Base
  • Search & Pathway Building - Gene/ Chemical, Functions, Drug Targets
  • Advanced Search: Limiting results to a molecule type, family or disease-association.
  • Building pathways: Creating a pathway, pathway navigating, Using Build and Overlay tools
  • Bioprofiler
  • Dataset Analysis: Interpretation of Gene, Transcript, Protein and Metabolite Data
  • Data Upload and Analysis:  Uploading and formatting a dataset, setting analysis parameters and running an analysis
  • Pathway Analysis and Canonical Pathways
  • Downstream Effects Analysis and identifying downstream functions and processes that are likely affected
  • Upstream regulators Analysis
  • Causal Network Analysis and identifying likely root regulators
  • Regulator Effects Analysis to link upstream regulators with downstream functions and processes that are affected
  • Comparison analysis and comparing multiple observations

Date & Time:      9:00am - 12:00pm, Tuesday, October 27, 2015

Location:              H-203, Jane Ellen Hope Building, 315 Cedar St, New Haven CT

Presenter:          Field Scientist QIAGEN Informatics

Register here     

 

Title:      MetaCore: Getting the most from your "omics" analysis (Introductory session)

Description: The ability to generate massive amounts of data with "omics" analysis begs the need for a tool to analyze and prioritize the biological relevance of this information. GeneGo provides a solution for using "omics" gene lists to generate and prioritize hypotheses with MetaCore. This tutorial highlights how to work with different types of data (genomics, proteomics, metabolomics and interaction data) beginning with how to upload gene lists and expression data (if available). Here we demonstrate data manager capabilities including how to upload, batch upload, store, share and check data properties and signal distribution. We then focus on how MetaCore uses your gene list to extract functional relevance by determining the most enriched processes across several ontologies. This entails a detailed lesson on how to prioritize your hypothesis using the statistically significance enrichment histograms and associate highly interactive GeneGo Maps and pre-built networks. We further emphasize the role of expression data in your analysis and the ability to visually predict experimental results, associated disease and possible drug targets. Lastly we highlight the benefits of using MetaCore workflows to compare data sets and work with experiment intersections.

Date & Time:      10:00am - 12:00pm, Tuesday, November 3, 2015

Location:              C-103 - SHM 333 Cedar St, New Haven CT 06520

Presenter:          Dr. Matthew Wampole, Solution Scientist, IP & Science, Thomson Reuters

Register here      

Title:      MetaCore: Getting the most from your "omics" analysis (Advanced)

Description: In the advanced tutorial, we will explore uses of our network building algorithms and methods for hypothesizing key hubs passed on data. We will begin this session with a discussion on using the Key Pathway Advisor to hypothesize key hubs regulating gene expression data. The session will then review ways of using the 11 network building algorithms in MetaCore. The first example will review how to build a network purely from the curated knowledge within MetaCore. Then we will go through an example of using omics data to build a network of interactions to better understand the relationships within our data.

Date & Time:      1:00pm - 3:00pm, Tuesday, November 3, 2015

Location:              C-103 - SHM 333 Cedar St, New Haven CT 06520

Presenter:          Dr. Matthew Wampole, Solution Scientist, IP & Science, Thomson Reuters

Register here     

 

Join the End-user Bioinformatics Group and become a member of a community that collaborates on end-user bioinformatics events, training sessions, resources, and tools that support biomedical research at Yale.

Subscribe to RSS - Rolando Garcia-Milian's blog