Home News Bioinformatics

Bioinformatics

For bioinformatics posts

DatabasE of genomiC varIation and Phenotype in Humans using Ensembl Resources (DECIPHER)

January 6, 2016 - 1:41pm by Rolando Garcia-Milian

Many genetic variants are novel or rare which makes difficult their clinical interpretation. The DECIPHER Consortium was initiated in 2004 as a community of academic centers of Clinical Genetics who submit consented, anonymized  genotype  and  phenotype  data  from  patients  with  rare  genomic  disorders for sharing with other clinicians and researchers. The identification of patients sharing variants in a given locus with common phenotypic features leads to greater certainty in the clinical interpretation of these variants. As of January 6, 2015, there are 18 539 publicly available patient record, 51 496 phenotype observation in these patients, and 27 175 publicly available copy-number variants in this database. DECIPHER can be search by phenotype, by genomic position, band, gene, pathogenicity, variant consequence, etc. Results are presented as a table or can be visualized in a browser. This browser contains different tracks where variants can be visualized in the context of other data. Learn more on DECIPHER and how to use it to make sense of genetic variants at the workshop “Making Sense of Variation”.  You can also contact Rolando Garcia-Milian with questions on this or any other variation tool, References DECIPHER: Database of Chromosomal Imbalance and Phenotype in Humans using Ensembl Resources. Firth, H.V. et al (2009). Am.J.Hum.Genet 84, 524-533 (DOI: dx.doi.org/10/1016/j.ajhg.2009.03.010)

Do not let Excel to deplete your gene list

November 24, 2015 - 3:25pm by Rolando Garcia-Milian

Last night, while preparing an RNAseq dataset for functional analysis. I found this problem again. When opening high-throughput data results into Excel be aware that this software will convert (by default) some gene symbols into a date format- see examples in the table below. These conversions are not reversible so the original name cannot be recovered. Zeeberg et al. reported this problem back in 2004. If you are not aware of this and proceed with the functional analysis, those genes (converted into dates) will not be recognized and will not be computed. If you think that this will never happen to you, this error have been found in a project as important as the Cancer Genome Atlas.   One way to avoid this –from the end-user bioinformatics perspective- is to define the column containing the gene symbols as “Text” under the “Column data format” as shown in the figure below. It is always recommended –whenever possible- to use unique identifiers (Ensembl IDs, Gene IDs, Affymetrix IDs, etc.) other than gene symbols. If you are not sure, you can always go to the Gene database (NCBI, NIH) whenever looking for the official symbol of a gene.    For questions, consultations, or help with you functional analysis, please do not hesitate to contact me. Example of some human gene symbols that will be converted into dates by Excel.

Get your omics functional analysis done: upcoming trainings on Ingenuity Pathway Analysis and MetaCore

October 15, 2015 - 4:45pm by Rolando Garcia-Milian

The Yale Medical Library is providing to all Yale affiliates free access to two of the most powerful commercial bioinformatics tools for the analysis of omics data: MetaCore and Ingenuity Pathway Analysis. This is part of a pilot project conducted by the medical library in order to find sustainable and long term access to these tools. Please register for these upcoming trainings if you are interested in learning how to use these tools or if you need a refresher. For questions on how to register for an account or comments please contact Rolando Milian Title: Introduction to Ingenuity Pathway Analysis Description: What is IPA and what questions can it address? Overview of key features in IPA Ingenuity Knowledge Base Search & Pathway Building - Gene/ Chemical, Functions, Drug Targets Advanced Search: Limiting results to a molecule type, family or disease-association. Building pathways: Creating a pathway, pathway navigating, Using Build and Overlay tools Bioprofiler Dataset Analysis: Interpretation of Gene, Transcript, Protein and Metabolite Data Data Upload and Analysis:  Uploading and formatting a dataset, setting analysis parameters and running an analysis Pathway Analysis and Canonical Pathways Downstream Effects Analysis and identifying downstream functions and processes that are likely affected Upstream regulators Analysis Causal Network Analysis and identifying likely root regulators Regulator Effects Analysis to link upstream regulators with downstream functions and processes that are affected Comparison analysis and comparing multiple observations Date & Time:      9:00am - 12:00pm, Tuesday, October 27, 2015 Location:              H-203, Jane Ellen Hope Building, 315 Cedar St, New Haven CT Presenter:          Field Scientist QIAGEN Informatics   Title:      MetaCore: Getting the most from your "omics" analysis (Introductory session) Description: The ability to generate massive amounts of data with "omics" analysis begs the need for a tool to analyze and prioritize the biological relevance of this information. GeneGo provides a solution for using "omics" gene lists to generate and prioritize hypotheses with MetaCore. This tutorial highlights how to work with different types of data (genomics, proteomics, metabolomics and interaction data) beginning with how to upload gene lists and expression data (if available). Here we demonstrate data manager capabilities including how to upload, batch upload, store, share and check data properties and signal distribution. We then focus on how MetaCore uses your gene list to extract functional relevance by determining the most enriched processes across several ontologies. This entails a detailed lesson on how to prioritize your hypothesis using the statistically significance enrichment histograms and associate highly interactive GeneGo Maps and pre-built networks. We further emphasize the role of expression data in your analysis and the ability to visually predict experimental results, associated disease and possible drug targets. Lastly we highlight the benefits of using MetaCore workflows to compare data sets and work with experiment intersections. Date & Time:      10:00am - 12:00pm, Tuesday, November 3, 2015 Location:              C-103 - SHM 333 Cedar St, New Haven CT 06520 Presenter:          Dr. Matthew Wampole, Solution Scientist, IP & Science, Thomson Reuters   Title:      MetaCore: Getting the most from your "omics" analysis (Advanced) Description: In the advanced tutorial, we will explore uses of our network building algorithms and methods for hypothesizing key hubs passed on data. We will begin this session with a discussion on using the Key Pathway Advisor to hypothesize key hubs regulating gene expression data. The session will then review ways of using the 11 network building algorithms in MetaCore. The first example will review how to build a network purely from the curated knowledge within MetaCore. Then we will go through an example of using omics data to build a network of interactions to better understand the relationships within our data. Date & Time:      1:00pm - 3:00pm, Tuesday, November 3, 2015 Location:              C-103 - SHM 333 Cedar St, New Haven CT 06520 Presenter:          Dr. Matthew Wampole, Solution Scientist, IP & Science, Thomson Reuters     Join the End-user Bioinformatics Group and become a member of a community that collaborates on end-user bioinformatics events, training sessions, resources, and tools that support biomedical research at Yale.

NCBI's SmartBLAST

August 21, 2015 - 3:52pm by Rolando Garcia-Milian

The National Center for Biotechnology Information is developing a new type of BLAST called SmartBLAST. It process the user query in such a way that presents the three best matches from the non-redundant protein sequence database along with the two best protein matches from well-studied reference species. In addition, it provides results that match the query from the Conserved Domain Database (CDD) SmartBLAST accepts only one query at a time- either as FASTA sequence or protein accession number/GI- and uses a combination of BLAST and a multiple sequence alignment to produce its results. It first uses the query to search the non-redundant (nr) protein database. Then, it searches the reference database with BLASTP, followed by a multiple sequence alignment on the six sequences (the query and five subject sequences) using the COBALT multiple sequence alignment program. Screen capture showing the results of a SmartBLAST for TP53 (GI:187830777). Panel A shows the five matching sequences are represented as a phylogenetic tree and a graphical overview. The matches are color-coded: matches from the reference species are green, matches from the non-redundant protein database are blue, and your query is yellow. Panel B represents the results from the multiple alignments. Join the End-User Bioinformatics Network (EBNET) and become a member of a grass root community that collaborates on end-user bioinformatics events, training sessions, resources, and tools that support biomedical research at Yale.

Day of Data 2015 Call for Posters

July 27, 2015 - 12:04pm by Rolando Garcia-Milian

Yale University undergraduates, graduate students, post-doctoral researchers, faculty, and staff are invited to submit posters for the 2015 Yale Day of Data, which will be held on September 18, 2015.  Any researcher who uses data for research can submit a poster! The Day of Data is a university-wide event that will feature speakers from a number of disciplines discussing how they use data in their work. The presentations and posters from the 2013 & 2014 Day of Data events are available on the conference site: https://elischolar.library.yale.edu/dayofdata/ We are looking for posters that describe how you collect, store, manage and use data in the course of a research project, but will also accept posters that more generally describe research that depends on data. Data may be of any kind and on any scale -- from small datasets collected during field work, to qualitative data, to big data projects using data from telescopes and other methods.

Training Sessions - Summer 2015 at the Cushing/Whitney Medical Library

July 8, 2015 - 2:40pm by Rolando Garcia-Milian

Introduction to Ingenuity Pathway Analysis Description:     What is IPA and what questions can it address? Overview of key features in IPA Ingenuity Knowledge Base Search & Pathway Building - Gene/ Chemical, Functions, Drug Targets Advanced Search: Limiting results to a molecule type, family or disease-association. Building pathways: Creating a pathway, pathway navigating, Using Build and Overlay tools Bioprofiler Dataset Analysis: Interpretation of Gene, Transcript, Protein and Metabolite Data Data Upload and Analysis:  Uploading and formatting a dataset, setting analysis parameters and running an analysis Pathway Analysis and Canonical Pathways Downstream Effects Analysis and identifying downstream functions and processes that are likely affected Upstream regulators Analysis Causal Network Analysis and identifying likely root regulators Regulator Effects Analysis to link upstream regulators with downstream functions and processes that are affected Comparison analysis and comparing multiple observations Date & Time:  9:00am - 12:00pm, Wednesday, July 15, 2015 Location:         C-103 333 Cedar St, New Haven CT 06520 Campus:          Medical School Presenter:       Dr. Kate Wendelsdorf, Applied Advanced Genomics, QIAGEN Informatics   Advance Ingenuity Pathway Training: Integrated Analysis and Interpretation of Variant and Gene Expression Data from Breast Cancer Subtypes             Methods that jointly interpret genomes and transcriptome data from disease case samples may be able to identify disease-specific factors and pathogenicity mechanisms that may not be observable on a single data type. These insights can then be used create more effective screenings or treatments. Here we show how jointly analyzing tumor-specific genotypes and gene expression can indicate medically important differences among tumor subtypes. Pairing two tools from Ingenuity® Systems – Variant Analysis  (for interpreting human genome data) and IPA® (for transcriptome data) – we trace differences between breast tumors that spread quickly (Claudin-low) versus slowly (luminal) to sequence variation that likely governs Epithelial-to-Mesenchymal Transition (EMT). Variant Analysis was used to filter genomic variants in RNA seq data to a shortlist of those plausibly involved in driving tumor spread. IPA is then used to leverage gene expression patterns from the same dataset to identify molecular pathways involved in the metastatic phenotype of Claudin-low breast cancers. The seminar will demonstrate how using a combination of IPA features and QIAGEN tools can provide insight in to phenotype-causing pathways for experimental follow-up and hypothesis testing. Date & Time:  1:00pm - 4:00pm, Wednesday, July 15, 2015 Location:         C-103 333 Cedar St, New Haven CT 06520 Campus:          Medical School Presenter:       Dr. Wendelsdorf- QUIAGEN Category:        Bioinformatics     Managing your References with EndNote Description:     EndNote is a citation-management software application that makes saving citations and then citing them within documents easy. EndNote's pre-formatted style templates, specific to journal instructions, make it easy to insert references into your papers as you write them. In this class you will learn how to easily add citations into your EndNote library, attach PDFs, and insert references into your research papers. Date & Time:  2:00pm - 3:00pm, Wednesday, July 29, 2015 Location:         Medical Library, Room 103 TCC, 333 Cedar St, New Haven CT 06520 Campus:          Medical School Presenter:       Denise Hersey Category:        Reference Management Systems   Give your PubMed Skills a Tune Up Description:     PubMed is one of the most comprehensive resources for searching the biomedical literature.  Most researchers have used it one time or another, but it may be time to brush up on your search skills to ensure that you have a relevant set of results.  In this class, we will go over PubMed search techniques, including how to quickly limit a search and the role of Medical Subject Headings (MeSH) in creating more effective searches. Participants will also learn time-saving features such as saving searches and how to link out to full-text. Date & Time:  6:00pm - 7:00pm, Wednesday, July 29, 2015 Location:         Medical Library, Room 103 TCC, 333 Cedar St, New Haven CT 06520 Campus:          Medical School Presenter:       Melissa Funaro Category:        Reference Management Systems     Webinar: Introduction to Cytoscape: network visualization software Description:     Cytoscape, an open source molecular interactions visualization tool, allows the exploration of molecular interactions and biological pathways and integrates these networks with annotations, gene expression profiles, and other data.  This webinar will provide an introduction to some of the core functionality of Cytoscape, including the loading and manipulation of experimental data.  For example, you will learn how to change visual properties to easily distinguish biologically significant relationships.  Many additional features and advanced analyses are available through Cytoscape’s extensive list of apps. Examples of apps are MetScape (allows for visualizing and interpreting metabolomic data), Reactome FI (Reactome Functional Interaction and pathway enrichment tool), and BiNGO (Gene Ontology Tool). Date & Time:  11:30am - 12:30pm, Tuesday, August 11, 2015 Location:         Medical Library, Large Conference Room 101A, 333 Cedar St, New Haven CT 06520 Campus:          Medical School Presenter:       Marci Brandenburg, Bioinformationist, Taubman Health Sciences Library, University of Michigan Category:        Bioinformatics  

QIAGEN Clinical Insight®: new tool for clinical labs interpreting and reporting on genomic variants

June 9, 2015 - 1:30pm by Rolando Garcia-Milian

QUIAGEN has announced the launching of its new tool Clinical Insight® (QCI)  for interpreting and reporting on genomic variant resulting from next-generation sequencing.  According to this company, the new tool can classify variant, identify treatment options, and perform geographical clinical trial matching. QCI has been evaluated in collaboration with Emory University School of Medicine and Dartmouth-Hitchcock Medical Center, among 50 other groups.  Clinical Insight® for Somatic Cancer provides clinical decision support for routine somatic cancer testing laboratories. QUIAGEN’s knowledge base contains millions of expert-curated biomedical finding and mutations from the literature and public databases that is used to build up-to-date pathways and networks related to diseases and drugs. The Cushing/Whitney Medical library provides access to two concurrent seats of QUIAGEN’s Ingenuity Pathway Analysis. For questions on accessing IPA or any other knowledge based product (MetaCore, BIOBASE, etc.) through the medical library, please contact Rolando Milian Join the End-User Bioinformatics Network (EBNET) and become a member of a community that collaborates on end-user bioinformatics events, training sessions, resources, and tools that support biomedical research.

Discovering the Beauty of Science: Call for Entries

April 17, 2015 - 9:51am by Rolando Garcia-Milian

Scientists may not consider themselves artists, however, there are times when science and research experiments lead to incredibly beautiful visual results. We invite Yale biomedical researchers (undergrads, graduate students, postdocs, faculty, associate researchers, etc.) at Yale to “Discover the Beauty of Science” by submitting up to two images per individual. Share with us the visual results of your work where science crosses over to art.  Your images will be reviewed by an interdisciplinary panel of artists, scientists and members of medical community and selected for an YSM exhibition. Contest Deadline Friday, July 31 (deadline extended!), 2015 – 11:59 pm Winners will be notified Monday- August 31st, 2015 Awards Awards will be given to 3 - 1st Honors and 1 - Viewer’s choice and consist of 1 TB USB 3.0 M3 Portable External Hard Drive The images will also be posted online and a print exhibition will be on display in the foyer outside the Medical School Library Fall 2015 Eligibility Yale affiliates including, students, postdocs, faculty, assistants, physicians, etc. working in scientific and biomedical research. Rules of Submission 1.    Individuals may submit up to 2 images. 2.    There is no contest fee. 3.    The submitter must have been involved in the generation of the images and must obtain permission for its use in this contest from any colleagues who also participated. Acknowledgement of collaborators can be credited in the written description. 4.    Images must be submitted electronically USING THIS FORM  5.    In awarding of prizes, images will be judged on esthetics, originality, and composition.   If you have questions or need help, contact Rolando Garcia Milian or Terry Dagradi. 

BIOBASE TRAINING WORKSHOP

April 16, 2015 - 2:18pm by Rolando Garcia-Milian

Sponsored by the Cushing/Whitney Medical Library Date & Time:    9:00am - 12:30pm, Friday, June 5, 2015 Location:    The Anlyan Center Auditorium (N 107), 300 Cedar Street New Haven, CT 06520 Campus:    Medical School Presenter:    Dr. Alex Kaplun, Field Applications Scientist, BIOBASE Registration:    Free and open to Yale affiliates – limited seating-   PROTEOME™’s powerful ontology search query system, with specialized tools for gene set analysis and pathway visualization, allows scientists to quickly find answers to questions relevant to their research. It works seamlessly with TRANSFAC®, an internationally unique knowledgebase containing data on eukaryotic transcription factors and miRNAs, their experimentally-proven binding sites, and regulated genes, which supports research into gene regulation. Based on TRANSFAC®'s broad compilation of binding sites, positional weight matrices are derived which can be used with the included Match tool to search DNA sequences for predicted transcription factor binding sites. TRANSFAC enables you to identify transcription factors affecting gene expression in your microarray and RNA-Seq experiments, as well as predict how they, in combination, can induce observed gene expression patterns. In the PROTEOME™ section, the attendees will learn to: 1.    Search for individual gene, disease, and drug reports by name. 2.    Browse for sets of genes, diseases, and drugs which share a desired set of characteristics. 3.    Upload a list of genes and identify those characteristics which are statistically over-represented (NEW) 4.    Export annotated characteristics for a gene list. 5.    Visualize protein-protein networks, overlaid with disease and drug assignments 6.    Annotate custom sequences. Network visualization using the BKL Pathfinder tool. In the TRANSFAC section, the attendees will learn to: 1.    Search for individual transcription factors and miRNAs, their experimentally-characterized binding sites and regulated genes, and ChIP experiments. 2.    Create positional weight matrices of transcription factor binding sites using set of aligned experiment-derived sites. 3.    Predict transcription factor binding sites (single sites or combinations) within a promoter or DNA sequence. 4.    Analyze high-throughput data sets for models of transcription factor binding (NEW). 5.    Perform statistical analysis of your differential expression data to determine which transcription factors are responsible for the observed effect (NEW). 6.    Perform step-by-step comprehensive microarray and ChIP-seq data analysis in easy-to-use, guided workflows (NEW).
Subscribe to RSS - Bioinformatics