Rolando Garcia-Milian's blog


21 August 2015 - 3:52pm by Rolando Garcia-Milian

The National Center for Biotechnology Information is developing a new type of BLAST called SmartBLAST. It process the user query in such a way that presents the three best matches from the non-redundant protein sequence database along with the two best protein matches from well-studied reference species. In addition, it provides results that match the query from the Conserved Domain Database (CDD)
SmartBLAST accepts only one query at a time- either as FASTA sequence or protein accession number/GI- and uses a combination of BLAST and a multiple sequence alignment to produce its results. It first uses the query to search the non-redundant (nr) protein database. Then, it searches the reference database with BLASTP, followed by a multiple sequence alignment on the six sequences (the query and five subject sequences) using the COBALT multiple sequence alignment program.

Screen capture showing the results of a SmartBLAST for TP53 (GI:187830777). Panel A shows the five matching sequences are represented as a phylogenetic tree and a graphical overview. The matches are color-coded: matches from the reference species are green, matches from the non-redundant protein database are blue, and your query is yellow. Panel B represents the results from the multiple alignments.

If you would like to see how SmartBLAST work, please register here for this 15-minute webinar: “Introducing SmartBLAST a Rapid Protein Identification Tool” -September 2, 2015 from 12:00- 12:15 PM at the Medical Library. 

Join the End-User Bioinformatics Network (EBNET) and become a member of a grass root community that collaborates on end-user bioinformatics events, training sessions, resources, and tools that support biomedical research at Yale.

Day of Data 2015 Call for Posters

27 July 2015 - 12:04pm by Rolando Garcia-Milian

Yale University undergraduates, graduate students, post-doctoral researchers, faculty, and staff are invited to submit posters for the 2015 Yale Day of Data, which will be held on September 18, 2015.  Any researcher who uses data for research can submit a poster!

The Day of Data is a university-wide event that will feature speakers from a number of disciplines discussing how they use data in their work. The presentations and posters from the 2013 & 2014 Day of Data events are available on the conference site:

We are looking for posters that describe how you collect, store, manage and use data in the course of a research project, but will also accept posters that more generally describe research that depends on data. Data may be of any kind and on any scale -- from small datasets collected during field work, to qualitative data, to big data projects using data from telescopes and other methods.

Please upload an abstract of the poster or the finished poster in digital format and a short description to the conference platform at: by August 28, 2015.

Successful submissions will be informed by September 4, 2015 and will be responsible for printing their own posters (see the poster guidelines from ITS: and uploading the PDF version of their poster to the conference website. If you have any questions, please contact Carla Heister at

Join the End-User Bioinformatics Network (EBNET) and become a member of a community that collaborates on end-user bioinformatics events, training sessions, resources, and tools that support biomedical research at Yale.

Training Sessions - Summer 2015 at the Cushing/Whitney Medical Library

8 July 2015 - 2:40pm by Rolando Garcia-Milian

Introduction to Ingenuity Pathway Analysis

Description:     What is IPA and what questions can it address?

  • Overview of key features in IPA
  • Ingenuity Knowledge Base
  • Search & Pathway Building - Gene/ Chemical, Functions, Drug Targets
  • Advanced Search: Limiting results to a molecule type, family or disease-association.
  • Building pathways: Creating a pathway, pathway navigating, Using Build and Overlay tools
  • Bioprofiler
  • Dataset Analysis: Interpretation of Gene, Transcript, Protein and Metabolite Data
  • Data Upload and Analysis:  Uploading and formatting a dataset, setting analysis parameters and running an analysis
  • Pathway Analysis and Canonical Pathways
  • Downstream Effects Analysis and identifying downstream functions and processes that are likely affected
  • Upstream regulators Analysis
  • Causal Network Analysis and identifying likely root regulators
  • Regulator Effects Analysis to link upstream regulators with downstream functions and processes that are affected
  • Comparison analysis and comparing multiple observations

Date & Time:  9:00am - 12:00pm, Wednesday, July 15, 2015

Location:         C-103 333 Cedar St, New Haven CT 06520

Campus:          Medical School

Presenter:       Dr. Kate Wendelsdorf, Applied Advanced Genomics, QIAGEN Informatics

Registration required


Advance Ingenuity Pathway Training: Integrated Analysis and Interpretation of Variant and Gene Expression Data from Breast Cancer Subtypes

            Methods that jointly interpret genomes and transcriptome data from disease case samples may be able to identify disease-specific factors and pathogenicity mechanisms that may not be observable on a single data type. These insights can then be used create more effective screenings or treatments.

Here we show how jointly analyzing tumor-specific genotypes and gene expression can indicate medically important differences among tumor subtypes. Pairing two tools from Ingenuity® Systems – Variant Analysis  (for interpreting human genome data) and IPA® (for transcriptome data) – we trace differences between breast tumors that spread quickly (Claudin-low) versus slowly (luminal) to sequence variation that likely governs Epithelial-to-Mesenchymal Transition (EMT).

Variant Analysis was used to filter genomic variants in RNA seq data to a shortlist of those plausibly involved in driving tumor spread. IPA is then used to leverage gene expression patterns from the same dataset to identify molecular pathways involved in the metastatic phenotype of Claudin-low breast cancers. The seminar will demonstrate how using a combination of IPA features and QIAGEN tools can provide insight in to phenotype-causing pathways for experimental follow-up and hypothesis testing.

Date & Time:  1:00pm - 4:00pm, Wednesday, July 15, 2015

Location:         C-103 333 Cedar St, New Haven CT 06520

Campus:          Medical School

Presenter:       Dr. Wendelsdorf- QUIAGEN

Category:        Bioinformatics

Registration required  


Managing your References with EndNote

Description:     EndNote is a citation-management software application that makes saving citations and then citing them within documents easy. EndNote's pre-formatted style templates, specific to journal instructions, make it easy to insert references into your papers as you write them. In this class you will learn how to easily add citations into your EndNote library, attach PDFs, and insert references into your research papers.

Date & Time:  2:00pm - 3:00pm, Wednesday, July 29, 2015

Location:         Medical Library, Room 103 TCC, 333 Cedar St, New Haven CT 06520

Campus:          Medical School

Presenter:       Denise Hersey

Category:        Reference Management Systems

Registration required   


Give your PubMed Skills a Tune Up

Description:     PubMed is one of the most comprehensive resources for searching the biomedical literature.  Most researchers have used it one time or another, but it may be time to brush up on your search skills to ensure that you have a relevant set of results.  In this class, we will go over PubMed search techniques, including how to quickly limit a search and the role of Medical Subject Headings (MeSH) in creating more effective searches. Participants will also learn time-saving features such as saving searches and how to link out to full-text.

Date & Time:  6:00pm - 7:00pm, Wednesday, July 29, 2015

Location:         Medical Library, Room 103 TCC, 333 Cedar St, New Haven CT 06520

Campus:          Medical School

Presenter:       Melissa Funaro

Category:        Reference Management Systems

Registration required    


Webinar: Introduction to Cytoscape: network visualization software

Description:     Cytoscape, an open source molecular interactions visualization tool, allows the exploration of molecular interactions and biological pathways and integrates these networks with annotations, gene expression profiles, and other data.  This webinar will provide an introduction to some of the core functionality of Cytoscape, including the loading and manipulation of experimental data.  For example, you will learn how to change visual properties to easily distinguish biologically significant relationships.  Many additional features and advanced analyses are available through Cytoscape’s extensive list of apps. Examples of apps are MetScape (allows for visualizing and interpreting metabolomic data), Reactome FI (Reactome Functional Interaction and pathway enrichment tool), and BiNGO (Gene Ontology Tool).

Date & Time:  11:30am - 12:30pm, Tuesday, August 11, 2015

Location:         Medical Library, Large Conference Room 101A, 333 Cedar St, New Haven CT 06520

Campus:          Medical School

Presenter:       Marci Brandenburg, Bioinformationist, Taubman Health Sciences Library, University of Michigan

Category:        Bioinformatics

Registration required

Join the End-User Bioinformatics Network (EBNET) and become a member of a community that collaborates on end-user bioinformatics events, training sessions, resources, and tools that support biomedical research at Yale.

QIAGEN Clinical Insight®: new tool for clinical labs interpreting and reporting on genomic variants

9 June 2015 - 1:30pm by Rolando Garcia-Milian

QUIAGEN has announced the launching of its new tool Clinical Insight® (QCI)  for interpreting and reporting on genomic variant resulting from next-generation sequencing.  According to this company, the new tool can classify variant, identify treatment options, and perform geographical clinical trial matching. QCI has been evaluated in collaboration with Emory University School of Medicine and Dartmouth-Hitchcock Medical Center, among 50 other groups.  Clinical Insight® for Somatic Cancer provides clinical decision support for routine somatic cancer testing laboratories. QUIAGEN’s knowledge base contains millions of expert-curated biomedical finding and mutations from the literature and public databases that is used to build up-to-date pathways and networks related to diseases and drugs. The Cushing/Whitney Medical library provides access to two concurrent seats of QUIAGEN’s Ingenuity Pathway Analysis.

For questions on accessing IPA or any other knowledge based product (MetaCore, BIOBASE, etc.) through the medical library, please contact Rolando Milian

Join the End-User Bioinformatics Network (EBNET) and become a member of a community that collaborates on end-user bioinformatics events, training sessions, resources, and tools that support biomedical research.

Discovering the Beauty of Science: Call for Entries

17 April 2015 - 9:51am by Rolando Garcia-Milian

Scientists may not consider themselves artists, however, there are times when science and research experiments lead to incredibly beautiful visual results. We invite Yale biomedical researchers (undergrads, graduate students, postdocs, faculty, associate researchers, etc.) at Yale to “Discover the Beauty of Science” by submitting up to two images per individual. Share with us the visual results of your work where science crosses over to art.  Your images will be reviewed by an interdisciplinary panel of artists, scientists and members of medical community and selected for an YSM exhibition.

Contest Deadline
Friday, July 31 (deadline extended!), 2015 – 11:59 pm
Winners will be notified
Monday- August 31st, 2015

Awards will be given to 3 - 1st Honors and 1 - Viewer’s choice and consist of 1 TB USB 3.0 M3 Portable External Hard Drive
The images will also be posted online and a print exhibition will be on display in the foyer outside the Medical School Library Fall 2015

Yale affiliates including, students, postdocs, faculty, assistants, physicians, etc. working in scientific and biomedical research.

Rules of Submission
1.    Individuals may submit up to 2 images.
2.    There is no contest fee.
3.    The submitter must have been involved in the generation of the images and must obtain permission for its use in this contest from any colleagues who also participated. Acknowledgement of collaborators can be credited in the written description.
4.    Images must be submitted electronically USING THIS FORM 
5.    In awarding of prizes, images will be judged on esthetics, originality, and composition.

If you have questions or need help, contact Rolando Garcia Milian or Terry Dagradi

Join the End-User Bioinformatics Network (EBNET) and become a member of a community that collaborates on end-user bioinformatics events, training sessions, resources, and tools that support biomedical research.


16 April 2015 - 2:18pm by Rolando Garcia-Milian

Sponsored by the Cushing/Whitney Medical Library

Date & Time:    9:00am - 12:30pm, Friday, June 5, 2015
Location:    The Anlyan Center Auditorium (N 107), 300 Cedar Street New Haven, CT 06520
Campus:    Medical School
Presenter:    Dr. Alex Kaplun, Field Applications Scientist, BIOBASE
Registration:    Free and open to Yale affiliates – limited seating- REGISTER HERE

PROTEOME™’s powerful ontology search query system, with specialized tools for gene set analysis and pathway visualization, allows scientists to quickly find answers to questions relevant to their research. It works seamlessly with TRANSFAC®, an internationally unique knowledgebase containing data on eukaryotic transcription factors and miRNAs, their experimentally-proven binding sites, and regulated genes, which supports research into gene regulation. Based on TRANSFAC®'s broad compilation of binding sites, positional weight matrices are derived which can be used with the included Match tool to search DNA sequences for predicted transcription factor binding sites. TRANSFAC enables you to identify transcription factors affecting gene expression in your microarray and RNA-Seq experiments, as well as predict how they, in combination, can induce observed gene expression patterns.

In the PROTEOME™ section, the attendees will learn to:
1.    Search for individual gene, disease, and drug reports by name.
2.    Browse for sets of genes, diseases, and drugs which share a desired set of characteristics.
3.    Upload a list of genes and identify those characteristics which are statistically over-represented (NEW)
4.    Export annotated characteristics for a gene list.
5.    Visualize protein-protein networks, overlaid with disease and drug assignments
6.    Annotate custom sequences.

Network visualization using the BKL Pathfinder tool.

In the TRANSFAC section, the attendees will learn to:
1.    Search for individual transcription factors and miRNAs, their experimentally-characterized binding sites and regulated genes, and ChIP experiments.
2.    Create positional weight matrices of transcription factor binding sites using set of aligned experiment-derived sites.
3.    Predict transcription factor binding sites (single sites or combinations) within a promoter or DNA sequence.
4.    Analyze high-throughput data sets for models of transcription factor binding (NEW).
5.    Perform statistical analysis of your differential expression data to determine which transcription factors are responsible for the observed effect (NEW).
6.    Perform step-by-step comprehensive microarray and ChIP-seq data analysis in easy-to-use, guided workflows (NEW).

National Center for Biotechnology Information workshops broadcasted from the University of Michigan Medical Center

20 March 2015 - 10:53am by Rolando Garcia-Milian

The Yale Medical Library will be hosting a National Center for Biotechnology Information workshop series (broadcasted from the University of Michigan Medical Center). Please register (next to each workshop title) since seating is limited

Navigating NCBI Molecular Data through the Integrated Entrez System and BLAST (May 5, 9:00am - 11:30am EDT) REGISTER HERE

Gene Expression Resources at the NCBI (May 5, 1:00pm - 3:30pm EDT) REGISTER HERE

Human Genes, Variation, and Medical Genetics Resources (May 6, 9:00am - 11:30am EDT) REGISTER HERE

NCBI Genomes, Assemblies and Annotation Products: Microbiome to Human (May 6, 1:00pm - 3:30pm EDT) REGISTER HERE

Each workshop consists of four 2.5-hour hands-on sessions emphasizing a different set of NCBI resources. Each session uses specific examples to highlight important features of the resources and tools under study and to demonstrate how to accomplish common tasks. Attendees will learn among others:

  • The content of the sequence databases and uses these as exemplar Entrez molecular databases.
  • The importance of derivative data such as NCBI Reference Sequences (RefSeqs) and sequence-related Entrez information hubs such as Taxonomy, HomoloGene and Gene.
  • Aspects of the Entrez interface to collect and download a specific set of records, to narrow the search, and to use the pre-computed relationships available in the Entrez system to find related sequences, genomic regions, genomic maps, homologous genes and proteins, pathways and expression information.
  • The practical aspects of working with NCBI BLAST, the most popular sequence similarity service in the world.
  • How to use the features of the updated service including direct access from the Entrez sequence databases.
  • The integrated databases to find phenotypes, literature, sequences (genome, mRNA and protein), and variations.
  • How to map variations onto genes, transcripts, proteins, and genomic regions.
  • Gain experience using additional tools and viewers associated with Entrez. These include the Graphical Sequence Viewer, the Variation Viewer, Gene View in dbSNP, and the 1000 Genomes Browser.

NCBI's Entrez as a discovery system. Image courtesy of Dr. Peter Cooper, NCBI.

New Biosketch Format Required for NIH Applications Submitted on or After May 25, 2015

23 January 2015 - 4:24pm by Rolando Garcia-Milian

New Biosketch Format Required for NIH Applications Submitted on or After May 25, 2015

In a notice issued last December 5, 2014, the National Institute of Health (NIH) and the Agency for Healthcare Research announced the requirement of a new biosketch format for grant applications submitted for due dates on or after May 25, 2015.

The new format extends the page limit for the biosketch to five pages. It allows researchers to describe up to five of their most significant contributions to science. Each description can be supported by a list of up to four peer-reviewed publications or other research products, including A/V products, patents, databases, educational materials, instruments or equipment, models, protocols, etc. that are relevant to the described contribution.

Image courtesy of Dr. Trawick, National Library of Medicine, NIH

Although not required at this point, the NIH suggests the use of the Science Experts Network Curriculum Vitae (SciENcv), -a MyNCBI online tool- that serves as an interagency system designed to create biosketches for multiple federal agencies. This, along with the use of My Bibliography for grant activity reporting and NIH Public Access Policy compliance, increases the importance of using MyNCBI as a tool for managing NIH-sponsored research.

In response to this, the Cushing/Whitney Medical Library will offer the workshop “My Bibliography and SciENcv:  grant reporting, compliance and biosketch through MyNCBI” to introduce researchers, research assistants and administrators on the effective use of these online tools.

Implementation of the Genomic Data Sharing Policy Begins January 25, 2015

9 December 2014 - 12:00pm by Rolando Garcia-Milian

Genomic data sharing repositories

The NIH Genomic Data Sharing Policy becomes effective with NIH grant applications submitted for the January 25, 2015, due date and thereafter. 

Investigators preparing grant applications for those due dates should prepare now if the work proposed involves the generation or use of large-scale genomic data (Suplemental Information to the NIH Genomic Data Sharing). 

Applicants preparing such grant applications are expected to:

  • state in the cover letter that the studies proposed will generate large-scale human and/or non-human genomic data
  • include a genomic data sharing plan in the application.
  • if sharing of human data is not possible, provide a justification explaining why they cannot share these data and provide an alternative data sharing plan.

Applicants who plan to use controlled-access human genomic data from NIH-designated data repositories as a secondary user to achieve the specific aims in the application should:

  • briefly address their plans for requesting access to the data
  • state their intention to abide by the NIH Genomic Data User Code of Conduct, in the Research Plan of the application.

Applicants preparing applications that involve research funded prior to the Policy's effective date should:

  • make every effort to include a genomic data sharing plan in the application that outlines plans to comply with the expectations outlined in the Policy
  • plan to transition to a consent for future research uses and broad sharing, if possible if the studies involve human participants and were initiated before the Policy's effective date and used consents that do not meet the expectations of the GDS Policy.

Additional questions:
Genomic Data Sharing Policy Team
NIH Office of Science Policy
Telephone: 301-496-9838


Subscribe to RSS - Rolando Garcia-Milian's blog