NCBI Insights
New Upcoming NCBI Virtual Workshops!
Apply to attend October 2022 interactive, hands-on workshops
Want to learn more about NCBI resources and how to implement our cutting-edge tools in your research? NCBI offers a variety of educational opportunities, including workshops, webinars, codeathons, tutorials, and more!
We are excited to announce our upcoming virtual workshop series for October 2022. Our interactive, hands-on workshops are taught by experienced NCBI Education Faculty. Applications are open to the public; however, each workshop will accept a limited number of participants to facilitate the best possible educational experience. Continue reading
SEPTEMBER 29, 2022
NCBI Workshop at the ASM NGS 2022 Meeting
NCBI Microbial Pathogen and SARS-CoV-2 Resources in the Cloud
Get hands-on experience with NCBI Pathogen Detection and SARS-CoV-2 Surveillance data in the cloud. No prior cloud experience necessary!
NCBI staff are presenting a workshop at the American Society for Microbiology Next-Generation Sequencing (ASM NGS) 2022 Meeting on Sunday, October 16, 2022 from 10 am – 3 pm ET (with a 1 hour break) to help conference attendees learn about two NCBI cloud-hosted resources, Pathogen Detection and SARS-CoV-2 Genome Sequence datasets. Continue reading
SEPTEMBER 28, 2022
Stephen Sherry, PhD, is the new NCBI Director and NLM Associate Director for Scientific Data Resources
We are excited that our own Stephen Sherry, PhD, is now the new NCBI Director at the National Library
of Medicine (NLM), and the NLM Associate Director for Scientific Data Resources. In these roles, Dr. Sherry will oversee the development and deployment of advanced computational solutions to meet life and health science information needs and facilitate open science and scholarship through a growing array of data, literature, and other information offerings and services from NLM.
Dr. Sherry brings a history of innovation and leadership to the NCBI Director position. Most recently, he served as Acting Director of NCBI, bringing a vision of customer engagement, and modular, interoperable, and cloud-based approaches to the technical platforms for NLM offerings and services. He is also recognized for his inventiveness in leveraging research for public health emergency response. Dr. Sherry has been central in making key innovations at NLM including the modernization effort and development of the NIH Comparative Genomics Resource, ensuring public input and technical innovation in the process. Dr. Sherry positioned NCBI as a strong collaborative force across the NIH and in supporting major NLM projects including the MEDLINE 2022 initiative, which resulted in 100% automated indexing of the biomedical literature available through NLM’s PubMed and PubMed Central (PMC).
“Dr. Sherry has the skills, knowledge, and insight to deliver creative, forward-thinking scientific and operational leadership for NLM and the communities we serve,” said NLM Director Patricia Flatley Brennan, RN, PhD. “His vast experience, expertise, and vision for NCBI is a great fit for NLM’s eye to the future and its commitment to drive innovation.”
Throughout his tenure at NCBI, Dr. Sherry has participated in many NIH efforts to characterize human genetic diversity and has served on numerous working groups across NIH to address a range of data science issues including the development of the genomic data sharing policy, privacy analysis for risk-sensitive data sets, and advances in scientific publications.
Dr. Sherry earned his PhD in Anthropology at the Pennsylvania State University in 1996 and completed a postdoctoral fellowship at the Louisiana State University Medical Center prior to joining NLM in 1998.
SEPTEMBER 27, 2022
Coming soon! Changes to NCBI Datasets command-line tool in version 14 (CLIv14.0.0)
In October 2022, NCBI Datasets will release version 14 of our datasets and dataformat command-line tools. This release will contain breaking changes to the command syntax, content of the data packages and data reports. Thank you for your feedback that inspired these new features. We hope they will improve your experience!
We will continue to support CLI v13.x, although new features and improvements will be exclusive to CLI v14.0.0 release and up.
NCBI Datasets supports the NIH Comparative Genomics Resource (CGR), an NLM project to establish an ecosystem to facilitate reliable comparative genomics analyses for all eukaryotic organisms. Join our mailing list to keep up to date with NCBI Datasets and other CGR news.
More details
How is version 14 of the Datasets command-line tools (CLI v14.x) different from CLI v13.x and previous versions? Continue reading
SEPTEMBER 27, 2022
Conserved Domain Database version 3.20 is available!
A new version of the Conserved Domain Database (CDD) is now available. Version 3.20 contains 1,614 new or updated NCBI/CDD-curated domains and now mirrors Pfam version 34 as well as new models from the NCBIfam collection. Fine-grained classifications of the [(+)ssRNA] virus RNA-dependent RNA polymerase catalytic domain, RING-finger/U-box, dimerization/docking domains of the cAMP-dependent protein kinase regulatory subunit, and Galactose/rhamnose-binding lectin domain superfamily have been added, along with many other new models.
We have significantly increased the fraction of CD-Search and interactive BATCH CD-Search queries that yield results showing conserved domain architecture information and attributes that further characterize protein function through links to information-rich resources such as Enzyme Commission (EC) numbers , Gene Ontology (GO) terms, PubMed IDs, and identifiers from the CaZY, TCDB, and MEROPS databases. See our earlier post for additional details. You can access CDD and find updated content on the CDD FTP site at CDD version 3.20.
 Database statistics for CDD version 3.20:
64,234Total models from all Source Databases
Organized into 4,541 multi-model Superfamilies
18,882NCBI CDD curation effort
1,009SMART v6.0
19,178PFAM v34
4,871COGs v1.0
10,140NCBI Protein Clusters
4,488TIGRFAM v15
59,693Total models form the default CD-Search database
CD Search is part of the NIH Comparative Genomics Resource (CGR), an NLM project to establish an ecosystem to facilitate reliable comparative genomics analyses for all eukaryotic organisms.
Join our mailing list to keep up to date with CD Search and other CGR news.
SEPTEMBER 26, 2022
RefSeq release 214 is available!
RefSeq release 214 is now available online, from the FTP site, and through NCBI’s Entrez programming utilities, E-utilities.
This full release incorporates genomic, transcript, and protein data available as of September 12, 2022, and contains 328,588,569 records, including 239,609,016 proteins, 47,387,931 RNAs, and sequences from 123,394 organisms. The release is provided in several directories as a complete dataset and also as divided by logical groupings.
Foreign contamination screening
Introducing the new Foreign Contamination Screen (FCS) tool! If you produce assembled genomes, check out FCS, a tool you can run yourself to improve your genome assemblies and facilitate high-quality data submissions to GenBank. FCS is part of the NIH Comparative Genomics Resource (CGR), an NLM project to establish an ecosystem to facilitate reliable comparative genomics analyses for all eukaryotic organisms. See our previous blog post to learn how FCS enhances contaminant detection sensitivity. Continue reading
SEPTEMBER 21, 2022
Join NCBI virtually at the Biodiversity Genomics 2022 conference
Learn about the NIH Comparative Genomics Resource (CGR) Project
The Biodiversity Genomics conference will take place virtually, October 2-7, 2022. This event is hosted by the Earth BioGenome Project and is open and free for all to attend.
NCBI staff will present a variety of recorded talks and posters highlighting various elements of the NIH Comparative Genomics Resource (CGR), including NCBI Datasets and the Comparative Genome Viewer (CGV). CGR is a multi-year National Library of Medicine (NLM) project to maximize the impact of eukaryotic research organisms and their genomic data resources to biomedical research. NCBI is charged with leading CGR development and engaging genomics communities. The CGR project will facilitate reliable comparative genomics analyses for all eukaryotic organisms in collaboration with the genomics community.
Check out NCBI’s schedule of activities to learn more about CGR: Continue reading
SEPTEMBER 19, 2022
NCBI hidden Markov models (HMM) release 10.0 now available!
Release 10.0 of the NCBI Hidden Markov models (HMM) used by the Prokaryotic Genome Annotation Pipeline (PGAP) is now available for download. You can search this collection against your favorite prokaryotic proteins to identify their function using the HMMER sequence analysis package.
The 10.0 release contains 15,360 models maintained by NCBI, including 228 that are new since 9.0, 99 that were modified significantly, and 205 that were assigned better names, EC numbers, Gene Ontology (GO) terms, gene symbols or publications. You can search and view the details for these in the Protein Family Model collection, which also includes conserved domain architectures and BlastRules, and find all RefSeq proteins they name.
GO terms associated with HMMs are now propagated to CDSs and proteins annotated with PGAP. In case you missed it, see our previous blog post on this topic.
SEPTEMBER 15, 2022
Coming soon: Updated PubMed E-utilities!
PubMed will be moving to an updated version of the E-utilities  API on November 15, 2022. As previously announced, this updated version of E-utilities will use the same technology as the web version of PubMed released in 2020. So, search results returned by the updated ESearch E-utility  will now match those of the website.  
This update only affects E-utility calls with &db=pubmed. There are no changes to the E-utilities for other databases. You can refer to our previous post or watch our recorded webinar for more details on this update.  Continue reading
SEPTEMBER 13, 2022
New ClinVar graphical display
Maps clinically significant variants by gene and position!
ClinVar is a freely accessible, public archive of reports of the relationships between human variations and phenotypes, with supporting evidence at NLM/NCBI. To help you access your variants of interest quickly, ClinVar is offering an experimental release of an all-new visualization tool in the search results. This graphical display provides an overview of variants when you search by gene or genomic region (Figures 1 and 2).
Currently the graphical display is implemented as an experiment and will appear for only 10 percent of searches by gene or genomic region, but the links in this post will show the display so you can try it out. Alternatively, if you would like to bring up the graphical display for your gene or genomic region search, you can edit the URL in the address bar to change the default gr=0 to gr=1.  For example, the following URL with show the graphical display:[gene]
Note that you can only get the graphical display with gene or genomic region searches. For other types of searches, you will see the table only.
Gene search display
The display for a gene search highlights small variants within the gene. Large structural variants are also marked as a single dot in the middle of the variation. The interactive display shows the placement of variants on the gene and their clinical significance and allows you to zoom in or pan right / left and limit results to variants in a chosen gene. Figure 1 shows the graphical display as it appears at the top of the search results for the desmoglein 2 (DSG2) gene and how to filter and navigate to variants of interest (Search ClinVar: DSG2[gene]).
Figure 1 (A-D). Graphical display views in ClinVar for variants in DSG2, a gene with many known pathogenic variants
A. Graphical view showing all variants for the DSG2 gene.  Results default to the GRCh37 assembly. You can change to the GRCh38 assembly by clicking the arrow at the upper left (circled in red).
B. You can zoom in by mousing over the 8th exon in the gene diagram, which activates a pop-up menu that allows you to re-display only this region by following the link (red box).
C. Refreshed result for the 8th exon of DSG2 showing a number of variants including pathogenic, benign, and ones with conflicting interpretations of pathogenicity. You can select the filters on the left-hand side of the ClinVar result to limit to variants with characteristics of interest, for example Conflicting Interpretations of pathogenicity.
D. Variants in exon 8 of DSG2 filtered for conflicting interpretations of pathogenicity. You can retrieve individual variants by mousing over the graphic to activate the pop-up menu and following the link (red box).
Continue reading
AUGUST 30, 2022
Older posts
Connect with NLM
National Library of Medicine
8600 Rockville Pike
Bethesda, MD 20894
HHS Vulnerability Disclosure