COVID-19 datasets
   HOME

TheInfoList



OR:

COVID-19 datasets are public databases for sharing case data and medical information related to the
COVID-19 pandemic The COVID-19 pandemic, also known as the coronavirus pandemic, is an ongoing global pandemic of coronavirus disease 2019 (COVID-19) caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The novel virus was first identi ...
.


Aggregate statistics


United States


Volunteer/non-government


U.S. Department of Health & Human Services


Global

* Johns Hopkins Coronavirus Resource Center: Global aggregated data including cases, testing, contact tracing, and vaccine development *
World Health Organization The World Health Organization (WHO) is a specialized agency of the United Nations responsible for international public health. The WHO Constitution states its main objective as "the attainment by all peoples of the highest possible level of ...
(WHO) Coronavirus Disease Dashboard: a database of confirmed cases and deaths reported globally and broken down by region. This database is part of the WHO Health Data Platform. * COVID-19 Africa Open Data Project: a volunteer-run database and dashboard reporting region, country and district level case counts, deaths, healthcare worker infections, healthcare services and urgent needs.


Data hubs


Health Data Research UK
provides a searchable registry of health data resources from the United Kingdom, includin
COVID-19 related datasets
* NIH Open Access Datasets: The
National Institutes of Health The National Institutes of Health, commonly referred to as NIH (with each letter pronounced individually), is the primary agency of the United States government responsible for biomedical and public health research. It was founded in the late ...
provide open-access data and computational resources related to COVID-19. * COVID-19 Open Research Dataset (CORD-19): The Semantic Scholar project of the
Allen Institute for AI The Allen Institute for AI (abbreviated AI2) is a research institute founded by late Microsoft co-founder Paul Allen. The institute seeks to achieve scientific breakthroughs by constructing AI systems with reasoning, learning, and reading capabi ...
hosts CORD-19, a public dataset of academic articles about COVID-19 and related research. The dataset is updated daily and includes both peer-reviewed articles and preprints. CORD-19 was originally released on March 16, 2020, by researchers and leaders from the Allen Institute for AI, Chan Zuckerburg Initiative, Georgetown University's Center for Security and Emerging Technhology,
Microsoft Microsoft Corporation is an American multinational technology corporation producing computer software, consumer electronics, personal computers, and related services headquartered at the Microsoft Redmond campus located in Redmond, Washin ...
, and the
National Library of Medicine The United States National Library of Medicine (NLM), operated by the United States federal government, is the world's largest medical library. Located in Bethesda, Maryland, the NLM is an institute within the National Institutes of Health. Its ...
. The dataset is created through the use of
text mining Text mining, also referred to as ''text data mining'', similar to text analytics, is the process of deriving high-quality information from text. It involves "the discovery by computer of new, previously unknown information, by automatically extract ...
of the current research literature.


Topic-specific and special-interest resources


Genomics

* Open access
Gene In biology, the word gene (from , ; "... Wilhelm Johannsen coined the word gene to describe the Mendelian units of heredity..." meaning ''generation'' or ''birth'' or ''gender'') can have several different meanings. The Mendelian gene is a b ...
sequencing data for
SARS-CoV-2 Severe acute respiratory syndrome coronavirus 2 (SARS‑CoV‑2) is a strain of coronavirus that causes COVID-19 (coronavirus disease 2019), the respiratory illness responsible for the ongoing COVID-19 pandemic. The virus previously had a ...
is provided by
GISAID GISAID (Global Initiative on Sharing Avian Influenza Data) is a global science initiative and primary source established in 2008 that provides open access to genomic data of influenza viruses and the coronavirus responsible for the COVID-19 pan ...
and included in an interactive Phylogenetic tree dashboard on
Nextstrain Nextstrain is a collaboration between researchers in Seattle, United States and Basel, Switzerland which provides a collection of open-source tools for visualising the genetics behind the spread of viral outbreaks. Its aim is to support public he ...
, an open-source pathogen
genome In the fields of molecular biology and genetics, a genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA (or RNA in RNA viruses). The nuclear genome includes protein-coding genes and non-coding g ...
data project.


Imaging (Radiology)

* Characteristic imaging features on chest
radiographs Radiography is an imaging technique using X-rays, gamma rays, or similar ionizing radiation and non-ionizing radiation to view the internal form of an object. Applications of radiography include medical radiography ("diagnostic" and "therapeut ...
and computed tomography (CT) of people who are symptomatic include asymmetric peripheral ground-glass opacities without
pleural effusion A pleural effusion is accumulation of excessive fluid in the pleural space, the potential space that surrounds each lung. Under normal conditions, pleural fluid is secreted by the parietal pleural capillaries at a rate of 0.6 millilitre per k ...
s. The
University of Montreal A university () is an institution of higher (or tertiary) education and research which awards academic degrees in several academic disciplines. Universities typically offer both undergraduate and postgraduate programs. In the United States, th ...
and Mila created the "COVID-19 Image Data Collection" in March which is a public data repository of chest imaging. The Medical Imaging Databank in Valencian Region released a large dataset of chest imaging from Spain. The Italian Radiological Society is compiling an international online database of imaging findings for confirmed cases. Online radiology case sharing platforms such as Eurorad and
Radiopaedia Radiopaedia is a wiki-based international collaborative educational web resource containing a radiology encyclopedia and imaging case repository. It is currently the largest freely available radiology related resource in the world with more than ...
serve as platforms for sharing COVID-19 case data and imaging.


References

{{COVID-19 datasets Datasets