COVID‑19 datasets
   HOME

TheInfoList



OR:

COVID-19 datasets are public databases for sharing case data and medical information related to the
COVID-19 pandemic The COVID-19 pandemic (also known as the coronavirus pandemic and COVID pandemic), caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), began with an disease outbreak, outbreak of COVID-19 in Wuhan, China, in December ...
.


Aggregate statistics


United States


Volunteer/non-government


U.S. Department of Health & Human Services


Global

*
Johns Hopkins Johns Hopkins (May 19, 1795 – December 24, 1873) was an American merchant, investor, and philanthropist. Born on a plantation, he left his home to start a career at the age of 17, and settled in Baltimore, Maryland, where he remained for mos ...
Coronavirus Resource Center: Global aggregated data including cases, testing, contact tracing, and vaccine development *
World Health Organization The World Health Organization (WHO) is a list of specialized agencies of the United Nations, specialized agency of the United Nations which coordinates responses to international public health issues and emergencies. It is headquartered in Gen ...
(WHO) Coronavirus Disease Dashboard: a database of confirmed cases and deaths reported globally and broken down by region. This database is part of the WHO Health Data Platform. * COVID-19 Africa Open Data Project: a volunteer-run database and dashboard reporting region, country and district level case counts, deaths, healthcare worker infections, healthcare services and urgent needs.


Data hubs


Health Data Research UK
provides a searchable registry of health data resources from the United Kingdom, includin
COVID-19 related datasets
* NIH Open Access Datasets: The
National Institutes of Health The National Institutes of Health (NIH) is the primary agency of the United States government responsible for biomedical and public health research. It was founded in 1887 and is part of the United States Department of Health and Human Service ...
provide open-access data and computational resources related to COVID-19. * COVID-19 Open Research Dataset (CORD-19): The
Semantic Scholar Semantic Scholar is a research tool for scientific literature. It is developed at the Allen Institute for AI and was publicly released in November 2015. Semantic Scholar uses modern techniques in natural language processing to support the resear ...
project of the
Allen Institute for AI The Allen Institute for AI (abbreviated AI2) is a 501(c)(3) non-profit scientific research institute founded by late Microsoft co-founder and philanthropist Paul Allen in 2014. The institute seeks to conduct high-impact AI research and engineeri ...
hosts CORD-19, a public dataset of academic articles about COVID-19 and related research. The dataset is updated daily and includes both peer-reviewed articles and preprints. CORD-19 was originally released on March 16, 2020, by researchers and leaders from the Allen Institute for AI,
Chan Zuckerberg Initiative The Chan Zuckerberg Initiative (CZI) is an organization established and owned by Facebook founder Mark Zuckerberg and his wife Priscilla Chan with an investment of 99 percent of the couple's wealth from their Facebook shares over their lifetim ...
, Georgetown University's Center for Security and Emerging Technhology,
Microsoft Microsoft Corporation is an American multinational corporation and technology company, technology conglomerate headquartered in Redmond, Washington. Founded in 1975, the company became influential in the History of personal computers#The ear ...
, and the
National Library of Medicine The United States National Library of Medicine (NLM), operated by the United States federal government, is the world's largest medical library. Located in Bethesda, Maryland, the NLM is an institute within the National Institutes of Health. I ...
. The dataset is created through the use of
text mining Text mining, text data mining (TDM) or text analytics is the process of deriving high-quality information from text. It involves "the discovery by computer of new, previously unknown information, by automatically extracting information from differe ...
of the current research literature.


Topic-specific and special-interest resources


Genomics

* Consensus genome data for
SARS-CoV-2 Severe acute respiratory syndrome coronavirus 2 (SARS‑CoV‑2) is a strain of coronavirus that causes COVID-19, the respiratory illness responsible for the COVID-19 pandemic. The virus previously had the Novel coronavirus, provisional nam ...
is available through
GISAID GISAID (), the Global Initiative on Sharing All Influenza Data, previously the Global Initiative on Sharing Avian Influenza Data, is a global science initiative established in 2008 to provide access to genomic data of influenza viruses. The datab ...
for registered users and included in an interactive
Phylogenetic tree A phylogenetic tree or phylogeny is a graphical representation which shows the evolutionary history between a set of species or taxa during a specific time.Felsenstein J. (2004). ''Inferring Phylogenies'' Sinauer Associates: Sunderland, MA. In ...
dashboard on
Nextstrain Nextstrain is a collaboration between researchers in Seattle, United States and Basel, Switzerland which provides a collection of open-source tools for visualising the genetics behind the spread of viral outbreaks. Its aim is to support public h ...
, an open-source pathogen
genome A genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA (or RNA in RNA viruses). The nuclear genome includes protein-coding genes and non-coding genes, other functional regions of the genome such as ...
data project.


Imaging (Radiology)

* Characteristic imaging features on chest
radiographs Radiography is an imaging technique using X-rays, gamma rays, or similar ionizing radiation and non-ionizing radiation to view the internal form of an object. Applications of radiography include medical ("diagnostic" radiography and "therapeu ...
and
computed tomography A computed tomography scan (CT scan), formerly called computed axial tomography scan (CAT scan), is a medical imaging technique used to obtain detailed internal images of the body. The personnel that perform CT scans are called radiographers or ...
(CT) of people who are symptomatic include asymmetric peripheral ground-glass opacities without
pleural effusion A pleural effusion is accumulation of excessive fluid in the pleural space, the potential space that surrounds each lung. Under normal conditions, pleural fluid is secreted by the parietal pleural capillaries at a rate of 0.6 millilitre per kilog ...
s. The
University of Montreal A university () is an institution of tertiary education and research which awards academic degrees in several academic disciplines. ''University'' is derived from the Latin phrase , which roughly means "community of teachers and scholars". Univ ...
and Mila created the "COVID-19 Image Data Collection" in March which is a public data repository of chest imaging. The Medical Imaging Databank in Valencian Region released a large dataset of chest imaging from Spain. The Italian Radiological Society is compiling an international online database of imaging findings for confirmed cases. Online radiology case sharing platforms such as Eurorad and
Radiopaedia Radiopaedia is a wiki-based international collaborative educational web resource containing a radiology encyclopedia and imaging case repository. It is currently the largest freely available radiology related resource in the world with more than ...
serve as platforms for sharing COVID-19 case data and imaging.


References

{{COVID-19 datasets Datasets