To unlock all published scientific knowledge on COVID-19, Causaly AI has performed a network analysis of the 40,000 COVID-19 full text papers recently made public as part of CORD-19, in addition to 30 million existing biomedical publications.
The following list of the top 250 compounds have been identified as having the highest promise for further research as treatments for COVID-19, and is available for immediate download to aid researchers and for the purpose of initiating further research.
Download CoV Network Analysis - Top 250:
The full network analysis yielded 6000 compounds with relevant cause and effect relationships, and is ready to explore live on Causaly’s knowledge graph, available to Causaly users and non-commercial researchers by requesting access.
(open access is no longer available as it ended in September 2020 - please reach out to us for more information)
About This Data
Data highlights: Top 30 COVID-19 Candidate Predictions
Of the 250 substances released today, here are the top 30 as sorted by highest relevance based on our network analysis. This list includes drugs as well as dietary supplements.
Of the 250 substances released today, the Causaly team has manually annotated half of the substances with further information.
Of those 135 substances our team investigated:
- 8 predictions had no evidence for CoVs found, neither as direct relationship from literature machine-read by Causaly nor by manually searching for additional evidence. It appears they were uniquely identified by our AI.
- 22 predictions have been published as in silico hypotheses
- 48 have preclinical evidence
- 57 have clinical evidence
The remaining substances have not been manually checked, and are provided for further investigation.
Our approach had two parts:
- A network analysis to make drug candidate predictions based on the data sources listed below
- Manual annotation and quality check of predictions based on additional sources e.g. any literature available on the web.
The Coronavirus Network Analysis
What are the data sources?
The document corpus used for the network analysis was:
- Approx 30 million Pubmed abstracts and 2 million full text papers
- Approx 40K full text papers from the CORD-19 corpus
- Clinical trial information from clinicaltrials.gov
Using this corpus, Causaly AI extracted approximately 180 million interactions between Chemicals, Diseases, Viruses, Pathways. All interactions in Causaly are directional and connect two Concepts either via an Upregulate, Downregulate, Unidirectional, Bidirectional relationship. More about our technology here.
How were substance relationships to coronavirus identified?
We used Causaly to search for substances that inhibit targets (proteins, enzymes, receptors, genes) which contribute to coronavirus (Coronaviridae and Coronavirus infections).
For our users, here’s what this search looks like in Causaly:
The results were aggregated by substance to form clusters of targets and clusters of coronavirus, scored and subsequently sorted. The first 250 of approx 6000 substances can be found in the spreadsheet released today.
How were substances ranked?
Results are scored based on three main factors:
- the number of targets inhibited by a substance
- the amount of direct evidence where a substance has already been documented to inhibit CoV replication in vitro or in vivo
- the availability of a Clinical Trial on any CoV.
Further details on the scoring and the field definitions can be found in the excel sheet provided.
The scoring algorithm can be further expanded in the future to include factors such as:
- whether substances share a similar drug mechanism or
- the linguistic strength of a statement (hypothetical vs definite statements),
- the page-rank of a node
All of the above are already part of the Causaly Knowledge Graph but have been omitted from the scoring to release these results sooner.
Finally, all substances that are still preclinical i.e. have no Clinical Trial data for any indication were removed from the list. More generic terms such as interferon type 1 were also removed.
To aid researchers, the Causaly team has annotated approx 135 of the 250 substances with additional information, including:
Drug Status: Whether the substance is a drug as defined by the FDA Glossary.
Marketing Status: Whether the substance is currently being marketed (as either a drug or dietary supplement), in at least one country.
CoV Evidence Status: Whether any published evidence exists showing a relationship between the substance and any coronavirus, labelled as either Clincial, Pre-Clinical, In Silico/Hypothesis or None.
In the interest of speed and providing this data to the community, we have kept our methodology overview brief. See the tab ‘Remarks’ tab in the sheet for further points on methodology, and the full breakdown of annotations and their denominations. A further post will be published shortly with additional detail.
Please also note that this analysis can be done on the fly in Causaly, and results can be sorted by different criteria. For access to the full dataset on Causaly’s live knowledge graph please get in touch.
This data does not endorse drugs, diagnose patients, or recommend therapy. This information has been released for the purpose of initiating further research for identified compounds. Causaly is not responsible for any errors or omissions, or for the results obtained from the use of this information. All information in this site is provided "as is", with no guarantee of completeness, accuracy, timeliness or of the results obtained from the use of this information
About Causaly AI
Causaly is a London based company with a mission is to democratize access to the world’s biomedical knowledge. Causaly leverages machine-reading and AI to enable rapid exploration of biomedical literature, unlocking key hidden evidence. With Causaly, researchers are able to answer complex questions that otherwise take weeks in traditional literature research.
Causaly has partnered with University College London to accelerate research into the pandemic. If your organization is actively involved in COVID-19, drop us a message to find out how we can help accelerate your work.
Questions, comments or feedback about this data? Get in touch.
Download CoV Network Analysis - Top 250: