How to read 35,000 articles in 1 minute

Yiannis Kiachopoulos
published on August 13, 2018

Within Pharmaceutical companies, departments such as Health Economics have a need to understand Diseases and their related treatment options. In this context, questions such as "What are the existing treatments for disease X?" are common and usually are followed by a targeted literature review to collect evidence and create a study report. Depending on the depth of review, such a report can take from weeks to months to compile, having a negative impact on decision lead time as well as high labor cost.

pharmacologicSubstance-challenge

Let's take Anxiety as an example and try to answer the question "What are the Pharmacologic Substances that reduce Anxiety?" The relevant corpus for Anxiety is ~ 35,000 academic publications. Clearly this amount cannot be read by a human as it would take multiple years. Our best alternative today is to reduce the scope by using keywords and filters similar to a Google search. Instead of reading 35,000 we "only" read e.g. 500 abstracts and dozens of full text papers as a proxy for making sense of this information volume.

What if somebody had read all 35,000 articles and you could just search for causal evidence?

At Causaly we are developing machine-reading algorithms which "read" academic publications and identify cause & effect relationships. Using our Rapid Search platform, we can retrieve all Causes that affect Anxiety with a click of a button: the yield is ~16,000 relationships (an equivalent of roughly 40 full-text papers) from 35,000 articles - 99,9% reduction in reading volume.

Results-big-hr-new

Information in the Causaly Knowledge Graph is hierarchical, ordering Concepts into Categories and Sub-Categories. For example, Serotonin is an Organic Chemical which is a Chemical or Drug. This allows us to conveniently filter relationships for what we exactly need. With two more clicks we select to show only Pharmacologic Substances that affect Anxiety and only evidence from Randomized Control Trials. The result further reduces 16,000 relationships to 180 - an additional 99% reduction in reading volume.

Results

The approach of searching for relationships instead of articles, is possible because the Causaly machine-reading platform has already read all the articles, connected all the dots and hierarchically categorized all relationships into our Knowledge Graph. The advantages for our users go beyond just reading speed: we are able to focus-read 35,000 articles in less than one minute and without compromising on reading scope - the full detail of all articles was considered for arriving at just 180 relationships.

At this stage, the research journey is not over yet. In our example, the evidence underlying the 180 relationships is documented in ~ 500 papers. Selected deep dives to investigate the evidence and form our opinion are still necessary. We will be looking at this next time. Stay tuned !

Not all evidence is created equal: Machine-Reading in Biomedicine
technology

Not all evidence is created equal: Machine-Reading in Biomedicine

Teaching computers how to read and understand biomedical publications for cause and effect relationships is a challenging task. This is especially true for it might not be intuitive what we mean by "read" and "understand".

Knowledge emergence - what we learn from 100K monthly publications
Point of View

Knowledge emergence - what we learn from 100K monthly publications

Every month we process more than 100,000 scientific documents. New knowledge is emerging every month across thousands of scientific disciplines

How is Obesity related to Breast Cancer ? Insights from 140,000 articles.
use case

How is Obesity related to Breast Cancer ? Insights from 140,000 articles.

The underlying query machine-reads 143,548 articles within < 2 seconds and returns 53 hormones as potential mediators for the relationship (Obesity)->(Breast Cancer).

Never miss an update

Subscribe to our newsletter