How to read 35,000 articles in 1 minute

Yiannis Kiachopoulos
published on August 13, 2018

Within Pharmaceutical companies, departments such as Health Economics have a need to understand Diseases and their related treatment options. In this context, questions such as "What are the existing treatments for disease X?" are common and usually are followed by a targeted literature review to collect evidence and create a study report. Depending on the depth of review, such a report can take from weeks to months to compile, having a negative impact on decision lead time as well as high labor cost.


Let's take Anxiety as an example and try to answer the question "What are the Pharmacologic Substances that reduce Anxiety?" The relevant corpus for Anxiety is ~ 35,000 academic publications. Clearly this amount cannot be read by a human as it would take multiple years. Our best alternative today is to reduce the scope by using keywords and filters similar to a Google search. Instead of reading 35,000 we "only" read e.g. 500 abstracts and dozens of full text papers as a proxy for making sense of this information volume.

What if somebody had read all 35,000 articles and you could just search for causal evidence?

At Causaly we are developing machine-reading algorithms which "read" academic publications and identify cause & effect relationships. Using our Rapid Search platform, we can retrieve all Causes that affect Anxiety with a click of a button: the yield is ~16,000 relationships (an equivalent of roughly 40 full-text papers) from 35,000 articles - 99,9% reduction in reading volume.


Information in the Causaly Knowledge Graph is hierarchical, ordering Concepts into Categories and Sub-Categories. For example, Serotonin is an Organic Chemical which is a Chemical or Drug. This allows us to conveniently filter relationships for what we exactly need. With two more clicks we select to show only Pharmacologic Substances that affect Anxiety and only evidence from Randomized Control Trials. The result further reduces 16,000 relationships to 180 - an additional 99% reduction in reading volume.


The approach of searching for relationships instead of articles, is possible because the Causaly machine-reading platform has already read all the articles, connected all the dots and hierarchically categorized all relationships into our Knowledge Graph. The advantages for our users go beyond just reading speed: we are able to focus-read 35,000 articles in less than one minute and without compromising on reading scope - the full detail of all articles was considered for arriving at just 180 relationships.

At this stage, the research journey is not over yet. In our example, the evidence underlying the 180 relationships is documented in ~ 500 papers. Selected deep dives to investigate the evidence and form our opinion are still necessary. We will be looking at this next time. Stay tuned !

Why poor target validation is costing pharmaceutical businesses millions

It has been well documented that only 1 out of 10 compounds that enter clinical trials makes it to market¹. That is an astonishing 90%...

  • Point of View

Watch Causaly’s panel discussion from Discovery Europe

On June 9th – 10th 2022, over 400 pharma leaders came to Berlin to take part in Discovery Europe...

  • General

Webinar - Knowledge discovery reimagined: finding new hypotheses with Causaly Cloud

It is well recognized among scientists that target selection can improve clinical trial success rates (1). Being able to stay up to date...

  • use case

Sign up for Causaly newsletter