Not all evidence is created equal: Machine-Reading in Biomedicine

Yiannis Kiachopoulos
published on November 19, 2018

Teaching computers how to read and understand biomedical publications for cause and effect relationships is a challenging task. This is especially true for it might not be intuitive what we mean by "read" and "understand".

By Reading we broadly are referring to extracting all the relevant information from a sentence for understanding an affect-relationship. This task is concerned with syntactical and semantic understanding i.e. what is the Subject-Predicate-Object agreement, what is the event, where is the action taking place, is it hypothetical or not etc. Let's look at an example sentence that is coming out an academic publication.

Already this relatively common sentence indicates the immense complexity of natural language - things that feel easy for a human reader such as the indicative nature of the statement medical procedures can lead to stress, are difficult to comprehend for machines. At Causaly we are developing algorithms with precisely this task. On the example above we would extract the following three affect-relationships:

We can immediately see that a statement of evidence cannot be reduced to just a relationship (A)-->(B): It is not the hospital setting but the lack of control over it that leads to stress. Likewise the statement of (Stress) --> (Anxiety) is referring to hospitalized children in this context and not in general to all population groups. In addition, linguistic statements can be expressed in hypothetical or definite terms. The amount of different forms of expression and their combinations is staggeringly high.

Our knowledge graph contains more than 110 million statements of affect-relationships in Biomedicine that we yield after "Reading" close to 20 million publications. But how can we make sense of the diversity of statements?

This is where the "Understanding" part of our platform comes into place. We have developed a hierarchical form of classifying evidence from very strong to weak, from hypothetical to definitive.


However, even with this classification we are only addressing the linguistic validity of evidence. A human reader on the other hand, would (again intuitively) look for more context. In particular it makes a difference whether the publication was a Randomized Control Trial or a case report, whether the statement is coming out of the Conclusion section of an article or from the introduction, whether it was published in a peer-reviewed journal or not, and more. These and several more parameters are being computed on our platform for each of the 110 million statements.

We intend to write a dedicated blog post for this in the future - stay tuned. However, here is a preview to what is possible when evidence from the whole of Pubmed has been synthesized and evaluated:
The chart above denotes the amount of evidence out of academic publications over time for (smoking)-->(COPD). It is a big picture view on what the scientific community evidenced over the past 40 years.

With Causaly, our goal is to give researchers and decision-makers the tools to drill down to each point of evidence to the desired level of detail as discussed above.

Preclinical safety analysis using Artificial Intelligence on the example of Alzheimer’s Disease.

Preclinical safety analysis using Artificial Intelligence on the example of Alzheimer’s Disease.

Causaly AI enables researchers to identify safety-relevant information in medical literature regarding a drug candidate. Preclinical experts can include this data in the preclinical study design to minimize the risks of unforeseen toxicities and increase chances of approval.

AI-supported Target identification for Systemic Lupus Erythematosus.

AI-supported Target identification for Systemic Lupus Erythematosus.

In the field of SLE, with over 2,000 scientific papers published in 2020 alone, target identification experts need to stay on top of recent advancements. Causaly facilitates this process and enables potential target identification, investigation of underlying mechanism of action and druggability.

How Causaly AI is transforming translational research: Interview with Imad Yassin

How Causaly AI is transforming translational research: Interview with Imad Yassin

An interview with Imad Yassin on how AI is transforming translational research and why Causaly is one of the biggest game-changers in the field he's seen to date.

Sign up for Causaly newsletter