Not all evidence is created equal: Machine-Reading in Biomedicine

Yiannis Kiachopoulos
published on November 19, 2018

Teaching computers how to read and understand biomedical publications for cause and effect relationships is a challenging task. This is especially true for it might not be intuitive what we mean by "read" and "understand".

By Reading we broadly are referring to extracting all the relevant information from a sentence for understanding an affect-relationship. This task is concerned with syntactical and semantic understanding i.e. what is the Subject-Predicate-Object agreement, what is the event, where is the action taking place, is it hypothetical or not etc. Let's look at an example sentence that is coming out an academic publication.
causaly-example-sentence_annotated

Already this relatively common sentence indicates the immense complexity of natural language - things that feel easy for a human reader such as the indicative nature of the statement medical procedures can lead to stress, are difficult to comprehend for machines. At Causaly we are developing algorithms with precisely this task. On the example above we would extract the following three affect-relationships:
causaly-example-sentence-tuple

We can immediately see that a statement of evidence cannot be reduced to just a relationship (A)-->(B): It is not the hospital setting but the lack of control over it that leads to stress. Likewise the statement of (Stress) --> (Anxiety) is referring to hospitalized children in this context and not in general to all population groups. In addition, linguistic statements can be expressed in hypothetical or definite terms. The amount of different forms of expression and their combinations is staggeringly high.

Our knowledge graph contains more than 110 million statements of affect-relationships in Biomedicine that we yield after "Reading" close to 20 million publications. But how can we make sense of the diversity of statements?

This is where the "Understanding" part of our platform comes into place. We have developed a hierarchical form of classifying evidence from very strong to weak, from hypothetical to definitive.

causaly-knowledge-hierarchy

However, even with this classification we are only addressing the linguistic validity of evidence. A human reader on the other hand, would (again intuitively) look for more context. In particular it makes a difference whether the publication was a Randomized Control Trial or a case report, whether the statement is coming out of the Conclusion section of an article or from the introduction, whether it was published in a peer-reviewed journal or not, and more. These and several more parameters are being computed on our platform for each of the 110 million statements.

We intend to write a dedicated blog post for this in the future - stay tuned. However, here is a preview to what is possible when evidence from the whole of Pubmed has been synthesized and evaluated:
causaly-smoking-COPD
The chart above denotes the amount of evidence out of academic publications over time for (smoking)-->(COPD). It is a big picture view on what the scientific community evidenced over the past 40 years.

With Causaly, our goal is to give researchers and decision-makers the tools to drill down to each point of evidence to the desired level of detail as discussed above.

Can AI enhance traditional clinical literature research methods?
Application

Can AI enhance traditional clinical literature research methods?

The process of finding and evaluating existing clinical research is central to all areas of biomedicine, providing the foundations upon...

Understanding Clinical Outcomes of Spinal Muscular Atrophy
use case

Understanding Clinical Outcomes of Spinal Muscular Atrophy

The objective of this study was to evaluate all possible symptoms of SMA to identify relevant research articles and to define SMA prevalence comprehensively. We asked the question: What are the disorders and syndromes associated with SMA?

PLK1 Drug Development – Understanding Drug Candidates in Oncology
Application

PLK1 Drug Development – Understanding Drug Candidates in Oncology

Application of Artificial Intelligence how to rapidly identify under-explored drug candidates on the the example of PLK1.

Be the first to know

Sign up for Causaly Newsletter