How AI is redefining hypothesis generation in pharma research

Richard Harrison
published on June 29, 2022

It is widely recognized that the Scientific Method is fundamental for discovering new medicines. We’re all familiar with the process: a researcher poses a question, such as what causes a specific disease, then gathers all relevant information that is known about their question. Upon digesting this information, they formulate a hypothesis and design experiments to test it.

This process has served us well for hundreds of years. However, recent advances in artificial intelligence are completely transforming the process.

Causaly Cloud is leading the charge in this field. It is a platform that utilizes machine learning to comprehend all biomedical literature in seconds, extracting precise evidence from millions of documents – allowing researchers to reimagine knowledge discovery and accelerate their research.

In this blog post, I will present some case studies illustrating what it’s capable of – and specifically look at how Causaly’s ‘Multi-Hop’ function can help with hypothesis generation. I believe this technology can truly transform pharma research.

Gathering knowledge

I will start by examining how knowledge gathering can be transformed with Causaly Cloud.

For any given disease, the rate of new information being published is rapidly outpacing the human ability to absorb it. As an example, from January 2021 until June 2022 there have been over 8,600 peer reviewed publications on Alzheimer’s Disease. A researcher would have to read, on average, around 16 publications a day just to keep up. Assuming it takes an hour to read each paper, there is barely any time left for actual research.

This means many scientists run the risk of generating hypotheses on partial information – a situation that can easily lead to uninformed hypotheses, or worse, potential scientific bias. This is where Causaly Cloud can help. It uses single-sentence linguistic analysis on the entirety of published biomedical literature – extracting key pieces of evidence and visualizing it in an easy-to-understand knowledge graph.

As an example, we’ll use the Alzheimer’s Disease/amyloid hypothesis, which suggests a buildup of amyloid peptides leads to cognitive dysfunction.

By investigating this in Causaly Cloud in June 2022, I can find 21,775 pieces of evidence: 2,345 that state that amyloid peptides cause Alzheimer’s Disease, and 304 that say it decreases it (see image 1 below). Having both views of a problem before generating a hypothesis is important to removing the inherent bias when there is too much information, since this may lead scientists to only using the literature that supports their views.


Image 1: Causaly Cloud shows all the evidence in biomedical literature. You get the full picture, including evidence you may otherwise accidently miss, overlook, or dismiss. This image shows the total amount of evidence that agrees and disagrees with a hypothesis, along with arrows that indicate the direction or causality.

Generating new hypotheses

Once relevant information has been gathered, the difficult task is to find the hidden relationships in the data that can be the basis of a hypothesis. In this regard, Causaly Cloud can make a massive difference.

My research above found a recent publication (1) that used a GWAS analysis to identify new targets and risk genes for Alzheimer’s. Risk, or causal, genes have recently been shown to be good potential drug targets (2). One of the risk genes identified was PMAIP1, a pro-apoptotic subfamily within the BCL-2 protein family.

I hypothesized that PMAIP1 is responsible for turning on genes that may be responsible for Alzheimer’s. And as you can see from the images below, I used the Multi-Hop feature to explore any potential relationships. After a few moments, I was able to identify MCL-1 which is influenced by PMAIP1, and has recently been implicated in Alzheimer’s pathology by inhibiting mitophagy.



Images 2 and 3: Multi-Hop enables you to discover hidden mediators, allowing you to connect seemingly unrelated biomedical concepts. In this case, Causaly Cloud uncovered the MCL-1 gene, which appears to have an impact on Alzheimer’s Disease.

If I was a researcher developing treatments for Alzheimer’s, this would be a revelation. Ordinarily, MCL-1 would only be identified after much painstaking research. However, in just a few clicks, Causaly Cloud’s Multi-Hop feature has uncovered it straight away. I can join the dots in an instant, and form new hypotheses that could prove seriously helpful in the fight against this disease.

To show you the entire end-to-end process of how to use Multi-Hop, we've created a video where I demonstrate the feature in detail - please take a look below.

As you can see, employing AI in drug discovery has a transformational impact on the identification of better-defined, disease-linked targets and clinical candidates that have a greater chance for success. Multi-Hop in particular is a gamechanger - enabling researchers to uncover hidden connections and generate transformational new hypotheses. As far as I’m aware, no other tool does this – making Causaly Cloud completely unique.

Request a demo to learn more about how to generate your hypthoses using Causaly Clould.


  1. Z. Wang et al., Deep post-GWAS analysis identifies potential risk genes and risk variants for Alzheimer's disease, providing new insights into its disease mechanisms. Sci Rep 11, 20511 (2021).

  2. K. Sonehara, Y. Okada, Genomics-driven drug discovery based on disease-susceptibility genes. Inflamm Regen 41, 8 (2021).

Founded in 2018, Causaly’s mission is to transform how humans can find, visualize and interpret biomedical knowledge, to accelerate solutions for some of the greatest challenges we face in human health. Causaly acts as an operating system for biomedical and health data that empowers researchers to effortlessly identify new research avenues and innovative drug development opportunities. Its technology mimics human reading, and digests tens of millions of documents into an Enterprise Knowledge Graph allowing researchers and decision-makers to answer questions they can’t answer anywhere else. To learn more, visit

Why poor target validation is costing pharmaceutical businesses millions

It has been well documented that only 1 out of 10 compounds that enter clinical trials makes it to market¹. That is an astonishing 90%...

  • Point of View

Watch Causaly’s panel discussion from Discovery Europe

On June 9th – 10th 2022, over 400 pharma leaders came to Berlin to take part in Discovery Europe...

  • General

Webinar - Knowledge discovery reimagined: finding new hypotheses with Causaly Cloud

It is well recognized among scientists that target selection can improve clinical trial success rates (1). Being able to stay up to date...

  • use case

Sign up for Causaly newsletter