Detect cancer earlier by interrogating medical and non-medical data sets using machine and deep-learning

Artificial intelligence Grand ChallengeWhen cancer is detected early, treatment is more likely to be successful. But too often, cancers are diagnosed at a late stage when they’re much harder to treat.

There are many reasons for this, including that the symptoms people experience can often be vague and linked to much less serious conditions. But could there be hidden clues in people’s lives that point to cancer? And if so, is there a way to gather this information and help detect cancer earlier?

We’re generating, collecting and sharing more information than ever before, and this includes information that may be tied to our health. Security and privacy are a priority as technology and personal data are combined, but what if there was a way to link this information and improve cancer detection?

If we could use information in this way, who would do it? And how? Could doctors use hints from prescription records, online searches or social media activity to detect cancer earlier?

All of these questions need exploring, and this Grand Challenge aims to understand the possibility of examining medical and non-medical databases to spot patterns that could point to cancer. If it works, this could eventually lead to new ways to detect cancer earlier. 

In a sentence: Get clues to detect cancer earlier from medical and non-medical information 


We know that, for almost all cancer types, patient outcomes are improved if the disease can be diagnosed at an early stage. Although we have made significant advances in diagnosing certain cancers early, unfortunately, late diagnosis remains an important problem; in England for example almost half of patients are diagnosed when their cancer is already advanced.

There are numerous barriers to the early detection of cancer. Some cancer types such as pancreatic cancer may display very few (or non-specific) symptoms until they are at an advanced stage. For other cancer types delays may be due to patients not being aware of or not reporting cancer symptoms, or health practitioners having difficulty identifying cancers in people presenting with vague or non-red flag symptoms.

There may, however, be patterns of symptoms and behaviours within accessible data sets that could be used to indicate the presence of a cancer. These data sets may be medical (e.g. GP presentation patterns, prescription records, health insurance claims) or non-medical (e.g. social media activity, shopping history, online search history). There is an opportunity to employ deep-learning approaches to combine these data sets with other cancer risk factors, and devise methods to drive diagnostic investigation at an earlier stage and facilitate the early detection of cancer.


This Grand Challenge requires the collation and interrogation of both medical and non-medical data sets; this may include data sources that have not previously been explored for the purpose of early detection of cancer.

It is envisaged that a number of factors may need to be considered:

  • Collation of anonymised data sets from health records
  • Collation of anonymised data sets relating to online search or social media activity
  • Use of pattern recognition algorithms to decipher a set of patterns that can then be prospectively validated
  • Development of machine learning approaches to optimise algorithms over time as new data is collected

Teams taking on this challenge will need to consider the ethical considerations surrounding the use of publically available and patient data.


The goal of this Grand Challenge is to deliver interventions on a personal level that facilitate the early detection of cancer, particularly for those cancer types that currently present late and/or have poor survival rates. The collation of diverse data sets is also likely to result in more informed health information and more accurate cancer diagnoses.


  • Any country
  • Any discipline
  • Any academic institution or commercial entity
  • Any career stage

The panel

The Grand Challenge Advisory Panel are responsible for not only setting the challenges, but assessing all applications, shortlisting teams and ultimately deciding which teams will be successful in receiving the Grand Challenge award.

Rate this page:

Currently rated: 3.3 out of 5 based on 21 votes
Thank you!
We've recently made some changes to the site, tell us what you think