Why do we need to wait for symptom onset before getting to a diagnosis?

And ever after symptoms onset, primary care performance in diagnosis has big room for improvement: 12 million adults every year experience diagnostic errors in the US alone – and half of these included the possibility of harm¹. But why so many errors in primary care?

high patient volumes do not allow for a careful and dedicated support from primary care physicians (PCPs)².
incomplete information require PCP to make decisions under uncertainty³ – think of undifferentiated symptoms, uncommon diseases as well as limited medical expertise…
people get lost in the very long and laborious referral pathway⁴ where physicians try to balance the risk of missing serious illness with the wise use of often scarce and costly referral and testing resources.

Thus, diagnostic errors and delays have emerged as a global safety priority⁵. And I believe technology hasn’t deployed its full potential in this challenge, yet. Why? My answer is lack of data.

Machine learning has a big ‘right to play’ here, having shown already super-doctor capabilities in a few (specific) tasks⁶. However, if you read pretty much every recent paper demonstrating an application of machine learning in early diagnosis, you will encounter something like:

Indeed, data collection will go on to increase the number of patients and blood samples. This will give the possibility to apply deep learning-based classification methods, which might lead to further improvement of the classification performance⁷ (paper from Nature, 2021)

This means that in order to exploit the potential of the latest machine learning algorithms being developed in the computer science community (i.e. deep learning) you need larger datasets. In fact, proper machine learning algorithms identify hidden patterns in data and could be able to predict anything without prior knowledge of the problem. How? They look at incredibly vast amount of observations and find relationships among factors that might explain a given outcome – then they make predictions on new observations based on what they have learned. Simple as that. They just need data…

However, medical data has been (and still is) extremely unstructured and spread around medical centres/ hospitals. People end up not ‘owning’ their electronic health records (EHR), while institutions can’t properly share them due to privacy concerns.

“To put this in perspective, your ATM card works in Outer Mongolia, but your EHR can’t be used in a different hospital across the street” ⁷

Result: an incredible amount of data that we do collect, is simply lost somewhere denying medical researchers from the possibility to develop disruptive models to save our lives.

My solution: Meredict.

I realised data can be unstructured, but there is only one common ground to all that pieces of information: the person. Each of us, must have access to all the data stored across hospitals, doctors, smartphone and insurers, especially not with GDPR – it’s just that noone has done the effort to collect it in one place. Meredict helps you do that.

Once collected you will have the chance to share it with those you want: doctors, family or medical researchers to generate new life-changing discoveries. On the flip side you will get ongoing monitoring of your health data by top specialists in multiple fields of medicine.

Imagine this:

creating such huge database helping people taking their health data back
offering it to medical researchers to find new patterns and develop new algorithms to predict diseases much earlier
run such algorithm on an ongoing basis on people’s helath data and raise a flag as soon as something is wrong

That didn’t work…

I worked on that for 2 months, getting great traction on the Pharma/researchers side (willing to build storied on the data we collect) but users were not excited. They key reason was that poeple are not ready to share their health data with a private early stage startup, they are not actually willing to share it with anyone – cause if all data sit in one place it’s easier to be stolen. Anyone except Google, Facebook and Candy Crash… Seems like if Facebook understands your condition from what you post, that’s fine – but if they asked you ‘what’s your condition?’ that would be a red flag.

That makes sense, probably not the right timing for such solution – or we could not find the right formula to scale: maybe leaving the data on their device and using federated learning to train a GAN to then generate synthetic data would work? We will figure it out!

Learnings

Validate asap all the assumptions on the user side – do not underestimate their reasons not to do what you ask them to do. Unless you provide real value to users in the short term it will be very struggling to get things off the ground.
Marketing campaign are freaking though – I have been spending a lot of money on Facebook ads with very poor results. I’m sure big part of that is that I could not design a catchy enough insertion.
People (at least in Germany) don’t take free beers at the park! We organized an outdoor event to give beers away in exchange of a quick disucssion and a no-spamming signup to our porject. We bought 200 beers for a full afternoon in a high traffic park in Berlin but struggled (STRUGGLED) to distribute 50!