Sunteți pe pagina 1din 95

DIGITAL

TRANSFORMATION
IN HEALTHCARE
a whitepaper of
the HealthCare Futurists GmbH

2017
This whitepaper contains the current state of the art assessment on
Digital Transformation in Healthcare
as explored in November 2016 by leading experts in the field,
who have come together under the umbrella of the HealthCare
Futurists.

The HealthCare Futurists (HCFs) are an international and independent


network, think-tank, make-tank, incubator, catalyst and consulting
hub for innovation in healthcare, life sciences and medicine. Our
mission is to collaboratively challenge the status quo and passionately
push the limits of current thinking and practice in healthcare. We are
renowned experts and thriving professionals of diverse disciplines, all
related to healthcare. We share a persistent passion for patient-centric,
client-centric and customer-centric innovation and consider ourselves
to be Change Agents of Innovation and healthcares custom shop.

http://www.healthcarefuturists.com

HealthCare Futurists GmbH


Dsseldorf Office
Plange Mhle 3
40221 Dsseldorf
Germany

info@healthcarefuturists.com
www.healthcarefuturists.com
Twitter: @hcfuturists
Legal Disclaimer and Copyright

The Digital Transformation in Healthcare white paper is published by The HealthCare Futurists GmbH.
All rights reserved. No part of this publication may be reproduced, copied or transmitted in any form
or by any means, or stored in a retrieval system of any nature, without the prior permission of the
HealthCare Futurists GmbH. Application for permission to reproduce all or part of the Copyright
material shall be made to the HealthCare Futurists GmbH, Plange Mhle 4, 40221 Dsseldorf or using
copyright@healthcarefuturists.com

Although the greatest care has been taken in the preparation and compilation of Digital Transformation
in Healthcare white paper, no liability or responsibility of any kind (to extent permitted by law),
including responsibility for negligence is accepted by the HealthCare Futurists GmbH, its servants or
agents. All information gathered is believed correct as of December 2016. All corrections should be sent
to the HealthCare Futurists GmbH for future editions.

Table of Contents
From Disruption to Transformation: It will not happen tomorrow... 2

1. Digital Health in General with a Special Focus on the Medical Affairs ... 4
1.1 Overview of the Current State of Digital Health 5
1.2 Current Streams of Engagement and Official Interest 8
1.2.2 Co-Creation in Healthcare 13
1.2.3 Predictive Analytics 14
1.3 Data Privacy, Data Security, Data Ownership 17
1.3.1 Data Ownership in German Case Law 17
1.3.2 Federal Supreme Court of Germany (Bundesgerichtshof) 17
1.3.3 Academic Discussion on Data Ownership 18
1.3.4 Data Privacy in General 20
1.3.5 Right to Data Portability 20
1.3.6 Data Privacy and Data Protection 21
1.4 Dedicated Section on Health Apps and Tracking Devices... 22
1.4.1 Market Penetration of Health Apps: An Overview 22
1.4.2 Description of Health App Segments 25
1.4.3 Diffusion of Health Apps in Germany 26
1.4.4 Clear Settings and Associated Certifications 27
1.4.5 Evidence of Clinical Efficacy and Economic Feasibility 27
1.4.6 Data Security Compliance 27
1.4.7 Conclusion 28
1.5 The Digitally Embedded Patient: How Does the Patient of the Future .... 29
1.5.1 The Changing Roles of Doctors and Patients 28
1.5.2 Digital Levers to Engage Patients in Health Care Processes 29
1.5.3 Digital Patient Deliberation and Support 30
1.5.4 Digital Solutions to Increase Patients Self-Responsibility in Managing ... 30
1.5.5 Digital Solutions to Facilitate Patients Interactions with the Healthcare ... 31
1.5.6 The Current Digital Patient Usage and Usage Barriers to Innovative... 32
1.6 Future Developments in Digital Health 34
1.6.1 Outlook on the Pharmaceutical Industry with an Emphasis... 34
1.6.2 The Issue of Data Ownership: What Kind of New Business Models... 35
1.6.3 The Issue of Data Security and Data Safety : where are Data Being Stored... 37
2. Special focus on Big Data Potential Assessment and Exploitation in Healthcare 39
2.1 Current State-of-the-Art and Application Examples 40
2.1.1 Introduction 40
2.1.2 Big Data 41
2.2 Deep Learning 44
2.3 Description and Assessment of Tools Used to Work on Huge Data Sets 54
2.4 The Possible Futures of Big Data in Healthcare 67
2.4.1 How Will Big Data-Driven Healthcare eventually be able to change... 67
2.4.2 Guiding RCTs: Generating Promising Hypotheses / Quickly Testing... 70
2.4.3 Complementing RCTs 71
2.4.4 What Kind of Impact will Ubiquitous Computing and Wearables.... 72
2.5 The Internet of Healthy Things (IOHT) 74
2.6 Augmented Reality: An Extraordinary Evolution of Technology Tools... 76
2.6.1 Introduction 76
2.6.2 Market and Pharmaceutical Marketing 76
2.6.3 Medical Education with Augmented Reality 77
2.6.4 Augmented Reality within the Hospital/Private Practice 77
2.6.5 Augmented Reality within the Surgical Theatre 78
2.6.6 Future Uses of AR 78
2.7 Competitor Analysis 79
2.7.1 What are Other Companies Doing and How Successful are They? 79
2.7.2 Roche 79
2.7.3 Pfizer 80
2.7.4 Novartis 80
2.7.5 Merck 81
2.7.6 Mylan and Allergan 81
Appendix 82
Concluding Remarks 84
From Disruption to Transformation: It will not happen
tomorrow if you expect others to do it

Disruption has recently replaced the word innovation, as if disruption were some kind of overhauled
innovation on steroids. Things need to be super-new, not just new. They need to tilt and twist, not just
work. Innovation has become mainstream and hence is seen as a lame duck. It seems we do not trust new
things unless the old things are subject to almost profound destruction. Radical things are what we want in
terms of change. However, change is good as long as for me everything remains the same; and then again
we live in times of radical change and accompanying global insecurity. Language over time seems to acquire
a tendency to reach out to superlative words like disappointed voters to overpromisers. Getting noticed
in the age of attention span deficit has become a difficult challenge. In this context, an in-depth whitepaper
such as this one seems to resemble a futile assault of nostalgia. A digital disruption difficulty, in fact.

Disruption, however, has been excessively quoted in consulting, C-Level lingo and startup pitches alike. It
has therefore acquired a well-deserved place in the Olympic ranks of words whose unsolicited use obviously
tries to veil the fundamental cluelessness of the user. The frequency of use is directly proportional to the
meaninglessness of the content. It hence shares its fate with terms such as sustainability or optimization.
Since this papers goal is not the semantic definition of words that refer to the field of further development of
technology, advancement in the humanities, or progress in ethics even though intellectually this would be
worthwhile an endeavor - we decided not to dwell on the plethora of possible explanations seeking to make
innovation commonly accessible. To solve this quest and discover this uncharted land, others might set sail. We
have discovered notable serendipity in refraining from such great words and humbly gazing at what we have at
hand in terms of change. Thus, we decided to name what we see with the descriptive term of transformation.

Disruption is spot on, the one, mind boggling intellectual supernova, the unicorn cantering by, leaving the
professional in shock and the crowd in awe. This does usually not endure but is rather an ephemeral event. At the
HealthCare Futurists, despite our denomination, we prefer to consider earthly aspects of business life and leave the
shining stars aside. We claim to look at what we think is going to stay for good and change the world into a better
place. These are the tiptoeing technologies which will soon come to a firm stand. It is not from a fashion perspective
or a trendy approach that we have selected our views. We are looking at things that sometimes circuitously
transgress from one vertical to the other and will shape our future and the way we practice and perceive healthcare.

It is this slow and yet persistent change of form, before anything else, that we currently see going on in
several industries. To this round dance of transformation, healthcare, medicine and life-sciences are quite
new aspiring bachelors. One might ask in bewilderment, how this had come about. It is amongst others,
that healthcare still is, but in the past even more so was, the domain of regulation. Changes, challenges and
chances were either funded by public money or governmental contribution, or they did not happen at all. The
doors were guarded by professions and professionals who made sure nothing would get between them and
the patient that was not to some extent based on evidence, which might be scientific or financial in nature.

The digital reformation, which is a part of the overall transformation process, is currently underway.
What it does is changing behavioral patterns and intellectual approaches. It is bringing light to the setup

Digital Transformation in Healthcare 2


A Whitepaper of the Healthcare Futurists GmbH
of innovative processes in healthcare systems and empowerment to those who wish to learn more about
their medical conditions. It does this by using its digital tools of transparency to find access to large bodies
of knowledge and generate insights that were previously the right of the privileged few who possessed what
is now called health literacy, and from there is slowly evolving into what is rightly called citizen science.

Disruption, to revisit this word again, has seemingly always started out with ideas and people seizing power
over a realm they were previously secluded from. We know this as a reformatory process; the revolution has
been procrastinated and delayed until new structures will have gained momentum. What has happened
in times of reformation seems to have always been the alliance of technology, curiosity and necessity. In
healthcare, we see all three of them equally expressed. As necessity arises from medical conditions, it is a
constant trigger for people with acute but even more so with chronic conditions to tap into other sources of
information and be ahead of the curve for their own good - further ahead often than their own physician who
is tied down by fiscal, financial and federal constraints. Add common mistrust in expert systems and political
bodies to the equation and there is more potential yet for technology and services entwined to satisfy needs.

For all physical beings, being body owners by nature, the sheer number of people interested in the achievements
and infrastructure of modern medicine and what it can do is caused by innate human curiosity. This again
leaves us with technology: it is always a means to reach a goal, but has never been the goal itself, for at the
start of a new technology, there are no words for this matter, there is marveling only and an invention that
will eventually become an innovation, let alone a disruption. So how should there be anything else but
necessity and curiosity? Disruptions, on the other hand, if there are any, are not planned, predicted or pushed,
they just happen. In fact, often their future potential is not revealed to their own creators who function as
disruptors of their own ideas. It takes time and visionaries to lead these technology advancements to full market
success. This is another reason why we are all better off talking about a digital transformation rather than a
disruption. There might be a disruptive technology, but its potential needs to be envisioned by business explorers
and product pioneers alike. And again, these processes are rather transformative in pace but also in result.

We want to provide an overview of what we see happening in other markets outside of healthcare and try to
bridge the gaps between here and there, now and then, them and us by connecting the dots of what transformative
processes are setting in motion. All of this is imprinted on the narrow and yet so vast area of healthcare, medicine
and life-sciences. This whitepaper embarks on a journey together with the reader and we shall sail several very
different seas listening to expert guides in their respective fields who each have their own style in reasoning
and concluding. These different voices should get the reader oriented with the most current standards and
aspects of digital transformation in general and in healthcare in particular. It points the way into one or more
of our possible futures. It is, however, the readers responsibility to connect the dots with lines that lead to
transformation into our next reasonable future. For the future always is what we allow ourselves to make of it.

Digital Transformation in Healthcare 3


A Whitepaper of the Healthcare Futurists GmbH
Chapter 1:Digital Health
in General with a Special
Focus on th e Medical Affairs
Perspective (Descriptive)

Digital Transformation in Healthcare 4


A Whitepaper of the Healthcare Futurists GmbH
1.1
Overview of the Current
State of Digital Health with
Focus on the Diverse Groups
Affected (Patient, Payer,
Physician, Pharma, Politics
etc.)

An outsiders view of the stakeholders of most European leads to adherence to old behavioral patterns most
healthcare systems, and especially the German one, of which are pre-digital in nature. This approach is
will most likely be led to the conclusion that digital reinforced by systemic risk aversion and often rigid
transformation takes place at a very slow pace, if at regulatory obedience at the public payer level. Even if
all. This contrasts with what is happening within the this behavior might not be considered desirable from a
organizations of healthcare providers and administrators. transformative standpoint, it is very well understandable
Since the turn of the millennium, digitalization has from the legal mandate of public bodies: their job is
successfully been integrated at considerable volume: not to push innovations, let alone to develop them.
internal production processes at the payer level,
especially in terms of accounting, customer care and The basis for any activity in the field of digital health
patient steering have meanwhile to the greatest possible is electronic communication in a medical content. This
extent been digitalized. Fundamental to this success ranges from classic and straightforward telemedicine
has been the well-established legal framework in applications, the acquisition of second opinions in
individual countries. It defines explicitly the financial dedicated second opinion portals, all the way to exploiting
pathways for invoicing and payments on behalf of wearable data under the umbrella of connected
digital machinery. Furthermore, it is legally binding health. In the not so distant future we will also see
for all parties involved in service delivery and charging. more and more surfacing around predictive analytics.
We wanted to start you off in this whitepaper by
On the other hand, what is happening in terms showing some comprehensive charts and tables. A lot of
of external affairs between the stakeholders is very thought has gone into the graphs, for they are trying to
heterogeneous. We see diversity in country-to-country depict the actual situation in a very comprehensive way.
comparisons as well as payer-to-payer and provider-to-
provider relations. The reasons are discrepancies between
payers and the lack of uniform legal frameworks. The
latter ones are not yet well established, as opposed to
digital invoicing but they need to become the pillars in
setting up consistent digital relationships in the future.

Data security regulations have come up a number of


times now as the one impediment for a quicker expansion
of digital transformation. They are often considered to
be too strict and then again there are age-specific pockets
that harbor a deep mistrust of electronic communication
using medical and personal data beyond the already
existing and seasoned system of digital invoicing.
Even more, in many cases the cost-benefit ratio of
new digital offerings is unclear. At a payer level, this

Digital Transformation in Healthcare 5


A Whitepaper of the Healthcare Futurists GmbH
1.1 Overview of the Current State of Digital Health

Fig. 1: Current State of Digital Health - Focus: Electronic Medical Communication

Figure 1: Electronic medical communication is the core function of digital health. The figure provides
an overview of the current state in the German health care system. The deeper the color, the farther the
implementation has been achieved at present. For details please refer to table 1a and the electronically
provided comprehensive appendix. (contact whitepaper@healthcarefuturists.com for details)

Digital Transformation in Healthcare 6


A Whitepaper of the Healthcare Futurists GmbH
1.1 Overview of the Current State of Digital Health

Fig. 2: Pharmaceutical industry: Relevant general categories of digital health applications.


The deeper the color, the farther the implementation has been achieved in the German public
health caresystem. For further details refer to table 1 (appendix).

For the pharmaceutical industry, several new opportunities have opened. The industry can now directly
approach patients and prescribers. They might do that with integrative beyond the pill solutions
in mind or by using digital and social media as platforms for social feedback and health information.
This can be helpful in terms of the provision of neutral medical aspects as well as marketing tools.
As a matter of principle, in almost all digital communication relationships there is space for pharma
contribution. From a payer side this might also be quite interesting in terms of strategic aspects.

Digital Transformation in Healthcare 7


A Whitepaper of the Healthcare Futurists GmbH
1.2
Current Streams of
Engagement and Official
Interest

1.2.1 Health Literacy and Patient-Centricity Poor health literacy is likely to be associated with
unfavorable health outcomes and a limited use of
Health Literacy: it is a reformation rather than preventive care4. This also means that healthcare costs
a revolution but in any event it is a revelation . are, on average, higher in its absence5. It is estimated
that up to one half of the US population has limited
Overall literacy, and especially health literacy, is health literacy standards5, and it is probably not
one of the major achievements of modern societies much different in the European countries. These
which have made it a duty for children to acquire a mechanisms are very well understood; therefore;
least a certain degree of knowledge in understanding health education materials are being simplified in
written texts. However, it is not only the writing that order to improve patient-to-provider communication
needs to be understood, it is also the thoughts (and and thus overall health literacy, which is considered
often advice) in medical texts, contained within the to lead to more efficacious spending in healthcare6.
alphanumerical codes that our brains are trained to
decipher to make sense of the world, for better or worse. However, it is not only the absence of skills and abilities
that render an individual incapable of comprehending
Thus, according to the WHO, health literacy is defined complex healthcare-related content, or to understanding
as the cognitive and social skills which determine the the professional language of physicians and care
motivation and ability of individuals to gain access providers. Active neglect and turning a blind eye to
to, understand, and use information in ways that the obvious also seems to be a part of the challenge.
promote and maintain good health1. It can be also So it is not only a question of socioeconomic status,
seen as a constellation of skills, including the ability but also of the will to break habits, to change, and to
to perform basic reading and numerical tasks required innovate on a personal and systemic level. In this it does
to function in the healthcare environment2, as the not come as a surprise that health literacy is believed
American Medical Association puts it. However, it is to be a stronger predictor of health outcomes than
academically put. Health literacy, at its core, means the social and economic status, education, gender, and age7.
degree to which individuals have the capacity to obtain,
process and understand basic health information and It seems though that we are now addressing a well-
services needed to make appropriate health decisions3. known phenomenon more and more under the novelty

1 Nutbeam, Don. (1998) Health Promotion Glossary of the World Health Organization.
2 American Medical Association Ad Hoc Committee. (1999). Health Literacy for the Council on Scientific Affairs.
3 Nielsen-Bohlman, Panzer, and Kindig. (2004) Health Literacy: A Prescription to End Confusion. Retrieved from https://www.nap.edu/read/10883/
chapter/1
4 Berkman ND, et al. Health Literacy Interventions and Outcomes: An Updated Systematic Review. (2011)
5 Eichler, K., Wieser, S. & Brgger, U. Int J Public Health (2009) 54: 313. doi:10.1007/s00038-009-0058-2
6 US Department of Health and Human Services. Retrieved from https://health.gov/communication/literacy/quickguide/quickguide.pdf
7 American Medical Association Ad Hoc Committee. (1999). Health Literacy for the Council on Scientific Affairs.

Digital Transformation in Healthcare 8


A Whitepaper of the Healthcare Futurists GmbH
1.2 Current Streams of Engagement and Official Interest

aspect of contemporary technological achievements. In As we can see happening in paternalistic medical


this, we are guided by the thought that technology has approaches, these others are guardians who have
already fixed a number of issues concerning longevity, kindly taken supervision upon themselves to see to
so why should health-literacy (via health understanding it that the overwhelming majority of mankind []
and modern technology use) not be one of them? should consider the step to maturity not only hard, but
The root of this thought lies in the introduction of as extremely dangerous. This also makes the guardians
ubiquitous computing, with potentially 80% of adults lives easier and more predictable, because they can fully
carrying a supercomputer in their pocket by 2020. We leverage any effects deriving from this nonage to their
may also see technological advances in connectivity advantage by exploiting value claims that are lacking in
affect health literacy habits. Technology and medicine medical evidence in the context of proposition-induced
seem to have been a matching pair in the West demands. This means that health literacy, its level of
dating back as far as Hippocrates in ancient Greece. development, and its acceptance, becomes a deeply
ethical question within several groups: politicians,
Six out of ten respondents have used the internet to payers, physicians, patients, and industrial suppliers.
search for health-related information within the last
year. At the top of the ranks are searches for general The latest driver of democratization in healthcare
health information, nutritional information, and facts or medical enlightenment, to pick up this phrase
on lifestyle choices. In second place are queries for one more time, is the so-called digital transformation
information on specific injuries, diseases, and illness in healthcare. We have seen a number of other fields
conditions; as well as side effects of medications. affected, probably even disrupted, by the introduction of
It is no surprise that at the current stage, the early modern technology, especially information technology
adopters in their twenties and thirties still lead the systems and the resulting change in customer
numbers, with the silver surfers catching up steadily8. empowerment and business models. In many markets
this has led to the disappearance of the middle man
There is a clear trend that is owed to what I would refer when customers actively engage with providers and
to as democratization in healthcare - knowing that this vendors. What is possible in the eBay commerce markets
might not be the best term to describe what is happening or the direct insurance markets might not yet be feasible
when patients engage with physicians and actively decide in healthcare; but we see an increase in companies
on their therapy. It might as well be called enlightenment who are trying just that - actively ignoring regulations
in healthcare, alluding to Immanuel Kants famous that have been around and untouched for decades.
words (quoted as such in 1784 in his text What is
Enlightenment?): Laziness and cowardice are the One of them is the patient-physician relationship,
reasons why such a large part of mankind gladly remain which is considered to be the physical basis for
minors all their lives []. They are the reasons why it is therapeutic success. Lichtenberg, a German poet of the
so easy for others to set themselves up as guardians. It is 18th century, once claimed that it was the physicians
so comfortable to be a minor. If I have [] a physician duty to entertain the patient up until to the moment
who prescribes my diet [] - then I have no need to that nature had cured the disease. However, doctors
exert myself. And Kant continues: I have no need to have lost their entertainment acumen since then,
think, if only I can pay; others will take care of that probably because they are too much in love with
disagreeable business for me. This goes along with the professional technology themselves, and thus have
self-imposed nonage which does not lie in the lack of ceased to be the only source prospective patients turn
understanding but in indecision and lack of courage to in their quest for understanding the cause of their
to use ones own mind without anothers guidance. current condition. Often times, the first resource
people consult has become an online search engine,

8 Flash Eurobarometer 404 European citizens digital health literacy (2014) Retrieved From http://ec.europa.eu/public_opinion/flash/fl_404_en.pdf

Digital Transformation in Healthcare 9


A Whitepaper of the Healthcare Futurists GmbH
1.2 Current Streams of Engagement and Official Interest

followed by specific and dedicated websites including cannot work without an engaged individual, both on the
blogs and forums8. We also understand that it is a healthcare providers side and on that of the recipient.
very private thing to search for medical information,
so the vast majority who used the internet to look for This is where health literacy comes full circle. It is thus
health-related information did so for themselves 8. not about selling more medical interventions, it is about
selling the right ones to the right individual at the
The European Commissions report on European right time and place. In times of information overload,
Citizens digital health literacy states that over three trustworthiness issues, and the declining reliability
quarters of all respondents agree that the internet is of things that were taken for granted before, business
a good tool for improving their knowledge of health- models of the future that engage in the patient-centric
related topics. Almost nine out of ten people who looked arena need to be able to offer real value; not just for
for health information online say they were satisfied with public reimbursement but also to convince a consumer
the information they found. The biggest downsides are of health goods of the value of a specific product or
reliability of textual content, its commercial orientation, procedure. They must separate the wheat from the
and its lack of detail8. These are the major pitfalls of chaff by using the tools of digital transformation, such
data acquisition from unreliable sources, and it is as self-learning algorithms that utilize a knowledge
a pivotal illustration of the accuracy of what is often database linked to individual patient data. This will
heard - that data has become the new oil. Just as in the be a key asset in guiding patients through a maze of
refining process and fractional distillation, compounds medical information in their search for more opinions
are intellectually separated and can be used in different and more security. The numbers of the EU report
ways according to their compounding quality, which show that almost four out of ten people do not trust
translates into medical reliability and accurateness. This information from the internet when making health-
will be the catalyst for business models that facilitate related decisions. But then again, we already have
the search for sound healthcare-related information. indicators of the effect the internet has on well-being:
people who have a poor health status use the internet
less for health-related queries than healthy people 8.
Especially because healthcare so often deals with
uncertain decision-making based on a number of There are currently a number of physicians and
influential factors from various sources, it will become healthcare professionals engaging in the field of patient
key in a connected society to declare the origin and enlightenment or patient empowerment through
quality of data and information available to the lay their efforts to increase health literacy. They are either
population wanting to grasp their medical conditions. curating their own webpages or creating medical apps
In the classic patient-physician interaction, the where they provide links to trusted sources; or they have
principal agency theory (framed by professional board started to take action and fill the knowledge gaps with
exams on display at the doctors sideboard) made sure health literacy tools that they have produced themselves.
healthcare was provided by a reliable and well-tested The privately run webpage www.orthopaedie-fuer-
source under the conditions of trustworthiness and patienten.de (orthopedics for patients) is an example for
efficacy. Nowadays, things are not so easy. Not only do a health care provider taking action and making specific
the borders between healthcare and self-care become information accessible to patients in a way they are able
blurry, but some prosumer electronic companies to grasp and comprehend. Interestingly enough, payers
find themselves in a steady process of change towards were not too enthusiastic about Dr. Kleins actions,
understanding the patients needs and detecting the so he turned his own conviction into a 3 kilo book
underlying medical conditions at an early stage. What project. Together with the initiative innovate.healthcare
will be called disease interception in a couple years (http://innovate.healthcare), a healthcare hackathon
time started from humble beginnings in the trenches event run by the HealthCare Futurists, this book will
of historic epidemiology fights against cholera, which now also be made accessible to the digital patient.
helped us understand the value of prevention. Prevention

Digital Transformation in Healthcare 10


A Whitepaper of the Healthcare Futurists GmbH
1.2 Current Streams of Engagement and Official Interest

Health literacy is a product of simplification and makes sense, given our genetic setup, and we will learn to
communication. It is the core discipline in becoming delay the progress of diseases or avoid future pathologies.
patient-centric rather than disease-centric. Consider On a personal level, we need the acknowledgment
that it is healthcare that we talk about, not disease- that genetically we are all different from one another
care. It also means getting the patient involved in the (always confined to the n=1 conundrum) 9. The
design and setup of healthcare, which is of course as participatory aspect points to the education of patients
cumbersome as a new traveler coming into a cozy train and their exchange of experience for example in social
compartment. A tool of health literacy needs to be a networks, which could become key in behavioral
process of co-creation in healthcare, to understand the pattern change, because the interaction with a different
patients values and wishes as to how the product works. peer group leads to a reframing on an individual
basis. Health literacy then becomes the driving force
It is the so-called P4 Medicine that will have a for personal change in the digital healthcare age.
major influence on how we practice healthcare:
predictive, preventive, personalized and On a governmental level, however, there are
participatory medicine, which quantify individual different forces at work, such as rising costs that
wellness and take the mystery out of disease9. force the growth of a demographic weave of more
Individual data clouds fueled by sensors we wear engaged healthcare consumers. It is expected that
outside, on and inside ourselves will, to some online health literacy programs and mobile health
extent, be able to predict future health statuses. per se will decrease the direct costs of healthcare
It will also give us clues as to where prevention in the US by 28% in 2020 compared to today10.

10000 9,400$
Spendings / Savings in US$ per

7500 6,800$
Capita

5000

2500 1,600$
700$
200$
0
Projected Savings from Improved Savings from Health Costs Per Capita
Healthcare Costs opBmizaBon of management of opBmizaBon with complete
in 2020 drug dispensing high-cost paBents of administraBve Digital
(without Digital processes ImplementaBon
Health ImplementaBon) (2020)
"

9 Hood, Leroy and Price, Nathan D. (2014) Science Translational Medicine


10 CMS; US Census Bureau, Bain & Company

Digital Transformation in Healthcare 11


A Whitepaper of the Healthcare Futurists GmbH
1.2 Current Streams of Engagement and Official Interest

In those healthcare systems where out-of-pocket Similarly, patients will be empowered to spend less
payments do not constitute a large percentage of time, effort and money on unnecessary GP and hospital
gross income, governments will likely push for more visits. It is said that 80% of visits to the GP in the
health literacy as a sustainable means of reducing UK are from patients requesting repeat medication14.
costs. One of these cost-containment programs could
be introducing prosumer tools to run diagnostic tests Contrary to what one might think, given all these
in a non-hospital setting, in order to deliver fast and insights and the obvious coherences, health literacy is
accurate care at home. We will most likely see this still in its infancy in Europe13, and has its limitations
happen in underserved and rural areas. Telehealth, in terms of personal dismay. Even professionals fall
mobile Health and overall eHealth trends point to short when facing fatal conditions themselves. This
the direction of more patient engagement11. This is indicates that it is not a question of education level,
needed to comply with the political, financial and willingness to break old habits, or unwillingness to
humanitarian challenges ahead. The concept of digital recognize health hazards; but that in every individual,
scorecards comprising blood pressure/heart rate, questions of value and trust prevail in how states of
body mass index, cholesterol levels, immunizations, disease and well-being are perceived, dealt with, and
appropriate preventive measures, and self-reported status complied with. It drills down again to the level of trust
could become another tool in the domain of health expressed towards healthcare providers, media, and
literacy12. Thus, more and more responsibility is given other information carriers. Health literacy makes the
to the individual who is able and willing to follow up. patient a partner. It assigns more responsibility to the
individual, but does not absolve the health professional
In the future, the commission has made it clear that of the responsibility to still act as a patients advocate,
activities aiming at increasing citizens digital health thus respecting that the patient has decided not to
literacy will be supported13. This means that patients will remain a minor in the Kantian sense as stated above.
be put in the drivers seat. We will see the development
of new indicators of how to assess the actual value The same accounts for technological solutions that
of eHealth services in cooperation with users. This support opinion-forming in both the healthcare
also ties into the concept of user-driven research professional and the patient. Technology can be a
and innovation in the area of eHealth. In the future, means of support, but it will most likely not be the
patient engagement need not remain political jargon. key to questions of noncompliance, ignorance, or
intentionally hazardous life styles. By and large, in a
Combining the world of health workers who also need society that makes healthcare more and more a public
to develop their digital skills (important stakeholders affair (because of the way it is funded through tax money
in the digital transformation of healthcare) with the or contributions), and with data generation sources
reality that patients need to reliably use the eHealth such as wearables permeating our daily lives, we must
assets delivered to them, we hope to find ourselves in not forget that individuals are still free to exercise their
a world of wider acceptance of eHealth technologies. right to ignorance and to disregard the facts provided
For doctors, this translates into more meaningful time to them. Event though we have asserted above that it
with their patients and fewer unnecessary appointments, is preferable for the citizen to become a citoyen, an
thanks to the use of ePrescriptions, medication plans educated participant in all things pertaining to the
and tele-monitoring, just to name a few examples. preservation of the health status, we should think

11 Wong, Genius. (2016). The Foundation For Healthcare Democratization. Health IT Outcomes.
12 IOM (Institute of Medicine). 2013. Health literacy: Improving health, health systems, and health policy around the world: Workshop summary.
Washington, DC: The National Academies Press.
13 Quaglio, Gianluca , et al. (2016) Accelerating the health literacy agenda in Europe. Health Promotion International
14 European Commission Memo: eHealth Action Plan 2012-2020: Frequently Asked Questions (2012) Retrieved from http://europa.eu/rapid/press-re-
lease_MEMO-12-959_en.htm

Digital Transformation in Healthcare 12


A Whitepaper of the Healthcare Futurists GmbH
1.2 Current Streams of Engagement and Official Interest

about using technology not only to check whether engagement can be increased by soliciting opinions
something has been accomplished, but also to support on already existing products and those currently
the completion of goals. Health literacy and its ethical being developed, or by giving customers the chance to
implications do not mean having the right to sacrifice contribute to innovation and business development.
self-responsibility on the altar of public surveillance,
even when our assessment might differ with the Healthcare poses a number of further challenges to
individuals choices with regards to staying healthy. the concept of co-creation and customer engagement
which are legal and logistical in nature. Research
Technological advances in the history of mankind and development have so far been quite remote from
have provided us with a number of tools that have actual patient engagement. Rather, development
changed the way we live and how we perceive our seems oriented toward clinical demand, potential
world. When Gutenberg invented book printing, and for reimbursement, and individual portfolio
when Martin Luther translated the bible from Latin fit per company. Instead, in most cases we find
(the professional language of the clergy at that time) product creation being driven by companies.
into the language of the people, the foundations for
what we now call the Reformation were laid. People With the internet and the waves of digitalization
drew their own conclusions about questions that pre- and post-internet, we have seen the possibilities
had been at the heart of a profession, and they did of patient engagement change from individual
that by exercising their right to enlightenment. It members of patient organizations and official patient
is this enlightenment, also called democratization, representatives being explicitly asked to contribute
that then leads to revolutions, be they political or in advisory boards or political meetings, to patients
technological in nature. It is also the core of innovation: taking on their fate and organizing themselves. This
to marvel at the extraordinary within the ordinary in turn has jeopardized compound marketing, because
and to put common things into uncommon contexts. certain internet portals have become powerful opinion
leaders, and internet services now serve as CRO support
Today we see a similar thing happening: the highly- to recruit patients for their studies in considerably
regulated healthcare systems that operate on less time than would ever have been possible before.
certifications for medicines, machines, and medical
practitioners are infiltrated and sometimes inundated This pull in co-creating healthcare has also already been
by companies and entrepreneurial individuals. These seen in personal genetic services such as 23&me and
entities make use of the digital areas printing plate - others. While there is still discussion about whether
the internet - and, like modern-day reformers, initiate patients ought to have access to their sensitive genomic
digital transformation by trying to bring literacy to data on Alzheimers disease and others without the
healthcare, questioning information asymmetry, and guidance of a professional physician, patients already
jeopardizing the dearly held status quo of those in use these platforms to band together in pushing
the system and thus in charge. Given the fact that pharmaceutical companies and research labs to investigate
our ancestors already fought this battle over eternal remedies for ultra-orphan diseases. The advantage they
life, it is interesting to see how massive professional clearly bring to the table is the fact that these individuals,
resistance is in an area that only, and by all means besides having a profound understanding of their
professionally, deals with disease and with sustaining condition - usually unparalleled by any physician - also
life, which is undeniably one thing amongst all: finite. constitute the study population. So the expensive process
of finding apt patients for studies is approaching zero.
1.2.2 Co-Creation in Healthcare
Novelty technologies will enable us to rethink how we
Co-creation in the area of fast-moving consumer goods perceive and thus practice healthcare. Rethinking in this
has become standard in certain product areas. It has context means challenging common practices and putting
been widely understood and accepted that customer them to the test. We anticipate that the way we take in

Digital Transformation in Healthcare 13


A Whitepaper of the Healthcare Futurists GmbH
1.2 Current Streams of Engagement and Official Interest

medicines will be subject to these kind of challenges. But is it the best we can do? Your physician is well
This includes the form, color, size, coating etc. of orally trained and knows a lot of things about you. But I
administered agents; and we foresee this becoming a doubt they know about your browser history (unlike
main domain of mass customization in healthcare, to the ad network that predicted you would click on
the tune of 80% individualized polypill medication. that mortgage ad). They dont even know about
highly predictive factors in your genes. And even
The will to co-create healthcare on an individual level if we would show them your genetic information
is also a prerequisite not only for patient empowerment on a DVD, would they be able to make use of it?
but also for successful and sustainable disease
prevention. While current efforts of prevention primarily Not really. Nobody could. The genome of a single person
geared towards a holistic healthy living approach, is more data than any human could read in a lifetime, let
individual factors such as genetic setup (eg. FOXO alone make sense of it. If we want to extract information
47 Gene for carbohydrate), epigenetic interactions, from this data, well have to let the machines take a look.
and personal preferences are widely neglected. It is And not only at one individual genome. At all genomes.
hoped that predictive analytics will provide a more
granular approach towards individual risk factors and Not long ago, we had no idea about germs and viruses.
thus a more sustainable and co-created healthcare. They were already there, but we couldnt see them
until we had good enough microscopes. Nowadays
1.2.3 Predictive Analytics we collect all kinds of data and wont be able to see
much if we dont look at it with the right instruments.
We care about the future. Especially our future.
And despite knowing it wont end well in the end, In God we trust, all others bring data.15
we want to make sure it doesnt end well too soon.
While mathematics has proved to be surprisingly useful
There are things we are not good at predicting. If you in the field of physics as described in the paper The
are about to die in a freak accident, you can at least take Unreasonable Effectiveness of Mathematics in the
comfort in the fact that you wouldnt have seen it coming. Natural Sciences by Eugene Wigner16, its application
The same reasoning applies to terrorist attacks. Theres has been less successful in the fields of medicine or
simply not enough data to predict these rare events. social sciences. Theres no elegant mathematical formula
to predict whether someone will develop colon cancer
We are much better at predicting whether you are about next year; but being unable to formulate elegant
to click on an ad, or the likelihood that you will pay back equations describing the health status of humans
your debt, simply because we have a lot of data about sufficiently does not mean we cant do anything.
those events and its therefore easier to build accurate Maybe we have to accept the complexity of human
models of your future; but what about developing beings and their environment as a given and resort
diabetes or having a heart attack? We know the exact to the next best option we have: looking at data.17
probability of an average person running into those issues,
but your individual risk is probably far from average. With enough data, some things become pretty easy
So maybe seeing your physician is a good idea, and on which otherwise would be really hard - spell checking,
close examination they would be able to assess your for example. The traditional method of spell-checking
risks more accurately. Thats what we do, and it works. was to look up each word in a dictionary. When a word

15 Quote by W. Edwards Deming: In God we trust; all others bring data.. 2012. 13 Sep. 2016 http://www.goodreads.com/quotes/34849-in-god-we-
trust-all-others-bring-data
16 Wigner, Eugene P. The unreasonable effectiveness of mathematics in the natural sciences. Richard courant lecture in mathematical sciences delivered
at New York University, May 11, 1959. Communications on pure and applied mathematics 13.1 (1960): 1-14.
17 Halevy, Alon, Peter Norvig, and Fernando Pereira. The unreasonable effectiveness of data. IEEE Intelligent Systems 24.2 (2009): 8-12.

Digital Transformation in Healthcare 14


A Whitepaper of the Healthcare Futurists GmbH
1.2 Current Streams of Engagement and Official Interest

was not found in the dictionary, we would assume it There are two ways to extract more knowledge from data:
was misspelled and search the dictionary for the word
that was most likely intended instead. This search 1. Build better algorithms to get more insight from
was done via complex heuristics like soundex and the data you have
metaphone to find words which sound similar but are 2. Gather more data
spelled differently. But if you have enough data you
wont have to deal with complex heuristics. Lets assume In 2001 Michele Banko and Eric Brill published a
you have a lot of textual data; then you can build a paper comparing different learning algorithms in
dictionary automatically. Youll generate a list of edits different data sizes19. They showed that its not possible
for each word you want to check and filter out words to predict the relative performance of algorithms when
that are not in your dictionary. After that, you calculate you increase the order of magnitude of your dataset.
the probability for each generated word and take the So one algorithm might seem to be weak on a small
one with the highest probability. Now its possible to dataset, but does much better on a big dataset compared
do that with less than two pages of computer code18. to other algorithms and vice versa. This makes sense,
because any sufficiently complex model should reflect
the complexity of the data it was trained on, not the
complexity of the algorithm that was used to train it.

18 How to Write a Spelling Corrector - Peter Norvig. 2010. 15 Sep. 2016 <http://norvig.com/spell-correct.html>
19 Banko, Michele, and Eric Brill. Scaling to very very large corpora for natural language disambiguation. Proceedings of the 39th annual meeting on
association for computational linguistics 6 Jul. 2001: 26-33.

Digital Transformation in Healthcare 15


A Whitepaper of the Healthcare Futurists GmbH
1.2 Current Streams of Engagement and Official Interest

But what does it mean if we accept the premise


that we have to use complex models based on huge
amounts of data to predict future events? In the past
one of our best guidelines in science was the use of
Ockhams Razor. Its the principle that entities must
not be multiplied beyond necessity or that simplicity
is a guide to truth. If we are given a set of observations
and have to come up with a theory which would have
predicted them, we tend to choose the simplest one.

But if we apply Ockhams Razor as a method of choosing


which model is better at explaining data we try to learn
from, it seems that it doesnt work so well20. Maybe its
a really good idea to use Ockhams Razor in a world
where the only way to teach a natural law is to write
it down on a blackboard or in a book. And maybe
the reason for this idea being good lies not so much
in the world being governed by simple rules, but that
simple rules are the only ones you can write down
and read again as a human being. Model simplicity is
probably a really good inductive bias in a world where
the lack of information technology is the bottleneck.

For a world where access and transfer of information


is no longer scarce, other inductive biases might
be more appropriate. In other words: we dont
have to be able to understand a model to be able to
make use of it. This is good, because the number of
useful models will increase without this constraint.
But it will also be strange, because will no longer
be able to explain why our predictions work.

20 Ockhams Razor is Dull - Apperceptual. 2012. 15 Sep. 2016 <http://blog.apperceptual.com/ockham-s-razor-is-dull>

Digital Transformation in Healthcare 16


A Whitepaper of the Healthcare Futurists GmbH

1.3
Data Privacy,
Data Security,
Data Ownership

Neither European nor German law recognize there are provisions in German law regarding or relating
one data law that covers all aspects of data as to the protection of data. They range from criminal law,
such. Therefore, data is neither singularly protected copyright law, competition law, general civil law, tort
by data privacy law nor by any other existing law and data privacy law to telecommunication law.
legal provision. Depending on the quality of the
respective data or its relation to an individual, the 1.3.1 Data Ownership in German Case Law
approaches to legal data protection vary substantially.
There is currently no judgment of the Federal
The question of whether data rights already exist and/ Constitutional Court (Bundesverfassungsgericht
or should be introduced is subject to intense academic - BVerfG) that addresses ownership in data as
discussions in Germany and across Europe (for a such. However, the BVerfG stated in 1983 that an
detailed survey see Osborne Clarkes Legal study individual does not have absolute and unlimited
on Ownership and Access to data, A Study for the rights in data. The data about a person instead
European Commission DG Communications Network, represents an image of social reality, which cannot be
Content &Technology). The majority of scholars tend allocated exclusively to the data-generating person21.
toward the conclusion that a property right to data does
not exist. Moreover, there seems to be a consensus that 1.3.2 Federal Supreme Court of Germany
a right to data should currently not be established due (Bundesgerichtshof)
to the unpredictable effects such a right may cause.
The Federal Supreme Court (Bundesgerichtshof - BGH)
German scholars categorize rights in two groups: has issued various judgements concerning rights in data,
absolute rights (i.e., erga omnes rights) and relative but it has not yet acknowledged ownership in data as such.
rights. Absolute rights apply with respect to any third
party. Such absolute rights grant the entitled person an Traditionally, ownership in German civil law depends
exclusive authority with regard to a certain legal position on a physical object. In contrast, data is not physical
(e.g., an item or a patent). Relative rights only grant legal as such and no longer depends on a physical carrier.
claims towards particular individuals, e.g. contractual The BGH has adhered to this fundamental principle
obligations which only apply vis--vis the contract of German civil law in its judgements so far. To
partner. In fact, the discussion about data ownership in constitute a physical object (a thing) in accordance
Germany is a dispute about whether there should be with Sec. 90 BGB as a prerequisite for ownership rights,
an absolute right in data (not necessarily ownership). it is decisive for data to be stored on a data carrier22.
Approaches to establish data ownership are as various as

21 BVerfG, judgment dated 15 December 12.1983 1 BvR 209/8 Volkszhlungsurteil = NJW 1984, 419
22 BGHZ 143, 307, 309; 109, 97, 100 f.; 102, 135, 144; BGH, judgments dated 4 March 1997 X ZR 141/95 MDR 1997, 913; 14. July 1993
VIII ZR 147/92 NJW 1993, 2436, 2437 f.; 7 March 1990 VIII ZR 56/89 NJW 1990, 3011; 6 June 1984 VIII ZR 83/83 ZIP 1984, 962, 963;
decision dated 2 May 1985 I ZB 8/84 NJW-RR 1986, 219

Digital Transformation in Healthcare 17


A Whitepaper of the Healthcare Futurists GmbH
1.3 Data Privacy, Data Security, Data Ownership

The BGH has dealt with further aspects of data - taking 1.3.3 Academic Discussion on Data
its commercial value into account - by acknowledging Ownership
that a data subject can have commercial interests
in its own personal data (in this case, a photo of There are three main positions among German legal
the actress Marlene Dietrich) which might even academics: (i) erga omnes rights in data already exist,
include a licensing right23. Further, in two different (ii) erga omnes rights in data do not exist but should
judgements in 1999 and 2006, the BGH recognized be created, and (iii) erga omnes rights in data do not
that the use of customer data by a business can exist and there currently is no need for additional laws.
constitute the violation of trade and business secrets24.
Approaches to deriving data ownership from already
The higher regional courts (Oberlandesgericht existing principles or provisions in German law
OLG) in Germany have supplemented the rulings are various. They range from granting ownership
of the BGH by focusing on additional aspects of data. in the traditional sense to intending to circumvent
the necessity of a physical object by classifying data
In 1995, the OLG Karlsruhe made a landmark decision as the fruits (product) of a thing27. In the end,
on the destruction of data, stating that the deletion of these approaches are not convincing, as the German
data stored on a data carrier may violate the ownership principles of ownership require a corporeal quality.
in the data carrier pursuant to Sec.823 para.1 BGB25.
Thus the decision extended the protection of ownership Other legal scholars argue that the German law already
rights in regard to the data carrier onto the data itself. acknowledges a right in data because such a right
would be a prerequisite for the protection of data under
A recent judgement by the OLG Naumburg addressed criminal law28. Since Sec 203a ff. German Criminal Code
issues regarding the legal authority to read and (Strafgesetzbuch StGB) and Sec 303a StGB protect
change data collected in a radar control system26. data, it commonly understood that the legal asset which
The judgement examined whether the producer of is protected by these laws is the authority to utilize the
electronics or the owner may use the data generated data. In conclusion, this authority to utilize data should
by such systems with the help of Sec 202a StGB. be seen as an erga omnes right pursuant to Sec 823 para 1
According to the OLG Naumburg, the data access German Civil Code (Brgerliches Gesetzbuch BGB).
should belong to the person generating the data.
However, the majority of legal scholars
The review of the available case law shows that argue that German law currently does
the establishment of a veritable and dogmatically not acknowledge a right in data as such 29.
reliable concept of an erga omnes right ultimately
fails because the provisions put forward to support Some of these scholars support the establishment of an
such a right cover only certain aspects of data, are erga omnes right to set incentives for the data economy
limited to certain situations or addressees, or may and to create legal certainty (Zech, CR 2015, 137 (144
not be transferred to the specific dynamics of data. et al.). Others regard existing contractual solutions as

23 BGH, judgment dated 1.December 1999 I ZR 49/97 - Marlene Dietrich = GRUR 2000, 709
24 BGH, judgment dated 14. January 1999 I ZR 2/97; judgment dated 27.April 2006 I ZR 126/03.
25 OLG Karlsruhe, judgment dated 7 November 1995 - 3 U 15/95 - Haftung fr Zerstrung von Computerdaten = NJW 1996, 200
26 OLG Naumburg, judgment dated 27 August 2014 6 U 3/14 = CR 2015, 83
27 Grosskopf, IPRB 2011, 259
28 Hilgendorf, JuS 1996, 509 (511); Hoeren, MMR 2013, 486
29 (e.g., Dorner, CR 2014, 617; Zech.CR 2014, 138 (142); Schefzig (co-author of this study), K&R 2015, Beihefter zu Heft 9, 3 (6); Kraus, TB DSRI
2015, 537; Grtzmacher, CR 2016, 485

Digital Transformation in Healthcare 18


A Whitepaper of the Healthcare Futurists GmbH
1.3 Data Privacy, Data Security, Data Ownership

sufficient to protect data effectively30. Therefore, these by the census judgement (Volkszhlungsurteil) of the
academics argue that it has not yet been shown that BVerfG in which the court stated information, also
there is indeed an economic necessity to create a right information on people, is a picture of social reality which
in data. Furthermore, the artificial limitation of data cannot be allocated exclusively to the data subject.
might negatively affect innovation because especially
large data applications depend on large amounts of data. Data privacy law defines the responsible body as
the Controller and grants the individual extensive
Data privacy law regards data as a threat, not as an asset. rights towards this Controller, including information,
Essentially, data privacy law is a personality right31, erasure, and correction rights. Hence, the BDSG
designed to protect the individual from any infringement confers on the data subject a position similar to actual
of their right to privacy resulting from the collection, ownership. But this position is limited to the specific
processing, use and transfer of personal data, cf. Sec1 Para individual, is directed against the data Controller,
1 Federal Data Protection Act (Bundesdatenschutzgesetz and is restricted to personal data. Data as such is
BDSG). As a consequence, data privacy law does not exclusively related to individuals. Technical
not protect data as such, but the information contained data stripped of or initially compiled without any
therein relating in various degrees to an individual. connection to identifiable individuals does not fall
within the scope of data privacy law to begin with
Data protection laws protect the individuals right
of informational self-determination. Therefore, the Particularly in connection with large amounts of
data itself should be protected by the general right of data, Sec 4 para 2 sentences 1 Copyright Act (Gesetz
personality as well. Other scholars simply state that the ber Urheberrecht und verwandte Schutzrechte
extensive rights of data subjects regarding their personal UrhG) comes into focus. Databases structuring
data implicitly establish a general right of the data data systematically or methodically qualify as
subjects to commercially exploit their data. This theory personal intellectual property and are therefore
would lead to an erga omnes right of subjects in their protected by copyright law. A high level of creativity
data because the general right of personality is recognized (Schpfungshhe) is required. In the case of mere
as an erga omnes right. If third parties used personal data analysis this requirement is not met32. Further,
data without the necessary justification, the data subject Sec. 87a to 87e UrhG which govern the protection
would have a claim for damages and basically have and use of databases might be applicable. The creator
similar rights as if it owned such data. But the general of the database is exclusively entitled to reproduce,
right of personality as such cannot be transferred (or distribute or publicly display the database. In contrast
at least, only to a limited extent). Therefore, the data to the protection of databases - pursuant to Sec 4 para 2
subjects could only trade their data to a limited extent. Sentence 1 UrhG - the protection does not originate from
a certain level of creativity, but rather from the economic
The majority of legal scholars argue that there is no general effort necessary to compile, verify and arrange the data33.
right of data subjects in their data. Even though data
protection laws grant the data subjects rather extensive Both concepts grant protection to the database as a
rights, data protection law would only be a regulatory whole but do not create ownership in the data itself.
instrument of the public law, which is supposed to
regulate the interaction of data subjects and data The patients right to inspect their medical records
controllers, but should not create private, commercially originates in their right of self-determination and
exploitable rights. This finding seems to be supported the patients personal dignity, as those records affect

30 Dorner, CR 2014, 617; Schefzig, DSRITB 2015, 551; Grtzmacher, CR 2016, 485
31 Hoeren, in: Grtzmacher, Recht der Daten und Datenbanken im Unternehmen, 1st edition 2014, 23 par. 4
32 Dorner, CR 2014, 617, 621
33 Dorner, CR 2014, 617, 622

Digital Transformation in Healthcare 19


A Whitepaper of the Healthcare Futurists GmbH
1.3 Data Privacy, Data Security, Data Ownership

them directly in their privacy34. In addition, this will be the case with most mobile health applications and
right of inspection is now legally standardized in the even with medical products specializing in data analysis.
Medical Associations professional code of conduct,
in Sec 630g BGB, Sec 810 BGB. First and foremost, Only data processing that does not include personal
this right aims to grant the patient a right to inspect data as such, e.g. mere technical data or data
his medical records. This includes the right to obtain stripped of any personal reference (anonymous
a transcript of the patients medical file. Sec 630g data) is excluded from these strict prerequisites,
para 2 BGB. Sec 630g para 2 also compromises the since it is not subject to data privacy laws. As a
right to receive an electronic copy of the file but only result, the eligibility criterion for the protection of
if and so far as the file is compiled electronically35. data under the BDSG correlates with the direct or
The inspection right is a specific form of the patients indirect personal reference to a specific individual36.
right of information. Therefore the information has
to be readable and uncoded. However, the right of The question of whether data is personal data is subject
inspection does not establish any right of ownership, to a heated debate, mainly because the answer to
nor does it limit the physicians right to process the data that question determines whether data privacy law is
lawfully, e.g. to safeguard their own justified interests. applicable (for a general overview see Brink/Eckardt, ZD
2015, 205; Bergt, ZD 2015, 365). This is essentially a
1.3.4 Data Privacy in General question regarding the requirements necessary to establish
a connection between an individual and the respective
Any collection, processing, and use of personal data is data. In other words: when can a person be identified
subject to the ban of permit reservation pursuant to Sec4 by means of the available data? Most data privacy
para1 BDSG. Therefore, it is only admissible in cases of authorities have adopted an absolute approach: data
legal justification or the data subjects consent. Personal qualifies as personal data if anyone might hypothetically
data means any information concerning the personal or identify the individual in question. Consequently almost
material circumstances of an identified or identifiable any data qualifies as personal data. Contrary to this
individual, Sec3 para1 BDSG. Health data will often absolute approach is the relative approach presented by
qualify as a special category of personal data pursuant most practitioners and moderate scholars: decisive for
to Sec3 para9, referring explicitly to an individuals the qualification as personal is the question of whether
state of health; therefore the collection, processing, and the controller can realistically identify the data subject
use of this data is even further restricted and limited with reasonable effort. Following that approach, the
in Sec28 para6 to para 9 BDSG. In accordance with controller determines whether the data in question is
these provisions, processing of special categories of data personal or not. Neither the Regulation 95/46/EC nor
is only admissible in certain situations, e.g. if necessary the recently adopted General Data Protection Directive
for medical treatment or diagnostics by groups bound offer an unambiguous answer. The currently pending
to confidentiality, such as physicians. Further processing case before the European Court of Justice addresses this
is admissible if necessary to protect vital interests of the particular question and might bring some clarification37.
data subject in case the data subject is unable to provide
consent, if the data is made public by the data subject, or
in case the data is necessary in relation to legal claims, or
for scientific research in observance of a strict principle
of proportionality. Outside these limited purposes the
controller has to revert to the data subjects consent. This

34 BVerfG decision dated 16. September 1998 1 BvR 113098 NJW 1999, 1777
35 Weidenkaff, in: Palandt, 75. Auflage 2016, 630g par 4
36 cf. Sec 1 Para 1 BDSG
37 European Court of Justice, Breyer v Germany Case C-582/14

Digital Transformation in Healthcare 20


A Whitepaper of the Healthcare Futurists GmbH
1.3 Data Privacy, Data Security, Data Ownership

1.3.5 Right to Data Portability of adequate mandatory technical and organizational


measures. This includes physical and electronic access
The General Data Protection Regulation (GDPR) control to data processing systems and to the data itself.
introduces the right to data portability, which has no The legislator does not define what he considers to be
current equivalent in European privacy law. Art.20 of adequate measures. However, it can be derived from
the GDPR enables the data subject to receive a copy of Sec 9 and the general principles of the BDSG that
their personal data currently residing with the controller the requirements of these measures correlate with the
in a structured, commonly used and machine-readable sensitivity of the respective data40. As health data often
format, as well as the right to have their data transferred qualifies as a special category of data pursuant to Sec3
directly from one controller to another, Art.20 para1 para8 BDSG, this affects the level of data protection.
GDPR. This right only applies in limited circumstances,
e.g. in case the data was obtained directly from the data In July 2015 the new IT-Security Act (IT-
subject based on consent or on a contractual basis and Sicherheitsgesetz IT-SiG) came into effect.
the processing was carried out by automated means. Operators of critical infrastructure are now obliged to
Initially expected to pave the way for an equivalent cooperate more closely with the German Federal Office
to data ownership, the right to data portability for Information Security (Bundesamt fr Sicherheit und
is still not suitable to establish such an absolute Informationstechnik BSI). The IT-SiG addresses
right. However, it is true that the data subject gains foremost nuclear plant operators, gas and electricity
additional disposition rights in regard to their data38. providers, and telecommunications network operators.
Nevertheless, critical infrastructures might also originate
At the same time, there are substantial limitations to in the health sector as Sec2 Para10 No1 BSIG explicitly
the right to data portability which separate this right mentions the health sector as a critical infrastructure.
from the concept of ownership. First of all, Art. 20 Which parts and players of the health sector are affected
grants only the right to receive a copy of the respective by the new requirements is yet unclear. Specifics are
data and explicitly excludes the data subjects sole subject to a legislative decree expected for 201741. But
disposition over the respective data by detaching the even today, many data controllers or providers will
right to data portability from the right to erasure, be subject to the effects of the IT-SiG, since - almost
Art.20 para3 sentence1 GDPR. The controller might unnoticed by the public - the changes also address
still use the data as long as there is a corresponding providers of commercial telemedia services such as online
legal justification. Furthermore, this right refers only to shops, search engines, webmail services and websites42.
data available at a specific time and, most importantly, The requirements of the IT-Sig were implemented by
solely to the data subjects own data. As most of the inserting the new 13 para 7 into the German Telemedia
data today relates not to one individual alone, the Act (Telemediengesetz TMG). Providers have to
amount of data the data subject might transfer is ensure the safety of their systems by means of technical
severely restricted by the rights of other data subjects39. and organisational measures, as far as is technically
possible and economically reasonable. This reinforces
1.3.6 Data Privacy and Data Protection technical and organisational compliance obligations.
Infringements of these compliance obligations are
Data controllers are subject to various data protection now subject to severe fine proceedings and may also
requirements. As a central provision for data protection, trigger competitive warnings by fellow competitors.
Sec 9 BDSG and its annex require the implementation

38 Jlicher/Rttgen/v. Schnfeld, ZD 2016, 358, 361


39 Kamalah, in: Plath, BDSG, 2nd edition, 2016, Art.20 DSGVO par.5
40 Ernestus, in: Simits, BDSG, 8.Auflage 2014, 9 par.27
41 Ortner/Daubenbchel, in: Medizinprodukte 4.0, NJW 2016, 2918, 2912
42 String, in: Von Apps und Atommeilern, ct 2015, 154

Digital Transformation in Healthcare 21


A Whitepaper of the Healthcare Futurists GmbH
1.4
Dedicated Section on
Health Apps and
Tracking Devices
and their Regulation

1.4.1 Market Penetration of Health Apps: An or manage its treatment, increase prevention, encourage
Overview patients to adopt a healthier lifestyle, and make them
more informed and aware of their own health. There are
Nowadays, digital technologies shape the market various applications for health apps. While some of them
environment of almost every industry. Huge players may be used for wellness purposes, others may be applied
as well as young entrepreneurs are developing novel for clinical or medical use. The differentiation between
approaches to facing every kind of daily challenge wellness and medical apps plays an important role due
and unsatisfied customer need. Healthcare, as one of to the fact that there are different regulations associated
the most attractive markets, is no exception. Digital with each. Health apps labeled as medical must undergo
solutions developed for patient treatment, diagnosis, a certification processes by the supplier, which includes
disease management, communication needs, and patient a demonstration of clinical efficacy and economic
data exchange, along with other numerous applications, feasibility before reaching the final target population.
are labeled as eHealth or digital health solutions.
With a wide range of contents and user interaction,
One of the fastest growing segments of eHealth is the wellness apps are developed to improve or support
mobile health market, which includes all services and a healthy lifestyle. They might track for example the
applications that may be carried out using mobile devices mobility of their users and provide recommendations
such as smartphones, phablets, tablets or wearables. such as daily activity targets or nutritional advice.
According to official data, the worldwide mobile health
revenue is expected to total as much as 23 billion U.S. In contrast, medical apps are applied to prevent or
dollars in 2017, up from 4.5 billion U.S. dollars in 201343. diagnose diseases. Furthermore, they support the
From 2013 to 2020, the compounded annual growth rate treatment and therapy of patients, for example by
(CAGR) of the global mobile health market is projected measuring a patients vital signs and subsequently
to reach 36%44. In comparison, the worldwide market sending push alerts to physicians automatically in case
for IoT (Internet of Things) solutions will experience of severe results. For the measurement of vital signs,
an annual growth of 20% from 2013 to 202045. the sensors of smartphones or wearables may be used.
Furthermore, additional devices such as electrodes
Health apps represent an opportunity to make mobile for ECG measurements or blood glucometers may be
health available to patients, healthcare providers, and connected to enhance the range of applications for health
stakeholders. Among others, they might be used to apps. Ongoing technological developments increase the
identify a patients disease in its early stages, improve opportunities for healthcare with every passing day.

43 Statista: Forecast of the worldwide mobile health revenue since 2013. Available at: http://www.statista.com/statistics/218843/forecast-of-the-world-
wide-mobile-health-revenue-since-2013/ (last accessed 05/09/2016)
44 Statista: Forecast CAGR of worldwide digital health market by segment. Available at: http://www.statista.com/statistics/387875/forecast-cagr-of-
worldwide-digital-health-market-by-segment/ (last accessed 05/09/2016)
45 IoT: Worldwide regional forecast 2014 2020. Available at: https://www.business.att.com/content/article/IoT-worldwide_regional_2014-2020-fore-
cast.pdf (last accessed 05/09/2016)

Digital Transformation in Healthcare 22


A Whitepaper of the Healthcare Futurists GmbH
1.4 Dedicated Section on Health Apps and Tracking Devices and their Regulation

Figure 3: Differentiation of eHealth solutions (own illustration)

Consequently, the countless functions of both wellness healthcare market remains low, at least in Germany.
and medical apps have led to new health apps being
registered in the online stores of Apple, Android or A study published by the European Commission
Microsoft every day. Their developers aim to either in 2014 estimates that by 2017, around 1.7 billion
attract a huge number of patients in the second people will use health apps. According to several
healthcare market, which includes only privately sources, there are currently more than 160.000 apps
purchased products or services, or to convince health available that are tagged as health apps. Within app
insurance companies and other stakeholders to promote marketplaces (e.g. Apples App Store, Androids Play
their medical apps for standard care, by promising Store) they are usually available in the categories
a frequent use of the application by healthcare Health and Fitness or Medical46,47. The category
practitioners or patients. These target consumers can be randomly chosen by the developer, however.
represent attractive revenue for developers. To enter The number of available health apps may be reduced
standard care - and therefore the first healthcare market - by excluding those that are published in several app
app developers promise huge medical as well as economic marketplaces at the same time, which reduces the total
benefits for patients, healthcare providers, and payers. number down to about 100.000 downloadable health
Nevertheless, they often lack valid evidence, and as a apps48. Consequently, approximately every 22nd app
consequence, the diffusion of medical apps into the first is categorized as a health app49. However, there are

46 CHARISMHA-Study - Chancen und Risiken von Gesundheits-Apps. Available at: http://www.bmg.bund.de/fileadmin/dateien/Downloads/A/


App-Studie/CHARISMHA_gesamt_V.01.3-20160424.pdf (last accessed 06/09/2016)
47 IMS health study: Patient options expand as mobile healthcare apps address wellness and chronic disease treatment needs. Available at: http://www.
imshealth.com/en/about-us/news/ims-health-study:-patient-options-expand-as-mobile-healthcare-apps-address-wellness-and-chronic-disease-treatment-
needs (last accessed 06/09/2016)
48 CHARISMHA-Study - Chancen und Risiken von Gesundheits-Apps. Available at: http://www.bmg.bund.de/fileadmin/dateien/Downloads/A/
App-Studie/CHARISMHA_gesamt_V.01.3-20160424.pdf (last accessed 06/09/2016)
49 Taking in account that there are approximately 2.2 million different apps available

Digital Transformation in Healthcare 23


A Whitepaper of the Healthcare Futurists GmbH
1.4 Dedicated Section on Health Apps and Tracking Devices and their Regulation

only 8.500 health apps offering content in German50. user group, the apps provided functions, and its
offered features. Regarding the segmentation displayed
A common segmentation of health apps relates to above, health apps that belong to the monitoring
their individual purpose in healthcare. Therefore, segment can be both wellness and medical apps.
the following categories are established: monitoring,
diagnosis, treatment, health practitioner support,
wellness, administration and prevention. According
to official data, in 2017 the market share for those
segments will be the one shown in Figure 451.

Worldwide mobile health market


shares in 2017
0.01
0.01
0.03

0.05

0.1
Monitoring Diagnosis
Treatment Health Prac>>oner Support
Wellness Administra>on
Preven>on

0.15

0.65

Figure 4: 2017 Mobile health market shares worldwide by segment depending on expected revenues

The following two examples illustrate a


Nevertheless, the distinction between wellness and possible use of a wellness and a medical app:
medical health apps depends highly on the intended

50 Gesundheits-App, Medizin-App, Medizinprodukt? Klassifizierung nach Gesundheitszielen & Nutzerzielgruppen. Available at: http://www.healthon.
de/sites/default/files/uploads/files/wp-content/uploads/2016/03/1603_Gesundheits_Medizin_Apps_Medizinprodukte_Healthon.jpg (last accessed
06/09/2016)
51 Statista: Global mobile health market share forecast by service category. Available at: http://www.statista.com/statistics/219262/global-mo-
bile-health-market-share-forecast-by-service-category/ (last accessed 06/09/2016)

Digital Transformation in Healthcare 24


A Whitepaper of the Healthcare Futurists GmbH
1.4 Dedicated Section on Health Apps and Tracking Devices and their Regulation

1. Wellness app: Monitoring a users activity for document parameters such as blood glucose levels in a
fitness reasons. type of diary, which is then used to provide information
2. Medical app: Tracking a patients pulse, providing about further activities or to synchronize the information
information about the patients risk of developing with a physician in order to adjust treatment.
a chronic heart disease, and if required, displaying
recommendations for subsequent courses of In Germany, prevention apps may represent a cheap
actions to treat the condition. opportunity to reduce future costs for patient treatment.
Therefore, they pose a great alternative for special tariffs
In the case of the second example, such an app needs or programs provided by health insurance companies
to be certified as medical product (especially if it is in order to increase a patients awareness concerning
remotely connected to other medical products such as a risk factors related to certain diseases. Nevertheless,
blood glucose meter). In addition, there are several risk there has to be proven medical evidence as well as
levels of medical products in Germany, ranging from ease of use for both health practitioners and patients
risk level I (low risk) to level III (high risk). Medical in order for the app to be accepted within the market.
apps are mainly allocated to risk level I. Nevertheless,
those apps that allow an active diagnosis or therapy for a By contrast, health apps may be used by health
certain disease, as well as apps used for birth control, are practitioners or patients to diagnose a patients disease.
associated with risk level II52. Higher risk levels are also There are several way diagnostic apps can support their
associated with more strict certification requirements. users. While some of them employ medical dictionaries
For medical products of risk level I, the developer proves displayed on a handheld device, providing all necessary
the conformity of the product by providing product symptoms for and possible differentiations between
documentation and risk assessment. All medical products diseases, others use sensors built into smartphones or
that are classified into higher risk levels are assessed other mobile devices to measure vital parameters, and
more closely and certified by an independent authority. can even signal the risk of manifesting a certain disease.
This last case especially requires medical evidence and
While for accessing the US market a certification certification as a medical product. Diagnostic apps are
by the FDA is required, products sold in the often used to detect heart disease or to measure the
European Economic Area require a CE marking 53. risk of diabetes and psychological diseases. Another
prominent use case is the diagnosis of skin diseases.
1.4.2 Description of Health App Segments In contrast to the measurement of vital parameters, in
this case patients exchange photographs through digital
Prevention apps are applied in order to support primary, platforms in order to get an anonymous diagnosis.
secondary or tertiary prevention. While primary
prevention is used to minimize the risk of developing
a certain disease, secondary and tertiary prevention At first glance, diagnostic apps offer great potential,
apps aim to stop or decelerate the further progression especially for health insurance companies, to detect a
of sickness. In the context of primary prevention, health patients sickness at an early stage and therefore reduce
apps should support a healthy lifestyle by providing the higher costs associated with the progression of a
information about healthy nutrition, mobility, or stress disease. Nevertheless, the application of diagnostic
management, for example. Consequently, primary apps might increase the usage of healthcare resources
prevention apps are very likely to be categorized as by patients, leading not to the desired results of early
wellness apps. Apps that are used in the context of detection and reduced costs, but to false-positive
secondary or tertiary prevention allow their users to results (as measured by the diagnostic app and followed

52 Bundesinstitut fr Arzneimittel und Medizinprodukte: Orientierungshilfe Medical Apps. Available at: http://www.bfarm.de/DE/Medizinprodukte/
Abgrenzung/medical_apps/_node.html (last accessed 13/09/2016)
53 FDA: US Food and Drug Administration / CE: Conformit Europenne

Digital Transformation in Healthcare 25


A Whitepaper of the Healthcare Futurists GmbH
1.4 Dedicated Section on Health Apps and Tracking Devices and their Regulation

by unnecessary and costly medical consultations). segments for health apps. In many cases they fit not
only one segment but offer features that support
Monitoring apps exist in a huge variety. Like diagnostic various necessities. For the existing and upcoming
apps, they often use different sensors built into mobile technological developments in this type of app, there
devices. Furthermore, they may analyze the signals sent is still a lack of clear and understandable guidelines for
from an implanted medical device such as a pacemaker. developers. For example, they should know in which
Monitoring apps often utilize complex algorithms to segment to best market their app, and which further
analyze the captured signals and, if required, generate certifications are necessary to publish and promote
alerts in case of severe results. Those alerts may be directly it to patients, healthcare practitioners and payers.
signaled to the patient or to healthcare practitioners who
would have to respond promptly if needed, and set the 1.4.3 Diffusion of Health Apps in Germany
course of treatment. Pacemakers, blood glucometers
and insulin pumps are frequently used medical products Although there are about 100.000 health apps
that are linked to monitoring apps. Needless to say, available in the relevant app marketplaces, only a
those apps must meet strict certification requirements. small number of suppliers have established a large
customer base. A conduit to accessing a vast number
By contrast, the sensors embedded in smartphones or of potential customers is health insurance companies.
attachable wearables are often used for wellness purposes. Health insurance companies may act as distribution
A major application field is fitness, in which live data is partners of health app suppliers for various reasons.
used to track the performance of a user and to prepare
detailed analyses about sport activities. In addition, First, their business model and purpose offer direct access
monitoring apps provide information that helps the to a huge number of healthcare practitioners and patients,
user to improve their performance in sport activities. who may both be users of health apps. Consequently,
health insurance companies may push emerging health
Chronic diseases represent an important field where apps into the market by using their communication
treatment apps offer many benefits for patients, channels. Second, health insurance companies may
physicians, and payers. They empower patients to enlarge the legal framework for reimbursement of
manage their diseases more effectively. A prominent healthcare services. Typically, healthcare practitioners
application is the management of drug ingestion - may invoice only those healthcare services that are
for example, in chronic, multimorbid patients who included in a Germany-wide health service catalog,
have to take many different drugs. For this purpose, which is defined by the Gemeinsamer Bundesausschuss
the app coordinates and schedules the proper intake (G-BA)54. The extension of this reimbursement catalog
of the medication at the right time, providing is quite difficult. The G-BA adds only those services
patients more security in regards their treatment. and treatments that have been proven to be clinically
and economically effective. Obviously, due to financial
Finally, health apps may also support the daily business constraints, healthcare practitioners only offer treatments
of healthcare practitioners as well as healthcare to their patients that can be invoiced afterwards.
administration. They can do so for example by offering
information about the need for reimbursement of Since patients price sensitivity for healthcare services is
treatment services and medications, or by offering a high, free use for patients - and therefore reimbursement
platform to rate physicians; to make appointments, or to by health insurance companies - might be one of the
enable data exchange between healthcare practitioners. most crucial criteria for establishing a huge customer
base. Further studies comparing customer needs and
Consequently, there are various case scenarios and related price sensitivity seems useful. Nevertheless, health

54 Gemeinsamer Bundesausschuss (G-BA): The G-BA is federal joint committee that represents all German health insurance companies as well as health-
care practitioners.

Digital Transformation in Healthcare 26


A Whitepaper of the Healthcare Futurists GmbH
1.4 Dedicated Section on Health Apps and Tracking Devices and their Regulation

insurance companies may set up contracts that enlarge questions, the health app may be categorized into a
the range of treatments that can be offered to patients health app segment (wellness app, medical app) and risk
and define additional reimbursement opportunities. class (none, I, IIa, IIb, III) with associated certification
Therefore, those contracts are effective instruments requirements, as referenced earlier in this chapter. Those
for introducing new innovative e-Health solutions requirements must be fulfilled in order to be part of a
such as health apps to the first healthcare market. new contract set up by a health insurance company.

Finally, health insurance companies have additional 1.4.5 Evidence of Clinical Efficacy and
but rather small budgets for offering new Economic Feasibility
services and products to their insured customers.
Unfortunately, those budgets are often exhausted Health app suppliers need to prove the clinical efficacy
by already existing services. Furthermore, they and economic feasibility of their products. In order to
underlie very specific and unique use cases. provide medical evidence, a comprehensive systematic
review of literature should present all epidemiological
Consequently, health insurance companies represent studies, both experimental and observational, preformed
an essential point of entry into the first healthcare to their health technology. While medical evidence is
market in Germany, which is associated with a large often provided by health app suppliers, those suppliers
customer base and frequent revenues. Nevertheless, often fail to present business cases to health insurance
in order to act as distribution partners, health companies. Although economic feasibility does not
insurance companies need to be convinced of have a direct influence on a patients health, it is
the business model of each respective health app. certainly as important for health insurance companies
Therefore, the following requirements are crucial: as medical feasibility. Therefore, economic evaluations
that are focused on the perspective of a health insurance
1. Clear settings and associated certifications company are necessary. Usually, the intended outcome
2. Evidence of clinical efficacy and economic of such evaluations is either the demonstration of cost
feasibility reduction compared to standard care, or the extent
3. Data security compliance (units) of incremental efficacy that justifies a higher
price. Correct economic evaluations must contain
long-term analyses concerning a patients demand
1.4.4 Clear Settings and Associated for healthcare services (e.g. inpatient & outpatient
Certifications treatment, drugs) and other associated costs that
would need to be reimbursed by health insurance
Contracts to extend the service catalog of healthcare companies. Furthermore, the potential for false-positive
services and treatments are underlain by special legal or false-negative results, and the app makers plan
constraints concerning the service that may be provided for dealing with such results, should be considered.
as well as the parties that can engage in such contracts. Therefore, a full data analysis must be performed.
In order to initiate such a contract, health insurance
companies would expect a clear presentation showing in Consequently, in order to enter the first German
which setting a potential health app would be used. That healthcare market, a health app provider needs
is why health app suppliers should include intended to state either that its product offers a cost
patient journeys, associated healthcare practitioners, and reduction with better or least the same health
the applications of the health app. Does the patient use outcomes compared to standard care, or that the
the health app itself or does a healthcare practitioner incremental efficacy justifies the higher price of the
apply it for the treatment of patients? Will the health app product (for these cases some limitations apply).
function as a substitute for an existing service? Is it used
within a hospital? What risks does it pose for patients or
healthcare practitioners? Based on the answers to those

Digital Transformation in Healthcare 27


A Whitepaper of the Healthcare Futurists GmbH
1.4 Dedicated Section on Health Apps and Tracking Devices and their Regulation

1.4.6 Data Security Compliance need to demonstrate a clear understanding of the setting
as well as related legal frameworks which are relevant
Health apps may collect a huge amount and variety of for their apps. Furthermore, they should provide valid
data about their users. Although such data offer a huge evidence for an apps medical and economic evidence as
potential for improvements in healthcare, they also well as their accordance with data security requirements.
pose risks for misuse. For example, app users might be
concerned that their data is being transferred to third Health apps offer a great potential to make healthcare
parties. For this reason, data security is one of the key more effective and efficient. Nevertheless, at least
requirements for attracting customers and avoiding in Germany, most of the currently available health
legal or ethical consequences. Legal frameworks for apps do not seem to offer sufficient value to decrease
data security arise from Data Protection Directive the price sensitivity of patients. Furthermore, clearly
of the European Union (Directive 95/46/EC). In understandable guidelines about reimbursement
Germany, further legislation is in effect (BDSG, SGB possibilities and associated legal requirements
X, TMG, TKG). Nevertheless, the legal framework as well as more medical and economic evidence
depends heavily on the origin of the health app could support and hasten the diffusion of health
supplier. Consequently, further legislation may take apps into the first healthcare market in Germany.
precedence55. In order to avoid misuse of patient
data, relevant measures include the anonymization of
information as well as a sophisticated authorization
for the use of a health app56. Customers may identify
trusted health apps based on the existence of a data
privacy statement, the access rights of an app, and
whether it is up to date. Additional confidence may
be assured with quality labels and certifications55.

1.4.7 Conclusion

There are about 100.000 different health apps


available in app marketplaces. Only few of them are
used by a large customer base. Health apps may be
categorized as wellness or medical apps depending
on their intended purpose in healthcare. They might
support prevention, diagnosis or treatment of diseases,
healthy lifestyle choices, or the administrative needs
of healthcare practitioners. Especially those apps
intended for diagnostics or those that are linked to
medical products are categorized as medical apps,
which require certain certifications as medical products.

Special contracts with health insurance companies


represent an opportunity to acquire a large customer base
and increased revenues. Therefore, health app suppliers

55 CHARISMHA-Study - Chancen und Risiken von Gesundheits-Apps. Available at: http://www.bmg.bund.de/fileadmin/dateien/Down-


loads/A/App-Studie/CHARISMHA_gesamt_V.01.3-20160424.pdf (last accessed 06/09/2016)
56 Greenpaper on mobile health (mHealth) of the European Commission (2014). Available at: https://ec.europa.eu/digital-single-market/en/
news/green-paper-mobile-health-mhealth (last accessed 13/10/2016)

Digital Transformation in Healthcare 28


A Whitepaper of the Healthcare Futurists GmbH

1.5
The Digitally Embedded
Patient: How Does
the Patient of the Future
Behave in Comparison to
the Current Patient?

1.5.1 The Changing Roles of Doctors and accomplish the sharing of decision making and
Patients responsibility, it is necessary to provide patients with
adequate health literacy, support, and guidance, which
Traditionally, the patient role has rather been is particularly enabled by the use of digital solutions61.
passive57 and the physician-patient relationship was
predominantly paternalistic. It has been generally
accepted that physicians have the knowledge and 1.5.2 Digital Levers to Engage Patients in
medical skills to decide what treatment is best for the Health Care Processes
patient58. Lay people and patients have had sparse access
to sources of health information outside of physician- Newly developed information and communication
and/or pharmacist patient-consultations. Moreover, technologies, along with mobile digital devices,
the possibilities of interacting with other patients by have gained an increasing influence in all areas of
sharing personal experiences and treatment options life62. This progressive digitalization of everyday
has been limited to local patient support groups59. life including implementation into the healthcare
system leads to an essential change in patients
In the past decades a growing demand for a more roles and the physician-patient relationship 61.
substantial involvement of patients in their own health
care has evolved on behalf of the population, health The active engagement of patients in different health
care providers, and politicians. The former imbalance care processes is a core attribute of this patient-
of power and information due to clinicians wealth of centricity. Innovative digital services mainly target:
knowledge and the social gap between professionals and
their patients58 is decreasing. Furthermore, physicians patient deliberation and support prior to medical
and pharmacists are no longer the prime counterpart encounters and/or during treatment
for health-related issues60. These developments support increasing patients responsibility in managing
the change from physician- to patient-centricity their health, symptoms, treatment, and personal
in medical care. To foster patient engagement and health data and/or

57 Lupton, D. (1997). Consumerism, reflexivity and the medical encounter. Social Science & Medicine. Retrieved from http://www.sciencedirect.com/
science/article/pii/S027795369600353X
58 Brody, D. S. (1980). The patients role in clinical decision-making. Annals of Internal Medicine. http://doi.org/10.1059/0003-4819-93-5-718
59 Lupton, D. (2013). The digitally engaged patient: Self-monitoring and self-care in the digital health era. Social Theory & Health, 11(3), 256270.
http://doi.org/10.1057/sth.2013.10
60 Kulzer, B. (2015). Arzt-Patienten-Beziehung: Im digitalen Zeitalter grundlegend verndert. Dtsch Arztebl International, 112(43), [20]. JOUR.
Retrieved from http://www.aerzteblatt.de/int/article.asp?id=172722
61 Weinhold, I., Gastaldi, L., & Hckl, D. (2015). Challenges and Opportunities in Health Care Management. Challenges and Opportunities in Health
Care Management, 307318. http://doi.org/10.1007/978-3-319-12178-9
62 Fischer, S., & Soyez, K. (2015). Trick or Treat: Assessing Health 2.0 and Its Prospects for Patients, Providers and Society. In Challenges and Opportu-
nities in Health Care Management (pp. 197208). Cham: Springer International Publishing. http://doi.org/10.1007/978-3-319-12178-9_16

Digital Transformation in Healthcare 29


A Whitepaper of the Healthcare Futurists GmbH
1.5 The Digitally Embedded Patient

facilitating patients access to and interactions org) summarize and explain treatment alternatives for
within the health care system61 patients with preference-sensitive conditions such as
breast or prostate cancer and support them in revealing
Apart from their intrinsic purposes, digital solutions and weighing their preferences. To find and choose a
support different health-care related activities, health care provider, many people seek information and
i.e.: knowledge management, data management, advice by visiting review and rating sites. Illustratively,
and organization and time management61. The researchers estimated that in the US about 16% of
following sub-paragraphs provide a short summary of aprox. 700,000 physicians had been rated by patients
different solutions and respective popular examples. via RateMD (see www.ratemds.com) by 2010 66.

1.5.3 Digital Patient Deliberation and Support It is common among many of these more topical sources
of health-related information that users or patients
The internet in particular offers enormous medical, actively contribute to the production of information
healthcare, and treatment information, which is and content. So-called Web 2.0 technologies facilitate
easily accessible for everyone at any time and provides participation, social networking and interaction within
great possibilities for patients to gain health-related and between different stakeholders such as healthcare
knowledge62. In a traditional way, digital content is consumers, patients, health care providers, researchers
offered uni-directionally by many stakeholders in the etc.67 Patients exchange their knowledge as well as their
health system. The majority of these services are related experience of symptoms, diagnoses, treatment, side
to health education, and an increase in patients health effects, and health care providers in disease-specific or
literacy, i.e. the skills in understanding and applying general social networks and online communities, forums
information about health issues63. Health care providers and blogs either mediated by medical professionals,
as well as health insurers and federal institutions in health chats, or among peers. This exchange
actively disseminate information and advice on among patients in particular often offers an access to
healthy living and disease prevention via different web alternative therapy options and can influence patients
channels, e.g., their homepages or digital newsletters. preferences in terms of treatment. The social networking
platform PatientsLikeMe (see www.patientslikeme.
At the same time, patients actively search for and com) is a successful example of this development.
acquire relevant information via Web technologies
such as search engines, social media, online support 1.5.4 Digital Solutions to Increase Patients
groups etc., becoming more empowered and confident Self-Responsibility in Managing their Health,
in actively handling their health64. Prior to the medical Symptoms, Treatment, and Health Data
encounter, many patients inform themselves about
symptoms, potential diagnosis, and treatment options65, In recent years, a huge variety of digital services for
supported by online health information providers and monitoring and managing health data has emerged,
services such as WebMDSymptomCheckers. Decision allowing patients more control over their own health.
aid services such as Option Grids (see www.optiongrid. The range of application areas seems to be almost

63 Ishikawa, H., & Kiuchi, T. (2010). Health literacy and health communication. BioPsychoSocial Medicine, 4, 18. http://doi.org/10.1186/1751-0759-
4-18
64 Lo, B., & Parham, L. (2010). The Impact of Web 2 . 0 on the Relationship.
65 Kivits, J. (2006). Informed patients and the internet: a mediated context for consultations with health professionals. Journal of Health Psychology,
11(2), 26982. http://doi.org/10.1177/1359105306061186
66 Emmert, M., Sander, U., & Pisch, F. (2013). Eight questions about physician-rating websites: a systematic review. Journal of Medical Internet Re-
search, 15(2), e24. http://doi.org/10.2196/jmir.2360
67 Van De Belt, T. H., Engelen, L. J. L. P. G., Berben, S. A. A., & Schoonhoven, L. (2010). Definition of health 2.0 and medicine 2.0: A systematic
review. Journal of Medical Internet Research, 12(2), 114. http://doi.org/10.2196/jmir.1350

Digital Transformation in Healthcare 30


A Whitepaper of the Healthcare Futurists GmbH
1.5 The Digitally Embedded Patient

unlimited, covering the employment of mobile devices progressive strategies focus on digital solutions
and wearables or implanted monitoring sensors61. that support interactions within the healthcare
Patients are actively involved in managing their health system, mainly targeting the efficient use of health
by monitoring bodily functions and activities, recording care providers resources71. In many organizations,
medical data using apps and wearable devices. Most the digitalization process started with preliminary
medical applications are developed and implemented investments in digital infrastructures72, which laid the
for chronic conditions such as cardiac diseases, foundation for a more comprehensive digital internal
asthma, diabetes, and hypertension68. Depending on integration73. Just recently, hospitals began to plan or
application and indication, the collected data is either implement digital services for external integration71,
interpreted by patients themselves or jointly evaluated i.e. with direct applications for patients and options
in the medical encounter. The data is transferred to to interact. Digital service delivery, online booking
health care professionals (either automatically or by systems for hospital visits, online payment for health
the patient) who receive a notification in case critical care services, SMS reminders, or solutions providing
values are exceeded. Examples are numerous, ranging information on clinical exams and on waiting times
from fitness apps and diet trackers for practically will increasingly support patients in their care
everybody, to disease specific-applications such as coordination and increase efficiency and effectiveness of
digital diaries or prescription reminders - see Silva, their interactions within the health care system71. For
Rodrigues, de la Torre Diez, Lopez-Coronado, & example, ZocDoc is a popular digital service that helps
Saleem, 201569 for a list of the top applications in 2015. to find health care providers within zip code areas and
make an appointment online (see www.zocdoc.com).
An associated pillar of digitally-supported self-
management covers the field of health data management. ICT-supported treatment and counseling services are
While the traditional electronic health record is important facilitators of access to specialized health
primarily used by health care providers in order to care, which often falls short outside major population
digitally process and save patient data, more topical centers. By now, remote medical consultation is available
solutions such as web-based personal health records, in various fields, for example in mental health or cardiac
for example Microsoft HealthVault, can be used by care. Routine visits or counseling support can be made
patients themselves to access, store, manage, retrieve by videoconference with health professionals or via
and exchange health data with their doctors and other messaging (e.g. using email or online chats)71. Online
relevant stakeholders within health care systems62, 70. services such as DrEd (see www.dred.com) offer online
access to qualified health care providers and prescriptions
1.5.5 Digital Solutions to Facilitate Patients for selected treatments for almost 2 million74 patients in
Interactions with the Healthcare System the UK, Ireland, Germany, Austria, and Switzerland.

Especially in larger health organizations like hospitals,

68 Wootton, R. (2012). Twenty years of telemedicine in chronic disease management--an evidence synthesis. Journal of Telemedicine and Telecare,
18(4), 21120. http://doi.org/10.1258/jtt.2012.120219
69 Silva, B. M. C., Rodrigues, J. J. P. C., de la Torre D??ez, I., L??pez-Coronado, M., & Saleem, K. (2015). Mobile-health: A review of current state in
2015. Journal of Biomedical Informatics, 56, 265272. http://doi.org/10.1016/j.jbi.2015.06.003
70 Steinbrook, R. (2008). Personally controlled online health data - The next big thing in medical care? New England Journal of Medicine, 358(16),
16531656. http://doi.org/10.1056/NEJMp0801736
71 Gastaldi, L., & Corso, M. (2012). Smart Healthcare Digitalization: Using ICT to Effectively Balance Exploration and Exploitation Within Hospitals.
International Journal of Engineering Business Management, 1. http://doi.org/10.5772/51643
72 for example, solutions for the dematerialization of clinical documents
73 For example, EMRs or logistic process management solutions
74 According to the companys website information

Digital Transformation in Healthcare 31


A Whitepaper of the Healthcare Futurists GmbH
1.5 The Digitally Embedded Patient

1.5.6 The Current Digital Patient Usage and never uploaded medical results to online repositories,
Usage Barriers to Innovative Services never used health or wellness apps and never transferred
vital signs or any medical data anywhere78. The main
For years, digitalization in health care has focused barriers to digital health service utilization are privacy
mainly on processes and less on patient needs. This and security for about 50% of respondents in the
understanding, however, is supposed to be the base on European study. People further pointed to concerns
which digital products and services should be built75. The with respect to reliability and trust into the services.
need for and ability to process health-related information About one third mentioned shortcomings in terms of
differs depending on sociodemographic characteristics, liability, health literacy, access to services, and digital
health status76, and the urgency of a health problem. skills. Older people, women, and those with lower
Moreover, patients desire to engage in health care education are especially sensitive to the usage barriers78.
processes and to use supportive digital technology is
moderated by personal and demographic characteristics: It is, however, not accurate to conclude that people in
a large European study found that the overall use of general and older generations in particular do not want
digital health services is more frequent among the to use digital services it seems to be type of service,
younger and higher educated populations, in students not the channel, that matters most for them75. Almost
and the employed, in urban areas, and among people 40% of the respondents in the European study used
who are in bad health or have chronic conditions77, 78. digital services to search for health- or disease-related
information75. Among the digital services presented
In general, the majority of patients, i.e. 75% as reported in the previous paragraph, those that facilitate access
in the McKinsey Digital Patient Survey 201479, already and interaction within the system are perceived
used digital health services mainly websites, online as most relevant to patients in general80. Quite in
portals, or email interactions. Smartphone apps were line with these results, gains in efficiency thanks to
used only occasionally by about one third of the eliminating processes or receiving services online and
respondents, and social media hardly played a role in the increased awareness of online services were found
German and UK samples75. Italian surveys found similar to be the top drivers of digital service adoption75.
trends collaborative solutions such as health chats or
social media are only valued by the younger age groups80. In the course of increasing access to digital technologies,
Considering the services for health and data management there is a shift of responsibility towards patients regarding
and interaction, a different picture emerges: considerably their health. On the one hand, patients extended
more than half of the European respondents have never information-seeking can lead to a better understanding
bought medicine or vitamins online, participated in of the diagnosis, the therapy recommendations, or the
health-related social networks or support groups, or mode of action of drugs, and can eventually help patients
disclosed medical information on websites. More than to accept more responsibility for their illness and become
75 % have never participated in online consultations, actively engaged in their health management60. On the

75 Biesdorf, S., & Niedermann, F. (2014). Healthcares digital future. Retrieved from http://www.mckinsey.com/industries/healthcare-systems-and-ser-
vices/our-insights/healthcares-digital-future
76 Srensen, K., Pelikan, J. M., Rthlin, F., Ganahl, K., Slonska, Z., Doyle, G., Helmut Brand. (2015). Health literacy in Europe: Comparative
results of the European health literacy survey (HLS-EU). European Journal of Public Health, 25(6), 10531058. http://doi.org/10.1093/eurpub/ckv043
77 Citizens of 14 European countries Citizens who have used the Internet in the last three months, stratified by country, gender and age groups,
n=14.000
78 Lupiaez, F., Maghiros, I., & Abadie, F. (2013). Citizens and ICT for Health in 14 European Countries: Results from an Online Panel. In European
Commission / Joint Research Centre: Scientific and Policy Reports (pp. 1166). http://doi.org/10.2791/84062
79 The survey was conducted in Germany, the UK and Singapore, sample size was >1000 respondents of different age groups, genders, and incomes; and
levels of digital familiarity
80 Observatories ICT in Health. (2013). Digital Doesnt Have to Be Left Only on the Agenda. School of Management of Politecnico di Milano, www.
osservatori.net, in Italian.

Digital Transformation in Healthcare 32


A Whitepaper of the Healthcare Futurists GmbH
1.5 The Digitally Embedded Patient

other hand, being left on their own to choose and assess into clinics84. Reactions to self-monitoring ones vital
the content of web-based health information, patients functions or clinical parameters can be ambivalent:
have difficulties in differentiating between information diabetes Type 2 as well as COPS patients for example
that is based on scientific evidence, and incomplete, reported feeling more secure and encouraged if their
false or even manipulating contents60. This can lead clinical parameters were acceptable, but felt ashamed,
to misinterpretations and insecurity81. Furthermore, anxious, helpless and frustrated if they were not85.
patients already suffering from health anxiety are at risk
of intensifying their affliction by searching for symptoms Understanding what patients want from digital
of unlikely diseases which they mistakenly feel60. healthcare as well as their fears and concerns is a
prerequisite to the successful diffusion of digital
From the health professionals perspective, the services. Healthcare organizations and digital service
medical encounter with the well-informed patient providers are advised to implement and add new
is an ambivalent one. Admittedly, engaged and services step by step and in accordance with their
empowered patients are valued, particularly if they customers real needs: in order to keep their attention,
suffer from complex symptoms, but the medical do not overburden patients, and ensure that they
encounter sometimes becomes more exhausting become gradually accustomed to the digitalization.
for physicians81. Misinformed patients may require
additional time due to longer discussions, for example.
Some physicians even feel like being examined and
take personal offense from interactions with suspicious
patients, while others regard pre-informed patients as
a positive challenge81. Similar results were reported
in a survey of 491 family and consulting physicians;
about half of the participants considered that medical
encounters with pre-informed patients were aggravating,
mainly due to an increased time requirement 82.

Patients feel highly vulnerable when they need


health care57 and during times of illness. Being
hospitalized or treated is associated with feelings of
fear and lack of privacy or lack of control83. Under
many circumstances, people who are incapable of
acting as sovereign patients will hardly be able to
question medical authorities, or may not wish to do
so. Agreeing to self-surveillance and self-care can be
overwhelming, and some patients do not want to be
faced constantly with their disease57 or turn their homes

81 Baumgart, J. (2010). rzte und informierte Patienten: Ambivalentes Verhltnis. Dtsch Arztebl International, 107(5152), A-2554-A-2556. JOUR.
Retrieved from http://www.aerzteblatt.de/int/article.asp?id=79862
82 Gerlof, H. (2014). Sand im Praxisgetriebe durch das Internet. CardioVasc, 14(2), 3435. http://doi.org/10.1007/s15027-014-0353-6
83 Berry, L. L., Bendapudi, N., & Berry, L. L. (2007). A Fertile Field for Service Research. http://doi.org/10.1177/1094670507306682
84 Oudshoorn, N. (2011). Telecare Technologies and the Transformation of Healthcare. London: Palgrave Macmillan UK. http://doi.
org/10.1057/9780230348967
85 Hortensius, J., Kars, M. C., Wierenga, W. S., Kleefstra, N., Bilo, H. J., van der Bijl, J. J., Rutten, G. (2012). Perspectives of patients with type 1
or insulin-treated type 2 diabetes on self-monitoring of blood glucose: a qualitative study. BMC Public Health, 12(1), 167. http://doi.org/10.1186/1471-
2458-12-167

Digital Transformation in Healthcare 33


A Whitepaper of the Healthcare Futurists GmbH
1.6
Future Developments
in Digital Health

Digital Health nowadays is still subject to a number of to what patient. This changes sales models because it
misunderstandings. To physicians, digitalization might reduces the importance of the prescriber when decisions
mean sending patients reports via email rather than are taken centrally rather than at the doctors desk.
using a fax machine. To sick funds and health insurers,
it might mean quicker reimbursements or data mining, Pharmaceutical companies are also looking for
and to the industry it might mean including wearable- technical support and digital solutions to integrate
generated data in phase III or phase IV studies, hoping parts of the IOHT in their phase III and IV studies.
to provide the right levels of evidence needed to sustain The challenge is to achieve evidence-based medicine
the authorities reimbursement requirements. The status with these technical devices in order to be able
truth is that digitalization is none of this and all of it. to integrate the data harvested there into randomized
control trials or studies that bear significance
Digital health will live up to its full potential when it for regulatory and reimbursement decisions.
brings all the different stakeholders and their individually-
suited solutions together into one grid of data exchange, The generic drug industry (Gx) in particular is
which we commonly refer to as the Internet of Healthy experiencing more and more pressure, with prices
Things (IOHT). With legislators being ever clearer on declining and competition on the rise. These medications
the rights of patients to exercise ownership over their have become commodities and new KPIs for measuring
own data as described in this chapter, and predictive success not only need to come in but need to bridge
analytics stepping down from hype status to usable the way to an entirely new business model. This will
technology, we will see the rise of business models that consist of technological availability, patient access, and
combine these sets. This means we will be looking at patient commodity, which poses the challenge to these
Healthcare data brokers and data sharing cooperations companies to invest into new technology in markets that
that will act as trust centers or intermediaries on the do not necessarily warrant economic risk-taking. We are
sale of data to research organizations or industry with looking at a person-centric therapeutic approach rather
the data owners full consent and a price attached. than maintaining the fragmented provision of medicines
by individual companies, dispensed by the physician
1.6.1 Outlook on the Pharmaceutical Industry and paid for by sick funds or insurance companies.
with an Emphasis on Generics and Patent-
Protected Products Digitalization in healthcare opens the door to new
business models that focus on the provision of care to
To the pharmaceutical industry, digitalization is friend communities or individuals with a technological follow-
and foe alike. Common business models are being put up that will monitor health-related behavior of individuals
in jeopardy through the employment of data-based and generate longitudinal data on how medicines
solutions such as prescription overrulings according actually work. In the case of polypharmacy, this data will
to the contract status of the individual patient. also include how medicines interact with one another.
Physicians are currently already guided by software
as to what medication they should be prescribing But it would be short sighted for package sales not

Digital Transformation in Healthcare 34


A Whitepaper of the Healthcare Futurists GmbH
1.6 Future Developments in Digital Health

to adopt new technologies in old and financially adapt to the producer and not vice versa. This will be
unattractive markets. First of all, this technology a sea change that the industry needs to undergo, from
and overall strategic approach can be used on other package provider to full care provider. We will see this
compounds, but also other markets. Second, new happen in the pharmacies of Europe in the next couple
markets can be developed, taking into account that of years: those pharmacies will not survive unless they
old players also need to move. The current ruling make personal special care for patients their strategic
of the European Court that fixes pharmacy prices center and legacy as the main KPI that distinguishes
for medication is just the beginning of achieving full them from cheaper mail-order pharmacies. Those who
market transparency. This will bring retail pharmacists will not or who have not the size to negotiate with sick
to the negotiating table with sick funds. We will also funds over individual deals for patients will have a hard
see the rise of mail-order pharmacies and, for some time in a market that is mainly about saving money.
products, a ferocious competition in price. Losers will
be the wholesalers who will need to give their margin 1.6.2 The Issue of Data Ownership: What Kind
to the pharmacists. This might result in the withdrawal of New Business Models we will See in the
of wholesalers from the market and might force the Future? What are the Legal Issues?
digitally illiterate or emergency prescription patients
to look for other options. Mail order pharmacies with
wholesaler licenses will be the winner in this model, One of the first questions to tackle when constructing
but none of them will be able to do without digital a data system is the central issue of who owns the
solutions for logistical and patient administration. data itself. This question has recently become much
more important as the collection and aggregation
In order to fully suit patients, the Gx industry will of data about individuals has increased drastically86.
need adopt a patient-centric approach even more
than the Rx industry. Since major Gx manufacturers
have a plethora of molecules on their shelves, they As recently as 2016, companies have been in the news
will easily be able to supply 80% of a chronic patients for collecting data that they probably shouldnt87,
medication. It is the proverbial last mile that needs aggregating data that was previously unconnected88
to be on these suppliers minds; literally - the space and for using data to make individuals more
between the blister and the patients mouth. Smart transparent89. All of these issues have only become
and easy-to-handle pill dispensing devices that do not possible because it has become feasible to actually
require repackaging or reblistering (considered value collect and collate these data about individuals, while
chain obstacles) are in short supply. This is because the legal framework that should govern collection
pharmaceutical companies usually consider one of the and use of personal data has remained unchanged.
patients conditions a priority for which they have a
fix; but the truth is that most chronic patients have For medical data, that framework is pretty clear and
diverse co-morbidities that need ailing as well. So again uncompromising: every person has the right to decide
we see inverse Taylorism, where the agent needs to who can collect, use and even know of the existence of

86 Ackerman, Spencer, and MacAskill, Ewen. Privacy experts fear Donald Trump running global surveillance network, The Guardian, 11 November
2016, https://www.theguardian.com/world/2016/nov/11/trump-surveillance-network-nsa-privacy
87 Mastroianni, Brian. Pokemon Go is catching personal data from your smartphone, CbsNews.com, 11 July 2016, http://www.cbsnews.com/news/
pokemon-go-is-catching-personal-data-from-your-smartphone/
88 Martin, Taylor. How to stop WhatsApp from sharing your data with Facebook, CNet, 26 August 2016, https://www.cnet.com/how-to/how-to-stop-
whatsapp-from-sharing-your-information-with-facebook/
89 Solon, Olivia. Googles ad tracking is as creepy as Facebooks. Heres how to disable it, The Guardian, 21 October 2016, https://www.theguardian.
com/technology/2016/oct/21/how-to-disable-google-ad-tracking-gmail-youtube-browser-history

Digital Transformation in Healthcare 35


A Whitepaper of the Healthcare Futurists GmbH
1.6 Future Developments in Digital Health

data about them90. Naturally, there are situations where be abolished completely 95, even if at this stage
consent is assumed, but even these specific exceptions only for pseudonymous data. Naturally, there
are strictly regulated91. As such, from a legal standpoint, is a heated and ongoing discussion about this96.
it is clear that data can only be collected with explicit
and specific consent by the respective person92. So, at this point the situation is both clear and
unclear. From a legal perspective, all data about a
With these provisions, Europe and specifically Germany person belongs to that person; copies must be made
have one of the strongest individual protection available to the owner and any use can be reviewed
frameworks in the world. In fact, it has become sort of and vetoed by that person. What is unclear is how
a trope already to mock German data protectionism93. the legal situation will change in the future and how
While this makes some business models harder or individuals will use their rights to control their own data.
impossible, it is a good thing, both for individuals and
businesses. It is good for individuals because they retain Still, there are plenty of business models that are viable
sovereignty over their own data, which is an important under these very restrictive laws, or even because of
personal right. It is, however, also good for business them. One obvious business angle could be to assist the
because all data carries an inherent risk with it, a fact that user in managing their own data and acting as a trusted
is often overlooked by collecting companies. All data informant to the user. In a wider sense, this is the strategic
is, sometimes quite literally, also a liability, especially angle for companies like Facebook, which help the user
so in case of loss, misuse, or breaches. While this may in managing their personal data in exchange for access to
sound like science fiction, already identity thieves that data and become a gatekeeper to the users data.
are stealing real property using data collected from
multiple sources94. It is unclear which parties are liable Users can also usually be persuaded to give up their
in each scenario of data loss or breach. Loss and misuse rights voluntarily in exchange for other benefits,
of data can also carry severe risks in regard to public such as tracking and aggregation or entertainment.
relations, as the image of a company that is careless Fitness trackers, which will be discussed in-depth in
with users personal data can be damaged seriously. a later chapter, are a prime example of the benefit for
users (being able to review and track their own fitness
In any case, protection of data against unauthorized progress) apparently outweighing the risk of external
access on a technical level is paramount. surveillance and data collection. This data, while not
directly medical, is at least medically relevant; in fact,
However, voices from the industry are now calling German insurance companies have begun offering
for changes in data protection laws, even going so to incorporate fitness data into their insurance plans,
far as to ponder whether data protection should offering rebates to active people97. At the same time,

90 https://www.gesetze-im-internet.de/sgb_1/__35.html
91 https://www.gesetze-im-internet.de/sgb_10/__67.html
92 Note that this does not only apply to medical data but to all personally identifiable data [REF: datenschutzgesetz]. There are specific laws for medical
data because of their sensitive nature as well as the long-standing practice of actually keeping records about patients.
93 Hucal, Sarah. Germanys Cryptic Debate on Data and Privacy, US News, 5 April 2016, http://www.usnews.com/news/best-countries/arti-
cles/2016-04-05/germanys-cryptic-debate-on-data-and-privacy
94 Remo, Jessica. Trio stole $1M in identity theft, mortgage fraud scheme, AG says, nj.com, 1 September 2016. http://www.nj.com/union/index.
ssf/2016/09/trio_stole_1m_in_identity_theft_mortgage_fraud_sch.html
95 Etgeton, Stefan. Wem gehren die Daten? Digitale Patientensouvernitt im Stadium der Morgenrte, Der Digitale Patient (blog), 27 July 2016.
http://blog.der-digitale-patient.de/digitale-patientensouveraenitaet/
96 Wem gehren die Patientendaten? Eine Auseinandersetzung mit den Thesen von Dr. Stefan Etgeton von der Bertelsmann Stiftung, dieDaten-
schtzer Rhein Main, 31 July 2016. https://ddrm.de/wem-gehoeren-die-patientendaten/
97 Schneider, Rainer. Rabatte fr Gesundheitsdaten: Was die deutschen Krankenversicherer planen, ZDNet.com, 18 December 2014, http://www.
zdnet.de/88214397/gesundheitsdaten-per-fitness-tracker-die-deutschen-krankenversicherer-planen/

Digital Transformation in Healthcare 36


A Whitepaper of the Healthcare Futurists GmbH
1.6 Future Developments in Digital Health

fitness and outdoor activity data has been successfully States alone, at a total cost of over $15 billion101,102.
introduced in mapping applications, allowing more
accurate and widespread mapping of walkways than It is thus of outstanding importance to secure data
would be possible with traditional methods98. This against unauthorized access and loss. The approach most
pattern is actually indicative of the business models widely used is called defense in depth. It is a layering
that are made possible by collecting data, even if the approach in which redundant levels of security are built
data might only be tangential to the actual business. around a central system, in this case the stored data.
In fact, it has been shown time and again that often at It ensures that there is no single point of failure. And
the point in time when data is collected, it is unknown even in the case of a security failure in a single layer, an
for which business use the data can be adopted. attacker does usually not gain access to stored data right
away, but instead has to surpass the other layers as well.
As such, it is hard to point to specific business models
that will become available with more data. On the The first and most obvious step is to encrypt all
one hand, big data, data collection, and data analysis data with strong keys and passwords while it is at
will continue to make current business models more rest103. There are full-disk encryption options for all
effective and efficient. On the other hand, once data operating systems today, and there is absolutely no
is present, connections can be drawn that allow novel reason why they shouldnt be used on all computers
business models and structures; however, these depend in an organization handling data, including all
highly on the attributes and quality of the data. employees computers. Since this is basic IT security
nowadays its barely worth mentioning; however,
1.6.3 The Issue of Data Security and Data many companies still do not publish such guidelines
Safety : where are Data Being Stored and nor adhere to them. Requiring encryption on all
Secured? An IT Point of View devices minimizes risk from the simple loss of a device.

As we have indicated in previous chapters, data, once The second obvious point of protection against
collected, is not just valuable but also a liability. Stories unauthorized access is to bolster user security with
of data breaches and loss abound, including discussions two-factor authentication. This minimizes the risks
of the reverberations for both the compromised posed by weak passwords and password reuse by
businesses and the affected individuals. One thing adding a second factor, usually a hardware token
is clear, however: loss of data due to breaches or or a software on the users smartphone that is very
negligence is going to be very costly99. Not only is hard to attack with traditional password-cracking
stored data susceptible to hacking attempts and inside methods. There are commercial off-the-shelf solutions
connections, but also, quite banally, to simple loss of available at very low cost, so there is very little reason
the devices100. It is estimated that over 13 million people to forgo the use of two-factor authentication 104.
were affected by identity theft in 2015 in the United But these are just the basic IT security

98 Purdy, Kevin. What Google is getting out of Ingress, IT World, 18 April 2014. http://www.itworld.com/article/2698381/mobile/what-google-is-
getting-out-of-ingress.html
99 Watson, Willis Towers. Cyber Claims Landscape: Companies Face Increasing Data Breach Liability, Willis Towers Watson Wire (blog), 23 July
2015. http://blog.willis.com/2015/07/cyber-claims-landscape-companies-face-increasing-data-breach-liability/
100 Nancy zz_Ferris. NIH researcher loses laptop with data on 2,500 patients, Healthcare IT News, 28 March 2004. http://www.healthcareitnews.
com/news/nih-researcher-loses-laptop-data-2500-patients
101 Identity Theft and Cybercrime, Insurance Information Institute, http://www.iii.org/fact-statistic/identity-theft-and-cybercrime
102 Identity Theft Resource Center Breach Report Hits Near Record High in 2015, Identity Theft Resource Center. http://www.idtheftcenter.org/
ITRC-Surveys-Studies/2015databreaches.html
103 Data at rest means data that is not currently loaded into a program for analysis, such as while stored on a laptop computer, in cold storage, or in
backups.
104 Specifications Overview, FIDO Alliance, https://fidoalliance.org/specifications/overview/

Digital Transformation in Healthcare 37


A Whitepaper of the Healthcare Futurists GmbH
1.6 Future Developments in Digital Health

mechanisms that any company should follow. a zero-knowledge system for the data stored on iOS
devices, where the encryption key never leaves the device
Regarding data security, in the specific context of patient and all operations are performed only at the users request.
and medical data, the first step that must be taken when
storing data is to anonymize the data. This means that However, the security of an implementation has to be
any association between stored data and an actual person evaluated on a case-by-case basis, with respect to the
should go through a high-security context and be stored needs of a project. In many cases, security that goes as
in a physically different place or be encrypted with a deep as zero-knowledge systems will not be possible,
different key, and only be combined at the last possible or will only be possible with restrictions of the service.
moment. This is to minimize the risk of exposure in the
actual case of a breach, but also to minimize the chances
of inadvertent disclosure during regular work routines.
After all, if a data set cannot be readily tied to a person
then a data breach is much less serious. Naturally, this
has been standard operating procedure in scientific data
collection for a long time now. Stronger anonymization
is necessary when publishing data, such that no single
person can be identified; however, it is very hard to
completely secure a data set against de-anonymization,
as examples have shown in the past, especially when
it is possible to combine different data sources105.

Finally, moving unencrypted data away from central


locations is becoming more common and adds further
layers of protection to each individual data row, as
well as a data set in total. This is because in the case
of distributed encryption, for example on the users
own device or on an authenticated edge server, the
data is only readable in a specific context and for a
short time, minimizing exposure risks. Such a situation
is known as zero-knowledge; that is, the physical
storage system of the data does not know what it
actually stores. Data is only decrypted when it is used,
and in a specific usage context. These contexts then
also allow fine-grained control over what is allowed
and what isnt with each particular data row, simply
by denying an access key to non-authorized contexts.
Since in this case there is also no central registry of
such keys, there is no single point of attack or failure,
adding much greater resilience to such a system.

Unfortunately, while a few commercial operators already


use this technology, it is not already available as an off-
the-shelf solution. Most prominently, Apple implements

105 Schneier, Bruce. Why Anonymous Data Sometimes Isnt, Wired Magazine, 13 December 2007. http://archive.wired.com/politics/security/com-
mentary/securitymatters/2007/12/securitymatters_1213

Digital Transformation in Healthcare 38


A Whitepaper of the Healthcare Futurists GmbH
Chapter 2:
Special focus on Big Data
Potential Assessment
and Exploitation in
Healthcare

Digital Transformation in Healthcare 39


A Whitepaper of the Healthcare Futurists GmbH

2.1
Current State-of-the-Art and
Application Examples

All models are wrong, but some are useful. - George Box106 out and on the other hand the goals had to be set much
2.1.1 Introduction lower to avoid getting trapped by inflated expectations
again. The goal switched from we construct human-like
For a long time, artificial intelligence research was not robots to how about software which gets better with
taken seriously, because it had failed to produce results experience. In the good old days of artificial intelligence
that were expected in the 50s and 60s. It made bold optimism, it wasnt quite clear that software had to have
statements about what it would be able to do, but the ability to learn at all. If you can deduct all the things
delivered next to nothing. What followed was a time you need to know from a few principles using first order
period called the AI winter where funding of research logic, then you dont need to learn. Unfortunately it
projects was discontinued and the whole field went from turned out that we live in a universe where you cant.
being mainstream and interesting to exotic and suspicious.
This rise of machine learning continued far into the 00s.
Ironically, the methods used in the eighties were Then something else happened: the internet became
strikingly similar to the ones which are enormously more and more widespread. While surfing the internet,
successful nowadays. Geoffrey Hinton, who is a pioneer people generated lots of data that wasnt available before.
in the field of artificial neural networks and co-published For example, in the late nineties the problem of speech
the first paper on the back propagation algorithm for recognition seemed to be nearly solved. I remember
training multilayer perception networks, answered the being told that we would all talk to our computers,
question of why they didnt work back then (1986): because in three to five years the recognition error
would be low enough to replace the standard keyboard
We all drew the wrong conclusions about interface for text editing. This didnt happen. But today
why it failed. The real reasons were: speaking into your phone really is faster than typing107
(interestingly most people are not yet aware of this).
1. Our labeled datasets were thousands of times too What happened? It turned out that in order to be
small useful the error rate has to be lower than a threshold,
2. Our computers were a million times too slow 2% for example, and while it was relatively easy to
3. We initialized the weights in a stupid way bring down this error rate from 10% to 5%, which was
4. We used the wrong type of nonlinearity done in the nineties with machine learning models like
hidden markov and gaussian mixture models, it was
In the mid nineties, interest started to rise again, with the exponentially harder to bring the error rate down from
first success stories in the field of data mining and with 5% to 2%. And to be able to do so you would need
new models like support vector machines. It became a lot of training data which was not available in the
fashionable to call the field machine learning, because on nineties. But today there really are a lot of people talking
the one hand the term artificial intelligence was burnt into their phones, and we have plenty of training data.

106 All models are wrong - Wikipedia, the free encyclopedia. 2014. 13 Sep. 2016 https://en.wikipedia.org/wiki/All_models_are_wrong
107 Speech Is 3x Faster than Typing for English and ... - Stanford HCI Group. 2016. 15 Sep. 2016 http://hci.stanford.edu/research/speech/

Digital Transformation in Healthcare 40


A Whitepaper of the Healthcare Futurists GmbH
2.1 Current State-of-the-Art and Application Examples

This change in the availability of data has pushed many Different people have different views on what big data
problems from the state of promising but not useful means. Some people call all data which is too big to fit
yet to it works. Having lots of data has another in an excel sheet big data. From a software engineering
advantage : if you only have a small amount of data, you perspective it was never a good idea to implement serious
often need to extract handcrafted features to get machine logic in a spreadsheet, simply because you cant apply the
learning to work. In fact most of the work you have to best practices that weve developed over the last decades
do in the fields of image or speech recognition is feature in the software engineering field. You cant write unit
engineering. This has changed with more data because tests for your Excel code, for example. Its very hard to
now its possible to do representation learning on that do proper version control, too, and therefore to work
data, which means learning the feature extraction from as a team on a spreadsheet. Nevertheless, people have
the data itself. This new possibility together with new used Excel for all kinds of strange things and got away
models which could make use of interactions between with it far too often. Sometimes it hasnt ended well108.
those features termed deep learning, and is a huge success.
For us software engineers its therefore kind of a late
With deep learning we have already seen significant gratification to see a new trend in finance: many banks
improvements of the state of the art in image and object are replacing Excel with ipython notebooks. The New
recognition, speech recognition and natural language York Python user group has about 10,000 members and
processing. Even things like learning to play video games they organize weekly events with hundreds of attendees.
from video signal or beating humans in Go have only And although its certainly easier to process bigger data
been possible by using deep reinforcement learning. in ipython notebooks than in Excel, this wouldnt be
Encouraged by those achievements, some people are the greatest advantage theyll get from this change. The
courageous enough to speak of artificial intelligence again. biggest advantage will be that despite its being a little
bit unglamorous, they will have access to the abundance
2.1.2 Big Data of boring but working libraries and will be able to fully
exercise their programming craft in a professional manner.
Big data is often sold as a silver bullet. Its not. For
example, if someone tries to sell you a big data system, Using a proper database to store data or applying time-
because it will overcome the data silos in your company, tested best practices for software development is nothing
they are playing a game. There have been countless new. This is traditional data processing, but of course,
attempts to solve social problems with technology, if you dont do it yet you should start doing it now.
but did this ever work out well? For the technology And youre absolutely allowed to call traditional boring
salesman it probably did. For the others, probably not. data processing big data if it helps you to get it funded.

The job of marketing is sometimes to find new and Scenario 1: Your Data Fits in RAM on a Single
fancy words for old concepts, and since the term big Machine
data is really hot, all kind of stuff is now called big data. 256GB ought to be enough for anybody
But its not all marketing fluff; there really are new use (for machine learning) - Andreas Mueller, scikit-learn core-dev109
cases where the data you have is too large or complex
for traditional data processing applications, and then This is the sweet spot you are looking for. Computers can
you have to do something new. But how do you know have lots of RAM these days and if you are working with
when you have arrived at that point? In the following biggish data, they should. Running a machine learning
sections I want to elaborate a little bit on this question. system is not that different from running a database.

108 The Reinhart-Rogoff error or how not to Excel at economics. 2013. 15 Sep. 2016 http://theconversation.com/the-reinhart-rogoff-error-or-how-
not-to-excel-at-economics-13646
109 Large Scale Non-Linear Learning (Pygotham 2015) - SSSSLIDE. 2015. 13 Sep. 2016 http://sssslide.com/speakerdeck.com/amueller/large-scale-
non-linear-learning-pygotham-2015

Digital Transformation in Healthcare 41


A Whitepaper of the Healthcare Futurists GmbH
2.1 Current State-of-the-Art and Application Examples

Both are easy to scale up but hard to scale out. Once I example - you should definitely give vowpal wabbit a try
attended a database performance class held by Kristian before distributing your data across a cluster of machines.
Khntopp (working as a principal consultant for mysql
at that time) in which he told us half jokingly the three Scenario 3: Your Data Doesnt fit on a Single
secret rules of database performance optimization: Machine

1. Ok, youve got performance issues - lets do the Nobody ever got fired for using Hadoop on a cluster111
obvious first thing to do and add more RAM to
the database server. In my opinion, this is the point where it starts to makes
2. Well, things are getting harder, but maybe its sense to speak of big data, because now youll have
possible to overcome even this by adding some to substantially change the way you work with data.
more RAM to the database server.
3. This is tough, youve tried everything and it didnt Usually you move the data you want to process from
work and youre running out of options, but the place its stored to another place where you actually
theres a last thing you could try before giving up: process it. For example - you might do a sql query
add more RAM to the database server. on a database and analyze the returned data in Excel.
If the data you want to analyze becomes big enough,
If you dont want to run a database but rather a machine you cant do that anymore, because it wont fit on your
learning system, the same advice still holds true. If computer. And maybe its not an sql database server you
your data doesnt fit in RAM on a single machine, hard query for the data, but a storage system where the data is
drive access will inevitably become your performance distributed over multiple continents and you cant move
bottleneck, or even worse, network latency if you the data, because you neither have the bandwidth nor
distribute the data over multiple machines. With SSDs a central place with enough storage capacity to do so.
things are getting better, but the fundamental problem
remains the same. Its often possible to stay below A possible solution for this problem is to invert the
the fits in RAM boundary via subsampling. You usual workflow and move your algorithms to the data.
should only move beyond it if you really know your Google has developed the most popular paradigm in
problem well and that it would be beneficial to do so. this domain called Map/Reduce in 2004. There were
similar concepts in the high performance computing
Scenario 2: Your Data Does not fit in RAM but community before, but nothing applicable to a cluster
on Disk in a Single Machine of commodity machines. The most popular method of
scaling out data processing today is to use apache spark.
Theres a special class of out of core or online learning Youll write your code in Python, R (the two dominating
algorithms that dont need to hold the complete languages in the data science field today), Scala, or Java
training set in RAM. They work by incrementally and itll get automatically distributed and executed on
looking at a part of the data and train just on that part. your data. But this sounds easier than it actually is.
Being able to do fast retraining if theres new training Changing the way data is processed is a difficult thing
data without having to batch retrain on the complete to do. For a lot of algorithms there exist no efficient
training set is another benefit of those algorithms. distributed versions. Youll have to rewrite your code in a
way that makes it distributable and the constraints youll
One of the most widely used software packages for face will be very different from the ones you optimized
fast online learning is vowpal wabbit110. If you have for on a single machine. Theres a lot less infrastructure
data where a linear model works well - textual data for you can build upon, so youll have to pay premium for

110 Vowpal Wabbit (Fast Learning) - Machine Learning (Theory). 2007. 15 Sep. 2016 http://hunch.net/~vw/
111 Rowstron, Antony et al. Nobody ever got fired for using Hadoop on a cluster. Proceedings of the 1st International Workshop on Hot Topics in
Cloud Data Processing 10 Apr. 2012: 2.

Digital Transformation in Healthcare 42


A Whitepaper of the Healthcare Futurists GmbH
2.1 Current State-of-the-Art and Application Examples

commercial vendors or have to roll your own. Its easy to the data stream those detectors create. Processing this
predict that this will be expensive and really hard to do. kind of data is quite different from most other use cases.

Different Kinds of Data Its also very important to know whether your big data
is tall (billions of rows/observations) or wide (billions
Big Data is a vague term, used loosely, if often, these days. But put of columns/predictors). Click Logs are an example
simply, the catchall phrase means three things. First, it is a bundle of
technologies. Second, it is a potential revolution in measurement. And
of tall data where you have just a few predictors per
third, it is a point of view, or philosophy, about how decisions will be - click, but a lot of clicks in total. Since most of the
and perhaps should be - made in the future. -- Steve Lohr, ads which are shown to users are not clicked on but
The New York Times 112 the interesting observations happen when users click
on an ad, it didnt hurt the performance to only take
Data that is too big to process it on a single machine every 15th non-clicked event. So subsampling was
is not the only case where traditional data processing a very effective strategy in this case117. Wide data like
breaks down. One of the largest publicly available brain activity scans or genome data usually consists
datasets is image-net113. This dataset consists of about 15 of only a few observations, brains scanned, genomes
million hand labeled images. The subset that is used for sequenced; but contains a lot of data per observation.
the Large Scale Visual Recognition Challenge 2016114 Wide data requires a completely different approach
has 1000 categories and a size of about 140Gb, so its (dimension reduction, strong regularization, lasso)117
not really necessary to distribute it over a cluster of
machines. But you have to use special hardware to build Conclusion
the deep network models which are now state of the art
in image recognition. Usually that would be a normal There are valid cases for non-traditional data processing,
computer packed with GPU extension cards. And yes, but theres no silver bullet and therefore no one size
you can bring down the training time even more if fits all big data system. Handling data correctly
you distribute your training run over more than one of depends heavily on the use case, and the hardware
those machines. But your standard off-the-shelf scale- requirements are very different for different use cases.
out hadoop-cluster, which is typically sold as big data
system, will be not of much use for this kind of problem.

The Square Kilometre Array telescope115 is predicted


to generate one exabyte of data per day116. This is a
thousand petabytes, which are a thousand terabytes
each. Youd need a nuclear power plant to have enough
power to process this kind of data in a traditional way.
Therefore you probably dont want to store all the data,
but filter out all the noise before you store the signal.
The detectors of the big particle accelerators do the same
and have a stack of hardware ASICs that try to reduce

112 Sizing Up Big Data, Broadening Beyond the Internet - The New York ... 2013. 13 Sep. 2016 http://bits.blogs.nytimes.com/2013/06/19/siz-
ing-up-big-data-broadening-beyond-the-internet/
113 ImageNet. 2007. 15 Sep. 2016 http://image-net.org/
114 Large Scale Visual Recognition Challenge 2016 - ImageNet. 2016. 15 Sep. 2016 http://image-net.org/challenges/LSVRC/2016/
115 The Square Kilometre Array SKA Home. 2011. 15 Sep. 2016 https://www.skatelescope.org/
116 The Informatics Institute at the University of Amsterdam invites ... 2014. 15 Sep. 2016 https://plus.google.com/106192624796075357023/posts/
BBfCqMfjxWd
117 Statistical Learning with Big Data - Stanford University. 2015. 13 Sep. 2016 http://web.stanford.edu/~hastie/TALKS/SLBD_new.pdf

Digital Transformation in Healthcare 43


A Whitepaper of the Healthcare Futurists GmbH
2.2
Deep Learning

Deep Learning is much easier to define. Its about Now you have a vector representing the text. If you
learning representations of data with multiple like fancy names you can call this representation a
levels of abstraction 118. This is usually done by unigram bag of words model with binary weights. It
training a neural network with multiple layers of works surprisingly well, because much of the meaning
neurons using the backpropagation algorithm. of a text is defined by the words it consists of. For
other kinds of data, feature engineering is much more
Traditionally, much of the effort in building difficult. How do you split up an image into words?
machine learning systems went into finding the
right representation of the data. This process is Deep learning systems aim to learn the right
called feature engineering, and transforming the raw representation from the raw data itself in an
data into the right representation is called feature unsupervised manner. It gets even better: previously,
extraction. For example, this is a (greatly simplified) one had to use the same representation for all kinds
process to transform a text into a feature vector that of images, because you couldnt invent a new clever
can be used later as a training example in a dataset: feature extraction mechanism for each new dataset. But
1. Split the text into words with deep learning, the representation is automatically
2. Assign a unique number to each word and tailored to the dataset it was learned on. And the
interpret this number as the dimension of this more data you have, the better the representation gets.
word
3. Initialize a vector containing all possible words or
dimensions with all entries set to 0
4. For each word, set the value of the previously This chart by Andrew Ng sums it up neatly119:
initialized vector at the respective dimension to 1

118 LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. Deep learning. Nature 521.7553 (2015): 436-444
119 Extract Data Conference | SlideShare. 2015. 13 Sep. 2016 http://www.slideshare.net/ExtractConf

Digital Transformation in Healthcare 44


A Whitepaper of the Healthcare Futurists GmbH
2.2 Deep Learning

Deep Learning works best in domains where its difficult Exploratory Data Analysis
to come up with good feature extraction methods like
image or speech recognition. However, if a particular Exploratory data analysis is an attitude, a state of flexibility, a
willingness to look for those things that we believe are not there, as well
neural net architecture works great in one domain,
as those we believe to be there - John Tukey122
like convolutional neural nets for image recognition,
that doesnt mean that this architecture would work
as well in a different domain without modification.
This is usually the first step. You take a look at the data
For sequence-based data like text, other architectures
you have and try to find some interesting properties.
like LSTMs or RNNs seem to work better. And
Visualization will probably play an important role
even though learned representations like word2vec
and the exploratory aspect means that the problem
or GloVe play an important role in natural language
you are trying to solve might change as you go.
processing, those feature learning systems dont use
Usually you wont look at complete datasets, but
deep learning, but a instead a very shallow neural net.
only at representative samples, to avoid running
into the engineering difficulties caused by data size.
There is another drawback to deep learning. Deep
learning models sometimes have hundreds of millions
Eventually youll find some promising patterns
of parameters that have to be fitted. Though its now
in your data and proceed to the next step.
possible to make use of more data than before, deep
learning wont work if the amount of data is too small.
Predictive Analysis
Surprisingly, Georg Dahl, who worked on deep
Start simple. First try to find out how hard the
learning models for speech recognition in the same
problem actually is. What simple means depends
group as Geoffrey Hinton, tried to use deep learning
on the structure of your inputs and desired outputs.
in the Merck Molecular Activity Challenge 120
For example, if you find out that logistic regression
especially because the data was so small. He was interested
(an off-the-shelf classification algorithm) does not
in whether it would work despite not having enough
work because there are too many nonlinearities in
data. It worked and his team won the challenge121.
your data, you could try to employ nonlinear models
or get better at feature engineering. But at first its
How to Start
important to know if your problem requires creativity.
It seems that every approach has its advantages and
Scaling
disadvantages. You could buy a hadoop cluster from
your friendly big data salesman, but later it turns out
If your approach then needs to be scaled, the advice
that you could have done all the predictive analytics you
is to rent before you buy. This is especially true in an
need on a single laptop. Or that a GPU cluster would
area that is known to have undergone major leaps in
have been better. As long as you dont know your use-
technology such as the availability of storage space
case and the kind of data youll have to deal with, you
and the speed of CPUs. Renting contracts also often
will know neither what hardware to buy nor what kind
times entail maintenance for servers and software that
of models you should develop. But how to start then?
make sure you always have the current security patches
installed and server uptime is within acceptable range.

120 Merck Molecular Activity Challenge | Kaggle. 2012. 15 Sep. 2016 https://www.kaggle.com/c/MerckActivity
121 Dahl, George E, Navdeep Jaitly, and Ruslan Salakhutdinov. Multi-task neural networks for QSAR predictions. arXiv preprint arXiv:1406.1231
(2014).
122 Exploratory data analysis. 2014. 13 Sep. 2016 https://www.stat.berkeley.edu/~brill/Papers/EDA11.doc

Digital Transformation in Healthcare 45


A Whitepaper of the Healthcare Futurists GmbH
2.2 Deep Learning

Data Annotation / Cleaning spam filter recognizes spam with an accuracy of


90%!. We pat ourselves on the back and consider the
The importance of good data annotation and cleaning job done - until someone comes along claiming that
tools is often overlooked. In practice its often preferable his spam filter has an accuracy of 95%. And its a lot
to have good editorial frontends over having a slightly faster, too, because all it does is return always True,
better performing model, because it is relatively easy regardless of whether the mail considered is really spam
to improve the performance of your system by adding or not. Surprisingly, he would be right, because more
annotated data, but difficult to improve it by building a than 95% of all mail is spam. But since an obviously
better model. Its also difficult to handle changes in your useless classifier which always returns True has a better
data over time by adapting the model. Simply adding new score, our performance measure seems to be useless,
annotations to the change data is much easier. Its also too. In other cases accuracy might be useful, but in
easier to find more people to operate your editorial tools cases with imbalanced class counts it doesnt do well.
than to find new data scientists to work on the models.
So, measuring the quality of a model really depends
Data scientists spend up to 80% of their time cleaning on your data and your objective. For the spam filter
up messy data, usually because there are no editorial problem, precision and recall might be more useful.
tools and no people to operate them. In the end, Precision is the number of true positives divided by
somebody has to do it. If you only have the budget to the sum of true positives and false positives. Recall is
hire two people, the second one after the data scientist the number of true positives divided by the sum of
should be someone good at building frontends. true positives and false negatives. We are interested in
all mail which are not spam. In practice the output
Performance Measures of our classifier is not a binary true or false - it would
instead be a score or a probability. Then we can establish
You need to be able to assess how well your methods a threshold and say that we consider all mail having
are working. Ideally, the performance measure a score greater than this threshold as not spam. From
you choose is aligned with some business value. there, we can calculate the values of precision and
From a technical perspective, lets consider the recall for each threshold. If we sampled enough of
simplest possible example: binary classification. these recall/precision pairs, they would form a line - a
so-called precision-recall curve. This curve is a much
Binary Classification better indicator of the performance of our classifier.

We understand this type of problem thanks to an


annoyance we are all too familiar with: spam. In
the beginning of the 00s, the spam problem became
significant enough that someone brilliant began to
think about a solution: Paul Graham123. Graham
suggested using a very simple machine learning model
called naive Bayes to extract just one bit of information
from every received email. Is this Spam? True or false?
Despite the simplicity, this model worked well
enough to be an effective measure against spam.

But how are we able to measure how well it works?


Maybe we can measure its accuracy and claim our

123 A Plan for Spam - Paul Graham. 2005. 15 Sep. 2016 http://www.paulgraham.com/spam.html

Digital Transformation in Healthcare 46


A Whitepaper of the Healthcare Futurists GmbH
2.2 Deep Learning

Cross Validation scores go up or vice versa, you are probably overfitting.

Being able to calculate appropriate performance Machine Learning Competitions


measures for data where we know how the output of
our model should look is all fine and well, but what If you are interested in the current state of the art, take
we are really interested in is how our model will a look at Kaggle124. Its a startup that hosts machine
perform on data we havent seen before. The reason learning competitions for its customers. Running a
we do predictive analytics is precisely because we are competition on Kaggle with your data is a great way
interested in predicting things we dont know yet. The to see if its possible to improve above your baseline.
good news is that most of the time we are able to find In almost all competitions, significant improvements
a good approximation of how well our model will over the current state of the art were achieved.
perform on unknown data by using the data we already
have. We just have to replace unknown with unseen.
Text Categorization Example from the
Lets assume we have a dataset consisting of emails Ecommerce Industry (use case 1)
which were manually annotated to be spam or not
spam. We could then split this data set randomly into In the sections about binary classification and
three parts. First we train our model on two of those performance evaluation, I used spam filtering as an
three parts and calculate the value of our performance example. Lets proceed to a more complex example.
measure on the last, unseen part. Then we train on
the second and the last part and test on the first Lets say you run an online shop selling summer houses.
part. On the last run, we train on the first and last You cant sell hundreds of different types of summer
part and test on the second part. Now we have three houses, because the storage costs would kill you, so
performance measures that we can combine by simply you are stuck with just a handful of different summer
averaging them. This gives us a single performance houses to sell. How do you go about marketing them?
measure depending on the classification result of each Well, nowadays you might just bid on Google Adwords
email in the dataset, but classified by a model which like anybody else. At first this seems great, bidding
has never seen the emails it was tested on before. on summer house provides you with a lot of traffic
from Google. But after some time you notice that
In practice, its not often that easy to find a valid the users you bought from Google are not buying as
cross-validation strategy, and trying to find one is many summer houses per thousand visits as your usual
probably one of the first things you want to do. If the customers. You take a look into your weblogs and notice
observations in your data set have a timestamp and you that most of the users from Google look at the selection
are interested in predicting future observations, then of your summer houses and then go away. Well, maybe
you cant simply split the data randomly into train and they werent satisfied with the choices they saw on your
test sets. Its often more helpful to select a holdout set site. They searched for summer house and then had to
which has the same properties as the unknown data take one of the five similar looking summer houses on
you are interested in - the last two weeks of a time your site. Ok, maybe in the end its not a good idea to bid
series data set for example - and then compare your on the keyword summer house if you cant offer a wide
cross validation score against the score you get on your range of varieties that cover most of the requirements
holdout set. If changes in your cross validation score of customers searching for summer house.
lead to similar changes in the holdout score, you know
your cross validation strategy is probably valid. If your But you still want to be able to market your products,
holdout scores go down while your cross validation right? This is exactly where shopping aggregator

124 https://www.kaggle.com/

Digital Transformation in Healthcare 47


A Whitepaper of the Healthcare Futurists GmbH
2.2 Deep Learning

sites come into play. Price comparison engines are for your query, you could do things like add category
a prominent example of this kind of website. They summer houses from our taxonomy to all products in
aggregate all the product feeds from different shops this result set. This seems to be quite similar to the rule
and import them into their own taxonomy of products. base approach we discussed before, but its actually very
Price comparison sites often list millions of products on different. Theres no rule stored saying if a product is in
their site and probably thousands of summer houses. shop X and has shop-category Y and has a price in the
If those aggregators manage to correctly identify all following range, add it to category Z in our taxonomy.
summer houses in the product feeds and have a category Its just a bunch of products tagged with the information
landing page listing all the summer houses, they can that they belong to category Z in our taxonomy.
effectively bid on generic keywords like summer house
because the probability of showing a suitable summer If you collected few hundred or thousand products
house after a user clicks on the ad is much higher. for your category Z, you could press a button in the
editorial frontend and it would build a model for
But by having millions of products and thousands of category Z with the new training data you created.
categories, the integration of a new shop with a few This works exactly as if you had been creating a spam
thousand products becomes difficult. The aggregator site filter - the only difference is that it doesnt filter spam,
could pay people to annotate all the products in this new it filters category Z. Having a model that knows the
feed with the right category, but this would be expensive difference between products of the category Z and all
manual labor. And what if the shop later adds changes other products, you now could hit another button in
to the feed? Then this manual work has to be repeated. the front end that says classify all the products in our
database as being elements of category Z or not. Then
Its clear that you have to automate the integration you could choose those products where the model was
of new or changed products into the taxonomy. The unsure and classify them manually, or refine further and
first idea we had was to build a mapping between the annotate more products as belonging into category Z.
taxonomy the shops used and our taxonomy, and then
use this mapping to categorize the products in the After some iterations, a technical editor would check
product feed of the shop. But this didnt work. The the precision-recall curve for category Z and decide if
problem was that this mapping was brittle, and small it looked good enough to generate a threshold for the
mistakes could lead to many misclassified products model score, at which point products which scored
which had direct financial consequences. Additionally, above this threshold would be automatically classified
this mapping meant our own taxonomy hard to as belonging to category Z. If the curve didnt look
change, and the people in marketing who wanted to good enough, the editor would have to gather more
introduce a new wading pool category in summer had training data. After a little bit of practice an editor
no idea how to change the mapping rules since they should be able to train a few categories per day to
werent programmers. In short, solving this problem be handled automatically. The number of generated
with a set of fixed rules or code was not possible. training samples was about 20-30K per day per editor.

What follows is our solution: at first we built editorial We now had a system where people without coding
frontends in order to annotate products with categories skills could add, merge or change categories and ensure
from our taxonomy. You could do a fulltext search on the percentage of false positives per category would stay
all products in our databases or pick products by the below a tolerable threshold (5% for example). If a new
category they were given by their shop or by the shop shop had to be integrated, nothing had to be done; it just
itself. You could formulate queries like all products worked automatically. If an old shop changed its product
from shop X having the shop-category Y and having feed, the changes in category assignment would also be
a price between $30 and $100. This is quite similar done automatically. Despite this whole system being
to the interface customers see when they search for a based on simple binary classification, it could handle 60
product on a price comparison site. Having got a result million products in over 2.5K categories. We had about

Digital Transformation in Healthcare 48


A Whitepaper of the Healthcare Futurists GmbH
2.2 Deep Learning

11 million products labeled, and one product could Clustering of Products for Price Comparison
have an arbitrary number of categories it belonged to. (Use Case 2)

To classify a product, we needed to calculate one score Any price comparison sites main purpose is to show its
per category and model - i.e. 2500 scores. That seems users the best price for the product they are interested
to be a lot, but since the classification operation is in. Now you might think this is a trivial task: just
just a scalar product, this is just 2500 scalar products group your database of products by EAN/UPC or any
which is not that much on a modern CPU. If you other unique identifier, and you are done. Well, thats
used the categorization service by hand and typed in easier said than done. The use of identifiers in your
a few words to describe something you were seeing product feed is not free of charge. Therefore, shops
at that moment, then the system would return a list which are competing hard on price cant afford to use
of probable categories which all seemed to be a good them. For our database the percentage of products
fit. It really felt like a kind of magic. But it wasnt for which we had unique identifiers was about 30%.
magic - if you learned from 11 million products, you
would have seen almost every word a few times and If the problem is finding an unknown unique identifier
would know its probability of belonging to a category. for a product, maybe we can solve this in the same way
as the categorization problem. We group the products
These factors were critical for the success of the project: for which we have unique identifiers by them and
train a model for each group of products with the
Good editorial frontends led to more training data same identifier. Then we classify a product for which
being incorporated faster, which was better than the identifier is unknown with each of the identifier
complicated models models and find out which identifier this product has.
Fast model training time - its much easier
to collect training data if you can try out the This sounds great, but it wont work. We are not
knowledge youve generated in an interactive talking about thousands of categories anymore. There
manner. On adding or removing some products are over a million different unique identifiers in our
from a category we did an immediate retrain/ database. So, we would have to build a million models.
reclassify cycle to show the effects of the last This would be impractical, but even worse, we dont
editorial action. This was possible because we were have that much training data for each identifier - just
able to use online learning. Batch retraining time a few examples per EAN. Thats far too few to train
with all products from a category was about a few any kind of model on, and those groups of products
minutes. are not as stable as categories. New products appear
We did some tricks to speed up training time and old ones disappear into oblivion all the time.
(memory mapping the feature matrix)
Its not a problem to use Python if you use it So what can we do if we have no annotated training
for the parts where speed doesnt matter and data from which we could learn? We could do clustering
implement the hot inner loops in C or unsupervised learning, as it is often called.
We had our own mapreduce framework to create Conceptually, clustering seems to be simple. All you
the feature matrix need to have are some observations you want to cluster
and a distance function to measure how close each pair
One possible application in the field of healthcare of observations is. In practice, clustering is far from being
is to extract information from textual data - for simple, and making effective use of unlabeled data is one
example, a binary classification based on doctors notes of the big unsolved problems in the field of data science.
showing whether a patient has had a heart attack.

Digital Transformation in Healthcare 49


A Whitepaper of the Healthcare Futurists GmbH
2.2 Deep Learning

What happens if we apply clustering to our problem the name and description texts of the products of one
of finding clusters of products that belong to the shop are often generated by some kind of template.
same identifier? The parameters we can choose are the Neither result is we are looking for. The problem with
distance function and the clustering algorithm. Lets clustering is that there are a lot of possible clusters of
take tfidf-cosine similarity as the distance function. products which are completely valid, but in which we
This is the same function used by full-text search are not interested. We need a highly accurate method
engines to determine the similarity between a query of finding products that belong to the same identifier,
given by a user and the documents in the result set. So because comparing prices of products having different
what we are doing is taking each product as a query identifiers (for example, an iPhone and an iPhone screen
and looking at the distance from all other products as protector), does not make sense. A website user looking
we would see it in a full-text search engine. Then we for the best price for an iPhone would feel tricked, and
cluster the set of products using k-means, which is one the shop offering the screen protector would have to pay
of the most popular clustering methods due to its speed for a lot of clicks to his product, despite selling nothing,
and correctness. If you try this method on the set of because those users were looking for a different product.
all products youll see that indeed the products would
be clustered in groups which are somewhat similar, but It seems that we have to incorporate our knowledge of
those clusters would have no semantic meaning. If you what a cluster of products belonging to the same identifier
use k-means, you have to set the number of clusters should look like into the distance function. It would be
you want beforehand. If you choose a low number, you great to have a clustering method where we dont have
get clusters that look like broad categories of products, to know the number of clusters in advance, because
whereas if you choose a high number of clusters you get we dont know how many different products exist.
a lot of smaller clusters which are really similar textually,
but are often products from the same shop, because

Digital Transformation in Healthcare 50


A Whitepaper of the Healthcare Futurists GmbH
2.2 Deep Learning

Combining Distance Functions In my experience, it is always useful to look at histograms


of this kind if you want to estimate how hard it would be
We know that different products tend to have to distinguish between classes of items. In dealing with
different prices. Sure, there is some variance, but you supervised learning problems I would usually look at
wouldnt find an online shop selling an iPhone for histograms of all the features where this is possible. When
half of the list price, for example. Therefore we could you see distributions that are similar, you know this
try to make use of this knowledge by looking at the feature wont be useful. If you see shifted distributions
distribution of price distances for products which like in this case, you know that this feature contains
belong together and the distribution of price distances some information about the thing you want to predict.
between products which dont belong together. We can
visualize those distributions using a simple histogram.

In this graph the green histogram shows the distribution


of relative distances between products which belong
together and the blue histogram the same distribution
of relative prices for products that dont belong together.
Those distributions are clearly different, which is
good, because we want to know if we can use them to
distinguish between related and unrelated products. For
example, we can see that if we have a pair of products
with a relative price distance of greater than 0.9, it is
highly improbable that those products belong together.

Digital Transformation in Healthcare 51


A Whitepaper of the Healthcare Futurists GmbH
2.2 Deep Learning

Lets take an example of another distance function The distributions of cosine distances are clearly
that seems to be useful: TfIdf-cosine distance. different between products that belong to each
This distance is calculated between the text of the other and the products which do not. It is also
products. At first, the text is transformed into a vector interesting that the difference in this distribution looks
representation. For each unique word in the text theres different from the difference in the price distance
a dimension in the vector. If we wanted to keep things distributions. We could expect to be able to extract
really simple we could just use boolean values indicating a different kind of information from this distance.
whether a word occurs in a text or not. The problem We now could think of a lot of useful distance functions
with this representation is that some words are clearly that reveal information about whether a pair of products
more important than others. So we use a weighting belongs together. But for our clustering algorithm we
scheme called tf-idf which has been proven really useful need a single measure of distance. So we have to combine
in the field of information retrieval. Then, we normalize those different distance functions into a single distance.
the vectors representing our product texts to unit length The simplest possible thing we could do is to concat
and calculate the scalar product between all of them. all distance function results into a vector of distances
This scalar product represents the cosine of the angle for each product pair and then compute its overall
between the two text vectors and is near one for small length. The higher the combined distance, the lower
values of this angle and near zero for big angles. We the probability that a pair of products belongs together.
then transform this cosine similarity to a distance by Indeed this method alone does not work too badly.
subtracting it from 1, so that products which are close
to each other in vector space have a small distance and
products which are different have a larger distance.

Digital Transformation in Healthcare 52


A Whitepaper of the Healthcare Futurists GmbH
However, we can do much better. For all products with In practice, all nodes would have the same color
an identifier, we know all the pairs that belong together and we have to decide which product belongs
and all pairs that do not. We could use this information to which cluster only by looking at the graph
as training data to train a binary classifier to distinguish structure. Heres another 2d visualization:
between matches and non-matches given a vector of
distance functions. This model should be able to learn a
lot of information about whether two products belong
together or not just by looking at the vector of distances.

Final Clustering

Now that we have a good measure of the likelihood


of two products belonging together, learned from
examples of products which do belong together,
we still have to apply a clustering algorithm to the
distance graph. To get a better idea of what we have
to do, consider this 3d rendering of the data structure Applying k-means clustering with an estimated number
we try to separate into clusters. Products belonging of clusters leads to an ok but not great result. We
to the same identifier were assigned the same color. evaluated different clustering strategies, and the one we
found worked best for our problem was MajorClust125.
With proper parameterization, our system was able
to reach a precision of over 97% at reasonable recall.

Things that were critical for the success of the project:

Good editorial frontends


Clever tricks to reduce the number of product
pairs the system had to look at - as the set of
potential matches is O(n^2) for n=60000000, this
number can become very large

Possible Applications in the Domain of


Healthcare

Record linkage to determine which textual


descriptions of a person belong to the same actual
person126

Discover whether a person fits into a cluster of


conditions, but isnt labelled as such yet (ie. AI
DOCTOR)

125 Stein, Benno, and S Meyerzu Eien. Document categorization with MajorClust. Proc. 12th Workshop on Information Technology and Systems
Dec. 2002.
126 Christen, Peter, Tim Churches, and Markus Hegland. Febrla parallel open source data linkage system. Pacific-Asia Conference on Knowledge
Discovery and Data Mining 26 May. 2004: 638-647.

Digital Transformation in Healthcare 53


A Whitepaper of the Healthcare Futurists GmbH
2.3
Description
and Assessment of
Tools Used to Work
on Huge Data Sets

Everything gets more complex when the size of While in the past, large-scale computing was often run
data sets grows. Things that seem easy or normal for on specialized hardware, all solutions we present here
small data sets become complex and difficult when run on commodity computers that can easily be bought
data size grows. For example, the quickest way to or rented anywhere, bringing down cost massively.
upload or download large data sets to different
cloud providers is simply to ship hard drives127 The typical first choice for data processing is Microsoft
directly to the data center. The simple act of moving Excel. Almost by definition, data processed in Excel
data from one place to another becomes a logistical is not big data. The maximum size of data that can
task when the data is large enough. Naturally, there are be processed in Microsoft Excel is determined by the
tools and programs to help manage large or huge data workstation it is run on, but can usually not be larger
sets128, so in the following we will list some of them and than 1 GB. Analysis is limited, and there are also
discuss the data size and purpose they are intended for. well-known flaws in Excels data processing that can

Data size Processing options Problems and risks

0 - 1 GB Microsoft Excel, standard desktop tools Known for inaccurate results at large sizes, limited
by desktop machine capabilities

0 - 1 TB Custom scripting, for example using Python or R Choice of data storage must be chosen, analysis
speed limited by single machine, might be
exploratory and as such investment needs to be
checked

0 - 100 TB SQL database Might not offer specific query capabilities, then
combination with custom scripts

10 TB - 1000 NoSQL databases Many different options with different offerings must
TB be considered, also custom scripting is usually
necessary, distributed systems offer different cost/
speed trade-offs

10 TB and up Apache Hadoop or other Map/Reduce system Distributed system must be administered, usually a
big investment, only necessary for large data sizes
but often mis-used for smaller data sets

An overview of data size categories and the options for handling storage and processing in each of them.

127 Bright, Peter. Need to get a bunch of data onto Windows Azure? FedEx your hard drives, Ars Technica, 5 November 2013 http://arstechnica.com/
information-technology/2013/11/windows-azure-cloud-services-now-accepting-data-uploads-by-fedex/
128 It is hard to define what large or huge data sets mean, and care has to be taken when working with a specific application to determine wheth-
er the size needs are fulfilled by it. Also, the actual sizes of data sets are changing rapidly. What counted as a huge data set some years ago may now
be viewed as small. While the largest accepted definitions of big data start between five and ten terabytes, already in 2012 Facebook announced that
their graph data alone was measured in Petabytes, while their total storage was in the high hundreds of petabyteshttp://www-conf.slac.stanford.edu/
xldb2012/talks/xldb2012_wed_1105_DhrubaBorthakur.pdf

Digital Transformation in Healthcare 54


A Whitepaper of the Healthcare Futurists GmbH
2.3 Description and Assessment of Tools Used to Work On Huge Data Sets

lead to flawed results129, so using Excel for large-scale (note: both have now reached scales where custom
or important data analysis is strongly discouraged. systems have been implemented). Since SQL
systems offer both storage and processing tools,
Custom scripts for data analysis are best for data of the using such a common off-the-shelf database can
next step up in size and accessibility. These scripts are be an excellent choice for sizes up to 100 TB.
usually written in a scripting language like Python or When crossing the threshold of 10 TB, now-common
R, intended to be run on single machines. These are NoSQL-databases start to become meaningful
useful in situations where storage is not likely to be an alternatives. These databases do not rely on the
issue, namely for data sets less than 1 TB in size. Some complicated SQL standard, but instead store documents
storage mechanism has to be provided, so either raw in a key-value format or in custom formats. The
files, CSV files, JSON data sets or some other storage advantage is that querying these databases is very fast
backend must be present. The major advantage of a because there is very little query logic involved. The
custom script is its flexibility, since anything that can downside is that processing is not usually available. As
be programmed can be tried out. However, since speed such, these databases can be used as a storage backend
of analysis is then dependent on the operations, custom that feeds into a query processor or analysis engine, such
scripting is often used in the initial, exploratory phases as one produced by custom scripting as outlined above.
of a project to determine the final needs of data analysis There are a variety of common open source NoSQL
tooling. Investment in such an exploratory phase is rarely systems available, offering vertical and horizontal scaling,
wasted, since the exact analysis mechanisms are usually essentially without size limits. The most widely-used of
not known beforehand to a fine-enough degree, so that these databases are MongoDB132, Apache Cassandra133,
precise determination of needs in advance can prevent Redis134, and Elasticsearch135. Each NoSQL database
costly corrections later on. Additionally, once exploration has its own advantages and disadvantages, which
is finished, the custom scripts can be adapted into other must be evaluated before using any such database.
analytical tools or re-worked to fit larger data sets. Some come with integrated query capabilities
to different extents, others simply offer storage.
To meet the storage needs of larger data sets, and also
to offer somewhat specialized query engines, the next Finally, at the largest sizes, there exist specialized storage/
step up is a regular SQL database engine like MySQL, analysis systems built specifically for big data that are able
PostgreSQL, Microsoft SQL Server, or Oracle SQL. Due to handle sizes well into the hundreds of terabytes without
to prohibitive licensing costs at larger scales, open source problems. The most well-known system is Apache
SQL systems are usually preferred. Even though most Hadoop136, which consists of a distributed file system
installations of such systems are used to handle data sizes called HDFS (which can be used as a NoSQL database)
in the megabytes, these systems scale surprisingly well and an attached Map/Reduce system. The Map/Reduce
even into large terabyte sizes. As an example, both Google paradigm is a method of treating very large datasets in
and Facebook have used MySQL installations well into a structured way on a large number of machines and
the 1000s of terabytes, albeit quite customized ones130, 131 was first published by Google. Most big data systems

129 Hesel, Dennis. Is Microsoft Excel an Adequate Statistics Package?, Practical Stats. http://www.practicalstats.com/xlsstats/excelstats.html
130 Maitland, Jo. Google moves AdWords off MySQL to F1, 30 May 2012, GigaOm. https://gigaom.com/2012/05/30/google-moves-adwords-off-
mysql-to-f1/
131 Borthakur, Dhruba. Petabyte Scale Data at Facebook, September 2012. http://www-conf.slac.stanford.edu/xldb2012/talks/xldb2012_wed_1105_
DhrubaBorthakur.pdf
132 https://www.mongodb.com/
133 http://cassandra.apache.org/
134 http://redis.io/
135 https://www.elastic.co/
136 http://hadoop.apache.org/

Digital Transformation in Healthcare 55


A Whitepaper of the Healthcare Futurists GmbH
2.3 Description and Assessment of Tools Used to Work On Huge Data Sets

will eventually use a Map/Reduce mechanism, and On a very general level, big data techniques can be
as such this is the current gold standard for big data grouped into a handful of categories, of which these
systems over the critical sizes mentioned above. three appear most interesting: analytics, clustering,
and feature detection. Each of these categories has
While there are several proprietary systems different properties and different application scenarios.
offered by vendors such as Oracle or SAP,
these are usually quite cost-prohibitive and are The first category is analytics, and is probably the most
often built on top of Apache Hadoop besides. traditional statistical technique. In general it means
determining some quality of a data set according to a
Since any big data project necessarily operates on large specific standard. In fact, the field of statistics itself was
data sizes, the necessary computing infrastructure started by a brewing company interested in analyzing
always plays a part in implementing such a system. their products and finding the factors that led to a stable
The usual vendors each have their own offering of product quality143. Naturally, analytics are used by almost
such systems that include automatic scaling and every company nowadays, especially so in healthcare and
pay-per-use. Examples are Googles BigTable 137 pharma settings. Usually, well-known database systems
or Cloud Dataflow 138, Amazons DynamoDB 139 are used to store moderate amounts of data for statistical
and Redshift 140, and Microsofts DocumentDB 141 analysis regarding specific questions and hypotheses.
and Table 142 storage. However, all of these However, once the size of these data sets crosses into the
solutions rely on cloud offerings, which are also unwieldy (by traditional standards), the emphasis shifts
hosted outside of the European Union and are from pure hypothesis testing into hypothesis generation
thus usually not legally usable for sensitive data. and exploration. However, this is still led by users and
a rather more manual way of dealing with big data.
In conclusion, there is a large selection of storage
and analytics engines for all sizes and needs, both Naturally, analytics can help the pharmaceutical
in cloud solutions and on-premises. The smallest industry in the same ways it can help other industries -
solutions can be implemented quickly and used for namely, by streamlining processes. However, it can also
exploration and specification, while the gold standard be used to evaluate other kinds of data and can be useful
is represented by a cluster running Map/Reduce in interpreting many different kinds of measurements.
programs on Apache Hadoop. Many solutions are open
source and can be used without cost, or reinforced The second category is clustering. Here, varied, multi-
by commercial support offerings, and all solutions dimensional data are treated by specialized algorithms to
can be run on widely available commodity hardware. create a clustering, or labeling, of each individual data
point. Instead of finding qualities of a data set, groups
State of Technology Development With of data points are found that somehow belong together,
the Goal of Understanding how Big Data in ie. a subset or cluster is built into the data set. This is
Healthcare can be Exploited in a Pharma sometimes called labeling because each group can also
Setting be called a label, so putting one data point into a group
is the same as attaching a specific label to it. Also, once

137 https://cloud.google.com/bigtable/
138 https://cloud.google.com/dataflow/
139 https://aws.amazon.com/dynamodb/
140 https://aws.amazon.com/redshift/
141 https://azure.microsoft.com/en-us/services/documentdb/
142 https://azure.microsoft.com/en-us/services/storage/tables/
143 Kopf, Dan. The Guinness Brewer Who Revolutionized Statistics, Priceonomics, 11 December 2015. https://priceonomics.com/the-guinness-brew-
er-who-revolutionized-statistics/

Digital Transformation in Healthcare 56


A Whitepaper of the Healthcare Futurists GmbH
2.3 Description and Assessment of Tools Used to Work On Huge Data Sets

the items are grouped it is usually easy to attach labels to data, such as user-generated images or continuous
the groups, for example to differentiate species of plant measurement data. Instead of relying on labeling
specimens144. One of the major breakthroughs in big and directed training phases, these new algorithms
data was the design of automatic clustering algorithms, instead rely on sheer volume of data for their efficacy.
such as support vector machines. There exist several Since this field is quite young and advances are
well-understood clustering algorithms now that need frequent, it is hard to estimate the impact it will have
very little human input and can be used to extensively on data science as a whole, as well as the use cases it
label data quickly and accurately. However, guidance will cover. However, already problems have been
and interpretation of results is still needed in many cases. solved that seemed intractable only a few years ago.
As such, it seems plausible that feature extraction
It is hard to predict how such clustering methods can might be useful in medical and pharmaceutical settings
be used specifically in pharma and healthcare, simply as well, given that enough source data is available.
because there are many cases where a grouping of data
points can be useful. Do patients with a specific outcome Even though the most recent and most surprising
constitute a cluster? Are there unknown clusters of use cases for big data have been in fields as diverse as
symptoms or side effects reported? Many different retail147, automotive148, fashion149, and education150,
cases and questions can be approached with clustering there have been several forays into the usage
algorithms, so an exhaustive list is impossible to generate. of big data in a healthcare and pharma setting.

The third and most recent category in big data can The most famous example of this is naturally the Google
be called feature extraction. This is a technique Flu Trends debacle, in which Google offered to mine
where varied, unlabeled and unstructured data is their vast database of search queries and find search
associated with specific features that are not obvious terms that correlate to outbreaks of flu in certain areas.
from the data itself. Most recent breakthroughs This was then taken as an indication that it should be
come from this field, specifically in image and sound possible to predict outbreaks of flu with these search
classification. One example is detection of dog breed terms, even though it later turned out that this was
from simple images145. While this task might be not possible151. In fact, projections based on 3-week-
simple for human viewers, it was traditionally very old case data as collected by the CDC yielded better
hard for computers to answer these questions with projection accuracy than did Googles data mining.
any accuracy, especially without overfitting to other Furthermore, it looks as though Googles selected
factors of training data such as the average lightness search terms correlated much better to the medias
of an image146. However, recent advances in this field presentation of flu trends than to actual cases. As such,
have yielded robust results even on very unstructured the fanfare created by the publication of Google Flu

144 Iris Flower Data Set. Wikipedia, Wikipedia.org, np. https://en.wikipedia.org/wiki/Iris_flower_data_set


145 https://www.what-dog.net/
146 Machine learning and unintended consequences, LessWrong.com, 23 September 2011. http://lesswrong.com/lw/7qz/machine_learning_and_unin-
tended_consequences/
147 Hill, Kashmir. How Target Figured Out A Teen Girl Was Pregnant Before Her Father Did, Forbes, 16 February 2012. http://www.forbes.com/sites/
kashmirhill/2012/02/16/how-target-figured-out-a-teen-girl-was-pregnant-before-her-father-did/#3351b6a734c6
148 Big data and analytics in the automotive industry; Automotive analytics thought piece. Deloitte. 2015. https://www2.deloitte.com/content/dam/
Deloitte/uk/Documents/manufacturing/deloitte-uk-automotive-analytics.pdf
149 Noyes, Katherine. Whats on trend this season for the fashion industry? Big data, Fortune.com, 22 September 2014. http://fortune.
com/2014/09/22/fashion-industry-big-data-analytics/
150 Dede, Chris; and Ho, Andrew. Big Data Analysis in Higher Education: Promises and Pitfalls, Educause Review, 22 August 2016. http://er.edu-
cause.edu/articles/2016/8/big-data-analysis-in-higher-education-promises-and-pitfalls
151 Lohr, Steve. Google Flu Trends: The Limits of Big Data, Bits, 28 March 2014. http://bits.blogs.nytimes.com/2014/03/28/google-flu-trends-the-
limits-of-big-data/?_r=0

Digital Transformation in Healthcare 57


A Whitepaper of the Healthcare Futurists GmbH
2.3 Description and Assessment of Tools Used to Work On Huge Data Sets

Trends might very well have impacted its own accuracy. not exclusive to specific industries. Take for example
the long list of projects that are making use of IBMs
It is important to keep in mind that big data and big-data analytics capabilities. IBM presents use cases
data mining very rarely turn up causal links. Rather, in the banking, government, education, insurance, and
previously undiscovered correlations and clusters healthcare industries, in business areas ranging from
are found in existing data sets. There is tremendous finance to risk management, to name just a few155.
value in finding such correlations since they point the Despite the diversity of applications, common threads
way towards further, causal investigation. Similarly, can be detected. Whether it is about understanding
clusters or patterns in data are enormously useful, the patterns of prospective criminals or predicting
especially in a healthcare context; what else are readmission risks of post-surgery patients, big-data
illnesses but clusters of symptoms, and how better analytics often yields actionable information about
to find new ones or more clearly specify known ones potential scenarios in the future. This section gives
than to look at these clusters and their patterns? an overview of exemplary big data analytics projects
that are making an impact in the real world today.
There have been other uses of big data specific to health care,
although not as widely publicized as Google Flu Trends. Modern law enforcement practices have been strongly
shaped by advances in information technology and
We find that each of the three categories outlined above scientific research. The increasing stream of behavioral
- analytics, clustering, and feature extraction - show great data from numerous sources is now opening up new
promise to be used productively in pharma and healthcare possibilities in this area. A data-driven approach called
settings, where many different use cases are possible. predictive policing allows law enforcement agencies
to focus their efforts on at-risk locations and suspects
Learn from Practical Examples that Have Been who are most likely to be involved in upcoming crimes.
Used Successfully in Healthcare, Medicine, Life The underlying process enabling this practice involves
Sciences, and Outside the Healthcare Industry pattern detection in big data databases to gain a better
understanding of criminal activity, ultimately allowing
Judging by the plethora of applications that harness for a more effective deployment of police officers. Such
the power of big data across industries, one cannot predictive policing systems are no futuristic scenario:
deny that big data analytics has already evolved from 70-90% of US police departments surveyed in 2013
a high-potential to a high-impact technology. A were using or intended to use predictive methods
machine learning algorithm outperformed general by 2016156, and Germany and Switzerland have
practitioners in predicting depression in individuals, implemented a predictive system called Precobs in
solely by analyzing the composition of their pictures multiple police departments157. A few years ago, the
on Instagram152. Rapid progress in deep learning has city of Chicago, using data-driven law enforcement
enabled cognitive systems like Googles DeepMind measures, generated and publicly released a heat list
to become the worlds best in the game of Go153 of 400 people who were deemed to be most prone to
- and the gap to human level performance is closing violence158. In 2016, more than 70% of people shot
in other domains154. Clearly, big-data applications are and 80% arrested in connection with shootings by

152 Reese, Andrew G; and Danforth, Christopher M. Instagram photos reveal predictive markers of depression, https://arxiv.org/ftp/arxiv/pa-
pers/1608/1608.03282.pdf
153 http://www.nature.com/nature/journal/v529/n7587/full/nature16961.html
154 https://www.cs.toronto.edu/~ranzato/publications/taigman_cvpr14.pdf
155 http://www.ibm.com/analytics/us/en/case-studies.html
156 https://netzpolitik.org/wp-upload/LKA_NRW_Predictive_Policing.pdf
157 http://www.faz.net/aktuell/gesellschaft/kriminalitaet/software-programm-precobs-berechnet-ort-von-einbruechen-13966153-p2.html
158 http://articles.chicagotribune.com/2013-08-21/news/ct-met-heat-list-20130821_1_chicago-police-commander-andrew-papachristos-heat-list

Digital Transformation in Healthcare 58


A Whitepaper of the Healthcare Futurists GmbH
2.3 Description and Assessment of Tools Used to Work On Huge Data Sets

mid-year were on this now 1,400 person list, according tracking, and other data-amassing trends are enabling a
to the Chicago police as cited in the NY Times159. On big data revolution. Furthermore, the average researcher
the business side, Microsoft (Cortana Analytics)160, who reads 250 to 300 articles annually cannot keep
Hitachi (Predictive Crime Analytics)161 and IBM162 up with most of the novel scientific achievements in
all offer crime-management solutions and present a world where scientific output doubles about every
successful use-cases thereof, such as cutting violent 9 years. Connections between many data sources
gun crime by 46% over a 7-year period in Durham. might remain uncovered as a result, leaving behind
a huge lost potential. Big data analytics tools present
US Startups like BLOCKpeek seek to carry opportunities across all of these touchpoints - either
this technology into the private-consumer creating new knowledge by detecting patterns in
market. BLOCKpeek is working on a mobile vast amounts of medical data, or gaining actionable
app that is powered by predictive analytics that insights from existing knowledge much faster than
warns users of nearby potential hazards like humanly possible. As the following examples suggest,
demonstrations, shootings or severe weather 163. big data analytics supercharges progress in Precision
Medicine as well as in pharmaceutical R&D and offers
Further illustrating the role of big data in this context, exceptional potential for the optimization of diagnosis
a large-scale study by A. J. Rosellini et. al. aimed at the and improvement of prevention through predictive
development of an actuarial model to predict future measures. This development is not only of interest to
violent crimes among US Army soldiers. A machine patients but also to businesses, since big-data strategies
learning algorithm was trained on data relating to could generate additional revenues of up to US$100
975,057 soldiers who served during a six-year period. billion annually in the US healthcare system alone165.
The developed algorithm found certain key predictors
of future crimes, such as a disadvantaged social/ Precision Medicine promises to usher in the era of
socioeconomic status, mental disorder treatment personalized medicine, where diagnostic, prognostic,
and prior crime. Of all major physical crimes (e.g. and therapeutic strategies [are] precisely tailored to
murder-manslaughter, kidnapping, robbery), 36.2% each patients requirements166. One must distinguish
were committed by the group of soldiers with the between therapies that are exclusively created for one
highest predicted risk - 5% of the total population of individual patient and those that would fall in the
male soldiers. These results suggest that such models category of mass customization. As a first step, the trial
could be used for predictive crime purposes, driving and error approach can be abandoned in many cases in
decisions about preventive measures such as routine- light of stratified medicine and companion diagnostics.
checkup, early interventions, or increased support164. For instance, targeted oncological therapies have already
been developed and are in clinical use today, owing to
In healthcare, the emergence of EHRs, quantified self- a better understanding of cancers and their distinct

159 http://www.nytimes.com/2016/05/24/us/armed-with-data-chicago-police-try-to-predict-who-may-shoot-or-be-shot.html
160 https://enterprise.microsoft.com/en-us/industries/government/fighting-crime-with-big-data-analytics/
161 https://www.hds.com/en-us/pdf/case-study/hitachi-success-story-austin-police-department.pdf
162 http://www-03.ibm.com/software/businesscasestudies/hk/en?synkey=E906175Y75689V95
163 https://www.deutsche-startups.de/2015/10/14/blockpeek-moechte-die-welt-zu-einem-besseren-ort-machenwarte-auf-mail/
164 Rossellini, AJ, et al. Predicting non-familial major physical violent crime perpetration in the US Army from administrative data. The National
Center for Biotechnology Information, January 2016. http://www.ncbi.nlm.nih.gov/pubmed/26436603
165 Cattell, Jamie, et al. How big data can revolutionize pharmaceutical R&D, April 2013, McKinsey & Company. http://www.mckinsey.com/indus-
tries/pharmaceuticals-and-medical-products/our-insights/how-big-data-can-revolutionize-pharmaceutical-r-and-d
166 Mirnezami, Reza, et al. Preparing for Precision Medicine, New England Journal of Medicine, 9 February 2012. http://www.nejm.org/doi/
full/10.1056/NEJMp1114866#t=article

Digital Transformation in Healthcare 59


A Whitepaper of the Healthcare Futurists GmbH
2.3 Description and Assessment of Tools Used to Work On Huge Data Sets

genomic signature167. In certain types of lung cancers mechanisms172. The academic community too is showing
with specific mutations, for example, therapeutic increasing interest in accessing assets like 23andmes
selection based on genomics has become the standard data pool of 1.2 million genotyped customers173.
of care. Cost-lowering next-generation sequencing and Studies that failed at finding links between genes
the analysis of this big data now promise to explore the and certain diseases might be more prone to success
molecular landscape of tumors further to ultimately if the quantity of individuals studied was raised174:
optimize therapy strategies for individual patients168. whereas a study incorporating 9.000 subjects did not
yield any insights about genes causing depression175,
Another example of mass customization is a companion a study involving data from over 300.000 23andme
diagnostics test, a pre-treatment test performed in customers found 17 single nucleotide polymorphisms
order to determine whether or not a patient is likely in 15 genetic loci associated with depression176.
to respond to a given therapy 169. In Germany,
more than forty drugs accompanied by such tests are Being aware of the importance of big data in the omics
currently on the market170. The ultimate treatment in field, governments are initiating large-scale projects.
personalized medicine, however, would go beyond The US NIH Precision Medicine Initiative Cohort
selecting the most suitable therapy from a broad is planning a project involving a cohort of 1 million
arsenal of available drugs. One example of such a participants, with the goal of amplifying precision
therapy is individualized immune-therapy in cancer medicines successes in oncology as well as extending it
patients. Clinicians would produce medicine tailored to a broad range of diseases. Large amounts of diverse
to the individual patient by identifying the exact data sources will join molecular, genomic, cellular,
composition of peptides in the patients tumor-cells clinical, behavioral, physiological, and environmental
and training the immune system to fight these cells171. parameters to form an immensely diverse database177.

In addition to better understanding cancer genomes, Cognitive systems like IBMs Watson have attained levels
human genome sequencing on a massive scale is of diagnostic performance that rival and sometimes even
garnering increasing attention. With privately held surpass human experts. In the case of a female leukemia
genome-sequencing companies like Human Longevity patient in Japan, Watson for Oncology refined the
Inc. and genotyping-focused 23andme entering the original diagnosis to a specific rare type of leukemia and
market, DNA databases are growing in size. Companies suggested a different treatment178. Watsons ability to
like Deep Genomics specialize on using deep learning to design such evidence-based treatment regimens stems
predict the consequences of genomic alteration on cell from the analysis of scientific literature, including

167 Targeted Cancer Therapies. National Institute of Health, www.cancer.gov. April 2014. https://www.cancer.gov/about-cancer/treatment/types/
targeted-therapies/targeted-therapies-fact-sheet
168 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4397718/
169 http://onlinelibrary.wiley.com/doi/10.1002/ddr.21029/abstract
170 http://www.vfa.de/de/arzneimittel-forschung/datenbanken-zu-arzneimitteln/individualisierte-medizin.html
171 https://www.jung-stiftung.de/de/presse-downloads/pressemeldungen/2016/personalisierte-immuntherapie-gegen-krebs
172 Deep Genomics launches, uniting deep learning and genome biology. Kurzweilai.net, 22 July 2015. http://www.kurzweilai.net/deep-genom-
ics-launches-uniting-deep-learning-and-genome-biology
173 https://blog.enlightenbio.com/2016/02/15/at-agbt-2016-the-winners-are-long-reads-and-whole-solutions/
174 23andme blog
175 https://www.ncbi.nlm.nih.gov/pubmed/22472876
176 www.nature.com/ng/journal/vaop/ncurrent/full/ng.3623.html
177 https://www.nih.gov/precision-medicine-initiative-cohort-program
178 http://www.japantimes.co.jp/news/2016/08/11/national/science-health/ibm-big-data-used-for-rapid-diagnosis-of-rare-leukemia-case-in-ja-
pan/%252523.V7B-6JN97ow

Digital Transformation in Healthcare 60


A Whitepaper of the Healthcare Futurists GmbH
2.3 Description and Assessment of Tools Used to Work On Huge Data Sets

MSK curated literature and rationales, [] over 290 Taking into account that cognitive systems already
medical journals, over 200 textbooks, and 12 million reach accuracies of over 97% in facial recognition
pages of text179. Mapping the knowledge gained hereby tasks180, the high potential of such systems to perform
to a patients medical record allows Watson to provide accurate medical image analysis does not seem
ranked treatment options for individual patients in farfetched. Indeed, IBM is unleashing this potential
record time. Consequently, the identification of the in the form of an algorithm that detects melanoma
specific mutated genes of diagnostic importance in vs. 12 benign skin diseases. To date, the algorithm
the case of the Japanese patient took 10 minutes a reaches accuracies of 83% to 91%181, depending
drastic improvement compared to the 2-week period on image quality, and is in use in the US hospital
required for human scientists to perform the same task. Memorial Sloan Kettering182. Two projects were

179 http://www.ibm.com/watson/health/oncology/
180 https://www.cs.toronto.edu/~ranzato/publications/taigman_cvpr14.pdf
181 http://www-03.ibm.com/press/us/en/pressrelease/50057.wss
182 http://www.popularmechanics.com/science/health/a13391/ibm-skin-cancer-detection-system-memorial-sloan-kettering-17545836/

Digital Transformation in Healthcare 61


A Whitepaper of the Healthcare Futurists GmbH
2.3 Description and Assessment of Tools Used to Work On Huge Data Sets

announced in 2016 to further improve the accuracy the most important areas to focus on. For instance, if the
of the aforementioned machine learning algorithm, relationship between a given disease and a certain gene
aiming at mining more than one million images. is of interest, Watson processes all scientific publications
available within hours and uses visualization techniques
The technique by which cognitive computing systems to create a holistic network map containing all
like Watson successfully interpret vast amounts connections mentioned in the studied data (see figure 4).
of structured and unstructured data and thereby
generate diagnostically relevant insights is also highly Not only does this procedure solve the impossible task of
relevant to the pharmaceutical research process. IBM reading millions of articles and prevent human bias from
Watson Discovery Advisor for Life Sciences promises contaminating the results of the search for information,
to accelerate the drug discovery process by rapidly but it also allows for connections across domains to
interpreting data sources like scientific publications, be made. Whereas serendipity has previously been
patients, and genomics databases, and presenting the reason for many medical breakthroughs in which
information in a condensed form183. Watson is able to seemingly dissimilar domains coincidentally revealed
accomplish this feat using deep learning natural language commonalities, the integration of data from disparate
processing, understanding not only the meaning of single domains - across therapeutic areas, study types, and
words but also the relationship between them. This way, steps in the drug development process - can replace the
Watson optimizes the researchers ability to work with serendipity-based approach with a systematic one. In the
existing evidence by enabling them to quickly make out case of the cancer researcher, they would most probably

183 http://www.ncbi.nlm.nih.gov/pubmed/27130797

Digital Transformation in Healthcare 62


A Whitepaper of the Healthcare Futurists GmbH
2.3 Description and Assessment of Tools Used to Work On Huge Data Sets

limit their search to cancer literature. In contrast, compounds of similar structure. This 1-month endeavor
Watson would draw on all information available, left Watson with 15 drug candidates. In contrast, it took
regardless of disease, journal or even species of study. 10 research scientists 14 months to perform the same
task, with similar results. Approximately half of the
After the interpretation and visualization of the data, drugs identified by Watson matched the ones found by
Watson continues mimicking the human decision- the scientists. Whether the other half turned out to be of
making process by engaging in the evaluation of the value was not disclosed by the pharmaceutical company.
information. Machine learning and predictive analytics
are used in this evaluation phase to generate new Deploying predictive health systems based on big
hypotheses about relationships for which no evidence data will yield substantial benefits even in the short-
is yet available. Two examples illustrate the practical term, according to Bates et. al184. Adverse effects,
significance and impact this procedure will have on the high-cost patients, triage, and hospital readmissions
identification of new drug targets and the repurposing are among the many use-cases where data analytics
of existing drugs. In a 2013 retrospective study, Watson can deliver value to both patients and healthcare
was used to detect high-potential cancer kinases that organizations. Simple predictive analytical tools for
phosphorylate the P53 protein. Having learned from a identifying high-risk patients and predicting patient
training dataset, which included evidence of all kinases readmission risks and death are already in clinical
that had been observed to phosphorylate P53 through the use internationally. However, the predictive power of
year 2002, Watson identified 9 novel potential kinases industry-standard models like the LACE model varies,
most likely to have the desired effect. Indeed, 7 out of the and proves to be ineffective in certain patient groups185
9 kinases predicted by Watson had been discovered and or diseases186. Predictive analytics solutions based on
validated between 2002 and 2013 by human researchers. big data analytics involve higher costs, since one-size
fits all approaches do not work and solutions must be
The second example suggests that Watsons usefulness tailored to specific institutions in order to be effective.
is not limited to the accelerated identification of new
potential drug targets, but also reaches into the area of Every providers process and data are different, which
drug repurposing. The bringing together of cross-domain is one of the reasons why big data predictive tools
data, as explained above, was the key in identifying are hard to implement187. However, if institution-
potential compounds in the existing portfolio of a specific big data prediction models are applied, they
pharmaceutical company to treat malaria. For this outperform the LACE model, sometimes being twice
purpose, Watson can draw on information from sources or three times more effective188. Such insights can
like preclinical study results, clinical trial data, ADE improve the quality of care effectively making use of
reports in drug safety databases and databases comprising resources to influence future outcomes. Furthermore,
all approved therapies and existing drugs. In the case of other studies show that most risk-prediction models
the malaria project, Watson first analyzed MEDLINE - almost all of those tested being logistic regressions
literature to identify drugs that suggested efficacy - perform poorly in predicting hospital readmission
against the malaria parasite. Second, Watson filtered risk189 while more complicated models using big data,
the portfolio of the pharmaceutical company for any particularly deep learning, can substantially raise

184 http://content.healthaffairs.org/content/33/7/1123.abstract
185 http://www.ncbi.nlm.nih.gov/pubmed/22644078
186 http://www.ncbi.nlm.nih.gov/pubmed/25099997
187 http://www3.gehealthcare.com/en/insights/forward_thinking/forward_thinking/why_has_not_big_data_transformed_healthcare
188 http://www.ncbi.nlm.nih.gov/pubmed/26363683
189 http://jama.jamanetwork.com/article.aspx?articleid=1104511

Digital Transformation in Healthcare 63


A Whitepaper of the Healthcare Futurists GmbH
2.3 Description and Assessment of Tools Used to Work On Huge Data Sets

the predictive accuracy190. Intel with Cloudera191, systems varies greatly by country. While the US
GE-owned Caradigm192 and IBM193 are some of the adoption rates of EHRs in hospitals196 and physician
companies currently in the market of predictive analytics offices197 has already surpassed the 75% mark, Austrias
solutions aimed at reducing hospital readmissions. stepwise introduction of its EHR system ELGA was
only initiated at the end of 2015198, and Germanys goal
Working With Registry Data According to is to have a working EHR system with information
Policy Makers and Actual Examples. about medication, medical reports and self-measured
data by 2018199. Despite these differences, large-scale
Hand-written medical data is increasingly giving way registry studies are feasible today, as shown by a recent
to its digital counterpart194. The increasing adoption of study that examined treatment pathways. In this study,
electronic health records (EHR) and medical records data from EHRs and administrative claims databases
(EMR) not only facilitates medical practices day-to- from a total of 250 million patients was used.200
day, but is also tremendously relevant to the success of
registry-based studies. Such studies can become more The feasibility of exploiting EHRs and registries for
powerful, more feasible, and more easily conducted as research purposes is strongly linked to compliance
a result of having greater amounts of healthcare data at with legal standards. The studies mentioned in
their disposal.195 As indicated in the previous section, this section that used EHR data generally do not
large amounts of data from EHRs can be used to support include a detailed description of the process by
clinical decision-making in areas such as predicting which regulations were met. The following paragraph
readmission risks or empowering diagnoses. This section gives a first look at the requirements for conducting
will explore the applicability of registry studies in medical registry studies by taking the US regulatory landscape
research and point to potential regulatory hurdles. as an example; there are different ways of getting
permission to use registry data for research purposes201.
Registries can be used for a wide variety of purposes,
such as determining clinical effectiveness of treatments First, existing research cohorts could be merged to
in real-world conditions (more in section Future of form a larger registry, requiring only an update in the
EBM); assessing the safety of pharmaceutical drugs; terms of patients consent202. Second, clinical data
and measuring overall quality of care. With electronic could be acquired through organizational partnerships
medical data available in sources such as EMRs, with community-based hospitals, clinics, pharmacies,
EHRs, hospital records, and payer claims databases, health systems like Kaiser Permanente, and many
registry studies can become a much more powerful more potential candidates. In the case of comparative
tool. However, the adoption of such electronic medical effectiveness research with participating institutions

190 http://www.sciencedirect.com/science/article/pii/S1532046415000969
191 http://www.intel.com/content/www/us/en/healthcare-it/solutions/documents/predictive-analytics-reduce-hospital-readmission-rates-white-paper.
html
192 https://www.caradigm.com/en-us/solutions-for-population-health/healthcare-analytics/
193 https://www-01.ibm.com/software/sg/industry/healthcare/pdf/setonCaseStudy.pdf
194 http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4341817/
195 Registries for Evaluating Patient Outcomes: A Users Guide Volume 2, p. 59
196 https://www.healthit.gov/sites/default/files/data-brief/2014HospitalAdoptionDataBrief.pdf
197 http://content.healthaffairs.org/content/early/2014/08/05/hlthaff.2014.0445
198 https://www.elga.gv.at/faq/wissenswertes-zu-elga/index.html
199 http://digitalpresent.tagesspiegel.de/patientendaten-sollen-europaweit-vernetzt-werden
200 http://www.ncbi.nlm.nih.gov/pubmed/27274072
201 https://www.nih.gov/sites/default/files/research-training/initiatives/pmi/opportunities-challenges-electronic-health-records.pdf
202 Registries for Evaluating Patient Outcomes: A Users Guide Volume 1, p. 19

Digital Transformation in Healthcare 64


A Whitepaper of the Healthcare Futurists GmbH
2.3 Description and Assessment of Tools Used to Work On Huge Data Sets

located in multiple states, for example, one must studies in particular can be evaluated using the official
comply with state- and institution-specific regulations guide of the Agency of Healthcare Research and Quality
and policies203. Third, patients themselves could become (AHRQ) Registries for Evaluating Patient Outcomes.
active agents through the right to access and share It outlines in great detail which aspects must be
personal health information, called the Blue Button. considered to successfully conduct registry studies for
the purposes of public health activities, governmental
HIPAA compliance and research laws play an important health program oversight, quality improvement/
role in using electronic medical data for research. assurance (I/A) and research. The applicability of
Pharmaceutical companies that fall into the category the Privacy Rule and the Common Rule depends on
of providers that transmit any health information in many factors: the purpose of a registry, the type of
electronic form in connection with a transaction or entity that creates or maintains the registry, the types of
business associate[s] of another covered entity can entities that contribute data to the registry, the extent
be classified as a covered entity subject to the Health to which registry data are individually identifiable
Insurance Portability and Accountability Act (HIPAA) [], the consent process, and the inclusion of genetic
and the HITECH Act204. As such, they are subject information208. To conclude, the applicability of
to regulations like the Privacy Rule and the Security regulations must be investigated in detail for the specific
Rule, regulations establishing national standards registry study of interest and varies with regards to
to protect individuals medical records and other nation- and state-specific laws and regulatory standards.
personal health information as well as individuals
electronic personal health information, respectively205. Ensuring the safety of a drug is a core objective of the
pharmaceutical R&D process, but the relatively low
In addition to the Privacy Rule, the Common Rule, one number of clinical trial participants relative to the
of the U.S. Department of Health & Human Service much larger and more diverse populations who end up
(HHS) regulations, as the uniform set of regulations using the drug after approval poses clear limitations to
on the ethical conduct of human subjects research is guaranteeing such safety209. On the other hand, registries
an important regulation to consider206. Institutional can include patients who are much different and suffer
Review Board approval is required when an institution from more complex diseases or comorbidities than those
is engaged in research with human subjects. The of participants studied in clinical trials210. Postapproval
Common Rule applies to all research involving human pharmacovigilance is therefore an important topic for
subjects conducted, supported or otherwise subject all healthcare stakeholders, yet adverse drug events
to regulation by any federal department or agency. (ADEs) suffer from high underreporting rates owing
However, an institution is exempt from the Common to the voluntary nature of reporting such incidences211.
Rule if (1) the information of interest is already Analyzing data stored in EHRs can be used to remedy
publicly available or (2) the information is recorded in the shortcoming that systems relying on nonsystematic
such a manner that subjects cannot be identified207. recognition of ADEs pose, as shown in various
research projects: Large registry studies which involved
The applicability of regulatory requirements for registry multiple millions of subjects combined healthcare

203 http://www.ncbi.nlm.nih.gov/pubmed/23774516
204 https://www.law.cornell.edu/cfr/text/45/160.103
205 http://www.hhs.gov/hipaa/for-professionals/privacy/index.html
206 Registries for Evaluating Patient Outcomes: A Users Guide Volume 1, p. 19
207 http://www.hhs.gov/ohrp/regulations-and-policy/regulations/45-cfr-46/index.html#46.101
208 Registries for Evaluating Patient Outcomes: A Users Guide Volume 1, pp. 19,171
209 Registries for Evaluating Patient Outcomes: A Users Guide Volume 2, p. 103
210 Registries for Evaluating Patient Outcomes: A Users Guide Volume 2, p. 106
211 http://www.ncbi.nlm.nih.gov/pubmed/19132802

Digital Transformation in Healthcare 65


A Whitepaper of the Healthcare Futurists GmbH
2.3 Description and Assessment of Tools Used to Work On Huge Data Sets

records from several EU countries and demonstrated discovered218, and Mini-Sentinel played a role as well
the feasibility of detecting safety signals and ADE in refuting the hypothesized increased risk of bleeding
associations, if databases were sufficient in size212, 213. events in the usage of Dabigatran219. The high potential
of data-mining projects of this kind is evident, but it
In order to uphold the data holders control over their is important to keep limitations of the Mini-Sentinel
protected data and ensure patient privacy, the data in mind, such as its observational design-approach,
from the different databases were aggregated. The the usage of claims data and the fact that it is just
subject of analysis in this case was structured data, one of many sources in the case of the assessment of
which is much easier for data-analytics systems to the two drugs mentioned above, next to the FDA
process accurately. However, mining of unstructured Adverse Event Reporting System, published case series
data such as free-text clinical documents via natural and information from the CMS Medicare database.
language processing techniques 214, 215 or simpler
methods216 can also be used to detect ADEs and
might reveal events otherwise undetected by analyzing
only structured data. Data mining can thus be
beneficial for rapid notifications of ADEs ahead of
official alerts, as well as for hypothesis generation.

All in all, the combination of large amounts of both


structured and unstructured data in EHRs has the
potential to support pharmacovigilance research and
pharmaceutical decision making. For a more specific
example of ADEs detected using data mining methods,
take the analysis of comedication for two of the most
prescribed drugs worldwide. Data analysis of EMR data
revealed a strong signal for glucose homeostasis affected
by the combination of the cholesterol-lowering drug
pravastatin and the antidepressant paroxetine (Paxil).
The study found that the administration of both drugs
together had a drastic effect in raising glucose levels217.

Insights from studies of this kind are already affecting


the real world. Findings from the pharmacovigilance
pilot project Mini Sentinel set up by the FDA are
part of the pool of information sources the FDA draws
on in weighing pharmacovigilance measures. For the
blood pressure drug Olmesartan, a warning was added
to the drugs label after link to intestinal problems was

212 http://www.ncbi.nlm.nih.gov/pubmed/22315152
213 http://www.ncbi.nlm.nih.gov/pubmed/21182150
214 http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2732239/
215 http://onlinelibrary.wiley.com/doi/10.1038/clpt.2012.54/full
216 http://onlinelibrary.wiley.com/doi/10.1038/clpt.2013.47/full
217 https://www.nigms.nih.gov/news/meetings/documents/russ_altman_article.pdf
218 http://www.fda.gov/drugs/drugsafety/ucm359477.htm
219 http://www.nejm.org/doi/full/10.1056/NEJMp1302834

Digital Transformation in Healthcare 66


A Whitepaper of the Healthcare Futurists GmbH

2.4
The Possible Futures
of Big Data
in Healthcare

2.4.1 How Will Big Data-Driven Healthcare favorite methods are far from infallible. Shortcomings
eventually be able to change Evidence-Based owing to the inherent nature of RCT study design as
Medicine and the Way Studies are Conducted well as suboptimal policy and incentive systems leave
in the Future? much room for improvement. Now, with big data on
the rise, EBM methods can be enriched in ways that
Science strives for a true understanding of nature and promise to improve quality and efficiency of care. This
the rules by which it operates. Through the scientific section will shine light on ways big data might remedy
method, the continuous process of making observations, some of the shortcomings of contemporary EBM.
generating hypotheses and testing predictions based on
these hypotheses, scientific knowledge can be generated EBM is equipped with a toolkit of methods and
and thus hidden processes uncovered and understood. study designs aimed at generating clinical evidence.
Fortunately for patients, modern medicine can be Depending on the type of clinical question, either
included in the list of practices which rest upon a experimental or non-experimental approaches are best
foundation of scientifically tested hypotheses. From suited to serve this purpose. The primary questions
having been based on religious beliefs, subjective in EBM can be divided in questions regarding
judgement, and individual experiences to being based diagnosis, prognosis, harm, and treatment. 221
on scientific evidence and the formal analysis thereof -
medical reasoning and decision making has undergone The examples in the previous sections have already
an impressive transformation throughout the past touched on the applicability and usefulness of big
centuries and decades to arrive at the point it is today. data in the areas of diagnosis, prognosis, and harm.
Cognitive systems can maximize the potential of the
The evidence-based medicine (EBM) approach ensures best diagnostic evidence available by evaluating all
that decisions concerning treatment, diagnosis, peer-reviewed medical evidence ever created, thereby
prognosis and safety stand on a bedrock of best available contributing to a more effective practice of EBM. The
external clinical evidence derived from experimental as machine learning algorithm that detected melanoma
well as observational studies.220 Randomized controlled is a prime example of how big data is at the core of
trials (RCT) have been labeled the gold standard for the development of cognitive computing-based
producing and evaluating evidence relating to the diagnostic tests that are both fast and accurate. With
effect of interventions because of their strong internal regard to prognosis, predictive analytics systems using
validity and the potential to control biases. While big data were shown to be superior to simpler methods
EBM is indeed the best approach modern medicine in some cases. To add another example positioned
has to peel back the layers of medicines secrets, the at the intersection between diagnosis and prognosis,
best evidence available often does not represent the a 2016 study showed that machine learning models
much desired scientific truth, and RCTs as EBMs performed fairly well in diagnosing and predicting

220 Sackett 1996, http://www.bmj.com/content/312/7023/71


221 Sackett 1996, http://www.bmj.com/content/312/7023/71

Digital Transformation in Healthcare 67


A Whitepaper of the Healthcare Futurists GmbH
2.4 The Possible Futures of Big Data in Healthcare

risks for acute kidney injury.222 Concerning harm and rely on statistical p-values of 0.05, publication bias,
safety, large registry-based pharmacovigilance studies lack of replication studies, a suboptimal environment
proved to be effective and efficient in spotting ADEs. of economic incentives that encourages quantity over
To complete the picture of big datas influence on quality, and many more.228 The weaknesses of RCTs can
the primary areas of EBM, it is necessary to take a be added to this list. A causal link between the treatment
look at how questions of treatment can be addressed. in question and an observed outcome can be determined
with high degree of certainty if randomization
The value of the evidence generated in the respective area mechanisms and rigorous inclusion/exclusion criteria in
of interest is largely determined by the underlying design RCTs are applied. However, the high internal validity
of the study in question. It is commonly recommended of RCTs stands in stark contrast to the low external
that experimental approaches, as described above, be validity, or the inability to correlate results from the
used to assess evidence when it comes to questions of outcomes in study participants to the population at
treatment in order to avoid false positive conclusions large. One cannot assume that clinical efficacy proven
about efficacy.223 In addition to RCTs, other study in RCTs is generalizable to subpopulations not studied
designs are proposed as methods of EBM and are being in these trials.229 Thus, a different approach is needed
used to measure treatment success, namely observational to permit claims about efficacy in the real world to be
studies like cohort studies and case-reports. Cohort made. Moreover, RCTs are costly and time-intensive in
studies in particular are seen as valuable tools for nature, rendering it infeasible to (1) conduct research
determining diagnostic test accuracy and prognostic on all topics of genuine significance to medicine (all
factors or answering questions concerning safety and combinations of treatments/individuals) and (2)
harm.224 However, such studies are seen as inferior for delaying the point at which innovations are actually
producing evidence and are thus located beneath RCTs implemented into medical day-to-day practice.230
in the hierarchy of evidence.225 The Oxford center for Comparative effectiveness research might also be
EBM226 ranks RCTs at level one, cohort studies at level problematic if conducted in a RCT-setting, if for
two, and the German IQWiG227 institute - which is instance the effectiveness of a drug should be compared
at the limit for non-randomized studies producing to that of a device.231 Also, potential ethical concerns
reliable qualitative results - at low. The relatively low associated with the retention of possibly valuable
significance of observational studies in clinical research medication to control groups can hardly be mitigated.
must be questioned in light of the massive amounts of Only through observational studies can the effects
medical data and data mining tools becoming available. of smoking on lung cancer be investigated while at
the same time respecting modern ethical standards.232
To begin, a 2005 study reveals many shortcomings
of modern EBM-based research. It found that a With these aspects in mind, big data can shape the
considerable amount of published study results may workings of the evidence-creating mechanisms of EBM
be wrong for reasons of overconfidence in studies that in the following ways: (1) improving RCTs through

222 Kate 2016, http://www.ncbi.nlm.nih.gov/pubmed/27025458


223 Sackett 1996, http://www.bmj.com/content/312/7023/71
224 http://guides.dml.georgetown.edu/ebm/ebmclinicalquestions
225 Kovesdy 2012, http://www.ncbi.nlm.nih.gov/pubmed/22364796
226 http://www.cebm.net/oxford-centre-evidence-based-medicine-levels-evidence-march-2009/
227 p.86, https://www.iqwig.de/download/IQWiG_Methoden_Version_4-2.pdf
228 Ioannidis 2005, http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1182327/
229 Angus 2015, http://jama.jamanetwork.com/article.aspx?articleid=2429723
230 Bothwell 2016, http://www.nejm.org/doi/full/10.1056/NEJMms1604593
231 http://clinicaldevice.typepad.com/cdg_whitepapers/2011/07/registry-studies-why-and-how.html
232 Bothwell 2016, http://www.nejm.org/doi/full/10.1056/NEJMms1604593

Digital Transformation in Healthcare 68


A Whitepaper of the Healthcare Futurists GmbH
2.4 The Possible Futures of Big Data in Healthcare

digital means, (2) guiding RCTs by delivering promising in time and presents a new way of marketing. 238

hypotheses or (3) complementing RCTs with large-


scale observational studies. These aspects might bridge Second, real-time monitoring and the employment of
the effectiveness/efficacy gap and help in the quest for connected devices allows for quick reactions to issues
controlling costs and stopping the decline in ROI. arising during trials. For instance, drug-safety signals can
be processed and reacted to much faster.239 With wearable
The clinical trial process presents many touchpoints technology becoming more affordable and delivering
where digital solutions can be introduced to improve more accurate data, there has been a movement towards
the way candidates are managed and thereby integrating connected devices into the traditional
increase overall efficiency and effectiveness 233 . clinical trial setting. This allows researchers to design
innovative study protocols that do not exclusively
First, the recruitment process can be accelerated and rely on self-reported data, which could be corrupted
more suitable candidates can be selected. Based on by human biases and other measurement errors.240 In
factors like age, disease severity, genetic composition 2015, wearables were being employed in at least 299
and several other novel criteria, the right people can clinical trials.241 The roll-out of the Apple ResearchKit,
be chosen. Parting ways with the manual process of for instance, has empowered researchers and developers
matching patients to clinical trials, IBM Watson for to build many health monitoring applications for the
Clinical Trial Matching automatically provides a list iPhone and the Apple Watch. Sleephealth, powered
of suitable candidates while maintaining transparency by IBM Watson Health Cloud, monitors sleep quality
and disclosing the criteria used in the evaluation and its effects on health, productivity and alertness.242
process.234 Other studies show that structured data in EpiWatch, built by John Hopkins researchers, is an app
EHRs is an important source for determining a patients for Apple Watch that helps epilepsy patients manage
eligibility for clinical trial enrollments.235 In later stages their seizures and medications as well as corresponding
of the development process, filtering out patients triggers and side effects.243 Users are given the option
with non-responsive genes can prove valuable236, and of sharing their anonymized data with researchers.
insights derived from data analysis might cause entry
criteria to be changed in an adaptive trial in order One could argue that one thousand chronic patients
to maximize the suitability of patients responding with smartphones are the cheapest real world medical
to the therapy being examined. 237 Additionally, research source, making such projects immensely
reaching potential candidates via communication interesting.244 However, the validity and reliability of the
channels like social media contributes to a reduction data collected is still an important issue when it comes

233 Cattell 2013, http://www.mckinsey.com/industries/pharmaceuticals-and-medical-products/our-insights/how-big-data-can-revolutionize-pharmaceu-


tical-r-and-d
234 http://www.ibm.com/watson/clinical-trial-matching.html
235 Ateya 2015, http://bmcmedinformdecismak.biomedcentral.com/articles/10.1186/s12911-016-0239-x
236 (Malik 2010, 882)
237 Angus 2015, http://jama.jamanetwork.com/article.aspx?articleid=2429723
238 Cattell 2013, http://www.mckinsey.com/industries/pharmaceuticals-and-medical-products/our-insights/how-big-data-can-revolutionize-pharmaceu-
tical-r-and-d
239 Cattell 2013, http://www.mckinsey.com/industries/pharmaceuticals-and-medical-products/our-insights/how-big-data-can-revolutionize-pharmaceu-
tical-r-and-d
240 https://investor.fitbit.com/press/press-releases/press-release-details/2016/Fitbit-and-Fitabase-Innovate-Health-Research-Practices-to-Enable-Re-
al-Time-Continuous-Measurement-Better-Participant-Engagement-and-Innovative-Study-Design/default.aspx
241 http://www.bloomberg.com/news/articles/2015-09-14/big-pharma-hands-out-fitbits-to-collect-better-personal-data
242 http://www.wareable.com/saves-the-day/what-is-apple-researchkit-iphone-watch-everything-you-need-to-know-931
243 http://www.hopkinsmedicine.org/epiwatch
244 http://www.euroforum.de/healthcare/review-2014/

Digital Transformation in Healthcare 69


A Whitepaper of the Healthcare Futurists GmbH
2.4 The Possible Futures of Big Data in Healthcare

to wearable technology. Biases and human errors might 2.4.2 Guiding RCTs: Generating Promising
be mitigated, but sleep monitoring devices that mistake Hypotheses / Quickly Testing Hypotheses
non-trivial awakenings for asthma-induced awakenings
can be problematic, to name one example.245 In addition, Besides the mutual goal of generating evidence, the
FitBit proclaims that their wearable may be able to traditional EBM approach and big data approaches
improve compliance rates among research participants, differ greatly in the mechanisms by which they produce
referring to a small-scale study which showed that the this evidence: in EBM, a hypothesis lays the groundwork
device was worn on nearly all intervention days.246 for an RCT in which data is acquired to prove or
disprove the hypothesis, the result being internally valid
Lastly, suboptimal-adherence in outpatient evidence for a causal link present under test-conditions.
clinical trials could be addressed with promising
technology that promotes medication adherence. In contrast, a purely data-driven approach values raw
observations over a priori hypothesized relations. The
It is important to note that RCTs have never strength of data-mining and deep learning algorithms
monopolized knowledge production in the sense that lies in the recognition of patterns patterns that might
they have pushed observational studies completely out go undetected by human eyes. It follows the principle
of the picture247. In contrast to RCTs, non-experimental that more data are better than better data. The result
approaches like observational studies present strong is externally valid, precise evidence for correlations
external validity and can succeed in proving clinical present under real-world conditions. Whether the
effectiveness under conditions outside the confines of result is true or affected by bias cannot be concluded.249
controlled trials. However, they generally fail to generate
evidence capable of warranting a causal link between A practical example puts the two approaches into
studied variables, owing to their non-randomized context: In the case of a young systemic lupus
nature. Despite these obvious drawbacks, observational erythematosus (SLE) patient with nephrotic-range
studies are generally seen as complementary to RCTs. proteinuria and pancreatitis, clinicians based their
The official NICE guide proclaims that data from non- decision to give anticoagulation medicines on the results
randomised studies may be required to supplement of an institution-wide search in the EMR database
RCT data.248 With data accumulating and controlled which showed a correlation between the complications
registry studies with 100s of millions of patients and an increased risk for thrombosis. With a lack of
becoming feasible, different positions exist on how the RCTs in the area of pediatrics to base their decision
relationship between RCTs and observational studies on, EMR data was used to guide real-time clinical
might change. The following paragraphs examine two decisions.250 Thus, a hypothesis derived from experience
possible roles for big data studies to change research was quickly tested and indeed a correlation was found.
performed under the principles of EBM: serving Statistical methods like resampling can validate a
as a guide to steer RCTs into promising directions, relation in such a case and ensure that it really exists.
or serving as standalone methods of investigating
and proving what actually works in the real world. This is not to say that a causal relation between (1)
nephrotic-range proteinuria and pancreatitis and
(2) an increased risk for thrombosis in SLE patients

245 http://www.appliedclinicaltrialsonline.com/wearables-clinical-trials-active-interest?pageID=2
246 https://investor.fitbit.com/press/press-releases/press-release-details/2016/Fitbit-and-Fitabase-Innovate-Health-Research-Practices-to-Enable-Re-
al-Time-Continuous-Measurement-Better-Participant-Engagement-and-Innovative-Study-Design/default.aspx
247 Bothwell 2016, http://www.nejm.org/doi/full/10.1056/NEJMms1604593
248 https://www.nice.org.uk/process/pmg9/chapter/the-reference-case
249 Sim 2016, http://www.ncbi.nlm.nih.gov/pubmed/26809201
250 Frankovic 2011, http://www.nejm.org.eaccess.ub.tum.de/doi/full/10.1056/NEJMp1108726

Digital Transformation in Healthcare 70


A Whitepaper of the Healthcare Futurists GmbH
2.4 The Possible Futures of Big Data in Healthcare

was proven, but it suggests the role that big data 2.4.3 Complementing RCTs
can take: rapidly assessing the (external) validity
of a hypothesis in real life scenarios, in case a first The role of big data in EBM as a complement to RCTs
assumption exists, or simply discovering correlations has been proposed by many. They either proclaim that all
that could make powerful new hypotheses, in case no should be considered as pieces of the same puzzle, with
prior assumptions exist. Applying such filters in the all of them being able to provide useful information and
process of hypotheses-choosing can allow RCTs to increase our knowledge about a subject, but without any
focus on the most promising hypotheses and thereby of them being infallible 253, that big datas strength lies in
increase the probability of success. In case of the SLE its potential to see what RCTs wont, thereby improving
patient, the hypothesis based on the existing correlation care in ways RCTs cant254 or that evidence-based
between (1) and (2) could later be subjected to rigorous medicine needs the computational power of big data,
testing under controlled conditions in a clinical trial. and big data need the epistemological rigor of EBM 255.

Pfizer is taking a similar approach with its Precision Strategy& proposes a tangible example in which RCTs
Medicine Analytics Ecosystem by first scanning for and big data are fused together. They argue that a new
patterns in data from hundreds of millions of EMRs, kind of R&D model with real-world evidence and RCTs
guiding the researchers to a new hypothesis. A clinical working hand in hand could shorten launch cycles by
trial database as well as a genomic database are then approximately five years and reduce R&D investments
consulted to design a more focused clinical trial. The per product by approximately 60 percent. The model
open source data management system tranSMART focuses on mitigating late-stage product failure by
is used to connect the three types of data. Following collecting real-world data in earlier stages. As real-world
this regimen in search of a lung cancer drug specifically evidence (RWE) provides a means to prove clinical
targeted at patients with an ALK mutation, Pfizer effectiveness and the major reason for late-stage attrition
successfully developed the drug Xalkori in 2011. In the is a lack thereof, potential failures can be filtered out
words of Pfizer CIO Jeff Keisling: Had this compound early on. In this scenario, the accumulation of RWE
been tested against a broad spectrum of lung cancer begins after the proof of concept in phase IIa and is
patients, it likely would not have been found to be accompanied by RCTs before and after. The model
effective. With this analytics-based approach, it was is based on the critical assumption of a sufficiently
found to be very effective, but we had to be able to proven safety profile so that evaluations at such early
identify a subset of cancer patients with a specific gene stages in real-world environments are possible.256
mutation who previously did not have this treatment
option.251 Big data can also assume this guiding role Any scenario that incorporates big data as the main tool
outside of the treatment-related area and involve to demonstrate treatment effectiveness must inevitably
a great variety of sources. For instance, aggregated deal with the disparity between correlation and
search data from large numbers of Internet users can causation. Some argue that causation might indeed be
deliver insight into previously unknown ADEs. If found without clinical trials through the careful selection
queries for a given drug correlate with searches for of observational research designs and the deployment
information about certain symptoms, a potential ADE of statistical methods to minimize confounding effects.
has been found. Clinical trials could then validate the A comparative effectiveness study that investigated
correlations found using this low-cost method. 252 the outcomes in 80,000 patients (e.g. mortality) of

251 http://www.informationweek.com/strategic-cio/executive-insights-and-innovation/pfizer-connects-dots-to-deliver-better-treatments/d/d-id/1141527
252 Yom-Tov 2013, http://www.jmir.org/2013/6/e124/
253 Kovesdy 2012, http://www.ncbi.nlm.nih.gov/pubmed/22364796
254 Frakt 2016, http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4758464/
255 Sim 2016, http://www.ncbi.nlm.nih.gov/pubmed/26809201
256 Rnicke 2015, http://www.strategyand.pwc.com/reports/revitalizing-pharmaceutical-rd

Digital Transformation in Healthcare 71


A Whitepaper of the Healthcare Futurists GmbH
2.4 The Possible Futures of Big Data in Healthcare

medications for type 2 diabetes increased the likelihood a quarter of internet users wearing a device during their
that the found associations were in fact causal through sports routine or even during regular walks257. There
the use of falsification tests. Other statistical methods are, however, other areas where personal monitoring
that could be applied to minimize the effect of has become more common. There are internet-enabled
confounders are random effects models, bootstrap personal scales which have been for sale in Apple stores for
estimation of coefficients, or an instrumental variables a long time. There are myriads of different sleep tracking
approach. However, finding causation in these study devices which promise better sleep and better wake-up
designs is only possible in certain cases and one should timing. There are insulin and cortisol trackers, wearable
not view big data studies as a general purpose tool for ECG monitors to track your body fitness, portable
inferring causation. Nonetheless, it is an important area breathalyzers to measure alcohol intake and processing,
to investigate since techniques that could successfully and smartphone-connected blood pressure monitors.
separate the signal from the noise would be invaluable.
Furthermore, services like 23andMe or uBiome
2.4.4 What Kind of Impact will Ubiquitous allow you to analyze your genome and that of
Computing and Wearables (Quantified Self) your microbiome; and other external such services
Have on the Way Medical Data are Collected exist. And then of course there are database apps
in the Future? that allow you to enter measured data yourself,
complementing the sensory data with varied other data
When the smartphone revolution started in 2007 (and points about calorie intake, sleeping times, activity,
even slightly beforehand with powerful MP3-players), arousal and mood levels, and general mental states.
something interesting happened to the electronics
market: suddenly, there was a market for mass production Given the speed of technological change and the
of sensors. Not only such sensors as are already found in frequency with which consumer devices arrive on
modern smartphones, like accelerometers, compasses, the market, it is safe to assume that more such
magnetic sensors or air pressure sensors, but also add- devices and services will appear in the future.
ons made for a specific purpose which use a smartphone
as their processing platform. So, instead of creating a Data protection already plays an important role in
costly sensor product that has its own CPU, connectivity the public discussion about these quantified-self
and network platform, a manufacturer can simply data. On the one hand, insurance companies would
add bluetooth, create an app and be done with it. love to incorporate fitness levels into their health
insurance and life insurance policies, or even making
The first time a large manufacturer used this was Nike them conditional on certain fitness goals. On the
with their Nike+ sensor and app that allows users to other hand, customers have to be protected from
track their running efforts. There was a small sensor pod unwillingly or unwittingly giving out these data, most
that would be attached to an iPod or an iPhone and of which are stored on American companies servers.
put into a small compartment in a Nike running shoe.
Then, when you finished your morning jog, you could All of these data are of course medically relevant, both
download your movement data and see exactly how in a clinical setting and in a research setting. Imagine
many miles you ran and how many calories you burned. a doctor getting access to the fitness data of a patient
Naturally, in later years these functionalities have moved and being able to examine their general condition
into the smartphones themselves, so that many of them very quickly. Or a heart patient who can prove a
now offer native step tracking, with some adding external condition they have simply by bringing long-term
options like weight tracking or heart rate monitoring. EKG data with them. Or a patient with a previously-
unknown allergy that could be detected from their
Fitness tracking has become a mature market, with over cortisol levels. Or even as a diagnostic tool, where

257 https://www.heise.de/newsticker/meldung/Umfrage-Jeder-Vierte-nutzt-Gesundheits-Apps-und-Fitness-Armband-3339518.html

Digital Transformation in Healthcare 72


A Whitepaper of the Healthcare Futurists GmbH
2.4 The Possible Futures of Big Data in Healthcare

in the future conditions might be detected simply movement have lead to a multitude of available data
because they are visible in the data that is present. about individuals. This is aided by the wide availability
of consumer devices that allow tracking of various
At the same time, scientists doing a study on a data, including movement and body activity, as well
particular condition might look for patients with as chemical and genetic data. More people will use
a specific marker or a specific pattern of data, those devices in the future, and more such devices
simply to find suitable and relevant participants. will be available, possibly even converged into regular
Tying in these previous measurements with in- smartphones. It would be wasteful to not use these data
study measurements could complement the in medical and scientific contexts. At the same time,
picture we have of a participants development the availability of simple-to-use measuring devices will
far better than was possible just a few years ago. make scientific data collection easier, less costly, and
give it much wider coverage than was possible before.
And what about giving study participants the necessary
sensors to take part in a study? Instead of taking
cortisol measurements at the beginning and ending
of a fitness study, why not have one each day, or each
hour even? With the availability of low-cost, easy-
to-use sensory equipment, the cost can be decreased
while at the same time increasing the robustness of
measurements, leading to much more robust outcomes.

In the last paragraphs, we have only talked about


singular data lines. However, many people who
practice quantified self do not simply measure one
aspect of their lives, they collect data on many
different aspects. Correlating events in their life with
data series on calorie intake or mood levels allows
them to form a very accurate picture of themselves.

The same could be true for medical analysis. Instead of


just collecting data that is relevant to a study at hand,
why not collect many more different data series, even
though they might not be immediately useable for
a specific study? Since many devices offer multiple
tracking options, from a cost perspective, nothing
changes. For the participant and the examiner, nothing
changes. Yet the collected data might become relevant
when cross-examined against other participants data
or even other study data, allowing post-hoc analysis
of previously unknown aspects of a patients status.
As we have seen in other chapters of this paper,
more data can be used to find out more things, so it
seems prudent to collect more, rather than less. In
fact, given the trajectory of adoption of personal
tracking devices, many more people will do so anyway.

Personal data tracking and the quantified-self

Digital Transformation in Healthcare 73


A Whitepaper of the Healthcare Futurists GmbH
2.5
The Internet
of Healthy Things
(IOHT)

It is very possible that ... one machine would suffice to solve all the clearly defined what exactly will be part of it, we can
problems that are demanded of it from the whole country. - Sir Charles expect that such smart, connected devices will make
Galton Darwin, 1946
our lives easier and more comfortable, allow better
customization of every aspect of private and work life
Computing technology has made phenomenal
and make many tasks more efficient, accurate and
advances in its seventy years of existence258. From the
economical. They will also allow insights into many
humble beginnings of the mainframe era, where each
aspects of life that are hidden from analysis, simply
computer was weighed in tons, through the mini-
because they cannot be accurately measured right now.
and microcomputer and the personal computer up
to the world of smartphones and smart devices we
Imagine that there will be computers in everything,
live in today, the number of computers in the world
from hairdryers to light bulbs, from books to toys,
has grown exponentially. While in 1996 there were
from clothes to accessories, from furniture to kitchen
an estimated 275 million personal computers in the
devices, from cars to bikes to shoes. Not only are
world259, only 20 years later more than 400 million
there a myriad of applications, there is also a wealth
new smartphones are shipped each quarter260. At the
of data that can be collected. How long did you
same time, computers have gotten smaller, more energy
use your electronic devices? How long did you
efficient and more powerful. This, of course, contributed
read and play, and what did you do? What clothes
to their proliferation. It is thus not hard to extend that
did you wear, and where did you go with them?
trend into the future to predict that soon, there will be
enough computers in the world that we can put one
into each device that we own. This is what is known
as The Internet of Things (IoT), and while it is not

258 We use the date of dedication of the first electronic computer, ENIAC, on February 15th, 1946, as a reference. Note that single-chip silicon CPUs
appeared on the market only in 1971.
259 http://stats.areppim.com/stats/stats_pcxfcst.htm
260 https://en.wikipedia.org/wiki/Mobile_operating_system#World-Wide_Share_or_Shipments

Digital Transformation in Healthcare 74


A Whitepaper of the Healthcare Futurists GmbH
2.5 The Internet of Healthy Things (IOHT)

What, then, is the Internet of Healthy Things? It


is the application of the concepts in IoT to health
applications, with the basic question: what devices could
be made more efficient, accurate and safe by adding
connection and computing to them? Devices in this
context does not simply mean electrical devices, but all
things we interact with in regards to health questions.

This is not science fiction, but has already started. Some


manufacturers build smart pill boxes that dispense
medication only on specific dates and monitor whether
the medication was taken out of the box. Fitness
tracking has been added to all recent smartphones, so
each user can track how far they walked or cycled on
a given day261. Blood sugar testing devices have been
fitted with bluetooth connections for easier display
and tracking of blood sugar levels on smartphones.

From the graph above, it is clear that this trend


will continue. But what applications will we see?

261 https://www.heise.de/newsticker/meldung/Umfrage-Jeder-Vierte-nutzt-Gesundheits-Apps-und-Fitness-Armband-3339518.html

Digital Transformation in Healthcare 75


A Whitepaper of the Healthcare Futurists GmbH
2.6
Augmented Reality:
An Extraordinary
Evolution of
Technology Tools with
Limitless Applications

2.6.1 Introduction device. This means that AR can and is being done
with computers, digital projectors, mobile devices
Augmented reality (AR) has become a hot topic such as tablets and smart phones, and yes, AR glasses.
over the last few years. In this review we look
at the principle and uses of AR in the medical As has already been said, AR is not a new idea, it is only
world, with some examples of companies and now that technology has caught up to the conceptual
research groups using and investigating AR today. idea. The first mention of AR was in 1901 by L. Frank
Baum who proposed the idea of wearable displays or
In the ever-increasing world of technology there has been spectacles that overlaid information onto real life. It was
a push to integrate what is seen virtually with our reality only in 1992 that Steven Feiner et. al263. released the first
- that is, virtually rendered262 images placed within our major publication on AR known at that time as KARMA
everyday view of our environment. This is the principle (Knowledge-based Augmented Reality Maintenance
of augmented reality (AR). Unlike virtual reality (VR), Assistant). From here the uses of AR have developed
which is a totally immersive artificial environment, AR exponentially to include the number of applications
allows the use to see his/her environment and focuses we have today. The purpose of this white paper is to
on supplementing it with images and information. provide an outline of the current and future direction
AR will take in the medical world and its future market.
Of late, since around 2015, with the introduction of
Microsofts HoloLens there has been vast amounts 2.6.2 Market and Pharmaceutical Marketing
of interest and new applications into augmented
reality. Although the buzz has come from Microsofts Market research released by LexInnova264 in 2015
media approach, there are companies that started predicts that the virtual/augmented reality market will
before Microsoft did. One company of note is Meta. rise from $93.21 billion in 2013 to $279.27 billion
They released in 2013 their Meta1 AR glasses and in 2018 and further to $470.86 billion by 2020.
thus were one of the first to have a fully functional This large increase in the market is due to hardware
development AR glass system. Although when hearing and software companies entering the field of 3D and
the term augmented reality one thinks of HoloLens 4D technology. These companies include Microsoft,
and/or Meta and thus AR glasses, the truth is that AR Magic Leap, Facebook, Meta and Unity3D, to name
has been around for many years and does not only a few. With respect to the medical field, there is a
include projection on glasses. In reality, as has been desire to strengthen and introduce new fields in
stated, AR is the adding of rendered images into our medical education, surgical simulators, telepresence
environment with the use of any display or projecting surgery, multi-image and complex data visualization,
rehabilitation, and surgical navigation. As of 2015

262 To render is the process of creating an image from either a 2D or 3D model


263 Steven Feiner, Blair MacIntyre, Dore Seligmann, Knowledge-based Augmented reality for Maintenance Assistance (KARMA) Columbia University
Computer Graphics and User Interfaces Lab, http://monet.cs.columbia.edu/projects/karma/karma.html
264 Virtual Reality: Patent Landscape Analysis, LexInnova 2015, http://www.lex-innova.com/resources-reports/?id=39.

Digital Transformation in Healthcare 76


A Whitepaper of the Healthcare Futurists GmbH
2.6 Augmented Reality: An Extraordinary Evolution of Technology Tools

there were 630 and 369 filed patents for the use of based phantoms without anatomical detail. This becomes
VR/AR for medical application in both the categories more difficult in todays world as the surgical approaches
of Medical Devices and Identification respectively. are less invasive due to the higher demand for endoscopic
procedures. This led to the training of surgeons with
Pharmaceutical companies have also taken to AR the field of haptic AR. In such an approach the trainee
applications on tablet devices to enhance their holds a device that provides the same haptic feedback
marketing strategies. Take Mercks Spectroquant Prove as he/she would have while undertaking a surgical
600 Augmented Reality App265, for example. With procedure. At the same time, the trainee is seeing an
the use of a tablet device, the sales person can show image on a computer monitor of an artificially rendered
a 3D representation of the product popping out of a body part acting as the area being operated on. In this
2D brochure. The customer gets a more in-depth view way a silicon phantom can give the visual and physical
of the product and the apparent feeling of being able impression of operating on a human and thus enhance
to physically interact with the product. This type of the training of the student. Examples of such devices
marketing application will become the standard for and companies include Virtual Botox by Allergan,
different pharmaceutical companies. By developing an hapTEL virtual system by Kings College London,
app that reads a QR code on a brochure and displays a and the Tempo surgical simulator by Voxel-Man.
3D model of the product, the sales process is enhanced
on a mental and physical level for the customer. 2.6.4 Augmented Reality within the Hospital/
Private Practice
2.6.3 Medical Education with Augmented
Reality As wearable AR glasses decrease in size and increase
possible application, their us in Hospitals and private
As the demand for AR and VR techniques has increased, practices will become an everyday event. Today,
so has the number of software apps used to build these there are a few groups investigating the use of these
desired techniques. These apps have enabled different applications. One company to note is Streye. Based
groups in research and industry to change the way in in both the United States of America and Spain, the
which medicine, anatomy, and even surgery can be company offers a solution to view patient data on
taught. Some companies leading the way are Medical the go. By connecting to data bases holding patient
realities, which uses Google Glass to record and live information, the doctor can call up data anywhere
stream surgical procedures to students in 32 different within the hospital. Furthermore, this system offers the
countries; ARnatomy, which uses QR codes as markers ability to live stream video information from medical
to flash medical names of different anatomical parts; devices such as endoscopes directly to the AR glasses
and Vipaar, a group that allows a student surgeon to the doctor is wearing at the time. Further applications
wear google glasses and see the hands of an experienced have been investigated and are being developed. One
surgeon projected onto the patient or medical image. of which is the ability, with the aid of patient data and
Since the prototype release of Microsoft HoloLens, simulated images, to explain the intended procedure
Case Western Reserve University, Cleveland, started to the patient. This way the patient will have a better
working on an AR means of visualising the human understanding of how the anatomical structure it
body and different anatomical structures. Furthermore, should look, the patients problem, and the medical/
it allows the wearer of the HoloLens to walk around the surgical approach that will be done to correct this.
projected hologram and interact with it while learning.

Medical education is also being brought to the training of


surgeons. In the past, a student would train on silicone-

265 Merck, Spectroquant Prove 600 Augmented Reality App , http://www.merckmillipore.com/DE/de/support/mobile-apps/spectro-


quant-prove-600-augmented-reality/f92b.qB.T6YAAAFT7OUR91.D,nav

Digital Transformation in Healthcare 77


A Whitepaper of the Healthcare Futurists GmbH
2.6 Augmented Reality: An Extraordinary Evolution of Technology Tools

2.6.5 Augmented Reality within the Surgical 2.6.6 Future Uses of AR


Theatre
As can be seen, although AR was a conceptual idea
Augmented reality has already been playing a vital role almost 115 years ago, it is only now, as technology has
within our surgical room, unbeknownst to users. In caught up, that it is pushing the medical technology field
previous years before technology caught up with the forward. The available applications, although limited, are
application, AR would have been undertaken by using well spread out, and include fields of medical marketing,
a digital projector to project a medical image onto the medical education, day-to-day clinical routine, surgical
surgical field. Today, with the use of image guided training and surgical intervention. It goes without saying
techniques and computer monitors, surgeons are able to that because of the power of AR, as the technology
render surgical targets on top of the patients previously increases, so will the number of AR applications and
acquired images for better surgical navigation. The companies. One major technology that is in the focus
work of Meola et. al.266 2016 shows that in the field of for future application is smart glass or AR glass. This
neurosurgery AR is a reliable and versatile tool when technology will allow a hands-free solution for visualizing
performing minimally invasive approaches in a wide range and interacting with rendered objects supplemented
of neurosurgical diseases. That is, the doctor has a better within our everyday environment. This technology
understanding of where the trauma or tumor is located. is being focused on by many companies including
Microsoft, Meta, Epson, Sony, Samsung, Google, Magic
Today, technology has aided in using the true power leap, Atheen Labs and more. Smart glass devices will be
of augmented reality. Some companies have started the commonly used hardware in a clinical environment.
developing especially for this field. One company,
Mbits, uses an iPad to video the surgical site while the
application renders and models the internal organs. By
doing so the surgeon gains more information regarding
the anatomical structures, the relation to them and the
surgical instrument and the intended surgical target.
Augmedics is another company using AR for surgical
localization. Unlike Mbits, Augmedics uses AR glasses
which give the surgeon the ability to overlay segmented
anatomical structures over the surgical site. This overlay
allows the doctor to see what lies under the skin and
can navigate to the surgical target more accurately and
less invasively. Another reason for ARs favorability in
surgical interventions is the ability to show and analyze
complex data within the surgical room, and to overlay
this on the patient while he/she is on the surgical
table. Other applications that are being developed
include interaction with computer software via hand
gestures and the Microsoft Kinect Infra-red camera267.

266 Meola A, Cutolo F, Carbone M, Cagnazzo F, Ferrari M, Ferrari V. Augmented reality in neurosurgery: a systematic review. Neurosurgical
review. 2016:1-12.
267 Jacob MG, Wachs JP, Packer RA. Hand-gesture-based sterile interface for the operating room using contextual cues for the navigation of
radiological images. Journal of the American Medical Informatics Association: JAMIA. 2013;20(e1):e183-e186. doi:10.1136/amiajnl-2012-001212.

Digital Transformation in Healthcare 78


A Whitepaper of the Healthcare Futurists GmbH
2.7
Competitor
Analysis

2.7.1 What are Other Companies Doing and not possible to conclude from this information how
How Successful are They? highly big data ranks in each individual companys
business, it indicates that big data is an important
[Google, IBM, SAP etc.] have a lot of capabilities around [] analyzing topic for many top companies in the industry, with
big data []. But what they miss is the medical knowledge, the
large investments flowing and success stories emerging.
understanding of biology. They cant ask the right questions. They can
program but they dont know what to program.
Severin Schwan, CEO Roche Group268 2.7.2 Roche

Given the unequal distribution in the market when it We will accelerate R&D, new drugs will be introduced to market earlier
comes to medical and data analytics knowledge, the and cancer patients will live much longer.
growing number of Joint Ventures (JV) between the Daniel ODay, CEO Roche Pharmaceuticals 272
pharmaceutical industry and technology companies
seems unsurprising. Google and Qualcomm have Roche has formed several strategic alliances in order to
become partners of J&J, GSK, Sanofi and Novartis access more patient data than was internally available.
in efforts to create virtual coaches for post-surgery The leader in oncology research has invested $1 bn to
patients269, develop comprehensive IT solutions to acquire a majority stake in Foundation Medicine, a
empower diabetes type 2 patients270, and create smart molecular information company specializing in the
devices like internet-connected inhalers for COPD sequencing of cancer tissue273. A deep understanding
patients271. They all seek to compensate for missing of the molecular structure of specific cancers can
digital capabilities and expertise in data collection and help Roche in identifying novel drug targets, match
analysis. These capabilities are of great significance for patients with suitable clinical trials and enable precision
big data projects as well. Looking at the past years, medicine. A similar project feeding Roches pool of
there is no shortage of pharmaceutical companies data with more data points is the collaboration with
collaborating with companies outside and inside the Flatiron. Roche plans to use the real-world patient
Life Science industry to build up analytics capabilities, data on cancer treatments that Flatiron collects to
establish access to new data sources or conduct big data accelerate clinical trials and advance personalized
projects. This section gives a description of exemplary medicine. As CEO of Roche Pharmaceuticals, Daniel
projects of several pharmaceutical companies with ODay stated that the project aims at understanding
big data being the key ingredient. The analysis is how drugs react in patients in the real world and can
based on openly disclosed information. Although it is

268 http://asia.nikkei.com/Business/Executive-Lounge/Here-s-where-drug-makers-are-heading
269 https://www-03.ibm.com/press/us/en/pressrelease/46582.wss
270 http://www.heise.de/newsticker/meldung/Google-Unternehmen-und-Sanofi-wollen-Diabetikern-helfen-3318656.html
271 https://www.novartis.com/news/media-releases/novartis-pharmaceuticals-collaborates-qualcomm-digital-innovation-breezhalertm
272 Translated from German quote, available at http://tablet.fuw.ch/article/roche-ist-vorreiter-bei-big-data/
273 http://investors.foundationmedicine.com/releasedetail.cfm?releaseid=905240

Digital Transformation in Healthcare 79


A Whitepaper of the Healthcare Futurists GmbH
2.7 Competitor Analysis

be classified as a long-term strategic investment274. 2.7.3 Pfizer

In order to build capabilities to analyze big data, the The previous section (see section 2.2.1 Future of EBM)
acquisition of Bina played an important role in the already included two cases of Pfizers efforts in the
past year275. Binas Genomic Management Solution area of big data: Pfizers Precision Medicine Analytics
presented a valuable tool for Roche in managing Ecosystem, which serves as a prime example for quick
and processing next generation sequencing data. hypothesis testing and the integration and analysis of
diverse and large amounts of data, and the large-scale
In 2015, Roches daughter company Genentech 23andMe depression study. In addition, Pfizer has
initiated a big data project in collaboration with initiated two more projects based on customer data of
23andMe 276 . Data from 12.000 Parkinsons 23andMe users: A study using the data of 5.000 Lupus
patients is being used for a study to find genetic patients279 and another study involving 10.000 23andMe
links to the disease to assist in drug development. users with inflammatory bowel disease280. Their goal
is to find new associations between genetic markers.
Roche considers Big Data as a huge opportunity, is
currently working on new projects related to Big Data and 2.7.4 Novartis
is in the process of launching pilot projects, according
to Isabelle Vitali, Head of Innovation and Alliances Just like Roche, Novartis has set foot into the world
Development.277 Roches growing interest in strategic of tech-collaborations under the auspices of big data.
alliances with technology companies, of which [w]e Foundation Medicine has been a partner of Novartis
will probably see more [] in the future278 as stated by since 2011, providing valuable molecular information
Roches CEO, clearly reflects this view. In general, Roche and genomic profiling analytics which has found its
is pursuing big data-related work in four different areas: way into Novartis oncology clinical trials281. The result
of a joint project with MIT and Harvard, also initiated
1. Social analysis, analyzing data from online patient with the goal of growing Novartis access to genetic
platforms data in mind, the Cancer Cell Line Encyclopedia
2. Data mining, analyzing unstructured data using comprises detailed genetic characterization of 1.000
statistical methods (e.g. for predictive health human cancer cell lines282. In order to better handle
purposes) data integration and analysis for both preclinical
3. Data warehouses and clinical research, a collaboration with Covance
4. Processing data from connected objects was initiated in 2014 to develop a clinical data
warehouse 283. For an improved integration and
analysis of diverse next-generation-sequencing data
from external organizations, Novartis uses a solution

274 http://bits.blogs.nytimes.com/2016/01/06/roche-leads-a-175-million-investment-in-flatiron-health/?_r=0
275 http://sequencing.roche.com/news---media/press-releases/roche-acquires-bina-technologies-and-enters-the-genomics-informa.html
276 http://www.forbes.com/sites/matthewherper/2015/01/06/surprise-with-60-million-genentech-deal-23andme-has-a-business-plan/#323700af7927
277 http://healthcaredatainstitute.com/2016/07/19/at-roche-big-data-is-about-revealing-the-invisible/
278 http://asia.nikkei.com/Business/Executive-Lounge/Here-s-where-drug-makers-are-heading
279 http://venturebeat.com/2015/01/14/23andme-has-signed-12-other-genetic-data-partnerships-beyond-pfizer-and-genentech/
280 https://www.genomeweb.com/clinical-genomics/23andme-pfizer-launch-inflammatory-bowel-disease-genetics-study
281 http://files.shareholder.com/downloads/AMDA-23Y63R/2971088260x0x716636/a441869c-74f1-4229-9dad-4d7eb48b0b34/FMI_
News_2014_1_6_General_Releases.pdf
282 https://www.nibr.com/our-research/disease-areas/oncology
283 http://www.covance.com/content/dam/covance/pdf/data-warehouse.pdf

Digital Transformation in Healthcare 80


A Whitepaper of the Healthcare Futurists GmbH
2.7 Competitor Analysis

called MapR284. Both the data warehouse and the in a quick and low-cost way287. Instead of directly
MapR solution contribute to the acceleration of R&D. testing all hypotheses in the costly lab environment,
correlations in the data can be applied as a filter so that
Novartis is increasingly building cross-sectional teams, only promising hypotheses are passed on to the lab.
including biologists, chemists, clinicians, and data For instance, the scientists intuition that ingredients
scientists. Since 2013, Novartis has been presenting might have been diluted at a certain process stage
a success story of their big data efforts on the official was disproved by the data. Thus, Merck was able
Novartis website 285. The Novartis Institutes for to avoid an unnecessary lab test which would have
BioMedical Research (NIBR) uncovered the cause of a demanded high investments of both money and time.
rare kidney disease from great amounts of genomic data:
previously undetected mutations in the gene LMX1B 2.7.6 Mylan and Allergan
caused focal segmental glomerulosclerosis which in turn
affected the kidneys filtering system. Big data was the Not much information can be found on Mylans
game changer, according to Joseph Szustakowski, head involvement in big data projects. However, the existence
of Bioinformatics in Biomarker Development at NIBR. of a Global Director of Enterprise Business Analytics
Mylan Inc. who manages data scientists and is responsible
2.7.5 Merck for planning and executing big data strategy suggests that
Mylan is at the very least building big data capabilities288.
Mercks project to speed up the process of vaccine
manufacturing makes a case for the usefulness of big Allergan formed a research alliance with NuMedii
data analytics in areas other than R&D286. Particularly at the end of 2015289. NuMedii offers predictive big
low yield rates resulting in high costs in the production data intelligence platform technology which provides
of certain vaccines prompted Merck to investigate the Allergan with information on existing compounds
underlying causes. The Hortonworks Data Platform which could be used in the treatment of psoriasis.
was used to meet the previously unsurmountable Hundreds of millions of clinical data points are the
challenges of siloed data and high costs involved basis of NuMediis predictive analytics engine that
in testing hypotheses. Data from disparate sources, tries to identify new uses for existing drug compounds.
namely (1) a process system that tags and tracks each
vaccine-batch, (2) a maintenance system that presents
calibration settings and (3) a building management
system that measures air pressure, temperature,
humidity levels and flow rates, were aligned to create
a fruitful data environment for the analysis of the root
cause. This procedure turned out to be much faster
and more effective than traditional spreadsheet-based
analysis: 5.5 million batch-to-batch comparisons
revealed that certain characteristics in the fermentation
phase were strongly linked to low yield rates.
Using this method, Merck can test different hypotheses

284 https://www.mapr.com/resources/novartis-relies-mapr-flexible-big-data-solutions-drug-discovery
285 https://www.novartis.com/stories/discovery/surfing-wave-big-data-analytics
286 http://www.informationweek.com/strategic-cio/executive-insights-and-innovation/merck-optimizes-manufacturing-with-big-data-analyt-
ics/d/d-id/1127901
287 http://hortonworks.com/blog/hdp-for-manufacturing-yield-optimization-in-pharma/
288 https://ieondemand.com/presentations/building-big-data-capabilities-at-mylan
289 http://numedii.com/numedii-allergan-collaboration-psoriasis/

Digital Transformation in Healthcare 81


A Whitepaper of the Healthcare Futurists GmbH
Appendix

Table 1a: Summary of categories of digital health applications and their current status of implementation in Germany. Further details can be found in the
comprehensive table in the electronic appendix (continued on next pages).

Digital Transformation in Healthcare 82


A Whitepaper of the Healthcare Futurists GmbH
Appendix

Table 1b: Summary of categories of digital health applications and their current status of implementation in Germany. Further details can be found in the
comprehensive table in the electronic appendix .

Digital Transformation in Healthcare 83


A Whitepaper of the Healthcare Futurists GmbH
Concluding remarks

Digital Transformation is in full swing whether we like offered to even more customers. However we also have a
it or not. It has already changed some of the industries humanitarian obligation to not leave behind those who
for good and now is on the verge of percolating into are not at the cutting edge and forefront of technology
the areas of healthcare, medicine, and life-sciences. due to socioeconomic status or location. No one in the
long run can be excluded and therefore condemned to
In this whitepaper, we have made clear that this is a window shopping on the internet without being able
market unlike any other. We have also shown the triggers to participate: personalized genome testing will at one
and technology levers of business success and the layers of point become the norm. Pharmaceutical companies
change where we can expect transformation to take place. will try more and more to become solution providers
for affordable, accessible and accountable care. Sick
Often, technology alienates people and causes fear of being funds and insurance companies are already in search of
made redundant. We think that digital transformation digital USPs that can be sold to customers. Prevention
in healthcare will on the contrary not deprive medicine will have impact on life style in terms of functional
of its workforce but will make this domain even more food, 3D printed mass customized medication and
accessible. But the jobs will have changed at the end. personal dietary and health bots. Wearables will
eventually evolve to imprintables transmitting data on
The core of this belief is the fact that in health, there is personal levels of fitness and wellbeing from different
no diminishing marginal utility. This means, for every body compartments. Augmented reality will help us
unit invested, we still gain utility. In most markets, target tumors and exercise operation procedures better.
this is not the case, because at some point they become Old study data will be subject to data archeology with
saturated. Even if you have an infinite number of data scientists running at the forefront of reverse drug
sport cars, you can only drive one at a time. Not so discovery out of hard drives rather than test tubes.
in healthcare where public systems might experience
more and more financial pressure, but where we also Professions will change and entire industry sectors must
see a huge self-pay potential that can be tapped in using reinvent themselves, because value chains break down
digital methods. The literate and conscious consumer of and take with them all those who rest on their laurels. In
healthcare services, the civilian scientists, will be able to this sense, digital transformation in healthcare can also be
make more choices as to how to live longer, how to feel looked at as a necessary evolutionary process, a process of
better, how to sleep better, how to prevent diseases, etc. creative destruction where old domain knowledge forms
only one part of the equation and agile, unpretentious
As always, when there are big changes ahead, there transdisciplinary approaches will make the differences.
is also the feeling of significant insecurity. We want
to contribute with this whitepaper to get our readers This whitepaper should enable you to get oriented
looking somewhat more optimistically into the future of towards where you want to go tomorrow; for the
healthcare in terms of digital transformation. The move future is not something that is done, it is imagined
will be slowly but steadily in established markets and and made by us. Today for tomorrow, day by day.
of course it will be international in nature. Company
business models might be challenged already, jeopardized
and if they cannot adapt they might be adapted.

It is not a question of whether individuals, companies,


institutions etc. want join this transformation, it is
rather the question of how they will do it. To us it is Tobias D. Gantner, MD, PhD, MBA, LL. M.
beyond doubt that healthcare is and will continue to Founder & CEO HealthCare Futurists GmbH
be one of the most relevant global industries in terms
of revenue and sheer broadness of goods and services

Digital Transformation in Healthcare 84


A Whitepaper of the Healthcare Futurists GmbH
Digital Transformation in Healthcare is a sensitive
and comprehensive enterprise; it does not
mean switching from fax to e-mail. It is an
irreversible process of change that has at its core
the intelligent and secure connection of data-
producing devices, data-weighing algorithms,
immersed and educated healthcare consumers,
and well-trained healthcare professionals
who know how to act on this intelligence
responsibly and with participatory transparency.

This is a transformation we need to be observant


but not afraid of. In fact, the technology
employed will put more time on our hands
where no machine can ever replace us: Being
human to other humans when they need it most.

S-ar putea să vă placă și