Sunteți pe pagina 1din 98

PROJECT FINAL REPORT

Grant Agreement number: 600932

Project acronym: MD-PAEDIGREE

Project title: Model-Driven Paediatric European Digital Repository

Thematic Priority: ICT – ICT-2011.5.2: Virtual Physiological Human

Funding Scheme: IP

Date of latest version of Annex I against which the assessment will be made:
Periodic report: FINAL

Period covered: from March 1st 2016 to May 31st 2017

Name, title and organisation of the scientific representative of the project's coordinator1:

Prof. Bruno Dallapiccola, Scientific Director, OPBG


Tel:
Fax:
E-mail:

Project website2 address: www.md-paedigree.eu

1 Usually the contact person of the coordinator as specified in Art. 8.1. of the Grant Agreement
2
The home page of the website should contain the generic European flag and the FP7 logo which are available in electronic format
at the Europa website (logo of the European flag: http://europa.eu/abc/symbols/emblem/index_en.htm logo of the 7th FP:
http://ec.europa.eu/research/fp7/index_en.cfm?pg=logos). The area of activity of the project should also be mentioned

1
Declaration by the scientific representative of the project coordinator
I, as scientific representative of the coordinator of this project and in line with the obligations as stated in Article
II.2.3 of the Grant Agreement declare that:

The attached periodic report represents an accurate description of the work carried out
in this project for this reporting period;
The project (tick as appropriate)3:

x has fully achieved its objectives and technical goals for the period;
□ has achieved most of its objectives and technical goals for the period with
relatively minor deviations.
□ has failed to achieve critical objectives and/or is not at all on schedule.

The public website, if applicable

x is up to date
□ is not up to date

▪ To my best knowledge, the financial statements which are being submitted as part of this report are in line
with the actual work carried out and are consistent with the report on the resources used for the project
(section 3.4) and if applicable with the certificate on financial statement.

▪ All beneficiaries, in particular non-profit public bodies, secondary and higher education establishments,
research organisations and SMEs, have declared to have verified their legal status. Any changes have been
reported under section 3.2.3 (Project Management) in accordance with Article II.3.f of the Grant Agreement.

Name of scientific representative of the Coordinator: BRUNO DALLAPICCOLA

Date:

For most of the projects, the signature of this declaration could be done directly via the IT reporting tool
through an adapted IT mechanism and in that case, no signed paper form needs to be sent

3 If either of these boxes below is ticked, the report should reflect these and any remedial actions taken

2
Summary

Declaration by the scientific representative of the project coordinator ................................ 2


Executive Summary.................................................................................................................................... 4
MD-Paedigree – from algorithms to clinical decisions ....................................................................................... 4
Project context and objectives .......................................................................................................................... 5
Analytics and case-based reasoning ............................................................................................................................. 5
Data and knowledge infrastructure .............................................................................................................................. 6
Cardiomyopathies ................................................................................................................................................................ 6
Obesity ...................................................................................................................................................................................... 6
NND ............................................................................................................................................................................................ 7
JIA ................................................................................................................................................................................................ 8
Project’s outcomes analysis ............................................................................................................................................. 8
Description of the main S&T results/foregrounds ...................................................................... 10
Potential impact ....................................................................................................................................... 44
Dissemination materials ....................................................................................................................... 52
Use and Dissemination of foreground .............................................................................................. 56
FINAL REPORT ON THE DISTRIBUTION OF THE European Union FINANCIAL
CONTRIBUTION ........................................................................................................................................ 73
Appendix 1: Outcome of the Fourth Internal Review ................................................................. 74
MD-Paedigree – Fourth internal Review meeting (Rome, 22nd-23rd May 2017) - Governing
Board’s Final Statement ..................................................................................................................................... 74
Patients data collection ................................................................................................................................................... 77
Limits in the socio-economic impact analysis ....................................................................................................... 77
Lack of involvement from patient associations .................................................................................................... 77
Other minor shortcoming .............................................................................................................................................. 77
Review by Prof. Tammo Delhaas, assessing the progress of the Cardiomyopathies study WPs 3 – 4
-8 - 9 ........................................................................................................................................................................................ 78
Review by Prof. Rolando Cimaz, assessing the progress of the JIA Study – WP5 – WP10 .................. 80
Review Dr. Adam Shortland - Report on the progress of the Neurological and Neuromuscular
Disease Working Group .................................................................................................................................................. 81
Review by Prof. Maria Krestyaninova, assessing the progress of the Microbiome Study – WP7..... 83
Review by Prof. Alberto Sanna, assessing the progress of the Infostructure WPs 14 – 15 - 16 - 17
................................................................................................................................................................................................... 85
External Advisor comments ............................................................................................................................. 86
ETHICAL EVALUATION CRITERIA APPLIED TO MD-PAEDIGREE......................................................... 88

3
Executive Summary
MD-Paedigree – from algorithms to clinical decisions
The paradigm shift fostered by the VPH (Virtual Physiological Human) initiative, of which the Model-Driven
European Paediatric Digital Repository (MD-Paedigree) project is a direct implementation, is to provide
physicians with new tools for predicting the evolution of clinical conditions and treatment outcomes to
support individualized clinical decisions. Following this vision, MD-Paedigree was launched under the
guidance of healthcare centres of excellence to lead advanced computer-sciences labs, small and large
enterprises and economy and technology transfer experts across Europe. The goal was not only to develop
new in-silico medicine solutions, but more importantly, to address the challenges to introduce them in clinical
practice. A multifaceted operational model was thus applied, that integrated core technology development,
translational medicine initiatives, clinical validation and impact analyses, all deployed under the leadership
of physicians in Rome, Amsterdam, Berlin, Genoa, Leuven, London, and Utrecht.

Following this blueprint, four core sets of technological assets were implemented:
1. A federated clinical data repository, fed directly by clinical centres, storing and making available
highly diverse classes of medical data and equipped with information curation tools.
2. A set of mechanistic models of organs, diseases and physiological functions, able to simulate clinical
outcomes at baseline and under different care regimens.
3. A set of advanced analytics tools for the identification of clinical, biomolecular and imaging predictors
gleaned from the highly heterogeneous data set collected during the study.
4. Various advanced search tools leveraging different, but integrated forms of statistical similarity
across clinical cases to provide doctors with actionable clinical insights.

Predictive systems’ cclinical validation was brought at the forefront as a key step to bridge technology labs
and clinical environments and a novel Validation Framework for in-silico systems was designed to that end,
extending traditional clinical trial practices into a three-phases protocol. Starting from technical verification
on computational prototypes, leading to internal validation on test data sets and ultimately external
validation on clinical cases, at baseline and follow up. More than 630 patients were recruited to test the
predictive accuracy of analytics and simulations systems and records from around 50 thousand more were
collected from electronically documented routine care.

At the end of this process all systems demonstrated their ability to support clinically relevant, individualised
care decisions in one or more clinical scenarios of choice in four pediatric areas: cardiomyopathies, obesity,
neuromuscular diseases, including juvenile idiopathic arthritis, i.e. a still ill-defined, highly multifactorial
pediatric syndrome. In all these areas the project made available four sets of capabilities to extend traditional
Electronic Health Records and hospital documentation systems:

• An extensive clinical data repository containing information-rich clinical records including


metagenomic profiles, 3D-MRIs studies, lab data, clinical histories and more, all available under
documented access policies to academic centers across Europe. The system includes data curation
and validation tools to foster data accuracy and normalization.
• Integrated advanced search and visualization capabilities based on clinical similarity. These enable
clinicians to identify and review cases relevant to the one at hand (“patients like mine”) for
comparative outcome analysis and more effective decision making
• Validated models of diseases and treatments in the four clinical areas, to perform patient-specific
simulation and predictions
• A set of statistically significant, novel predictors for JIA, Obesity and cardiometabolic syndrome, and
of severe cardiomyopathies
• Patient-specific clinical workflows for paediatric cardiomyopathies with related cost assessments,
showing potentially vast savings in terms of life expectancy and resources consumption.
4
Major technical challenges were faced, somehow ironically, not so much in tackling the development of in-
silico models or their validation, but in the operational levels, mostly in terms of integrating disparate data
sources in a semantically coherent and secure data infrastructure.

Despite these challenges, the flexibility and determination of the MD-Paedigree consortium led this project
to achieve the vast majority of its objectives and in many relevant ways even to exceed them.

Project context and objectives


The work was divided into four broad clinical areas offering a range of patient care scenarios, targets for
knowledge discovery and improved decision making, and for assessing potential impact.

Mechanistic models were developed and then personalized on actual clinical cases in the Neurological and
Neuromuscular Disease (NND) area, to simulate real-life gait and postural patterns and surgical outcomes.

Similar principles were applied in the Cardiomyopathies area, where models of anatomical structures,
electrophysiological and hemodynamic functions were developed, using data from more than 140 patients
and integrated in a Virtual Heart model capable of predicting a variety of risk profiles with direct impact on
patient management.

The effort was focused on the multiscale virtual heart as a tool for the stratification of patients and their
optimal allocation in more or less aggressive care paths.

After formally testing the model, socio-economic impact analyses revealed, under this new decision model,
very high potential savings.

Obesity was the fourth major target of the study. Advanced analytics, including deep learning, was applied
to the development of a screening protocol for cardiometabolic risk. Leveraging a ‘low cost data first’
principle, a sequential diagnostic strategy was developed based on a chain of statistically significant, early
predictors of obesity and cardiometabolic syndrome.

Cross-sectional meta-genomics studies were executed to uncover gut microbiome profiles in relation to both
JIA and Obesity, demonstrating how some of these profiles have significant predictive power in both diseases.
To provide such insights, a novel reporting method, linking microbiome profiles to metabolites and clinical
phenotypes, was developed. In total, 169 patients (with a total of 567 follow up visits) were used in the
multidimensional analyses in the JIA area and roughly 160 for Obesity.

Analytics and case-based reasoning


The project aimed at realizing two decision support use-cases to allow physicians identifying and reviewing
patients similar to the one at hand, to glean clinical insights and improve decision making (patients like mine)
and, for patients, to look up people with similar conditions (patients very close to me).

While the first scenario was fully implemented and validated by physicians, the second, while relying on the
same core technology, presented systemic challenges that could not be addressed within the scope of the
project, such as the lack of public data sharing infrastructures for citizens and still unresolved, but critical,
issues around privacy and data security. During the third year of the project a parallel initiative was launched,
MyHealth-MyData, which got funded by the EU in 2016, and is directly addressing these issues.

In turn, the ‘patient very close to me’ use case was one of the areas in which the project’s goals have been
exceeded. After a thorough redesign of the first Case-Reasoner prototype, the underlying machine learning
engine was updated with a new deep-learning component allowing for much higher dimensionality in
5
knowledge discovery and similarity search operations. The new approach also allowed to create and use
compact and task-specific clinical cases representations that can be processed in a variety of analytics
pipelines. This approach realized therefore a multifunctional knowledge discovery platform which, in the
course of MD-Paedigree, identified multiple novel disease factors while allowing decision support on routine
clinical cases.

Behind and integrated user workflow, seamless access was also allowed to another knowledge discovery tool
specialized in text mining and applied to both physicians notes and medical literature, adding further richness
to the overall search capabilities. This tool has been tested by physicians in its capacity to identify cases that
contribute to current patient management, which was indeed demonstrated.

Data and knowledge infrastructure


Both modelling and analytics activities imposed highly demanding requirements in terms of volume and
variety of data sets to be collected, curated and processed. The project leveraged and extended a federated
database infrastructure, which had been partially developed in previous EU-funded projects, and was now
markedly extended with regard to anonymization, data transfer, advanced querying and visualization
capabilities. The 637, highly rich and multifaceted records from patients recruited during the study were all
collected in and shared through the infrastructure, in addition to more than 48 thousands routine cases
loaded from direct connections with hospital repositories.

An online data profiling and curation system was developed and made available to the consortium to control
for inconsistencies and errors in data, to guarantee accuracy and normalization. The system applies
descriptive and advanced statistics and facilitates manual corrections through an intuitive user interface. As
an extension to this tool, an analytics and knowledge discovery platform was added during the second half
of the project. The integrated solution was applied to data sets in all clinical areas, supporting the production
of analytics insights and mechanistic models’ development.

Cardiomyopathies
Work in this area started by extending key instruments such as MRI images segmentation and adapting them
to cardiac anatomy as a pre-requisite for subsequent R&D and validation tasks to allow sufficient processing
efficiency and data throughput.

Comprehensive circulation models were integrated to account for ventricular dynamics and large vessels
flows. With the ability to directly set input parameters and thus personalizing models, measurements of left
ventricular pressure-volume were taken on the simulated scenarios and compared to actual values measured
invasively in clinical cases, showing a tight agreement of this and other parameters, such as aortic pressure
and ventricular volumes. These outcomes were formally validated on 50 cases at baseline and 21 cases at 2
follow-up time points, proving the statistical soundness of the approach and results. Such parameters have
also been utilized in the Obesity area to the study of cardiac risk in obese children. The clinical relevance of
this model as a tool for risk-stratification and outcomes predictions was therefore well established.
Computational requirements remained high, suggesting non-real-time workflows and direct integration with
radiology information systems, rather than EMRs. The leading partner in this area is indeed actively working
toward that type of commercial solution.

Obesity
In this area as well, the work started by setting up and validating key parts of the technology infrastructure,
such as automated assessment of body fat distribution from MRI images and setting up processing pipelines
for organ and subcutaneous fat segmentation. This work led to a major result, not only providing an accurate

6
way to estimate fat percentage in the liver, but subsequently allowing to uncover how this parameter is
directly linked to risk of obesity and cardiometabolic syndrome.

This and other results from analytics studies led to the main accomplishment in this area, i.e. the proposal
for a novel patient screening methodology aiming at identifying young people with early metabolic, genetic,
metagenetic and/or cardiovascular signs that indicate increased cardiovascular risk, but which cannot be
easily detected by standard diagnostic work-outs.

The new method utilizes a sequential process where risk is assessed acquiring initially easily accessible,
patient-generated data, such as questionnaires, subsequently other data from increasingly demanding and
costly measures (stool and blood samples), and finally clinical assessments and advanced medical imaging.
At each step, deep-learning based predictions determine intermediate risk and therefore the need, or not,
to engage in further assessment.

The diagnostic power of complex imaging techniques have been extended and validated through applications
developed in this area that allow the direct diagnosis of early cardiometabolic disorders, leveraging
parameters such as vascular stiffness, ventricular hypertrophy or ectopic fat deposition, all of which establish
risk of disease, but may not be detected by simpler risk models. To validate this approach a pilot multi-centric
study was conducted over roughly 160 cross-sectional records of healthy and obese children recruited for
this purpose. Statistical cross-validation tests evaluated the different relationships among risk factors.
Multiple statistically significant findings parameters were generated to instantiate the screening model.
Overall these factors allow to go beyond standard screening systems, primarily BMI index measurements,
allowing, if used in clinical practice, much finer and timely diagnostic ability.

NND
This area focused initially on the development of new techniques to extract computational representations
of anatomical structures straight from MRI images of lower limbs using advanced image segmentation and
mapping methods. These were extended to extract individual muscles and bones shapes and skin contours
in the pelvis and legs regions. A second line of work realized a system to complete anatomical models
ligaments and tendon insertion points, setting and then fine-tuning the geometrical and bio-mechanical
underpinnings of gait simulations. These functions were then applied to target neuromuscular diseases:
Charcot–Marie–Tooth disease (CMT), Duchenne muscular dystrophy (DMD) and cerebral palsy (CP).

Gait and postures generated by alternative sets of musculoskeletal parameters were this way simulated,
accounting for modified tendons insertion points and slack, muscular strength, length and maximum fiber
extension. A dedicated method was also developed to incorporate long bones’ anatomical alterations by
surgical treatments, such as tibial torsion and femoral antero-version based on measures taken from a
physical exam. An end to end computational pipeline was then built and optimized to assemble individualized
musculo-skeletal models to visualize them in an animated user application. The tool allows to modify above-
mentioned parameters, but also the walking surface and the patient’s visual feedback for the evaluation of
resulting gaits.

The accuracy of these simulations, validated by expert opinions and on 57 patients, demonstrated the ability
to provide key insights in treatment planning for target conditions, opening a wide range of innovations.
Their integration, for instance, in rehabilitation devices and protocols will increase the precision and
effectiveness and rehab protocols. The effects of surgical and prosthetic treatments can be extended in a
variety of areas providing crucial insights to guide treatments. With just these two sets of applications,
patients’ outcomes could be vastly improved. From an industrial stand point, simulated biomechanical effects
can drive the customization of prostheses and devices improving clinical outcomes establishing this area as
one in which the original scope and goals was exceeded.

7
JIA
The JIA area posed specific and multifaceted challenges as this condition is, up to this day, still poorly
understood, its etiology and evolution remaining unclear and probably related to multitude of causes. Two
parallel approaches were therefore applied to study it: a mechanistic modelling of the ankle and foot was
developed, to identify clinical (gait and posture) markers of the disease, while analytics studies incorporating
highly diverse data sources (metagenomic, clinical, metabolic, imaging) were targeted at uncovering
biomolecular predictors of the disease. These parallel lines of work ultimately achieved what can be deemed
to be the most comprehensive, multilevel study of this condition to date.

High data dimensionality was indeed one of the challenges in this area, and to that end low-level data fusion
protocols were applied, correlating analytics outcomes from different platforms, ultimately leading in all
cases to more accurate predictive models compared to segregated studies. This work in turn provided
statistically significant separations between disease classes (baseline, inactive and persistent) based on
microbiome and metabolic phenotypes. Specifically, the presence of the disease was correctly classified in
more than 82 % of cases, independently of disease activity, inflammation markers, patient geography or
ongoing therapy. This innovative methodology led to a successful patent application while a second one is
being submitted at the time of writing, indicating another example of above-expectations results.

Mechanistic simulations led at the same time to highly accurate, subject-specific models, providing new
levels of information to assess these cases. Integrated analyses of images, clinical data, gait and
biomechanical simulations yielded a set of 265 variables potentially characterising the disease and treatment
outcomes, which provide now a framework for objective quantifications of the pathology progression. While
this study pointed to a non-biomechanical etiology of the disease, it at the same time uncovered a set of
complex patterns of ankle protection mechanisms that may concur, as such, in determining the severity of
the disease.

Project’s outcomes analysis


Focus on validation

The intense work to validate simulations and analytical findings was framed within the standard context of
biomedical research and clinical trials in particular. While it was brought to successful completion, its last
phase opened unforeseen methodological and epistemological questions, especially in the area of long term,
patient specific predictions.

Outcomes, in fact, are subject to external dynamics that cannot, and will not be possible in the near future
to include in the predictions. Actual likelihoods are therefore bound to pathophysiological evolutions outside
the model. This understanding highlighted the need for new validation and design approaches that the MD-
Paedigree consortium analyzed in depth, not only framing the project’s validation results in their proper
context, but also investigating possible solutions. A novel approach to validation of in-silico tools, using virtual
cohorts and uncertainty quantification methods, was captured in a follow-up proposal, submitted in response
to the SC1-PM-16-2017 call.

Physicians and healthcare systems engagement

Clinical leadership was one of the key strategic themes to the project, purposely putting in the hands of
physicians scientific coordination, system designs and testing, validation and finally early adoption. While

8
this approach did indeed deliver most of the intended value, it did not entirely mitigate the major challenges
that still exist in the end-user implementation of decision support systems.

In the later stages of the process, as clinical validation and socio-economic analyses kept delivering positive
results, clinical centres still found difficult to facilitate systems’ implementation in care settings and end user
adoption in the face of cultural and organizational hurdles. In this view, while the project completed all tasks
required to introduce advanced technologies in clinical practice, the broader healthcare environment proved
to be the major obstacle to translating research into direct societal value.

The Exploitation plan developed in previous years drove intense activity in the submission of a number of
research proposals and business initiatives. The successful bid for the MyHealth-MyData project has allowed
to further support the development of the Infostructure, and will lead in particular to its extension into the
realm of personal data accounts over distributed data sharing architectures, extending as well analytics
applications and semantic integration. In parallel a proposal aimed at the exploitation of all main MD-
Paedigree components was submitted for an SME Phase II vehicle and although positively evaluated was not
finally accepted. Of notice is the acquisition of the Infostructure and anonymization development partner,
G-Nubila, by one of the leading telecommunication and IT groups in Europe, Almerys, in light of the market
potential and strategic value of the systems developed during MD-Paedigree and other EU-funded projects.
A report on patents, commercial prototypes and related industrial plans is included below, in section Use and
Dissemination of the Foreground, in this document for all areas.

In the end, MD-Paedigree’s final conference was held in May 2017 in Rome and attended by more than 120
participants from 10 countries. It featured keynote speakers from academic centres of excellence, industry
and research institutions.

9
Description of the main S&T results/foregrounds

WP 2 Clinical and Technical user requirements for disease modeling


The main objective of WP2 was to guarantee that the features and capabilities of the disease models
merge the clinical needs and provide a significant impact on current clinical practice pathways, improving
patient care and supporting medical practice in the everyday activity. In D2.1 (m 12 of the project) the
most pressing requests of the clinical partners for the MDPAEDIGREE derived models were defined. In the
revised document released in D2.2 (m 24), we have started the process of defining the prospected clinical
impact of the use of disease modelling in clinical practice for each of the studied disease areas. In order to
reach this target, D2.2 defined the current patient pathways for the four disease areas, in order to start
identifying the potential effect of modelling in the change/improvement of the clinical pathways for the
studied diseases. In the current document we provide a comprehensive analysis of the potential impact of
personalised models of current pathways, with the final aim of providing a solid ground work for the
definition of:
1. Potentially revised clinical pathways (which will be delivered by WP12 at month 51) and to the
consequent
2. Potential impact on health costs (which will be explored by WP19 at month 51).

D2.3 (submitted at m 36) contributed to ensure that the personalised models delivered by the technical
partners have the potentiality to provide a significant and innovative impact on the existing clinical
pathway on patient care. The ultimate goal is to assure that the computational models obtained from the
project can improve the current knowledge and understanding of the disease by simulating different
aspects on:
i. Supporting the definition of complex pathophysiological interactions
ii. Supporting the identification of relevant and new markers of disease evolution
iii. Predicting the potential effect of a specific therapeutic intervention (being this either
pharmacological, behavioral or surgical).
We have gathered feedback from the technical partners to define the current status and applicability of
the personalised models, in all four the clinical areas. Accordingly, we have refined the definition of the
prospected potential clinical impact of the use of disease modelling in clinical practice for each of the
studied disease areas.

D2.3 provided specific indications and insights to integrate the work of the D12.3 “Improved clinical
pathways and outcome analysis”, in order to clearly predict the potential role and impact of the MD-
Paedigree models in routine clinical practice after the accomplishment of the validation process
described in the WP12 deliverables.

10
WP 3 Data acquisition and processing for Cardiomyopathies

In WP3, the aim was to carry out a 33-month observational longitudinal cohort study in the three clinical
centres, enrolling 180 cardiomyopathy patients (60 per centre), with clinical, laboratory, bio-humoral,
genetic and imaging data (Echocardiography and MRI). The follow-up interval for all patients was
scheduled at 6 months to 18 months after the first visit, collecting the same data as at the baseline.
During the first year of the project the work focused mainly on clinical protocols. The forms for informed
consent, for data collection and the study protocol have been discussed, implemented and approved by
the Coordinator’s local Ethical Committee (OPBG).
OPBG started the enrolment on February 2014. At UCL approval by the local Ethics Committee delayed
due to an initial misunderstanding: at the beginning, it was thought that not only the patients enrolled in
the study would anyway be scanned routinely at Great Ormond Street Hospital (i.e. within routine clinical
scanning), but also that thiswas to be performed within a former protocol for heart failure and DCM
already in place (and approved by the local Ethics Committee) at GOSH, including imaging modalities and
blood tests. However, it was later indicated by the Ethics Committee that a specific MD-Paedigree protocol
for DCM patients would be better submitted separately.
The initial third clinical partner, Johns Hopkins University (JHU), has been unable to join the project and,
in accordance with WP4 partners, it was substituted by Deutsches Herzzentrum Berlin (DHZB) in both WPs.
DHZB achieved to finalize the Informed Consent, and established workflow and study protocol for all
mandatory parameters, in line with the DoW. The approval was awarded.
However, this meant a slightly reduced number of patients enrolled for the obesity study (20), with a
coinciding increase in the number of those to be enrolled by the other two clinical centres, while
maintaining an equal share of the cardiomyopathies patients. Uniform multicentre MRI protocol was
approved and shared in the three different centre.
One of the main objective was to clearly establish Data Collection Needs for Modelling with a complete
clinical evaluation (as protocol). The technical partners decided that a 4D flow data from MRI would have
been better to build the model. Electrical axis parameters needed by modelling for personalisation, and
some EKG values were also necessary for validation. The following solution was agreed with the clinical
partners: Clinicians were asked to send the scanned EKG with times already calculated to the technical
partners; furthermore some intermediate scans (echo and/or MRI) was send in between the baseline and
the follow-up control in order to improve the building of the model.
During examinations clinicians realised that CPX was difficult to perform in young patients, and therefore
only a small subgroup performed it in the course of the study, while it was decided that, for modelling, the
following activities be methodically performed on all patients:
• Scanned EKG
• Systemic BP (max,mean,min) taken at time of Echo and MRI SCAN
• Echocardiography:
 3D Scan
• MRI:
 Flow
▪ 2D Flow and 4D Flow
• Catch data, when taken for clinical reasons (TX listing)
• Karto data, if collected for clinical reasons

During the project it was important to reach an agreement concerning the standard to export 3D data. In
terms of providing retrospective datasets, issues were raised regarding the sharing (non-disclosure)
agreement with the technical partners. Maat G, in collaboration with VPH-Share, set up an uploading and
anonymisation system. Both OBPG and UCL initially provided examples of short axis stacks and flow
sequences allowing their comparison with each other, leaving to SAG the decision on how to parameterise
the data. In order to achieve higher accuracy for data with breathing artefacts, a method to automatically

11
align short axis stacks, making use of existing mesh models in combination with a slice registration
algorithm was implemented.
The enrollment period for WP3 ended at month 36. Patient enrolment was completed by all three centers.
OPBG enrolled 69 patients; UCL enrolled 69 patients; DHZB enrolled 60 patients. In total 198 patients have
been enrolled (overrunning the expected target number of 180).
From month 30 to month 40 the WP3 activity focused on completion of the data acquisition including
follow up studies. In addition, also data (such as genetic tests and other imaging modalities) from some
patients, which have been included in the study but could not undergo an MRI, was analyzed.
During the data collection it was clear that the number of available MRI studies would have been lower as
compared to the initial target due to a number of unexpected clinical and ethical reasons:
• The number of patients requiring anesthesia was above the number prospected at the beginning of
the study.
• Follow up MRI might not be clinically justifiable especially if performed in general anesthesia (i.e. the
risk associate with the procedure overcomes the benefit provided by the examination.)
• Parents’ low compliance on performing MRI in general anesthesia (i.e. not signing consent)
• Patients unable to perform MRI study due to claustrophobia
Nevertheless, after thorough discussions with the technical partners, it was agreed that the minimum
number of complete datasets to perform an adequate modelling tool needs to be around 30 to 40 baseline
patients, being significantly lower that what initially prospected.
Finally, the following examinations were performed:

• Echo :
Considering the total number of examination study (first + second) we performed overall 234
echocardiograms (114 OPBG, 72 UCL, 48 DHZB). Including Lost Follow-up for clinical reasons (death,
transplant list , PMK implantation), this number should be increased to 272. Moroever, 16 Echo were
missing. We stated in our self-assessment plan that at May we we should have had a minimum of 96
echocardiographic study per center with a overall target of 288 studies.
Follow-up Echo examinations were performed in 71 patients (45 OBG, 12 UCL, 14 DHZB).

• MRI:
We performed overall 195 MRI (85 OBG, 75 UCL, 35 DHZB). Including Lost Follow-up for clinical reasons
(death, transplant list , PMK implantation), this number should be increased to 233. We stated in our
self-assessment plan that at may we should have had a minimum of 78 MRI study per center with an
overall target of 234 studies. We reached 233 patients. The number is very good: OBG was on target
and UCL was almost there while Berlin had some kind of delay.
Follow-up MRI examinations were performed in 40 patients (25 OBG, 12 UCL, 3 DHZB).

Moreover, 36 Cath data and 122 4D Flow data were available.


Analysis has been carried out in the whole multicentre database with definite primary outcomes including
heart transplant and/or cardiac death. Secondary outcome have been defined as: hospitalization event
and/or worsening of cardiac functional class (by either Ross of NYHA classification according to age, as
suggested by current guidelines).

12
WP 4 Data acquisition and processing for the estimation of CVD risk in obese children
There are 1.4 billion people in the world who are overweight or obese. This is estimated to account for
44% of diabetes, 23% of ischaemic heart disease and between 7 and 41% of cancers. This significant health
burden has been known about for decades but there has been very little progress in development of
effection prevention measures. There is also little understanding of how obesity causes disease. Together
with smoking, obesity and two obesity-associated conditions, hypercholesterolaemia and hypertension
make up the top four leading risk factors for death. Therefore, effective prevention of obesity and its
associated risk factors would have a significant impact on human health. However, efforts to date have
focused on behaviour modification e.g. diet and exercise advice with limited effect. It is not certain why
such public health interventions are unsuccessful but there is growing evidence that individuals vary widely
in their response to certain interventions, that innate homeostatic mechanisms that evolved to preserve
body weight combat efforts to reduce it and that these mechanisms may have powerful biological
underpinnings that have not been fully delineated. For example, experiments in animals have shown that
restriction of maternal nutrition in pregnant animals results in offspring who prefer a hypercaloric diet and
have lifelong hyperphagia and limited interest in physical activity, compared to control whose mothers
were fed normally [Vickers, Am J Phys RICP, 2003]. Thus, to develop effective prevention, a greater
understanding of the biological underpinnings of obesity-related disease risk is needed. These risk factors
emerge early and have significant deleterious effects on health in childhood. One in six adolescents has
coronary atheroma and the effects of BMI on cardiometabolic risk factors such as blood pressure and
cholesterol in children mirrors its effects in adulthood. This has led expert groups such as the American
Heart Association to call for “primordial prevention” in their publication, “Defining and Setting National
Goals for Cardiovascular Health Promotion and Disease Reduction: The American Heart Association’s
Strategic Impact Goal Through 2020 and Beyond”. In essence, primordial prevention aims to address
primary risk factors before cardiovascular disease develops. Such factors include hypertension,
hypercholesterolaemia and insulin resistance, which are all more common in the obese, and obesity itself.
Given the early emergence of cardiovascular disease risk factors, such primordial prevention necessitates
a better understanding of the cardiometabolic abnormalities present in young obese individuals than is
available to date.
To develop a better understanding of cardiometabolic risk processes in the young, MD-Paedigree WP4 has
focused on the complex interplay between multiple cardiovascular and metabolic systems in obese and
non-obese adolescents. The deliberate attention to complex relationships between variables
acknowledges the growing understanding that obesity does not have a single cause and there is no single
mechanism that links obesity with subsequent disease development. The known cardiometabolic
abnormalities associated with obesity vary widely in their presence and severity in both obese and non-
obese individuals. Therefore, other factors beyond obesity confer vulnerability to these conditions. The
earliest signs of such negative consequences are also difficult to detect as they may not be reflected in
simple, traditional measures. For example, early vascular damage may be associated with increased
arterial stiffness but not yet manifest as significant hypertension in the young. MD-Paedigree has
developed an expanded multimodal assessment of the at-risk phenotype, together with genotype and
microbiome analysis to identify early disease processes. The aim has been to leverage this complex
assessment to develop a screening process for early disease development.
There has been close collaboration between UCL and Siemens to develop the analytical approach needed
for these complex data. To set the direction for this, the WP4 leader developed a model of a multi-staged
sequential screening process to identify high-risk populations. The concept underpinning this was to
determine how well low-cost, easilty-obtainable data, predict high-cost, complex data that might only be
obtainable in a clinical setting such as a hospital. This would allow staged reductions of population size
until the high-risk population is found, ensuring that only those with the greatest risk are examined with
the costliest medical assessments. Although a final, commercial tool for this is not a realistic goal of the
present project, the aim was to demonstrate feasibility and develop the methods needed for such a tool.
13
This could underpin further work where a larger dataset might be collected that focuses on the key
variables that are found to have promise in the present study as variables that are predictive of disease
development.
Two principal study populations were collected by OPBG and UCL. The DHZB element of the study did
not recruit the intended number of participants and DHZB were unable to process their data into a
usable format due to a lack of resource. Therefore, their data were not included in the final project.
OPBG collected 106 participants at baseline, with 50 follow-ups. These were obese patients with no
normal-weight controls. Patients underwent routine clinical weight-loss management for a period of up
to 12 months. Six patients had intragastric balloons, two had sleeve gastrectomies and the remainder
were given dietary advice and support. Their resting (unchallenged) physiological phenotype was
determined using the same baseline protocol as that used in UCL. Thus, a common set of data were
available for the two centres. The advantage of the OPBG dataset is in the routine weight-loss program
that participants entered, allowing us to determine which baseline characteristics predict susceptibility
to weight loss in the adolescent obese populations.
UCL collected 82 participants, with 82 meal-challenge responses. There was an approximately even split
between overweight/obese participants and normal-weight volunteers. There was also an even sex
distribution. No weight loss intervention was available at UCL but advanced imaging techniques and
other tests were exploited to determine the dynamic cardiovascular, metabolic and endocrine
responses to high calorie ingestion and advanced MRI was used to determine accurate organ fat
quantification. The advantage of the UCL dataset, therefore, was the ability to compare the effects of
obesity with parameters in normal controls and the deep phenotype obtained.
KEY RESULTS
OPBG data demonstrated a variable response to intervention. The following figure shows the change in
BMI z-score over the follow-up period for each participant in the follow-up group (N=50).

35/50 participants (70%) lost 0.1 to 1.5 SD of BMI (mean 0.46 SDs) and the remainder (15/30; 30%),
gained 0.01 to 0.63 SD of BMI. In the whole group, the mean change in BMI was a loss of 0.26 SDs
(P=0.0002). Thus, the group-wise effect was to lose weight, demonstrating effectiveness of the weight-
loss management program.

UCL began by developing a metabolic stimulus (meal challenge) comprised of 350mL of cream and
maltose syrup with 1635 kCal. UCL showed previously that this meal stimulated a strong cardiovascular
response and that the cardiovascular changes were not due to a volume effect on the circulation
because an equivalent volume of water had no effect (see figure below):

14
This was published in the American Journal of Physiology as detailed in the previous annual report.
Subsequently, UCL showed that the meal induces an abnormal response in the adolescent obese
participants that differed from that of leaner controls. This response, including hyper-insulinism,
hypertriglyceridaemia and lymphocytosis suggested the obese were more vulnerable than lean
individuals to a pro-atherosclerotic milieu when ingesting high-energy foods. These data were presented
by the WP4 leader at the AHA conference in Orlando, Florida in 2015.
UCL have now shown that adolescents usually experience a post-prandial reduction of sympathetic
nervous system activity following the meal, which concords with evidence in animals that food ingestion
activates a sympatho-inhibitory reflex, the primary effect of which is to inhibit sympathetic outflow to
gut vasculature, increasing blood supply to this organ.
Most recently, UCL have shown blood flow to the gut is blunted in the obese only if they have failure of
sympathoinhibition (i.e. their circulating adrenaline does not reduce or increases after the meal). This
accords with data in animals but has never been demonstrated in humans before. Interestingly, it is only
in the obese with such a failure of sympathoinhibition that a blood pressure rise after the meal ingestion
was observed. Thus, it may be that hypertension develops in such individuals due to an inability to
reduce gut vascular resistance following each meal. Given that humans spend approximately 16 hours in
the post-prandial state in the Western World, this effect could have significant impact on our risk of
developing hypertension. Further support for this possibility was demonstrated by examining the
relationship between blood flow responses to the meal and vascular outcome measures.

This shows that liver fat percentage, a crucial metabolic risk factor known to be associated with
cardiometabolic disease, can be substantially predicted (76% of the variance) by age, body weight, ALT,
HDL, GGT and Tanner score. These predictor variable are all relatively cheap to obtain. Thus, individuals
most likely to benefit from assessment of their liver by MRI could largely be determined by a simple
questionnaire and some blood tests.

15
WP 5 Data acquisition and processing for Juvenile Idiopathic Arthritis

Juvenile idiopathic arthritis (JIA) is a broad term that describes a clinically heterogeneous group of arthritis
which has an onset before age of 16 years, lasts more than 6 weeks and is of unknown origin. The cause
and pathogenesis of JIA are still poorly understood, but likely they include both genetic and environmental
components. Moreover, disease heterogeneity implies that different factors probably contribute to its
pathogenesis and causes. Affected joints develop synovial proliferation and infiltration by inflammatory
cells which may ultimately lead to destructive lesions of joint structures, disability and high disease-related
costs. Indeed, JIA, which affects approximately one in 1,000 children represents the leading cause of
childhood disability from a musculoskeletal disorder.
The present ability to predict the disease course and outcome is limited. Within the FP6 Health-e-Child
project, ICT tools for diagnosis and scoring of JIA, based on image data of the wrist, have been developed.
This framework is the basis for the developments planned for MD-Paedigree.
During the 4 years of the MD-Paedigree project, the three clinical centres involved (IGG, OPBG and
Utrecht) have continuously and consistently collected data of the participants, according to the protocols
established at the start of the study. Of all 169 participants, seven have been lost to follow up, due to
various reasons (mainly: transfer to other clinical centres with non-response after telephone or email
contact). The remainder has performed at least one clinical follow up visit, where clinical data, biological
samples and imaging data have been collected, as appropriate. Together, these patients contributed 567
follow up visits (161 at 6 months, 149 at 12 months, 136 at 18 months, 94 at 24 months and 27 flare visits).
All patients got routine laboratory tests at baseline and at regular intervals during treatment. Patients who
were in remission and not on treatment in general did not have laboratory tests performed.
Blood for cytokine profiling was collected in 137 patients at baseline. A subset of cytokines, mainly the
matrix metalloproteinases 1 and 3, as well as YKL-40 correlate with disease activity at baseline (number of
active joints, juvenile arthritis disease activity score). Many cytokines show differences between the
various age groups in the cohort.
Microbiota profiling has been performed for all patients contributing a faecal sample. Clear differences in
gut microbiota composition were observed between patients and healthy controls. Additional differences
were found between Dutch and Italian patients and among the various age groups in the cohort. These
observations are in line with previous studies in patients (mainly adults) with autoimmune diseases and
open up a line of exciting new research to unravel the pathways that link gut microbiota to autoimmune
disease and JIA.
Ultrasounds and MRIs have been performed according to protocol. Regarding the MRI and CGA protocols,
despite their complexity and length, our juvenile patients tolerated the protocols well and we had a very
good retaining percentage, with data missing at the various time-points for only 13 out of the expected
138 datasets. However, in about 30% of the cases part of the data were either missing for technical
problems or deemed of not good enough quality for the post-processing. Nonetheless, this still left us with
the largest modelling database of this kind ever collected for children with JIA, with a total of 124 usable
datasets.
Using clinical, imaging, microbiota and Luminex data, predictions models were developed for disease
evolution in JIA. Specifically, the occurrence of inactive disease using internationally validated criteria was
predicted at the various follow up visits. Even though the models for all patients together showed
moderate performance in test data, improved models could be fitted when considering oligoarticular
patients, polyarticular rheumatoid factor negative patients and antinuclear antibody positive patients
separately. In these subgroups, a set of baseline predictors was found, among which the duration of
morning stiffness, the haemoglobin level, the gut microbiota operational taxonomic unit
Mogibacteriaceae and the chemokine CXCL-9, which were predictive of disease activity. The performance
of the models in test data was moderate. As usual, the models require validation in independent cohorts.
16
Furthermore, the association of Mogibacteriaceae and CXCL-9 with disease activity merit further study to
elucidate their potential role in JIA disease pathogenesis.
For WP10 an automated segmentation framework was created for bone structures in both feet and the
lower limbs. Using this framework, the bones in more than 90 datasets were automatically segmented.
Post-processing was performed to these segmentations to achieve the high level of accuracy needed for
personalised models.
Based on these high accuracy segmentations, we have now built all the personalised models and run 80%
of the simulations. A preliminary statistical analysis of the results has confirmed the excellent
discrimination power of this extraordinary data collection and offered 50 anatomo-functional biomarkers,
out of the 235 we tested, as possible candidate prognostic predictors. While it will take months to
complete the in-depth analyses, the use of subject-specific modelling may finally shed a light on this
complex paediatric syndrome.
Gait analysis provided evidence of kinematic and kinetic alterations of the locomotor pattern. These
alterations were evident during acute state showing how the patients kept the inflamed joint in relative
rest. The integration with MRI and the development of a more accurate biomechanical ankle model
allowed to better understand the involvement of muscle and soft tissue in the observed patterns. These
results put clinicians in the perspective to address specific motor training, besides pharmacological
treatment, in order to avoid incorrect pattern recovery. Future in-depth analysis of the personal style of
movement could contribute to understand their role especially in case of relapses.
Clinical, laboratory, immunological, microbiota and imaging data have been uploaded to the online
repository. An automated file transfer algorithm has been developed to extract the variables from the
local databases and insert them in the online platform, taking account of the hierarchical structure of the
data (patients  visits  variables).

Key results
1. Significant differences have been found in the composition of gut microbiota between patients
and healthy controls, meriting further exploration in future research projects.
2. Correlations have been found between cytokines and disease activity at baseline.
3. Prediction of disease evolution has proven to be challenging in this cohort. Some relevant variables
have been found in the clinical, microbiota and Luminex data set, but collectively these could not
predict disease evolution with satisfying accuracy. Accuracy improved upon analysing
oligoarticular and polyarticular patients only. Future efforts should be directed at unravelling the
heterogeneity underlying JIA.
4. We are the first to collect such a large anatomical and functional database for biomechanical
modelling in JIA. Preliminary analysis identified a set of promising biomarkers.
5. An automated segmentation framework was developed to facilitate the biomechanical modelling
of the datasets.
6. Gait analysis provided evidence of changes of locomotor pattern mainly aimed at the reduction of
the loads on the joint. The history of the motor pattern recovery could be influenced by
personalized training.

17
WP 6 Data acquisition and processing for NND

The aim of WP6 was to collect data from patients affected by Neurological and Neuromuscular diseases,
in order to provide the basics for the modelling partners to build patient specific models as part of the
WP11, as well as to provide a large dataset of both retrospective and prospective data for probabilistic
modelling in WP14-16. All the collected data within this WP would be stored in the digital repository.

In this work package, first standardized protocols for technical quality assurance as well as marker
placement and operational procedures were developed; next a large data set was collected according to
these protocols; and standardized output formats were developed and implemented to ensure
systematic upload to the repository. These steps are described below in more detail.

Technical Quality Assurance


Two levels of protocols were considered: the technical quality assurance of the performance of the
equipment in the three laboratories (also called “low level”), as well as the overall performance of the
repeatability of measurements in the lab on actual subjects (“high level”).

For both levels URLS, who is the responsible for the Technical Quality Assurance, has developed the
protocols and performed measurements to assess the quality of the measurements conducted in the
involved labs. The CGA centers involved in the experimental protocol were:
i. KU Leuven (KUL)
ii. VU Medisch Centrum (VUMC)
iii. Children’s Hospital ‘Bambino Gesù’ (OPBG)

The experimental protocol of the low-level quality assurance consists of the validation among the three
labs of: (i) the optoelectronic system OS-validation), (ii) the force platform (FP-validation), and (iii) the
signal synchronization (S-synchro). As regards the OS-validation, the performance of the optoelectronic
system was comparable between KUL and OPBG, while lower accuracy was found at VUA. For the FP-
validation, different behaviour of platforms at each center was found. Eventually, for the signal
synchronization (S-synchro) between force platform and EMG system, the highest value of the time
delay between EMG and FP was individuated at KUL.

For the high level protocol, two healthy children were enrolled and performed five walking trials at KUL,
OPBG and VUA laboratories. Two therapists per center performed the marker placement according to
the protocol adopted in each lab. Kinematics, kinetics, and timing on EMG activation was considered for
each subject, each operators and at each laboratories. Then, the inter-operator and the inter-laboratory
reproducibility were evaluated. The inter-operator reproducibility was excellent for the kinematic and
kinetic variables in the sagittal plane at each center, while less reproducibility was found for the out-of-
sagittal plane variables. The inter laboratory reproducibility was lower than the inter-operator one, but
always in the range of good reproducibility. As regards the EMG activation, good reproducibility both
inter-operator and inter-laboratory was found.

Standardized Clinical Gait Analysis protocol


First, consensus was reached between all partners on all operational procedures. This was done through
a comparison of protocols of participating centers; an inventory along 11 world-wide gait labs; consensus
meetings; and practice meetings.

The final protocol included detailed descriptions of:


1. Preparation of the lab
2. General patient history
3. Gait specific information
18
4. Gait analysis measurements
5. Physical examination
6. Energy expenditure
7. 6-min walk test
8. MRI
The protocol was made freely available on http://www.md-paedigree.eu/clinical-scenario-nnd/gait-
analysis-protocol/

Comprehensive clinical data set of gait analysis data for CP, DMD, CMT
A large comprehensive clinical dataset of gait analysis data for CP, DMD and CMT has been collected
(Table 1), as well as MRI data sets of these patient groups. The data set comprises a total of 863 patient
visits (33 more than anticipated), each consisting of a 3D clinical gait analysis and a set of clinical data
(physical exam, patient history, gait specific information, etc). Some specific data sets are complemented
with additional data such as MRI images, hand-held dynamometry, energy expenditure. Many of those
are pre-and post- treatment combinations. All extended data sets that included MRI images (66 total)
have been shared with the NND biophysical modeling partners in WP11. A large data set of 426
retrospective gait analyses (213 patients) from KUL has been shared with the probabilistic modeling
partners in WP15 and WP16.

TABLE 1: Total numbers of data collected


Patient Reference Complete Acquired GOAL
TOTAL OVERALL PATIENT DATA 863 830
Total CP prospective extended 28* 30
Total CP prospective clinical 136 120
Total CP retrospective 626 600
Total DMD T0 26* 20
Total DMD T1 16 20
Total CMT T0 18* 20
Total CMT T1 13 20
*For these patients also MRI images were collected, if possible

Standard output formats


Specific tools have been developed to upload the acquired data into the repository. These include a large
excel spreadsheet for data entry of clinical data per patient; and an adapted version of the Data
Processing Suite (DPS) to convert these clinical data to the infostructure file formats. For gait data, a list
of clinically relevant outcome parameters (CROPs) has been defined, as well as a standardized format
(.csv) with standardized names. Custom made matlab programs have been developed to export this list
of CROP parameters in the agreed format. Furthermore, also the time-normalized curves for all relevant
joint angles, moments and powers are stored in a similar standardized .csv format. These .csv files are
then uploaded to the repository.

19
WP 7 Genetic and Metagenomic analytics
In the context of the clinical data science, the Project MD-PAEDIGREE has provided one of the most
advanced approach to make integration of data for disease prediction modelling. Within this framework,
completely new it is the clinical and diagnostic approach for understanding the gut microbiota related to
obesity and juvenile idiopathic arthritis (JIA). Indeed, the most advanced theory in the field of “which
came first, the chicken or the egg” in term of microbiota enterophenotypes/disease phenotypes and
treatment is the “common ground hypothesis”, recently discussed in NEJM 375; 24, December15, 2016.
The “common ground” hypothesis, has been proposed to explore the question of whether imbalances of
gut microbial communities are a consequence or a cause of chronic polygenic diseases. This hypothesis
posits that:
i) various endogenous or exogenous factors, or combinations of such factors, trigger an increase
in gut permeability (“leaky mucosa”) or mucosal inflammation either directly or through
selective pressure on the gut microbiota;
ii) in persons who are genetically susceptible to one or more chronic disorders, the subclinical
intestinal abnormalities favor the expansion of opportunistic microbes and the transition to
pathobionts;
iii) microbial gene products from the dysbiotic pathobiont gut communities promote local or
systemic morphologic and functional changes that are pathogenic; iv) once disease-associated
gut microbiota have been expressed in a genetically susceptible person, they can be
transferred from that person to a genetically sensitive recipient, acting as a continual and
contributing pathogenic mechanism.
Our group has studied and is studying in deep the prediction models of disease by employing big data
(multidimensional data) from metagenomic and metabolomic platforms, hence providing both descriptive
(metataxonomy) but also functional predictive models of microbiota associated to diseases. The models
can respond to multivariate prediction models going into univariate after sub sequential reduction of data
dimensions. The models have produced diagnostic predictive models of disease, but also prognostic
models, as recently demonstrated for hepatic steatosis, (Hepatology. 2017 Feb;65(2):451-464. doi:
10.1002/hep.28572. Epub 2016 Jun 2. Gut microbiota profiling of pediatric nonalcoholic fatty liver disease
and obese patients unveiled by an integrated meta-omics-based approach. Del Chierico F, Nobili V,
Vernocchi P, Russo A, Stefanis C, Gnani D, Furlanello C, Zandonà A, Paci P, Capuani G, Dallapiccola B,
Miccheli A, Alisi A, Putignani L).The paper has been discussed into an Editorial in Hepatology, just for the
novelty of the approach (HEPATOLOGY 2017;65:401). The integrated model describes the predictive
prognostic model of microbiota enterophenotypes associated to disease stage NAFL, NASH, identifying
markers of disease as operational taxonomic units and metabolites (please see Figure below). For obesity
model, 100 Italian Obese patients (age 9-19 ys), 27 Italian Obese paediatric patients with Non-Alcoholic
Fatty Liver Disease (NAFLD), 26 Italian Obese paediatric patients with Non-Alcoholic SteatoHepatitis
(NASH), 79 Italian normal weight healthy controls (CTRL; age 7-18 ys) have been studied.

To accomplish the aim of stratify microbiota profiles according to disease phenotype, targeted microbiome
pyrosequencing was performed and profiles were compared to healthy subjects, as above described, to
describe ecological diversity, richness and composition of the gut microbiome. We performed description

20
and classification of gut microbiota components (diversity indexes) as ecological indexes. Gut microbiota
components were classified as transients, resilients, and described at L2, L5 and L6 levels of taxonomy.
Also a microbiome-based diagnostics method was developed and/or applied to develop a method to infer
specific microbial biomarkers of disease and to identify lacking bacteria in the microbiota profiling,
potentially exploited as potential probiotics for therapeutically usage. All available parameters for
classification of microbiota enterophenotypes were considered, such as age, geographical origin of
patients, stage of disease, treatments, laboratory and clinical features, metabolomics, when available.
Within the obesity-related enterophenotype the passage from childhood to adolescent age have been
inferred by exploiting microbiota profiling, CDA and ROC-based computations. The model is below
reported (Del Cheirico et al., 2017, Obesity, Manuscript Submitted, ID 17-0576).

Also microbiome profiling from 52 English obese paediatric patients have been generated, but further
computational analyses are necessary to generate disease-related microbiota enterophenotypes.
For JIA-related microbiota profiles was assessed for 99 Italian and Dutch treatment-naïve JIA patients at
baseline, 44 patients with Inactive Disease (ID), 25 patients with Persistent Activity (PA),
107 matched healthy Italian and Dutch controls (CTRLs) in a large prospective study. Random forest and
univariate analyses provided microbial biomarkers for both Dutch and Italian patients, dependently on
geographical origin more than disease profiles. Dysbiosis was a disease marker of JIA patients at different
disease stages, irrespective of their geographical origin. Differences were found between Italian and Dutch
samples. JIA samples at baseline showed reduced richness compared to matched controls (p=0.005). JIA
and CTRL samples could be distinguished based on changes in the microbiota composition, most notably
Erysipelotrichaceae, Faecalibacterium prausnitzii and Allobaculum.
For metabolomics analyses, samples were collected from 75 Italian JIA patients (25 each at baseline, with
persistent activity, and with inactive disease) and comparatively analysed with 75 matched CTRLs.

21
Metagenomic and metabolomic multidimensional profiles were processed and reduced to low fused data,
in order to develop a predictive model of disease, based on the microbiota enterophenotypes. The model
was validated by different statistical camputations, according to chemometrics use in omics applications.
Diagnostic statistics based on NMC, AUROC, and DQ2 in double cross validation procedure of PLSDA was
exploited in the selection of optimal number of latent variables in CV1.
The diagnostic statistics assessed the overall PLS-DA model quality after double cross validation procedure
(CV2). The microbiota enterophenotypes appeared independent on disease stages. The integration of
metagenomic and GC/MS and NMR based metabolomics data allowed to individuate a specific and stable
fecal microbiome profile of the JIA children. The PLS-DA models on low level fused data were highly
predictive for the JIA disease allowing a correct classification rate above 82 %.This JIA gut microbiome
integrated profile is independent on disease activity or inflammation, as well as on methotrexate therapy.

22
WP 8 Modelling and simulation for Cardiomyopathies

Segmentation of cardiac anatomy is a pre-requisite for the cardiac personalization workflow. During the
course of this project, a prototype with a consolidated MRI segmentation pipeline was developed in task
T8.1, with minimal requirement for parameter tuning and enhanced interactive editing capabilities. This
prototype has resulted in efficient processing of cases across all participant centres, with a high
throughput. In order to provide comprehensive support for haemodynamics modelling and fluid-structure
interaction, valve models were integrated with the MRI-based chamber models. For the purpose of
modelling valves from paediatric transthoracic echo (TTE) data, automation pipelines were adapted to deal
with the small anatomy of children. Given the limited data collection, we favoured a semi-automatic
approach to streamline the processing, while alleviating the challenging conditions in TTE imaging of
paediatric anatomy. Subsequent automated fusion with chamber models allows for interactive rigid
refinement, particularly to compensate for differences in imaging conditions between MRI and
echocardiography.

In order to make computational models useful in clinical practice, they need to be precisely personalized
to the patient. Previously, model personalization used to be a highly inefficient, time-consuming and
manual task. One of the main achievements of task T8.2 was the development of a user-friendly, highly
automated whole-heart model personalization pipeline that requires only routinely acquired clinical data
as input. With this, we managed to take a huge step towards “industrialization” of computational
modelling. With the suite of segmentation and personalization apps developed within MD-Paedigree, it is
now possible with only 15-30 minutes of user interaction to create an electro-mechanical virtual heart of
any patient, including the time for semi-automatic heart segmentation from MRI.
The main achievement on the Inria side for biomechanical modelling is the development of a generic
optimization strategy for the very fast personalization of cardiac models. We developed a "multi-fidelity"
method that speeds up the personalization of a 3D model (patient-specific parameter estimation) by order
of magnitudes. We built a reduced version of our 3D model using simplifying assumptions in the spatial
domain (spherical symmetry). The resulting 0D model is extremely fast compared to the 3D model: 15
beats per second vs 15 minutes per heartbeat. Learning the mapping between the 0D and 3D model
parameters gives a very fast approximation of the outputs of 3D model simulations from 0D model
simulations with which we can estimate patient-specific parameters. We also added tailored prior
probabilities on the parameters in order to increase the parameter consistency. On the MD-Paedigree
database, these parameters improve the classification of healthy versus cardiomyopathy cases compared
to using clinical parameters only.

For task T8.3, two haemodynamics computation engines were developed and validated. They include a
tuneable and lumped-parameter closed-loop whole-body circulation (WBC) solver, as well as a full 3D
cardiac flow engine, controlled by a hybrid cardiac model based on both MRI and 3D-echo data.
Personalization of both engines is based on the full cardiac cycle kinematics of cardiac meshes that
integrate in a novel fashion multiple imaging technologies: MRI data for myocardial segmentation, and 3D
echo data for valve segmentation.
The WBC model has been developed together with a personalization framework which enables the
computation of patient-specific hemodynamic quantities of interest, like systemic circulation properties
(resistance, compliance), stroke work, arterial elastance, etc. The personalization is based on a set of
standard objectives (formulated based on measured systolic / diastolic pressure / volumes, valve timings,
etc.) and a set of advanced objectives (formulated based on volume / pressure derivatives and / or values
at certain time points). The parameters of the WBC model to be personalized were determined from a
local sensitivity analysis.
The model is able to not only compute quantities of interest for the patient state for which measurements
are being provided, but has also demonstrated predictive capability on a dataset of 12 patients.
Specifically, the model was personalized for the baseline state of each patient, and next, the heart rate

23
was changed to that of the follow-up exam: the model was able to predict the ejection fraction at follow-
up, with a correlation of 0.87 and a mean absolute difference of only 4.58%.
The full 3D cardiac haemodynamics computation engine uses boundary conditions from the valve-
endowed cardiac model and generates 3D+time velocity fields and relative pressure fields. A novel 7-
region analysis model has been developed for enhanced analysis of the global blood flow pattern.
Validation against measured velocity from PC-MRI data was done for 6 patients. Qualitative patterns like
aortic jet shape and intra-ventricular vortex were matched while quantitative measures showed a good
correlation between the 7–region averaged velocities, with an average error of less than 7.9cm/s at peak
systole and 6.7cm/s at peak diastole.

For task T8.4, a novel haemodynamics engine was developed that combines the most important features
of the engines developed in WP8.3: on one hand it is tuneable to specific patient characteristics, while also
allowing for variations in the systemic or intrinsic cardiac properties that can enable the clinician to test
hypotheses and develop personalized treatment plans.
The model integrates several components: the personalized electro-mechanical model developed in T8.2,
the lumped WBC and the full 3D haemodynamics models from T8.3, and a novel 3D+time geometric valve
model, which is personalized from the 3D echo data and responds to haemodynamic pressure gradients.
The model includes bi-directional coupling of the blood flow with the cardiac walls deformation.
The new algorithm produces haemodynamics results that are qualitatively similar with the computations
in T8.3, e.g. they show similar slingshot dynamics. However, a better match with the PC-MRI region-
averaged velocities was obtained, featuring an average error of less than 6cm/s at both peak systole and
diastole. Another advantage of the new model is that the haemodynamics results provide absolute
3D+time pressure information, which can be used e.g. to train machine learning models or for clinical
procedure planning.

The main achievement in the statistical analysis of spatially distributed parameters (T8.5) is the
development of two methods to reduce the dimension of the cardiac motion estimated from sequences
of cine MR images: polyaffine models and barycentric subspaces projections. Many existing methods
extract features from motion fields without being able to reconstruct a motion from given features. For
instance, it is very difficult to estimate a motion that has a prescribed volume curve or strain parameters.
In contrast, polyaffine parameters allow parameterizing a time-varying diffeomorphism and they can be
used directly as parameters in registration algorithms used for motion tracking (e.g. polyaffine log-
demons). In this project, we have further pushed the interpretability of these parameters so that their
mean value can be used to describe the mean motion in an intelligible way. Likewise, barycentric
coordinates and their reference images allow the reconstruction of an image sequence approximating the
original cine MRI. A key property of our methods is that the statistical analysis of our motion parameters
on a population of subjects can now be used not only to discriminate between pathologies, but also to
reconstruct the mean (or typical) cardiac motion of each class and to interpolate between different cases
or classes.

24
WP 9 Modelling cardiovascular risk in the obese child and adolescent
T9.1: Heart model adaptation to the obese heart
Extraction of heart anatomy: (task shared with WP8)
Segmentation of the cardiac anatomy is a pre-requisite for many tasks in the cardiac personalization
workflow. During the course of this project, a prototype with a consolidated MRI segmentation pipeline
was developed, with enhanced interactive editing capabilities and minimal requirement for parameter
tuning. This prototype has resulted in efficient processing of cases across all participant centers, with a
high throughput.

Figure 1: New prototype of heart anatomy extraction

Personalization of whole-body circulation model:


A model-based approach was introduced for the non-invasive estimation of patient specific, left
ventricular PV loops. A lumped parameter circulation model was used, composed of the pulmonary venous
circulation, left atrium, left ventricle and the systemic circulation. A fully automated parameter estimation
framework was introduced for model personalization. We computed the PV loop for a cohort of patients
and compared the results against the invasively determined quantities: there was a close agreement
between the time-varying LV and aortic pressures, time-varying LV volumes, and PV loops. The
personalized circulation models are used to provide input features for characterizing cardiac function of
obese children.

T9.2: Automated assessment of body fat distribution from MRI and ultrasound data
The basis for quantification of different fat distributions is the acquisition of fat and water separated MR
images. Within MD-Paedigree, we were using two different protocols: T2* IDEAL and DIXON. Using two
different protocols certainly meant more work from a technical point of view, but it also brought us in a
highly desirable position to develop algorithms that are both more robust and generic.

During the project two processing pipelines for organ segmentation on one side and subcutaneous adipose
tissue (SAT) and visceral fat (VAT) segmentation on the other side. Organ segmentation is performed using
a model-based segmentation approach. The processing steps are depicted in Figure 2.

Figure 2: Processing pipeline for model-based organ segmentation

25
The proposed method consists basically of two main steps. First, the image is preprocessed to compensate
for image inhomogeneities, to normalize the intensity range and prepare the image for the segmentation
step by removing noise and structures inside the liver. Second, an initial model of the liver is placed into
the image and deformed according to an Appearance Model. The deformation is further constrained by
minimizing a cost function according to a shape model. The algorithm expects the preprocessed MR
dataset, the liver shape model and Appearance Model as input and returns a binary mask of the segmented
liver.
SAT consists of the fat deposits under the skin and is clearly visible in fat images that we got from our
clinical partners. SAT and VAT segmentation is performed using a model-based segmentation approach.
The processing steps are depicted in Figure 3.

Figure 3: Processing pipeline for SAT and VAT segmentation

The proposed method consists of several steps. First, the image is pre-processed to remove image
inhomogeneities, which are almost always present in the image. This greatly helps to improve the
efficiency of the following steps. Second, a body mask is created using a region-based segmentation
method. Third, the body mask is further refined by removing the arms using a connected component
analysis. This is necessary to obtain comparable images, because the arms are not always completely
imaged and/or distorted. Fourth, SAT and VAT is segmented and separated from each other. Simple
thresholding would include VAT that is connected to SAT at several points. Our approach uses a ray-based
method that scans the preprocessed MR image from different directions to detect the inside of the body
in order to separate VAT and SAT.

Bias correction is based on an established method [1]. However, it has 7 parameters that need to be
optimized, before results are satisfying.

Figure 4: Evaluation of different parameters for bias correction

Figure 4 shows the setup for evaluation of different parameters. First, we use an exhausting search
approach on a single slice to calculate the Top-5 parameter sets comparing segmentation results from
bias-corrected vs. non-corrected images vs. a manual ground-truth segmentation. Second, these
parameter sets are then used to correct and compare all remaining slices.

26
Figure 5: Results of Top-5 parameter sets on one dataset

As can be seen in Figure 5, the Top-5 parameter sets show a big improvement compared to segmentations
obtained from the original dataset (red). It can be concluded that bias correction greatly improved
segmentation quality.
The whole automated detection and quantification process for liver fat, SAT and VAT takes around 5
minutes. Additionally, we spent roughly 30 to 90 minutes per dataset for manual refinements. In future
work, we plan to investigate the trade-off between the effort spent for manual refinement and fat
quantification results.
[1] Sled et al. “A Nonparametric Method from Automatic Correction of Nonuniformity in MRI Data”.
1998

T9.3: Multi-scale data integration and virtual phenotype generation


Within the scope of the project, Siemens Healthcare achieved a complete refoundation of its case-based
reasoning technology now rebranded as DeepReasoner. Not only the prototype itself, where the front-
end has been completely re-designed and re-implemented as well as its back-end, but also the machine
learning technology at the core of the case-based reasoning engine have been improved. Our former web
prototype implemented in Java and served by a Jetty server supports similarity searches that use classical
distance functions, such as Euclidean distance, computed over the selected clinical variables. While
classical distances are well-studied and therefore easy to interpret, they do not cope well with the large
dimensionality, the presence of uninformative features, or the large difference in dimensionality and
scales between different sources of information. All of these features are found in the heterogeneous and
multi-modal data that are being acquired in this project. Moreover, classical distances are by design not
aware of the context and thus cannot weight the importance of the parameters according to the current
use case. Within MD-Paedigree project, we proposed a significant enhancement of the underlying
knowledge-discovery engine that has the potential to overcome these limitations of similarity search,
based solely on classical distance functions. This new development uses deep learning methods to derive
compact task-specific patient representations that can then be analyzed in a number of ways, including
through similarity estimations. Moreover, we completely redesigned the front end as well as the back-end
using modern web technologies such as html5, javascript and server side scripting node.js. The retrieval
engine itself is implemented as an azure machine learning web service using python scripting language
and keras deep learning library. On the side of the front-end, results analysis and visualization such as
computing prediction histograms and graph clustering are implemented in javascript.

T9.4: Cardiovascular risk stratification and predictive disease and therapy modelling
The clinical and technical team from WP4 and WP9 proposed a novel screening approach with the
overarching goal of identifying young people with early physiological derangements that signify high
cardiovascular risk and that cannot be reliably detected by traditional approaches. As depicted by figure
6, this approach relies on a sequential strategy where patients at risk are identified through the acquisition
of increasingly complex phenotypic measures drawn from a range of different sources, such as
questionnaires, stool and blood samples, clinical assessments and advanced medical imaging. At each
stage, predictive models based on deep learning facilitate a risk assessment that determines whether to
recall a given patient for the next stage of more advanced examination. The most advanced but also most
resource intensive stage depends on complex imaging techniques where approaches described above

27
permits to quantify body fat distribution and comprehensively measure cardiovascular function. This
allows ultimately the diagnosis of early cardiometabolic disorders, such as vascular stiffness, ventricular
hypertrophy, ectopic fat deposition and insulin resistance. These intermediate outcomes are known to be
associated with frank pathology in later life, but may not be detected by simpler risk models.

To combine complex multimodal data sources with varying scale and distribution and build predictive
models based thereon, we proposed a multi-task deep learning approach: while it permits to perform
inference for a considered intermediate outcome associated with elevated risk, it also encodes the original
multi-modal patient data into a compact signature that has better properties in terms of scale, distribution
and sparseness than the raw dataset. While such model can be used for performing inference only, it can
be also plugged into a similarity search engine, enabling thereby the retrieval of similar patients providing
important evidences for understanding system’s reasoning, assessing prediction confidence, and finally
supporting clinical decision.

A pilot multi-centric study was conducted to collect a cross-sectional dataset of approximately 160 children
(including more than 100 obese), consisting of questionnaire, anthropometrics, genetic, clinical as well as
imaging data. Using their MRI data, we extracted parameters characterizing different aspects of their
cardiac function as well as their liver, subcutaneous as well as visceral fat distribution. Extensive cross-
validation experiments have been conducted to evaluate the different hypotheses linking the parameter
domains at different levels of screening for identifying patient at risk. Different predictive models have
been compared such as logistic regression, dense neural networks and multi-task neural networks based
on classification metrics such as sensitivity, specificity, positive predictive value and f-score. Furthermore,
to assess the significance (p-value) of our results, each predictive model has been compared to a simple
model based on BMI z-score only by using Welch’s t test. From these experiments, we could identify a few
promising results for assessing the risk considering a particular intermediate outcome such as liver fat ratio
or aortic compliance.

Figure 6: Screening for early risk assessment

WP 10 Modelling and simulation for JIA

28
The main S&T outcome of this work package (WP10) is the workflow that was developed for generating
patient-specific musculoskeletal models of children affected by juvenile idiopathic arthritis (JIA) starting
from magnetic resonance images and use them to generate biomechanical simulations using conventional
gait analysis data. The results of these simulations, together with additional anatomical, clinical and gait
analysis biomarkers, were then used in statistical analyses aiming to identify variables able to quantify and
predict the disease outcome. Specific results and foregrounds have been achieved at each stage of this
workflow and can be summarised as follows:
• Development of a magnetic resonance imaging (MRI) protocol to acquire images of the foot and
ankle joint, and of the full lower limb. The protocol consisted of multiple MRI sequences required for
clinical assessment, bone segmentation and musculoskeletal modelling purposes. The total scan
duration was minimized in order to be used with a paediatric population, i.e. semi-collaborative
patients. The protocol included positioning some MRI-visible markers on well-defined bony landmarks
of the patients to allow registration of MRI and clinical gait analysis data.
• Development of a gait analysis protocol compatible with protocols used in routine clinical practice in
the two laboratories (based at IGG and OPBG) involved in the project, which would allow ensuring the
better possible data quality while still producing the needed traditional clinical outcome and allowing
for a better description and characterisation of the foot movements. The protocol was also suitable
for musculoskeletal modelling applications.
• Development of a semi-automated segmentation procedure to automatically process the images
collected during the project and generate 3D anatomical models of the patients’ bones. Being a
procedure based on statistical shape modelling, training data was initially obtained by manual
processing and, when the system reached an adequate level of training, required operations became
almost fully automated. The techniques here developed were robust enough to deal with partially
corrupted or incomplete images, reduced field-of-view, and intensity inhomogeneity and produce the
needed 3D geometries of the lower limb bones. The potential of the Sheffield Image Registration
Toolkit, a previously developed registration-based approach, as a tool to streamline and increase the
reliability of the following features within the creation of the personalised musculo-skeletal models
was also investigated.
• Development of a procedure to generate full lower limb models: a procedure to systematically and
reliably process the anatomical model (bone geometries) and create patient-specific musculoskeletal
models was developed. Establishing and consolidating this procedure (Figure 7) involved the
definition of ad hoc procedures for:
o Estimating patient-specific joint parameters from bone geometries
o Estimating patient-specific inertial parameters for body segments of the lower limb
o Determining muscle attachment locations on the bone geometries and defining muscle paths
consistent with the MRI images
o Determining the position of the skin surface markers with respect to the underline bones
o Assessing the inter- and intra-operator repeatability of the developed procedure
• Development of a procedure to generate foot and ankle musculoskeletal models: the procedure was
conceptually similar to the one developed for the full lower limb. Due to the sagittal design of the
study, which included full lower limb MRI scans at month6 visit, and foot and ankle regional MRI at
baseline and month12 visit, a method to fuse ankle and foot models of each patient with the
correspondent full lower limb models was also implemented. In addition, starting from 3D bone
segmentations, an innovative morphological fitting technique was implemented, which allowed to
identify the tibiotalar and subtalar joints’ functional axes and discriminate their individual contribution
to the kinematics of the ankle-foot complex.
• The entire experimental database was processed to create patient-specific anatomical and
musculoskeletal models: the final database included 23 patients and 124 dataset (MRI and gait data),
corresponding to 62 patient controls. The entire amount of experimental data (100% datasets)
collected by the clinical partners was used to produce geometrical models; 92% of those models were

29
suitable and used to produce biomechanical models; 100% of the musculoskeletal models were then
used in biomechanical simulations of walking.
• Development of a multi-dimensional model of the disease to estimate ankle pressure: a contact
model was implemented based on Hertzian contact theory and used to estimate the maximum
cartilage pressure occurring at the ankle joint as a results of the internal forces estimated through the
biomechanical simulations.

B C D
A
Figure 7 Example images representing (A) a full lower limb MRI acquisition, (B) the gait analysis
protocol with highlighted MRI-visible markers, (C) the anatomical model generated through semi-
automated processing and (D) the final lower limb musculoskeletal model

• Development of a template for reporting biomechanical results to clinicians: the most significant
results obtained from the biomechanical models were collected and organized in clinical reports
allowing interaction and discussion between the clinician and the technical partners.
• Creation and statistical analysis of a large dataset of candidate biomarkers from children affected
by juvenile idiopathic arthritis: from medical image examination, clinical examination, clinical gait
analysis, biomechanical simulations and multi-dimensional model of the disease we defined and
extracted, for each patient control, a set of 265 variables considered as potential candidates for
characterising the disease and/or predicting disease outcome. We tested these variables to
undercover features that allow to discriminate activity and laterality of the disease (un paired t-test),
followed by a correlation analysis to identify variables able to discriminate response to treatment and
be potential predictor of outcome disease.
• Conclusions: Current results show that we were able to produce excellent subject-specific models,
that reproducibly provided information that would be otherwise impossible to detect or measure.
Several candidate biomechanical determinants appeared to have some correlation with the response
patterns observed in the small cohort analysed. The statistical analyses also lead to the identification
of complex ankle protection mechanisms, however, no pathology-specific biomechanical alteration
was found, due to the high subjectivity of children response to the pathology. We concluded that:
o An objective quantification of the pathology progression, based on both imaging and
functional data, should always be pursued;
o No joint should be assessed in isolation to drive the treatment;
o The hypothesis that seems more likely from the results of this project is that JIA is initiated for
causes other than an anatomo-functional determinant; however, after the first onset, the
heterogeneity in response to mild treatment may be correlated to how much the child
protects the affected ankle when the disease is active. It might be possible to test this
hypothesis could be tested with a simpler protocol in a larger cohort.

30
WP 11 Modelling and simulation for NND

Objectives

(SAG):
To develop an automated method for the extraction of subject-specific muscles, bones and skin of the
pelvis and legs, first from MRI images of healthy children in the ages 8 to 15 years old, and later from
MRI images of pediatric patients suffering from CP, CMT or DMD.

(USFD):
To develop an automated method to obtain a patient-specific complete anatomical model of the lower
limbs from the MRI-based geometries. This requires the extrapolation of muscle, ligament and tendon
attachment points and lines of action, and joint centres and axes, which are required for the functional
gate simulations but are non-visible in MRI.

TUD
To develop an OpenSim musculoskeletal model based on the anatomical data from USFD. This requires
converting data to specific OpenSim parameters, estimating required muscle path via points, and
estimating maximum isometric muscle force based on muscle volume.

Motek
To develop and a personalize HBM model, based on MRI data sets of to scale it to patient-specific
characteristics using functional calibration. To develop commercial products which can be used in
routinely clinical services for gait analysis. To develop applications using mechanical and visual
perturbation to assess impaired gait patterns.

Outcomes

(SAG):
Through this project, Siemens Healthcare (SHC) has developed a novel method to extract anatomical
structures from MRI images of both healthy and ill children’s legs. The method was adapted to extract
subject-specific muscles, bones and skin of the pelvis and legs, from MRI images of pediatric patients
from 3 neuromuscular disease groups, namely CMT, DMD and CP.

Extensive tests were performed to assess the viability of the original atlas used with healthy children’s
images in pathological cases.
The tests revealed that an atlas used for these purposes must be flexible enough to accommodate
disease-specific deviations, such as femoral version.
The method was adapted to each of the 3 disease groups (DMD, CMT and CP), we created disease-
specific atlas with detailed corresponding structures and adapted the structure extraction methods
according to each disease group’s specificities.

A total of 57 MRI images acquired for 15 healthy subjects and 42 patients have been processed. For each
subject, three dimensional (3D) meshes for 54 individual muscles, 12 bones, fore, middle and back feet
as well as the whole skin were extracted.

(USFD):
During this project, the University of Sheffield (TUoS) has developed a new method which complete the
anatomical model with the muscle, ligament and tendon attachment points and lines of action, and joint
centres and axes, required for the functional simulations but non-visible in MRI.

31
To this purpose, USFD generated a complete anatomical template by integrating geometries from the
reference MRI segmentation and from the publically available TLEM2.0 model. A mesh morphing
technique developed for the European project MySpine, was adapted to propagate the additional
structures from the template to each of the patient MRI extracted geometries, resulting in patient-
specific complete anatomical models.

USFD has processed the MRI-based geometries of one healthy subject and the 42 patients, providing the
corresponding patient-specific complete anatomical models, including

The complete anatomical template was revised for possible anatomical errors, and manually corrected
by USFD with the clinical advice of VUMC. After two iterations of this correction and feedback process,
the final template was accepted as anatomically correct.

USFD has processed the MRI-based geometries of one healthy subject and the 42 patients, providing the
corresponding patient-specific complete anatomical models, including 73 lower limb muscles and bones,
185 ligaments and muscle-tendon lines of action, and 9 joint centres and axes.

The accuracy of the patient-specific complete anatomical models has been evaluated by the surface-to-
surface distance with respect to the MRI-segmented geometries. The results were analysed for each
muscle and bone and for the each disease group (CP, CMT, and DMD) . No important differences were
observed. In all the cases the errors are smaller that 1mm, which are negligible in comparison with the
segmentation errors.

TUD
As part of this project, TUD has produced a tool that can automatically generate an OpenSim
musculoskeletal model based on anatomical data. For this purpose, TUD has developed a unified file
format to specify subject-specific anatomical data: the MPF format. For the tool, TUD has developed
different methods to convert via points and muscle properties from the TLEM 2 model to the Delp
model.

In addition, TUD has performed several sensitivity studies to quantify the effect of changing specific
musculoskeletal parameters. The parameters include muscle attachment, muscle strength, tendon slack
length, optimal fibre length. In addition, TUD developed a method to incorporate tibial torsion and
femoral anteversion based on measures taken from a physical exam. The outcome measures that have
been compared include muscle moment arm and muscle activation during inverse dynamics simulation.

Motek
During MD-Paedigree Motek has developed a pipe-line to construct personalized musculo-skeletal
models, which can be used in medical devices. The outcome of the model can be analysed using the
specifically for this project developed off-line analysis tool for clinical gait data. The tool incorporates
various visualization options as well as a preliminary clinical reasoning interface using different
combinations of outcome parameters to be analysed to evaluate a specific condition.

Secondly Motek has focussed on the perturbations of gait in order to obtain a more sensitive measure of
the level of impairment of a patient. This was done in two way: 1. By studying the mechanical
perturbation of gait by applying rapid disturbances of the walking surface, i.e. belt anterior-posterior
perturbations and medio-lateral perturbations of the walking surface. By applying these perturbation in
such a high speed it can be used to dynamically study the effect on increased reflex activity to study
spasms. 2. Visual perturbations. By using the outcomes of the personalized HBM model to drive
interactive and dynamic Virtual Reality environments and avatars the selective control of a patient can
be studied. Besides this, such application can be applied for training interventions.

32
WP12 - Models validation, outcome analysis and clinical workflows
The aim of the Work Package 12 was (i) to validate the computational models to assure that they can be
personalised by adapting the parameters to the integrated data of a specific patient and to improve the
current knowledge and understanding of the disease by simulating different aspects on the evolution of a
disease; (ii) to verify the accuracy of the insights of the effect of a specific therapeutic intervention; being
this either pharmacological, behavioural or surgical.

In particular WP12 started at month 13 of the project and its main objectives were the following:

1. To clinically validate derived models


2. To improve prediction of outcome and risk stratification
3. To establish integrated clinical workflows and personalised treatment models

In order to provide evidence that the utilized processes were capable of consistently producing finished
products of the required quality, the validation process followed a well-defined logical and chronological
course protocol. The clinical assessment and validation was based on a comprehensive analysis of
available clinical data for the four disease areas, clinical performance data, and safety data. Having four
different disease areas, significant heterogeneity in the building and aim of the different models existed
and followed a multi-layer approach:

1. The first level of validation will be the initial testing and debugging, which will be performed in close
collaboration with the technical partners.
2. The second level will be the internal validation (both prospective and retrospective) that will ensure
that the different mechanistic models reproduce results of the clinical studies used to build the
single model for the specific disease area.
3. The third level of validation will be the external validation which will define the ability of the model
to accurately predict the results of studies acquired through the digital repository and thus to derive
a statistical model from data that were not used to build the model.

Initially a great interaction and confrontation between clinicians and technical partners was necessary and
this led to a new and better understanding of the validation process.
During the second year of the project model testing and debugging were performed in all disease areas as
well as an extensive genetic analysis, while gut microbiota was investigated in obese and JIA children. At
that point, even though the initial results were satisfactory, the few cases tested with a retrograde internal
prediction prevented a complete statistical analysis to ascertain the level of accuracy of the models.
In particular at the end of year 2 the following results were achieved in the four different diseases:

- CMD: the model personalization was obtained in two patients and a retrograde internal prediction
has been performed on them with good results.
- Obesity: the clinical and technical user requirements was revised; a mechanistic model of the left
ventricle has been created and tested; personalization of the model begun; organ fat fraction
estimation through adaptive fitting of statistical organ shape models was successful and a method
for estimation of hepatic fat fraction was developed by Fraunhofer and UCL in a retrospective
dataset to validate the approach.
- JIA and NND: an ankle model that incorporates a full set of variables was created, tested, debugged
and it was considered effective at a preliminary examination.
Personalized leg structure extraction was obtained in 14 healthy subjects with good results.
The musculoskeletal biomechanical Human Body Model for Clinical Gait Analysis was revised
according to the clinical specifications. New regression and functional calibration methods to
calculate joint rotation centres and axis were implemented, tested and debugged.

33
During the third year the work focused on: (i) the final definition of the validation protocol for all disease
area and (ii) the internal validation on patient –specific models. A successful internal validation of both
disease definition and prediction was performed on a significant number of patients included in the study
sample. In particular at the end of year 2 the following results were achieved in the four different diseases:

- CDM: the model was validated on both baseline and follow-up data. The fully-automatic electro-
mechanical simulation and personalization pipeline was finalized and tested in 35 patients, including
patients from all clinical centres.
- Obesity: A fully automatic active-shape-based liver method was developed by Fraunhofer and
evaluated on all available prospective UCL datasets. A prototype called CaseReasoner was
developed (a learned representation for patient data, an associated similarity measure to compare
patient representations and finally, a database of reference patients associated with relevant
information for risk assessment).
- JIA: processed data were available for one single patient in the JIA group and at the three time
points. The biomechanical models were produced, gait simulations generated and contact forces
acting at the ankle joint estimated at all time steps.
- NND: study validation of the adaptation of the method to extract subject-specific muscles, bones
and skin of the pelvis and legs, was performed on MRI images of paediatric patients from the 3
neuromuscular disease groups, namely CMT, DMD and CP.

During the fourth year of the project the internal and external validation process was carried out for the
four diseases. Moreover, integrated clinical workflows and personalised treatment models were defined.

- CMD: personalization algorithm of all models was completed, including: the lumped parameter
whole body circulation model, the fully-automatic electro-mechanical simulation and the
hemodynamic model. Clinical validation was performed in order to define the additional predictive
value in defining clinical outcome and of therapy optimization. Analysis was carried out in the whole
multicentre database with definite primary outcomes including heart transplant and/or cardiac
death. Secondary outcome was defined as: hospitalization event and/or worsening of cardiac
functional class (by either Ross of NYHA classification according to age, as suggested by current
guidelines).
- Obesity: Three different studies, all based on multi-modal data collected at OPBG and UCL, was
conducted in order to assess whether a novel screening approach could be developed. The approach
relies on a specific sequential strategy: patients at risk are identified through the acquisition of
increasingly complex phenotypic measures drawn from very different sources, such as
questionnaires, stool and blood samples, clinical assessments and advanced medical imaging. To
assess
- JIA: The work focused on the performance of the model in test data. To this end, the complete data
set (N = 152 patients eligible for prediction, contributing m = 508 visits) was split at-random in 2/3
(N = 101 patients) for model training and 1/3 for model validation.
- NND: extensive tests were performed to assess the viability of the original atlas used with healthy
children’s images in pathological cases. The tests revealed that an atlas used for these purposes
must be flexible enough to accommodate disease-specific deviations, such as femoral version.
The method that SHC presented for leg structure segmentation in healthy children's images was
finely adapted to each of the 3 disease groups (DMD, CMT and CP). SHC created disease-specific
atlas with detailed corresponding structures and adapted the structure extraction methods
according to each disease group’s specificities.

WP13 Requirements and compliance for Mdpd infostructure

34
WP13 dealt with a requirements analysis for the technical aspects of the MD-Paedigree project and also a
compliance with international rules and regulations. Particular interest in terms of compliance was given
to the VPH (Virtual Physiological Human) framework, as it can value data produced in the project and make
it accessible, and OpenAIRE, as it is important part of the European Union assuring open access to research
results for European stakeholder, similar to international initiatives as in the US, where NIH (National
Institutes of Health) requires all publications to be open access and also data from funded research
projects to be released for research at the end of the project.
In terms of compliance analysis with VPH share it was made sure that all meta data are available in the
right format and that the basic rules of VPH are respected. Availability of data for other research groups
has limitations as MD-Paedigree deals with personal medical data, so very sensitive data, as children are
concerned. The exact sharing of the data thus depends mainly on constraints of the ethics committees of
the participating institutions.
For OpenAIRE another set of rules are required and WP13 defined several points for the project to
implement, also in this case concerning publications that should be openly accessible with a limited delay.
All required aspects for being compliant with OpenAIRE as much as possible were analysed and feedback
was given to the project administration.
Partners strongly involved in VPH and in OpenAIRE performed these analyses.
The largest part of WP was dedicated to a detailed technical requirements analysis among all stakeholders
in the project. Several surveys and face to face sessions were organized to get an information on all
technical requirements by the stakeholders. These requirements were then priorized based on the
feedback working with the disease areas to make sure work started with the highest priority items. Regular
meetings were held with the infostructure group o analyse the requirements analysis and its impact on
the software being developed. As the initial requirements analysis and priorities were taken into account
form the project start, only few adaptations had to be done in the course of the project. The requirements
documents were then updated to correspond in the end to the requirements and priorities at the end of
year 3 of the project. The WP finished in month 32.
It become clear in the course of the project that particularly the coordination between technical and
clinical requirements was essential and thus several meetings were held where mutual outcomes were
discussed and then implemented. The objective was to assure that the most relevant technical aspects
were addressed first and then the parts with lower priorities.

WP 14 Grid cloud services provision and GPU services integration


Since February 2015, MD-Paedigree’s integration platform has been released as an alpha, beta then final
version, implementing the estimated functionalities following an agile process.
With regards to data integration, the FedEHR Repository is currently hosting more than 500 shared
patients. The import has been a rather time-consuming task, taking from two to three weeks of work for
each new import from the file sharing system (each import including the sourcing, preparation and
curation of data with the addition of the importer’s execution runtime). This data integration effectively
took place over a 8 months’ time span. It has to be said that the curation of data has proven to be an
extremely time-consuming task, which wasn’t foreseen in the initial project workplan. The issue comes
from legal and administrative aspects that avoid to get access to original data in the centres and made it
impossible to work on non-anonymised data. The process has been modified to do a first anonymization
on site before importation of data to the repository. As anonymization tools were already registered at
some centres it has not been possible to unify the anonymization process despite specific MD-Paedigree
anonymization profiles have been created and provided for free by Gnúbila to the clinical centres. More
than 1 500 000 image files were treated from the different clinical centres and more than 7 500 000 data
records.
Integration of all gateways is finalised and all data provider partners have been provided by a functional
gateway on site or at OPBG in collaboration with the CARDIOPROOF project. All the legislations have been
respected by the signature of several contracts between requested parties.
35
The whole infrastructure is extensible by adding new permanent or cloud elements. Gnúbila has currently
is ready to manage the dynamic provisioning of server on demand on private or public clouds.
Service consumer from WP16 interact with WP14 technology provider to consume their service in order
to provide GUIs and statistical models.
Integration with VPH-Share in all aspects of data provisioning is completed. This has involved the
development of a dedicated gateway server which allows the seamless migration of data sets from within
the VPH-Share system into the MD-Paedigree platform. Due to the nature of the security models employed
in each of these infrastructure transfers in the opposite direction are not permitted as data should not
flow from an more secure environment to a less secure one. Of course, technically this could be achieved
as well but there has been no use case identified in the project that requires it. This integration also allows
the use of the VPH-Share data extraction and transformation tools (the Data Publication Suite) to be
integrated into the processing pipeline between the clinical data providers and the MD-Paedigree
platform.

Data integration has taken place for all the needed modalities, data is integrated and technical partners
are able to consume web services or other connection mean to get data and use it in their own tool. Access
rights and their management in the repository have been put in place and configured. Integration APIs are
used for different protocols and at different level of connections by the technical partners applications.
Native repository web services are used by different partners in order to acquire and treat data from the
repository. Benefitting from the CARDIOPROOF project the standard download tool is used by partners
facing difficulties to connect web services to get repository data with a standard web browser.

We developed a methodology for performing one-way Fluid-Structure interaction (FSI), i.e. where the
motion of the wall boundaries is imposed. A Graphics Processing Unit (GPU) accelerated Lattice-Boltzmann
Method (LBM) implementation was used and an efficient workflow for embedding the moving geometry
was developed. The efficient approach leads to an average execution time of approx. one hour per
computation, whereas 50% of it is required for the geometry update operations.
Starting from a regular CPU implementation of the Bloom filter algorithm, we employed different GPU
based optimization techniques on the two basic Bloom filter operations: mapping and querying. An
important speed-up was achieved for both operations: over 300x for mapping, and over 20x for querying.
Texture analysis based on steerable Riesz wavelets is a powerful pattern analysis tool, but requires
computing pixel–wise operations resulting in a run time in the order of days when large volumes of data
are processed. To overcome this limitation we developed a GPU based solution. To further increase the
performance, and to overcome compute and memory limitations we applied a series of optimization
techniques, leading to five versions in total. The best performing GPU solution ensured a speed–up of 93x
for the parallelized section of the application and of 29.6x for the entire application.
We introduced a novel approach for the voxelization of solid objects, designed for GPU. The method is
based on a heuristic approach that computes an approximate distance field instead of using mesh surface
normals or exact point-to-triangle distances. Two main steps are required: voxel marking and distance field
computation. The proposed method was found to be exceptionally robust as it is able to handle meshes
with severe defects such as self intersections and holes. The GPU based implementation was on average
20 times faster than the multi-core CPU based implementation.

ATHENA: EXAREME (http://www.exareme.org/) which has been developed integrating Athena Distributed
Processing Engine (ADP) with madIS extensible relational data analysis system
(https://github.com/madgik/madis) is now an open source project supported by the MaDgIK group at
ATHENA.
EXAREME offers a declarative language which is based on SQL with user-defined functions (UDFs) extended
with parallelism and data pipeline primitives. It is separated into the following components: The Master is
elected from the worker pool and is the main entry point, through the gateway, to the system. The Master
is responsible for the orchestration of all the components. The Execution Engine communicates with the
resource manager and schedules the operators of the query respecting their dependencies in the dataflow

36
graph and the available resources. It also monitors the dataflow execution and handles failures. All the
information related to the data and the allocated resources is stored in the Registry. The Resource
Manager is responsible for the allocation and deallocation of resources on each node. The
Optimizer/Scheduler engine translates a high level query into the distributed machine code of the system
and creates the final execution plan by assigning operators to workers. Finally, the Worker executes
operators (relational operators and UDFs) and transfers intermediate results to the master. MadIS is the
core engine of the Worker. MadIS is a wrapper of SQLite based on the python APSW. It processes the data
in a streaming fashion and performs pipelining when possible, even for UDFs. The UDFs are executed inside
the database along with the relational operators to push them as close to the data as possible.
EXAREME offers a relational processing engine able to support scalable distributed execution of complex,
resource, and time-consuming data processing flows mainly related to data mining and decision support.
In addition (and if this is ever required), data mining algorithms can be implemented with EXAREME in a
privacy-preserving way, transmitting only aggregated hospital data (sufficient statistics). It currently
supports the following functionalities for supporting distributed data mining algorithms in a privacy-
preserving way:
1. Get list of the available algorithms such as ID3 Decision Trees, K-Means, Linear Regression, Covariance
Matrix, PCA, Standard Deviation, Summary Statistics.
2. Submit any of the available algorithms for execution.
3. Get the execution status of a submitted algorithm.
4. Get the execution results of a completed algorithm

VPH-DPS tool has been enriched with several importation processes allowing same sources to be
redirected to VPH with no additional coding. Also, imported or not via DPS, the standard followed by the
repository for data retrieval based on web service allows automated transfer to trusted platform with
ease.

Different agreements have been signed between different partners with each other and with the
consortium in order to manage legal concerns. Technical access rights ability has been added to the
system. A governance policy has been proposed to the partners and the system has been configured to
follow this policy.

WP 15 Semantic data representation and information access

From the usability point of view, the ability to search patients records using precise descriptors and
equation (controlled vocabularies such as the Medical Subject Headings) is highly appreciated by
healthcare professionals. HES-SO case-base retrieval systems (“patient like mine”) is an effective tool for
clinical decision support as it is today able to retrieve relevant cases with a precision of 66% - up to 2
relevant cases out of 3 at the top of the search list. HES-SO also developed a service, which allows searching
the MD-Paedigree repository through images that the user has input into the system. This service works
in tandem with text-based search to provide the most relevant hits to the user and visual search can be
combined with text search. Positive and negative feedback is possible for the visual search.

During the project, ATHENA developed a web-based, end-to-end data profiling, curation/cleaning, pre-
processing, analytics and knowledge discovery platform for big data healthcare by extending and
integrating the Data Curation and Validation tool (DCV) developed in T15.1 with the KDD tools developed
in WP16. The platform runs on top of ATHENA’s EXAREME (T14.3, ex-ADP) dataflow processing system,
having only one point of integration with the MD-Paedigree platform. The data cleaning, curation and
validation part of this platform (DCV), is a web application offering an advanced (semi)-automatic data
cleaning process, providing data cleaning mechanisms for facilitating the detection of numeric outliers,
missing values, as well as alphanumeric typographical errors. DCV also offers a user-friendly interface for
defining and running data cleaning rules over a relation such as functional dependencies, conditional

37
functional dependencies and denial constraints. An additional extremely powerful functionality of DCV is
the computation of new derived columns either through discretisation criteria or by computing and
executing arithmetic operations (e.g. for computing medical scores). Furthermore, DCV provides
visualisation of data through interactive barcharts and piecharts which help users to identify the distinct
values of a column’s data, as well as scatterplots and linecharts which give a graphical representation of
correlations between two attributes. Users can also export data regions of interest through interactive
visualisations. Last but not least, DCV keeps a history of all actions that affect the values of data. The user
can undo/redo history or save workflows and re-run them in other projects or with other data.

Figure 8: The Data Curation and Validation (DCV) tool.

WP 16 Biomedical knowledge discovery and simulation model-guided personalised medicine

ATHENA developed a web-based, end-to-end data profiling, curation/cleaning, pre-processing, analytics


and knowledge discovery (KDD) platform for big data healthcare extending and integrating existing DCV
(T15.1) & AITION (T16.1) tools. The platform runs on top of ATHENA’s EXAREME (T14.3) dataflow
processing system, having only one point of integration with the MD-Paedigree platform. Several well-
established machine learning techniques targeting clustering, classification, dimensionality reduction and
similarity search have been incorporated.
During the project, the platform has been used for data cleaning, curation and validation o MD-Peadigree
data (e.g. Obesity eCRF’s) and for JIA and NND classification use-cases. In particular, an automated
classification system was developed and assessed to classify joint movement patterns for NND patients.
All models achieve high prediction accuracy (above 80%) for each pattern.

SAG: Within the scope of this project, Siemens Healthcare achieved a complete refoundation of its case-
based reasoning technology now rebranded as DeepReasoner. Not only the prototype itself, where the
front-end has been completely redesigned and reimplemented as well as its back-end, but also the
machine learning technology at the core of the case-based reasoning engine have been improved. Our
former web prototype implemented in Java and served by a Jetty server supports similarity searches that
use classical distance functions, such as Euclidean distance, computed over the selected clinical variables.
While classical distances are well-studied and therefore easy to interpret, they do not cope well with the
large dimensionality, the presence of uninformative features, or the large difference in dimensionality and

38
scales between different sources of information. All of these features are found in the heterogeneous and
multi-modal data that are being acquired in this project. Within MD-Paedigree project, we proposed a
significant enhancement of the underlying knowledge-discovery engine that has the potential to overcome
these limitations of similarity search, based solely on classical distance functions. This new development
uses ‘deep learning’ methods to derive compact task-specific patient representations that can then be
analyzed in a number of ways, including through similarity estimations. Moreover, we completely
redesigned the front end as well as the back-end using modern web technologies such as html5, javascript
and server side scripting node.js. The retrieval engine itself is implemented as an azure machine learning
web service using python scripting language and keras deep learning library.

UTVB developed a methodology for separating the total stiffness of arteries, determined in vivo, into
stiffness of the arterial wall and stiffness of the surrounding tissue. By employing a reduced-order
multiscale model, the methodology was used for studying the global effects of surrounding tissue support
on arterial hemodynamics. The main effects were: higher wave speed, earlier arriving backward travelling
pressure and flow rate waves, lower total compliance, higher pressure pulse, and reduced arterial cross-
sectional areas.
We introduced a model-based approach for the non-invasive estimation of patient specific, left ventricular
PV loops. A lumped parameter circulation model was used, composed of the pulmonary venous
circulation, left atrium, left ventricle and the systemic circulation. A fully automated parameter estimation
framework was introduced for model personalization. We computed the PV loop for a cohort of patients
and compared the results against the invasively determined quantities: there was a close agreement
between the time-varying LV and aortic pressures, time-varying LV volumes, and PV loops.
The personalized circulation model are used to provide input features for the DeepReasoner.

HES-SO developed a hypothesis generation tool, also used for clinical trials feasibility assessment. Such
systems capitalize on narrative data and text mining techniques. Source data originates from two sources:
clinical data from EHR or medical literature. Data, mostly unstructured textual data, are first normalized
using unambiguous clinical ontologies. Second, they are indexed and a search engine, based on this index,
is designed to explore for association across the different concepts. Associations among 13 data categories
of clinical concepts are available (e.g. chemicals and drugs).

WP 17 testing and validation

ATHENA: In T17.3 Beta Prototype of KDD & Simulation Platform testing and validation [M36-51], we
proceeded with a number of development & testing iterations (similar to agile sprints), which were driven
by the requirements of the NND, JIA and Obesity use-cases of T16.1 and their end-users (clinicians, data
analysts, researchers, etc.), following the quick production process already adopted for making the
functionalities usable (and so testable) as soon as possible. In addition, we prioritized and began
implementing the suggestions/feedback provided by clinicians during the beta prototype training session
in Rome in mid-February 2016 (the feedback was reported in in Deliverable D17.5 “Test on Beta Prototype
of KDD and Simulation Platform”). Bugs were corrected and the user-interface and interactive
visualizations were improved based on this feedback.
Furthermore, as the new web-based KDD platform was integrated with the Data Curation and Validation
(DCV) tool, all integrated algorithms were tested again on synthetic and real data. After the platform was
tested internally by the infrastructure team, we proceeded with validation steps in cooperation with
clinicians. They started to use the platform during the 4th bi-annual MDP meeting (Leuven, mid-September
2016) applying real data from clinical cases. The clinicians managed to load, curate and process their data
successfully on the platform. Prediction models were also developed by the NND group in order to
compare with their own manual classification system. Results are presented in WP16 4 th year periodic
report in T16.1.

39
HES-SO: A methodology to develop and monitor the progress of the Case-Based Retrieval service,
developed within WP15, has been proposed. Qualitative and quantitative evaluations have been used for
testing and validation. The qualitative results were sufficient to improve the application regarding
usability. Quantitative evaluation shows the effectiveness of the tool for clinical decision support, as it is
today able to retrieve relevant cases with a precision of 66% - up to 2 relevant cases out of 3 at the top of
the search list.

GNUBILA: MD-Paedigree Infrastructure testing and validation (T17.1) has taken place numerous times with
testing elements based on agile methodology, with each elementary functionality carrying its own tests
and being in production as soon as possible to be tested by the final users. Real tests by utilising the
applications have taken place during the whole project led by the necessity of doing the real work. This
ensures the usability and has allowed revealing a lot of real-condition bugs. Integration with other
applications has followed the same approach with half of the tests in both sides then following the full
process of real use from side to side.

40
WP18 Dissemination and Training

The dissemination and training WP started its work at the very beginning of the project. First, the
preliminary dissemination materials were provided, together with a strategy for effective dissemination
of the project results. Website and first infographic were provided after less than one month from the
project start.

The strategy indicated the key messages to be vehiculated, the different stakeholders to be addressed,
the training activities outline.

During the following four years, the strategy has been executed, with the constant production of new
materials, newsletters, posters, videos, and through the participation to relevant international events and
conferences.

Particularly successful has been the participation to the ICTXX events: during the project lifetime, MD-
Paedigree participated to ICT203, which took place in Vilnius at the end of 2013, and to ICT2015, organised
in Lisbon at the end of 2015.

During ICT2013, after only 8 months from the project start, MD-Paedigree managed to obtain a big
exhibition space, in which the various tools were showcased through live demo, and where dissemination
materials were distributed. The unexpected outcome of this effort has been the award has Best Exhibition
at ICT2013.

During ICT2015, MD-Paedigree organised a joint booth with its cognate project CARDIOPROOF and with
the Gèant Network, obtaining a big space nearby the EC Space, where video interviews were displayed,
and live demo were provided.

It is worth noting that in both the occasion, Networking sessions were organised: during ICT2013, the
Networking session focused on Big Data Healthcare, and a discussion paper was released for the occasion.
During ICT2015, a networking session on Enhanced Consent was organised, and also in this case, a relevant
discussion paper was released for discussion. Both the discussion papers have worked not only as effective
means of dissemination of the project key concepts and approach, but also for laying the basis for new
research and cooperation initiatives. Particularly, the discussion paper for ICT2015 worked as starting brick
for the new project MyHealthMyData.

Various videos have been produced and published during the project: the first videos produced were
interviews to the developers and clinical partners, explaining the clinical needs addressed by the
developed tools and their key features.

The second series provided demo videos of the analytics tools, as a step by step screen capture showcasing
the various features available.

Annual newsletters have been released regularly on the website and in printed version, for distribution to
various events. The newsletters provide an overview of the progress in each year, and the last issue
provided a general overview of all the activities and results in each disease area and in the Infostructure
development.

Finally, a final conference has been organised in Rome in May 22nd-23rd 2017. The Conference saw the
attendance of more than 100 participants from ten different countries across Europe. Decision-makers,
industry representatives, EC officers, researchers and clinicians attended, in a two-days packed with
plenary presentations, demonstration sessions, and disease-specific sessions.

41
Beside the presentation of the project results, key issues such as validation, business opportunity, future
perspective of the VPH field, have been discussed with the key stakeholders. For the conference, specific
dissemination materials and a dedicated web page were created.

Regarding the training activities, as a preliminary remark, it has to be said that these activities changed
significantly in nature in the implementation phase, if compared with the initial provision included in the
Description of Work.

In fact, if training activities were mainly envisioned as a further tool for dissemination, they became a
different thing: through training activities, which have been organised as hands-on session with the
involvement of the clinicians, it has been possible to gather early feedback on the implemented tools, to
show the key features of the tools to the intended end-users, and understanding the most suitable clinical
usage of the tools. In this sense, training activities and reports also worked as debugging session. Three
training events were eventually organised: the first one, very early in the project, during the second
biannual meeting in Utrecht. Then, the second one, organised during the third Yearly meeting in Rome,
and finally, in the final biannual meeting, held in Leuven at the end of 2016. For each of these sessions,
specific materials, guidelines and script were prepared by the training team, in close cooperation with the
partners in charge of the development of the various tools.

WP19 Exploitation, HTA and medical device conformity

WP19 was aimed at providing mainly four outcomes: 1) an HTA model for the tools implemented in the
project; 2) an impact assessment of the usage of the developed tools in a concrete clinical environment,
on the basis of the HTA framework provided in the first phase of the project; 3) an exploitation plan, also
providing information on relevant regulatory framework and issues (4).

It can be said that these goals have been overall achieved.

As far as concerns the HTA evaluation framework and subsequent impact assessment, a general health
economic model structure was developed and to be applied as part of the MD Paedigree clinical impact
assessment. We showed how such model can be used to assess health technology impact in MD Paedigree
using decision analytic modelling. This generic clinical impact modelling framework is closely related to
and drawn from the European Innovation Partnership for Active and Healthy Ageings’s MAFEIP Tool
(Monitoring and Assessment Framework for the EIP on AHA), by the EU Institute for Prospective and
Technological Studies, Joint Research Centre.
Our HTA model contributes from a socio-economic and commercial perspective towards making VPH
models and data-driven simulations readily available both to researchers and to health professionals as
decision support at the point of care. We prepared an appropriate analytical evaluation framework and
undertook groundwork for exploring market access, including meeting regulatory requirements of medical
products. This also includes exploring health system and business opportunities. The following steps
preformed as central output of WP19 to MD Paedigree describe the main S&T results from an technology
impact assessment and exploitation planning perspective:
• Review, develop and specify a high-level, generic benefit-cost scenario for clinical impact
assessment
• Identify, analyse and summarise knowledge on the disease types of cardiomyopathy/heart
failure, the presently available treatment options, on data of incidence and prevalence,
prognoses for different outcomes to be expected, as a base for the assessment
• Based on this knowledge, develop a detailed clinical pathway model for the indentified
disease, and validate it against the clinical workflow in the Ospedale Pediatrico Bambino Gesù
in Rome

42
• Draft an initial framework/matrix relating pathway steps and clinical interventions to their
respective costs, as well as benefits and costs expected to relate to various outcomes of the
pathways, including their impact on patients/parents
• Populate this template with concrete cost data and outcome estimates from the hospital as
well as from the literature where available
• Undertake a high-level calculation and assessment of the benefits and costs to be expected
from the application of the models under development for this disease once they are indeed
ready for clinical usage.

With regard to the exploitation, several activities have been carried out: at the beginning of the project, a
preliminary Strategic exploitation seminar was organised, and a discussion paper on possible
approaches to joint exploitation was provided. The seminar has been the first occasion to discuss with
both clinical and technical partners the development perspectives of the relevant market, and how
effectively introduce innovative model-based CDSSs and relevant data platform for management and
analytics in the current clinical workflows, and what changes were to be made in these workflows to better
adopt the innovative tools provided by the project.

On the basis of the exploitation seminar, the first exploitation plan was submitted. It explored mainly the
joint exploitation perspectives, while the individual exploitation routes were only briefly indicated (also
in consideration of the lack of concrete business cases for the still not-mature implemented tools).

Subsequently, at M36, the Updated Exploitation Plan was provided. This document provided a much more
fine-tuned market analysis and business model for the joint exploitation plan, while also adding some
details (through partner-specific exploitation tables) on the individual exploitable results and relevant go-
to-the-market strategy. The updated exploitation plan also presented the perspective of a joint
exploitation initiative through SME’s Instrument proposal submission. Finally, this document also reported
about the discussion and relevant licensing draft documents, regarding the maintenance of the basic
technology, namely the PCDR, as a way of ensuring future sustainability of the key tools implemented in
the project, as well as of the datasets collected. Both these elements have been considered key assets for
the development of a compelling business case.

The final year of the project was the year of two SME’s Instrument proposals submission, with a great
effort in preparing the proposals, and also seeking for the support of all the project partners, for making
available tools and datasets for the new venture. Unfortunately, as reported in the Final Exploitation Plan,
both the submissions were unsuccessful.

Still, the project sustainability has been eventually granted through: 1) the Project Cooperation Agreement
between MD-Paedigree and CARDIOPROOF, indicating the mutual interest in maintaining the technical
Infostructure and keep the datasets available.; 2) subsequently, the participation of both the projects, and
of some of the MD-Paedigree partners, to the new EU-Funded project MyHealthMyData, which build upon
the MD-Paedigree/CARDIOPROOF platform and collected datasets; 3) a letter of intent among gnúbila and
OBG, for the maintenance of the PCDR, as basic technology at the core of the project platform, hosting
both the data management and analytics tools, and the data themselves.

43
Potential impact
WP5 Data acquisition and processing for Juvenile Idiopathic Arthritis

The results obtained in MD-Paedigree benefit patients and provide the basis for future research in a
variety of ways. First of all, the developed prediction models could be used in clinical practice to identify
patients at risk to fail to achieve disease inactivity within a short period of time. Those patients could
then be prescribed more aggressive drugs (i.e. biological agents) earlier in the course of disease in order
to increase their probability of attaining inactive disease. The models have furthermore highlighted the
importance of the duration of morning stiffness in this evaluation, a symptom which is currently not
sufficiently taken account of.

Furthermore, these models provide a basis for future investigations into the role of Mogibacteriaceae
and CXCL-9 in the pathogenesis of JIA. The differences in gut microbiota composition between patients
and healthy children pave the way towards follow up studies to find out if gut microbiota are involved in
the pathogenesis of JIA or if they render individuals more susceptible to develop autoimmune diseases. If
these hypotheses are confirmed, these findings might lead to novel therapeutic targets in the treatment
of JIA.

The data sets acquired using MRI and US provide useful information for the ongoing discussion about the
use of imaging in the evaluation of JIA disease activity and outcomes. Protocols and the derived imaging-
based disease scores will be shared with imaging experts and used in the establishment of an
internationally agreed and validated US and MRI scoring tool. These tools, in turn, can be used in clinical
trials to assess the efficacy of (novel) drugs, thus leading to a more comprehensive assessment of
treatment for JIA. Ultimately, these implementations will lead to a therapeutic strategy which is aimed at
the prevention of the occurrence of joint damage and the preservation of function.

The results of the biomechanical ankle model will provide insights into the mechanical component of the
pathogenesis of JIA and the development of long-term sequelae, such as bone and cartilage erosions.
These findings have the potential to impact the treatment of JIA patients substantially, mainly by
drawing attention to non-inflammatory aspects of the pathogenesis of JIA, requiring non-
pharmacological interventions, such as physiotherapy.

The rich dataset collected during this study, including clinical, imaging, gut microbiota, immunological
(inflammatory compounds in plasma and synovial fluid as well as isolated peripheral blood mononuclear
cells and synovial fluid mononuclear cells) and functional (gait cycle analysis) data of a large, multicentre
cohort of treatment-naïve JIA patients early in the course of disease will be used in ongoing research to
better characterize the group of JIA patients. It is an excellent source to generate hypotheses regarding
the pathogenesis and evolution of JIA, as well as the evaluation of disease outcomes. Similarly, the
dataset can be used as a validation cohort to test hypotheses generated using other cohorts.

WP 6 Data acquisition and processing for NND

The developed standardized protocols and data sets contain a wealth of information that may impact the
field of clinical gait analysis.

The developed protocols are a first step towards more standardized data collection in Europe, which will
facilitate better comparison of data between centres.

The developed data set can now be used for all kinds of scientific studies, focusing on classification of
patient characteristics and/or gait patterns, or prediction of treatment outcome.
44
WP 7 Genetic and metagenomic analytics
The impact in providing model for obesity and JIA microbiota enterophenotypes has important impact in
the modulation and amelioration of gut microbiota dysbiosis for the therapeutical treatment of these
diseases, including patient-tailored probiotics and FMT. Currently, the derived models can assist in the
generation of diagnostic and clinical tools, below synthesized. Moreover, next frontiers can be easily
represented as follows in clinical applications.

45
WP 8 Modelling and simulation for Cardiomyopathies
Computational modelling has attracted significant attention in cardiac research over the last decades. We
strongly believe that computational models will improve patient stratification and therapy planning in the
future. They will become the enabling tool for predicting disease course and therapy outcome, ultimately
leading to improved clinical management of patients suffering from cardiomyopathies or other types of
cardiovascular disease.

T8.1: The prototype developed during the course of this project provides a robust and user-friendly means
for segmentation and tracking of cardiac chambers over the whole heart cycle. As shown, it can be
deployed in the clinic and be used directly by physicians after a brief training. Together with the efficient
processing that allows finishing one case within 10 to 20 minutes, it opens up anatomical modelling for
usage in clinical routine. A direct, clinically relevant output of the prototype is an exact blood volume curve
over the entire heart cycle. However, as the anatomical model is a pre-requisite for all following modelling
stages with much more interesting outputs, enabling the physician to get to these stages so fast can indeed
be regarded as the highest impact of this task.

T8.2: The availability of fast and reproducible personalization pipelines, which are scalable both in terms
of fitted data and estimated model parameters, open the way to the online processing of large databases.
The work performed in MD-Paedigree has shown that the parameter values of personalized 3D cardiac
models can capture some intrinsic properties of the heart and can predict the possible behaviour of the
heart under specific changing conditions (such as exercise or drug treatment). In particular, the evolution
of parameters in some follow-up CMP data suggests an improvement of the heart condition under therapy.
Furthermore, the ability to automatically and robustly personalize models is one of the key-enablers for
applying computational modelling tools in future clinical routine; with huge potential implications for more
precise patient stratification and advanced therapy planning.

T8.3: The outcome of the clinical validation of the WBC haemodynamics model suggests that the
personalized model might be suitable to support the physician in planning patient-specific beta blocker
therapies. By providing the heart rate for optimal cardiac function, the model sets a target which the
physician can use to select an appropriate dose of the pharmaceuticals.
Moreover, the quantification of the 3D flow with region-averaged velocities might be used as strong
features in machine-learning approaches to predict patient outcome or optimal therapy. Due to the low
number of cases available for the full 3D haemodynamics model, this could not be evaluated during MD-
Paedigree, but would need to be investigated further in a follow-up study.

T8.4: The FSI technology developed in this project was able to improve the accuracy of the standard
haemodynamics model for the evaluated cases, which in itself is a big success. As the technology is now
ready-to-use, it can also be applied to other applications beyond cardiomyopathies, hopefully improving
results there as well and helping to establish haemodynamics as a standard component in cardiovascular
modelling.

T8.5: Using statistical techniques such as partial least squares (PLS), we can model the relationship
between the parameters of our reduced motion models and the clinical variables. More recent non-linear
methods could be used when larger databases are available. This allows building statistical models of the
diseases from cross-sectional and longitudinal data. For instance, subtle motion modifications can be seen
with increasing body-mass indices. Thanks to the generative nature of our motion models, we can simulate
image sequences illustrating the different cases or their evolution along time or across pathological case.
The developed methods could be used e.g. for better disease understanding or clinical decision support
systems.

46
WP 9 Modelling cardiovascular risk in the obese child and adolescent

T9.1
The tools developed for extracting the cardiac anatomy and characterizing the cardiac function have been
greatly improved during the project to increase their robustness and ease of use. This has significantly
accelerated patient data processing to a level where the tools can handle the large number of cases
acquired by clinical partners in a timely manner. They have been successfully applied to challenging cases
presenting cardiomyopathies (within WP8) and cases from obese children (within WP9). These
tools enable the extraction of highly relevant parameters for assessing cardiac function in an objective and
reproducible manner.

T9.2
To date, the body mass index (BMI) is still the primary measure to assess the degree of obesity for clinical
diagnostics and studies. A major drawback of this simple measure is that it does not estimate the individual
fat distribution within the body, but only the general adiposity of the subject.
Clinical studies have shown that Visceral Adipose Tissue (VAT) highly correlates with cardio vascular
diseases (CVD) [1]. But also subjects with normal BMI might suffer CVD due to fat accumulation in other
parts of the body [2].
Manual delineation of adipose tissues in 3D data is very time consuming, especially for large amount of
data and therefore, it is highly desirable to automate this task as much as possible.
The developments carried out by Fraunhofer are an important contribution for clinical research and
studies, where the BMI is not accurate enough to answer clinical questions in the context of cardiovascular
diseases.

[1] van Gaal and M. e de Block. “Mechanisms linking obesity with cardiovascular disease". Nature, vol. 444,
pp. 875--880, 2006.
[2] Romero-Corral et al. “Normal weight obesity: a risk factor for cardiometabolic dysregulation and
cardiovascular mortality". Eur Heart J., vol.31, no.6, pp. 737--746, 2010.

T9.3
The novel case-based reasoning approach DeepReasoner developed within the MD-Paedigree is a very
generic tool that can be applied to a wide variety of clinical use case. It has the potential of becoming a
new standard for supporting clinical decision as it does not only permit to perform inference, but it also
provides evidences for understanding the reasoning behind the decision.

T9.4
Obesity is a complex disorder and a known major risk factor for the development of cardiovascular disease
in both children and adults. It has become clear that many of the detrimental physiological processes
associated with adult disorders such as heart disease and stroke begin in childhood and are worsened by
obesity. However, these processes remain effectively occult because of a reliance upon simplistic
measures that do not sufficiently summarise cardiometabolic risk, particularly in the young. For example,
many children, even those with significant obesity, may have normal resting physiological parameters such
as blood pressure or fasting glucose, despite evidence from more comprehensive assessment that their
cardiometabolic health is compromised. Such assessment is costly and unsuited for use in large population
screening programs but may have value if reserved for cases at the greatest risk. For this reason, the
development of a screening approach as the one designed within WP9 might permit to identify those
challenging cases that are at higher risk of developing cardiovascular diseases and even to support the
choice of the most adapted treatment.

47
WP 10 Modelling and simulation for JIA
The potential impact of the tools and results of WP10 is scientific and technological.

Scientifically, the project increased knowledge about juvenile idiopathic arthritis and collected a
comprehensive dataset for the characterization of this clinical population. We uniquely showed how the
patient specificity of the disease progression, highlighted by the information collected within this project,
requires a patient specific approach to the pathology quantification. In particular, we showed the
importance of adopting quantitative imaging (MRI and Ultrasound based) and functional (gait analysis
based) tools to assess the pathology progression and rive the intervention. In fact, clear alterations were
still detected by these tools when the patients were clinically judged as being in remission and treatment
was erroneously suspended. This is an exciting result aligned and reinforcing the current efforts in clinical
backgrounds to establish juvenile idiopathic arthritis as a syndrome, i.e. stereotyped response of the body
to a spectrum of conditions.

During this project 57 patient-specific musculoskeletal models were created and used in 534
biomechanical simulations of walking: such a large and complete database of data and models is
unmatched by previous musculoskeletal modelling literature and as soon as the results and data will be
published and shared they will represent a unique source, with an expected high associated impact for the
biomechanics scientific community.

The clinical protocol for collection of medical images and gait analysis compatible with current clinical
practices was conceived and successfully implemented across institutions, so setting the state of the art
for future studies interested in deep phenotyping patients affected by rare diseases. This protocol can also
be considered a concrete step towards adoption of computational modelling in routine diagnostics, as
data collection for modelling purpose becomes a feasible task. In this respect, the project also offered a
previously unseen opportunity for modellers and medics to work together and experiment, for the first
time, the advantages of combining standard clinical evaluations and innovative musculoskeletal modelling
and biomechanical analysis in real clinical cases.

On a technological point of view, the highly efficient semi-automated techniques and procedures here
developed, will be part of JointForce, an on-line in silico medicine service that emerged from the work that
USFD has done in various projects, including MD-Paedigree. The on-line service will provide, initially only
for experimental medicine research purposes, the possibility to examine selected subjects using defined
imaging and gait analysis protocols, upload the data collected, and receive back a non-invasive estimate
of the forces being transmitted through selected joints of that subject during level walking.
We expect to release by Q2/2018 JointForce as an on-line service accessible to every clinical research
group in the world. We are discussing with some in silico medicine service providers (such as
http://insilicotrials.com) the possibility to expose JointForce through their portals and we think the
JointForce technology has a clear case for exploitation as a clinical research tool.

WP 11 Modelling and simulation for NND

(SAG, USFD):
Medical treatment to possibly regenerate or halt nerve and muscle degeneration in combination with
rehabilitation and surgical procedures will hopefully revolutionize the way patients are treated and
improve their quality of life. Accurate modeling of the legs and their function is an important step in this
direction.

The structure extraction and estimation approach developed by SHC and USFD can dramatically reduce
the time required for defining 73 lower limb muscles and bones, , 185 ligaments and muscle-tendon lines
48
of action, and 9 joint centres and axis. Using the proposed method, defining MR-based musculoskeletal
models can become a more time efficient and more accurate alternative to rescaling generic models.

The tool developed by TUD allows automatic generation of OpenSim musculoskeletal models based on
anatomical data presented in a standardized format. This opens up several possibilities for researching
and clinical decision-making, as well as easy conversion to the Motek HBM format.

TUD sensitivity studies help identify scenarios where MRI-based modelling is beneficial, and where other
less invasive methods are sufficient.

The impact of the personalized models is still not clear. There is more research necessary to see if the
outcomes truly are beneficial for clinical decision making.

The inertia compensation algorithm has huge impact for any research at first, and clinical site later, who
uses treadmill perturbation to assess gait impairments.

Real-time feedback on gait parameters to assess and improve impaired walking patterns also can have a
great impact. This methods promises to be more effective than conventional treatment options, and
therefore offers possibilities for patients with e.g. Cerebral Palsy, who otherwise could not receive
conventional treatment.

WP12 - Models validation, outcome analysis and clinical workflows

At the end of the project the validation protocol was defined for all the four disease area. Internal
validation on patient–specific models was performed on most clinical cases, while patient-predictive
models were validated for groups of patients (external validation). Unfortunately, for NMD the validation
of the complete anatomical model was challenging and thus only a limited number of patient was validated
for this purpose in the three disease areas.

Also, innovative model-driven workflows were implemented, indicating the most suitable ways to
translate innovative CDSSs in the clinical practice.

WP13 Requirements and compliance for Mdpd infostructure

The results impacted the development of all technical tools for the MD-Paedigree infostructure in the
course of the project. Development of tools was priorized based on the feedback of the end users and
focus was put on the highest priority items, so tools with lower priority were regarded as secondary, as
potentially less impactful and were thus started later.
WP 13 also assured the compliance with standards and communicated this compliance to the project
partners. Main areas of compliance were identified at VPH were meta data on available data sets can be
made available. Availability of the actual raw data depends on the exact requirements of the ethics
committees. OpenAIRE is a second major standard that compliance was controlled against. Publications
and also data are made available in this way, so also comply with future project regulations. Thus, WP13
assured that the project was aware of all parts that can influence compliance, so these can be taken into
account.

WP 14 Grid-Cloud Services Provision and GPU Services Integration


49
ATHENA: EXAREME offers a relational processing engine able to support scalable distributed execution of
complex, resource, and time-consuming data processing flows mainly related to data mining and
decision support.

WP 15 Semantic Data Representation and Information access

Case base retrieval systems in healthcare environment are likely to provide an alternative to existing
Electronic Patient Records if integrated into vendor solutions. Image context-based retrieval alone and in
combination with text-based search will facilitate the process of finding similar past patients to current
ones and, thereby, provide effective clinical decision support in terms of diagnosis and treatment planning.
This allows alternative forms of browsing the visual context in archives.

DCV is a user-friendly web-based tool for data cleaning, curation and validation of tabular datasets,
developed for clinicians and biomedical personnel, as their data are often manually collected and contain
many errors or inconsistencies. DCV’s functionality for computing new derived columns either through
discretisation criteria or by computing and executing arithmetic operations (e.g. for computing medical
scores), is also very useful for clinicians and will likely save time and minimise errors.

WP 16 Biomedical Knowledge Discovery and Simulation for Model-guided Personalised Medicine


ATHENA’s integrated DCV and KDD platform provides an end-to-end, user-friendly, web-based analytics
platform which is ready to tackle the most challenging biomedical tasks. From data curation to knowledge
discovery, clinicians can explore and identify statistical profiles of their datasets. In cooperation with
clinicians, we already work on incorporating the platform and the related produced models within the
regular clinical validation process. The automated classification system that was developed and assessed
to classify joint movement patterns for NND patients can be used to assist clinicians in classifying their
patients.

HES-SO: The hypothesis generation tool has been tested for clinical trial feasibility using clinical protocols
provided by ClinicalTrials.gov. Two approaches have been investigated: top-down (from trial to
reformulated queries) and bottom-up approaches (from queries to protocol). A mix of these two
approaches seems to be the most efficient approach.

WP18 Dissemination and Training


The impact of WP18 can be summarised as follows:
1) the dissemination materials produced provide a rather complete overview of the work performed
during the project, showcasing the various tools, their features and their potential clinical impact. As the
website will be maintained in the future, the produced information will remain available, hopefully also
making it possible to ignite new research initiatives, or to attract the attention of industrial stakeholders
and investors.
2) the exploitation initiatives conducted during the project led to opportunities for starting new ways to
cooperate with external partners, who were not only provided with enough information to obtain a good
understanding of the project, but also with an outlook regarding future research initiatives and potential
50
commercial developments. This has been in particular apparent when implementing the new proposal for
MyHealthMydata, starting from the Consortium building, to the content of the project.
3) the training activities demonstrated the actual usability of the tools by non-technical individuals, at the
same time highlighting the key requirements and the most urgent shortcomings to be addressed.
Furthermore, the scripts produced for the training can easily be transformed in users’ manuals for new
users to test these tools, contributing to their fine-tuning and subsequent market outreach.

WP19 Exploitation, HTA and medical device conformity


The “impact of the impact assessment” as developed and conducted in WP19 not only illustrated how
the transformation of bio-computational modelling and VPH technologies into a future patient-flow will
supplement and improve the current management of specific diseases targeted by MD-Paedigree. Its
impact lies, moreover, in realising the goal behind the clinical and socio-economic assessment
perspective to facilitate the testing of clinical application scenarios for bio-computational models and to
deliver support tools as well as empirical evidence for health system actors and decision makers,
exploitation planning and business modelling.

With regard to the exploitation, as mentioned in the first section of this report with regard to WP19, the
key goal of having ensured the basic sustainability of the project’s Infostructure has been achieved,
through the MD-Paedigree/CARDIOPROOF PCA, the joint participation to the new project
MyHealthMyData, and the OPBG-gNubila agreement for the PCDR maintenance. These results will make
it possible to keep both the technical tools and the collected datasets available for future research
initiatives and possible business development.

From a strategic standpoint, the various exploitation plans submitted in the course of MD-Paedigree have
provided an articulated outline of the business framework, market characteristic, business cases, relevant
stakeholders, potential users and customers. In particular, the multi-sided platform approach is a fairly
specific and compelling business model for a joint market outreach of the project’s results, which can be
conducive to operational convergence of various stakeholders (Clinical institutions, research centres,
industries in the biomedical field), providing to each of them a specific value proposition, allowing for a
mutual value exchange. This approach - envisioned in particular within the Final exploitation plan and in
the two SME’s Instrument proposals - deserves to be further explored and fine-tuned, while waiting for
the relevant market to become mature enough to enable the implementation of such a vision.

51
Dissemination materials
Project logo
The figure below shows MD-Paedigree’s logo, was agreed by the project partners at the project kick-off
meeting.

Posters

A poster summarising the MD-Paedigree effort was produced and is shown in the following
figure:

The MD-Paedigree infographic

Workshop and conferences posters


These posters have been prepared for illustrating the project scopes and results at international scientific
meeting, such as conferences, workshops and seminars, and recently updated in occasion of the project
Final Conference (Rome, 22-23 May 2017). (see D18.8)

Beside the disease-specific posters, two new posters were created for the Final Conference, displaying
photos taken during the whole project lifetime.

52
Figure 9. Posters realised for the project Final Conference showing events and important moments of the project lifetime.

Newsletters

Issue 1
The first issue was produced after the first six months of activity (November
2013). It was divided in various sections: a programmatic interview with Project
Coordinator, a Project Overview, Highlight & Objectives section, and third one
dedicated to the potential clinical applications in the various disease areas.
Another section was dedicated to the Infostructure Challenges. The issue
reported the most recent project achievements (Latest News), and the last
Special Feature section included a position paper on “Big Data Healthcare” (an
overview of the challenges in data intensive healthcare) by the Project
Manager.

Figure 10. The Issue 1


Issue 2 newsletter frontpage
The second newsletter was released in May 2014. After an editorial by the
Project Coordinator (Prof. Bruno Dallapiccola), it focused on the outcomes of
the Internal Review, also including the comments received from the project
internal reviewers (Marco Bonvicini, Rolando Cimaz, Adam Shortland and
Alberto Sanna). Besides, it included the Action Leaders’ statements regarding
the assessment of the first year of activity (Giacomo Pongiglione, Clinical
Background Activities, User Requirements, Validation, Outcome Analysis,
Workflows; Olivier Ecabert, Modelling and Simulation; David Manset, MD-
Paedigree Infostructure). It also presented the new partner Deutsches
Herzzentrum Berlin (DHZB), which formally entered the consortium in 2014.
Additionally, a specific slot was devoted to highlighting some early results in
the metagenomic study area, as well as some other articles produced by a
number of consortium members on the themes of big data and personalised
medicine. Figure 11. The Issue 2
newsletter frontpage.

53
Issue 3
The third issue of the newsletter was released in May 2015. After an editorial of
the Modelling and Simulation Action Leader and an overview of project and
VPH-related Events & News, the issue highlighted the most relevant
achievements of Disease Modelling in the four disease areas and the updates in
the Infostructure development and relevant tools implementation in two
dedicated sections, as well as the project alfa-prototype. Beside displaying the
concertation efforts and activities carried out with other ICT funded projects
and VPH partners, a special Data Sharing Focus was dedicated to the topical
issues of health data, particularly in regard to the increasing concern about
personal data protection, and meanwhile the enormous potentialities in the
fields of personalized medicine.

Figure 12. Issue 3 newsletter


frontpage.
Issue 4-5
The double 4-5 issue of the newsletter was released in May 2016. The contents
included: an overview of the project latest dissemination materials and
initiatives, internal meetings and attended public events (Project News); a
clinical focus illustrating the Cardiomyopathies Clinical Scenario and a Clinical
impact assessment of the project personalised workflows; the latest
developments of MD-Paedigree models in the different disease areas (Modelling
section); an overview of the Machine Learning and Similarity Search tools
developed within the project Infostructure (Infostructure section); a Genetics
and Metagenomics focus introducing the role and importance of the gut
microbiota and illustrating some latest important results in the field; a special
insert on patient data management, exploring how new technologies can ensure
personal data protection.
Figure 13. Issue 4-5 newsletter
frontpage.

Issue 6-7
The final double 6-7 issue of the newsletter, prepared and distributed at the
project Final Conference, was conceived as a conclusive overview of the
project activity in the different scientific areas, as well as the major project
accomplishments, including scientific evidences, computer-based modelling
tools and relevant prototypes achieved throughout the project timeline,
including their clinical validation and relevant application scenarios, together
with the Infostructure big data analytics. In this view, the newsletter included
five big sections dedicated to the four disease areas (cardiomyopathies,
cardiovascular risk in obese children and adolescents, juvenile idiopathic
arthritis and neurological and neuromuscular diseases) and the infostructure-
related big data analytics. Each one contained an intro to the clinical problem,
research goals and possible application scenario (or the experimental context,
for the latter), and a preview of the section contents. Then, individual articles
were dedicated to each of the main models or analytics developed throughout Figure 14. Issue 6-7 newsletter
the project, baseline principle, implementation workflow, validation in clinical frontpage.
setting and potential impact for clinical decision making and patient care. The issue also included an
overlook of the new research initiatives fuelled by the MD-Paedigree Infostructure, and an exploitation
focus dedicated to the project exploitation perspectives in the healthcare industry.

54
Multimedia
Beside print-based materials, the Consortium has also deemed of major importance to produce video-
based dissemination materials, namely “dissemination objects”, to illustrate the project scopes and
outcomes in a more immediate and direct way. To this aim, the Consortium has produced two different
series of videos: (i) a former one, made with interviews to the Consortium partners researchers, to explain
rationale, mission and state-of-the-art of the project in the different research areas; (ii) a latter, more
recent one, where tools arming the Infostructure have been explained in the form of demo sessions. Both
series have been made available on the YouTube channel and on the website as YouTube plugins.
Ultimately, (iii) a video of the Final Conference key moments and themes has been produced utilizing
conference interviews and shots.

55
Use and Dissemination of foreground

TEMPLATE A1: LIST OF SCIENTIFIC (PEER REVIEWED) PUBLICATIONS, STARTING WITH THE MOST IMPORTANT ONES

Permanent Is/Will open


identifiers4 access5
Title of the Number,
NO Main Place of Year of Relevant (if available) provided to
Title periodical or date or Publisher
. author publication publication pages this
the series frequency
publication?

1 A patient-specific foot model for the USFD Annals of Volume 44, 2016 247-257 https://link.springe Yes
estimate of ankle joint forces in J.A.I. Biomedical number 1 r.com/article/10.1
patients with juvenile idiopathic arthritis Prinold Engineering 007%2Fs10439-
015-1451-z
2 Gut microbiota profiling of pediatric Feb;65(2):451
nonalcoholic fatty liver disease and OPBG -464. doi:
Hepatology.
obese patients unveiled by an Del Chierico 10.1002/hep.2 2017 451-464. NO
2017
integrated meta-omics-based F 8572. Epub
approach. 2016 Jun 2.
3 Bifidobacteria and lactobacilli in the gut file:///C:/Users
microbiome of children with non- /Utente/Down
alcoholic fatty liver disease: which DOI:
OPBG September loads/AOMS_
strains act as health players? Arch Med Sci 10.5114/aoms Termedia NA YES
V Nobili 2016 Art_28294-
.2016.62150,
10%20(7).pdf

4 A comparison of two marker protocols VUMC Gait & Posture 49S Elsevier Amsterdam, 2016 235
for gait analysis in children with Jaap The
cerebral palsy Harlaar Netherlands

54 A permanent identifier should be a persistent link to the published version full text if open access or abstract if article is pay per view) or to the final manuscript accepted for publication (link to
a6rticle in repository).
5 O7pen Access is defined as free of charge access for anyone via Internet. Please answer "yes" if the open access to the publication is already established and also if the embargo period for open

acce8ss is not yet over but you intend to establish open access afterwards.
9
56
Children with Spastic Cerebral Palsy VUMC Published in Frontiers in
Experience Difficulties Adjusting Their Kaat December human
Gait Pattern to Weight Added to the Desloovere 2016 neuroscienc
Waist, While Typically e
Developing Children Do Not
10 Literature Review and Comparison of VUMC PlosOne Published in PlosOne
Two Statistical Methods to Evaluate Kaat March 2016
the Effect of Botulinum Toxin Desloovere
Treatment on Gait in Children with
Cerebral Palsy
11 Statistical Parametric Mapping to VUMC PlosOne Published in PlosOne
Identify Differences between Angela January 2017
Consensus-Based Joint Patterns Nieuwenhuy
during Gait in Children with Cerebral s
Palsy
12 Inter- and intrarater clinician VUMC Developmental Published in Mac Keith
agreement on joint motion patterns Kaat medicine and February Press
during gait in children with cerebral Desloovere child neurology 2017
palsy
13 Prevalence of Joint Gait Patterns VUMC Frontiers in Published in Frontiers in
Defined by a Delphi Consensus Study Kaat Human April 2017 Human
Is Related to Gross Motor Function, Desloovere Neuroscience Neuroscienc
Topographical Classification, e
Weakness, and Spasticity, in Children
with Cerebral Palsy
14 Longitudinal Parameter Estimation in Roch A FIMH 2017 - 9th Toronto, CA 2017
3D Electromechanical Models: Mollero, international
Application to Cardiovascular Changes conference on
in Digestion Functional
Imaging and
Modeling of the
Heart.
15 Longitudinal Analysis using Roch Proc. Of Quebec, CA 2017
Personalised 3D Cardiac Models with Mollero, MICCAI 2017,
Population-Based Priors: Application to September
Paediatric Cardiomyopathies. 2017, Quebec,
CA.
16 Sensitivity of a juvenile subject-specific USFD Journal of
Doi:
musculoskeletal model of the ankle Hannah, I. Engineering in SAGE
Volume 231 2017 415-422 10.1177/0954411 Yes
joint to the variability of operator- Medicine Publications
917701167
dependent input.
57
17 To what extent is joint and muscle USFD Journal of
Available Doi:10.1016/j.jbio
mechanics predicted by Lamberto G. Biomechanics NA – in
NA – in press Elsevier online since mech.2016.07.04 Yes
musculoskeletal models sensitive to press
2016 2
soft tissue artefacts?
18 Concurrent repeatability and USFD Journal of
Doi:
reproducibility analyses of four marker Di Marco R. Biomechanics
Volume 49 Elsevier 2016 3168-3176. 10.1016/j.jbiomec Yes
placement protocols for the foot-ankle
h.2016.07.041
complex
19 GPU-accelerated model for fast, three- Nita C. Annual Inter. Yearly IEEE IEEE Xplore 2015 965-968 http://ieeexpl No
dimensional fluid-structure interaction Conf. of the ore.ieee.org/d
computations IEEE
ocument/7318
Engineering in
Medicine & 524/
Biology Society
- EMBC 2015
20 Optimized Three-Dimensional Stencil Vizitiu C. 18th IEEE High Yearly IEEE IEEE Xplore 2014 78-83 http://ieeexpl No
Computation on Fermi and Kepler Performance ore.ieee.org/d
GPUs Extreme
ocument/7040
Computing
Conference 968/
21 GPU Accelerated Geometric Multigrid Stroia I. IEEE High Yearly IEEE IEEE Xplore 2015 1-6 http://ieeexpl No
Method: Comparison with Performance ore.ieee.org/d
Preconditioned Conjugate Gradient Extreme ocument/7322
Computing
Conference 480/
22 GPU Accelerated Information Iacob A. International Yearly IEEE IEEE Xplore 2015 872-876 http://ieeexpl No
Retrieval Using Bloom Filters Conference on ore.ieee.org/d
System Theory,
ocument/7321
Control and
Computing 404/
23 GPU–Accelerated Texture Analysis Vizitiu A. IEEE High Yearly IEEE IEEE Xplore 2016 56-61 http://ieeexpl No
Using Steerable Riesz Wavelets Performance ore.ieee.org/d
Extreme
ocument/7445
Computing
Conference 372/
24 GPU accelerated, robust method for Nita C. IEEE High Yearly IEEE IEEE Xplore 2016 50-55 http://ieeexpl No
voxelization of solid objects Performance ore.ieee.org/d
Extreme
ocument/7761
Computing
Conference 582/

58
25 GPU Accelerated Geometric Multigrid Stroia I. Inter. Conf. on Yearly IEEE IEEE Xplore 2015 175-179 http://ieeexpl No
Method: Performance Comparison on System Theory, ore.ieee.org/d
Different Architectures Control and ocument/7321
Computing
289/
26 Double precision stencil computations Vizitu A. Inter. Conf. on Yearly IEEE IEEE Xplore 2014 25-29 http://ieeexpl No
on Kepler GPUs System Theory, ore.ieee.org/d
Control and
ocument/6982
Computing
402/
27 GPU Accelerated Semantic Search Iacob A. IEEE High Yearly IEEE IEEE Xplore 2016 - - No
Using Latent Semantic Analysis Performance
Extreme
Computing
Conference
28 A Relational Approach to Complex Chronis I. MEDAL 2016 Yearly EDBT CEUR- 2016 http://ceur- Yes
Dataflows (EDBT/ICDT WS.org ws.org/Vol-
Workshop) 1558/paper45.pdf
29 GPU-accelerated model for fast, three- Nita C. Annual Inter. Yearly IEEE IEEE Xplore 2015 965-968 http://ieeexpl No
dimensional fluid-structure interaction Conf. of the ore.ieee.org/d
computations IEEE
ocument/7318
Engineering in
Medicine & 524/
Biology Society
- EMBC 2015
30 Applying machine learning to gait Joyseeree MIE2015 2015 Stud Health 2015 5 PMID: 25991275 yes
analysis data for disease identification Ranveer Technol
Inform.
31 GPU-Accelerated Texture Analysis Vizitiu PDP2016 2016 IEEE Piscataway, 2016 4 yes
Using Steerable Riesz Wavelets Anamaria NJ
32 Informatics for Health / Development Pasche IFH2017 2017 Stud Health 2017 5 No
and Evaluation of a Case-Based Emilie Technol
Retrieval Service Inform.
33 A Demo of Multimodal Medical Ranveer CBMI 2016 2016 CBMI 2016 5 yes
Retrieval Joyseeree
34 Data Exploration: A Roll Call of All Anna ExploreDB’16 2016 ACM San 2016 31-33 DOI:10.1145/2948
User-Data Interaction Functionality Gogolou Francisco, 674.2955105
CA, USA
35 Applying machine learning to gait Joyseeree MIE2015 2015 Stud Health 2015 5 PMID: 25991275 yes
analysis data for disease identification Ranveer Technol
Inform.

59
36 A method for modeling surrounding Itu L.M. IEEE Yearly IEEE IEEE Xplore 2014 1-4 http://ieeexpl No
tissue support and its global effects on International ore.ieee.org/d
arterial hemodynamics Conference on ocument/6864
Biomedical and
Health 433/
Informatics
37 . Model Based Non-invasive Estimation Itu L.M. Annual Inter. Yearly IEEE IEEE Xplore 2014 26-30 https://www.n No
of PV Loop from Echocardiography Conf. of the cbi.nlm.nih.go
IEEE
v/pubmed/25
Engineering in
Medicine & 571551
Biology Society
38 A method for modeling surrounding Itu L.M. IEEE Yearly IEEE IEEE Xplore 2014 1-4 http://ieeexpl No
tissue support and its global effects on International ore.ieee.org/d
arterial hemodynamics Conference on
ocument/6864
Biomedical and
Health 433/
Informatics
39 . Model Based Non-invasive Estimation Itu L.M. Annual Inter. Yearly IEEE IEEE Xplore 2014 26-30 https://www.ncbi.n No
of PV Loop from Echocardiography Conf. of the lm.nih.gov/pubme
IEEE d/25571551
Engineering in
Medicine &
Biology Society
40 Informatics for Health / Development Pasche IFH2017 2017 Stud Health 2017 5 No
and Evaluation of a Case-Based Emilie Technol
Retrieval Service Inform.
41 Model-driven paediatric Karl A. Building 2017 F. Lau et al. 309-314 Yes
cardiomyopathy pathways – a clinical Stroetmann, Capacity for (Eds.).
impact assessment Rainer Thiel Health
Informatics in
the Future -
Studies in
health
technology and
informatics
42 Prediction of inactive disease in E.H.P. van [In preparation] No
juvenile idiopathic arthritis: a Dijkhuizen
multicentre observational cohort study

60
43 Microbiome analytics in juvenile E.H.P. van [In preparation] No
idiopathic arthritis patients: an Dijkhuizen
observational, longitudinal cohort study
44 Ultrasound changes in synovial S. Lanni Clinical and [In press] No
abnormalities induced by treatment in Experimental
juvenile idiopathic arthritis Rheumatology
45 Does Expert Knowledge Improve Kaat PlosOne Under review
Automatic Probabilistic Classification of Desloovere
Gait Joint Motion Patterns for Children
with Cerebral Palsy?
46 Gait pattern changes in children with Eirini Gait & Posture Published
cerebral palsy after the administration Papageorgi (abstract;
of botulinum toxin injections ou paper in
preparation)
47 Are baseline joint patterns in the Eirini Gait & Posture Submitted
sagittal plane indicative for the success Papageorgi (abstract;
of botulinum toxin injections in children ou paper in
with cerebral palsy? preparation)
48 Functional power training in children Marjolein Gait & Posture Submitted
with CP: Does increased sprinting van der (abstract;
capacity lead to improvement in gait Krogt paper in
kinematics? preparation)
49 Reducing knee joint crosstalk using Jaap Gait & Posture Submitted
PCA correction Harlaar (abstract;
paper in
preparation)
50 A literature overview and novel Marije PlosOne Under review
strength assessment to evaluate the Goudriaan
association
between muscle weakness and gait
pathology in children with Cerebral
Palsy.
51 Differences in synergy complexity Marije Gait & Posture Submitted
during gait between children with Goudriaan (abstract;
Cerebral Palsy and Duchenne paper in
Muscular Dystrophy preparation)
52 Differences in co-contraction level Marije Gait & Posture Submitted
between CP and TD children during a Goudriaan (abstract;
functional and isometric strength paper in
assessment. preparation)

61
53 Gait deviations in children with Marije Gait & Posture Submitted
Duchenne Muscular Dystrophy can be Goudriaan (abstract;
directly attributed to muscle weakness paper in
in two lower limb muscle groups. preparation)
54 Is ultrasound characterization of tissue Marije Gait & Posture Submitted
composition related to rate of force Goudriaan (abstract;
development in children with paper in
Duchenne Muscular Dystrophy? preparation)
55 Functional power training in children Marjolein Gait & Posture Submitted
with CP: Does increased sprinting van der (abstract;
capacity lead to improvement in gait Krogt paper in
kinematics? preparation)
56 Reducing knee joint crosstalk using Jaap Gait & Posture Submitted
PCA correction Harlaar (abstract;
paper in
preparation)
57 Global Longitudinal Strain is More M Chinali Abstract
Strongly Related to Cardiovascular presentation
Outcome than NT-proBNP or Ejection
Fraction in Children with Chronic Heart
Failure: the MD-Pedigree study
58 Echocardiographic Accuracy in A Ricotta Abstract
Evaluating Cardiac Geometry, presentation
Function and Ventricular Strain in
Children with Chronic Heart Failure
from the MD-Pedigree study:
Comparison with Cardiac MRI.
59 Patient-Specific Computer Heart Model M Chinali Abstract
in Children with Dilated presentation
Cardiomyopathy: a Useful Tool to
Guide Beta-Blocker Therapy in
Children with Heart Failure

62
TEMPLATE A2: LIST OF DISSEMINATION ACTIVITIES

Type of Main Date/ Size Countri


N Title Place Type of audience7
activities6 leader Period of es
audien address
ce ed
1 Conference WP4 "Health Promotion in young age" June 2016 Budapest, Hungary Talk
Practical prevention of
cardiometabolic abnormalities.
Lesson from the Origin study to the
MD Paedigree project.
2 Conference WP5 23rd Paediatric Rheumatology September Genoa, Italy Speaker
European Society conference 28 – October
The composition of the gut 1, 2016
microbiota differs between children
with JIA and healthy controls
3 Conference WP 6 European Society of Movement 26/9/16- Seville, Spain Speaker
Analysis in Adults and Children 1/10/16
(ESMAC)
4 Conference WP6 Dutch Biomedical Engineering 26/1/17- Egmond aan Zee, The Netherlands Speaker
conference (Dutch-BME) 27/1/17
5 Conference WP6 European Academy of Childhood 17/5/17- Amsterdam, The netherlands Speaker
Disability (EACD) 20/5/17
6 Conference WP11 FIMH 2017 June 2017 Toronto, CA Keynote Speaker

6 A drop down list allows choosing the dissemination activity: publications, conferences, workshops, web, press releases, flyers, articles published in the popular press, videos, media briefings,
presentations, exhibitions, thesis, interviews, films, TV clips, posters, Other.
7 A drop down list allows choosing the type of public: Scientific Community (higher education, Research), Industry, Civil Society, Policy makers, Medias, Other ('multiple choices' is possible).

63
Maxime Sermesant. When Cardiac
Biophysics meets Group-wise
Statistics
7 Workshop WP11 Advanced rehabilitation technology 042016 Egmond aan Zee Speaker and organizer
to assess pathological
8 Conference WP15 SIB Days / Quantitative Evaluation 7-8 June Biel, CH Poster presenter
of a Case-Based Retrieval Service 2016
(E Pasche, J Gobeill, L
Mottin, P Ruch)
9 Conference WP15 Informatics for Health / 24-26 April Manchester, UK Speaker
Development and Evaluation of a 2017
Case-Based Retrieval Service (E
Pasche, M Chinali, J Gobeill, P
Ruch)
10 workshop WP16 ExploreDB '16 (Third International 1/7/2016 San Francisco, California, USA Speaker
Workshop on Exploratory Search in
Databases and the Web) at the
2016 ACM SIGMOD/PODS
Conference @ San Francisco, USA.
11 Conference WP14 2016 IEEE High Performance 13 – 15 Sept. Waltham, MA, USA Speaker
Extreme Computing Conference 2016
(HPEC)
Nita, C., Stroia, I., Itu, L.M., Suciu,
C., Mihalef, V., Datar, M., Rapaka,
S., Sharma, P. GPU accelerated,
robust method for voxelization of
solid objects, 20th IEEE High
Performance Extreme Computing
Conference, Waltham, MA, USA,
Sept. 13-15, 2016.

12 Conference WP14 2016 IEEE High Performance 13 – 15 Sept. Waltham, MA, USA Speaker
Extreme Computing Conference 2016
(HPEC)
Iacob, A., Itu, L., Sasu, L.,
Moldoveanu, F., Nita, C., Foerster,
U., Suciu, C. GPU Accelerated
Semantic Search Using Latent
Semantic Analysis, 20th IEEE High
Performance Extreme Computing

64
Conference, Waltham, MA, USA,
Sept. 13-15, 2016.
13 Conference WP19 Model-driven Feb 2017 Global Speaker
presentation paediatric cardiomyopathy
pathways – a clinical impact
assessment, Presentation given at
the ITCH 2017, BUILDING
CAPACITY FOR HEALTH
INFORMATICS IN THE FUTURE,
February 16 – 19, 2017 at Inn at
Laurel Point, Victoria, BC, Canada
14 Abstract WP12 51st Annual Meeting of the March, 2017 Lyon, France Speaker
presentation Association for European Paediatric
and Congenital Cardiology (AEPC).

65
TEMPLATE B1: LIST OF APPLICATIONS FOR PATENTS, TRADEMARKS, REGISTERED DESIGNS, ETC.
Confidential Foreseen
Click on embargo date
YES/NO dd/mm/yyyy
Application
Type of IP Applicant (s) (as on the application)
reference(s) (e.g. Subject or title of application
Rights8:
EP123456)

YES US 15/469,310 DeepReasoner: Case-based


Submitted as
EP 17162746.6 reasoning in the cloud using Olivier Pauly and Martin Kramer (Siemens Healthcare
patent
deep learning GmbH)
YES US 14/973,345 Personalized whole-body Tommaso Mansi, Lucian Itu, Viorel Mihalef, Dominik
Submitted as
circulation in medical Neumann, Tiziano Passerini, Puneet Sharma, Dorin
patent
imaging Comaniciu (Siemens AG)
NO US20170116748, Landmark detection with
WO2015191414A3, spatial and temporal
Patents
EP3152736A2, constraints in medical I Voigt, M Scutaru, T Mansi, R Ionasec, HC Houle, AV
CN106605257A imaging Tatpati, D Comaniciu, B Georgescu, NY El-Zehiry
NO WO 2015/031576 Systems and methods for
estimating physiological
Patents heart measurements from
medical images and clinical D Neumann, T Mansi, S Grbic, B Georgescu, A Kamen, D
data Comaniciu, I Voigt
NO US 20150242589 Method and System for
A1 Image-Based Estimation of
Patents
Multi-Physics Parameters D Neumann, T Mansi, B Georgescu, A Kamen, D
and Their Uncertainty for Comaniciu

8 A drop down list allows choosing the type of IP rights: Patents, Trademarks, Registered designs, Utility models, Others.

66
Patient-Specific Simulation
of Organ Function
NO EP 3043276 A2 Personalized whole-body D Comaniciu, LM Itu, T Mansi, V Mihalef, D Neumann, T
Patents circulation in medical Passerini, P Sharma
imaging
NO US 14/973,345 Personalized whole-body T Mansi, L Itu, V Mihalef, D Neumann, T Passerini, P
Patents circulation in medical Sharma, D Comaniciu,
imaging
NO US 2016E24528 Integration of 3D Cardiac V Mihalef, S Grbic, T Mansi
Patents Valve Kinematics with a
Dynamic Heart Model
NO US 62/301,901 Flow-aware intra-cardiac V Mihalef, P Sharma, T Heimann
Patents
segmentation
NO US 62/337,948 System and methods for L Itu, V Mihalef, C Nita, P Sharma, S Rapaka, I Stroia, C
Patents robust voxelization of solid Suciu, M Datar
objects
NO US 62/218,160 Pressure-Driven L Itu, V Mihalef, T Passerini, T Mansi
Computational Heart Model
Patents for Comprehensive
Physiology Assessment and
Therapy Planning
NO US 14/973,345 System and methods for L Itu, D Neumann, V Mihalef, P Sharma, T Mansi, D
personalized computation of Comaniciu, T Passerini
Patents the whole-body circulation
from medical images and
signals

67
Part B2
Please complete the table hereafter:

Description Confidential Foreseen Patents or


Type of Exploitable Timetable, Owner & Other
of Click on embargo Sector(s) of other IPR
Exploitable exploitable product(s) or commercial or Beneficiary(s)
YES/NO date application10 exploitation
Foreground9 foreground dd/mm/yyyy
measure(s) any other use involved
(licences)

A full patent
Patented
Records, 1. Medical and material is planned
diagnostic YES All 2018 OPBG
Diagnostic tools laboratory exams diagnostic 2018-2019 for 2018
record OPBG-GEMELLI
2. Industrial

Fecal
Clinical
microbiota Experimental
Clinical tools no no applications and from now NO -
transplantatio FMTs
treAtaments
n, FMT
Databases of
omics data for
Diagnostic and Records,
diagnostic YES NO in progress NO OPBG
Clinical tools laboratory exams
and clincial
purposes
Consensus gait Protocol No VUmc, KUL, OPBG
analysis protocol
comprehensive Data set Yes VUmc, KUL, OPBG
data set of 864
measurements for
CP, DMD and CMT
Instrument for NO NO Software Electronic health 2018: No patents are HES-SO Geneva
decision records, Implementation at planned. Licenses
support in Hypothesis large Swiss are being
healthcare Generation in Hospitals discussed with
Clinical R&D Swiss Hospitals
Service for NO NO Software Clinical Decision- 2017: No Patents are HES-SO Valais
facilitating the Making Implementation planned
retrieval of

19 A drop down list allows choosing the type of foreground: General advancement of knowledge, Commercial exploitation of R&D results, Exploitation of R&D results via standards, exploitation
of results through EU policies, exploitation of results through (social) innovation.
10 A drop down list allows choosing the type sector (NACE nomenclature) : http://ec.europa.eu/competition/mergers/cases/index/nace_all.html

68
Description Confidential Foreseen Patents or
Type of Exploitable Timetable, Owner & Other
of Click on embargo Sector(s) of other IPR
Exploitable exploitable product(s) or commercial or Beneficiary(s)
YES/NO date application10 exploitation
Foreground9 foreground dd/mm/yyyy
measure(s) any other use involved
(licences)
past similar within MD-
patients PAEDIGREE
based in
image-related
properties of
the new
patient
Commercial Improved 3D YES N/A Application for Medical N/A N/A Owner SHC
exploitation of R&D cardiac cardiac
results segmentation segmentation in
for MRI cine MRI images
images
(MRIBuildR)
General Heart valve NO - Quantification for Medical N/A US20170116748, Owner SHC
advancement of modeling diagnostics, WO2015191414A3
knowledge improvements monitoring, ,
by intervention EP3152736A2,
streamlining planning in CN106605257A
semi- structural heart
automation in disease
challenging
cases
Commercial Personalized NO - Advanced patient Medical N/A WO 2015/031576,
exploitation of R&D electrophysiol stratification, US 20150242589 Owner SHC
results / General ogical and clinical decision A1
advancement of biomechanica support and
knowledge l modelling of therapy planning
the heart for patients
suffering from
cardiomyopathies
General Tool for YES N/A MRI, CT Medical N/A N/A Owner SHC
advancement of planning beta-
knowledge blocker
therapy

69
Description Confidential Foreseen Patents or
Type of Exploitable Timetable, Owner & Other
of Click on embargo Sector(s) of other IPR
Exploitable exploitable product(s) or commercial or Beneficiary(s)
YES/NO date application10 exploitation
Foreground9 foreground dd/mm/yyyy
measure(s) any other use involved
(licences)
General Fast NO - Diagnosis and Medical 2020 Inria Software Owner Inria
advancement of mechanical prognosis tools
knowledge parameter
estimation
General Reduced NO - Diagnosis and Medical 2020 Inria Software Owner Inria
advancement of model of prognosis tools
knowledge cardiac
motion

In addition to the table, please provide a text to explain the exploitable foreground, in particular:

• Its purpose
• How the foreground might be exploited, when and by whom
• IPR exploitable measures taken or intended
• Further research necessary, if any

70
WP7
Implementation of biobanks of microbiota samples;
Extension of microbiota profile databases;
Design of further patents for diagnostic applications of microbiota profiles

WP8
Improved 3D cardiac segmentation for MRI images (MRIBuildR)
Developed as part of T8.1, MRIBuildR is a fast and robust tool to segment and track cardiac chambers over
the entire heart cycle. Enhanced editing capabilities and easy deployment make it user-friendly and
compatible in the clinical environment and resulting cardiac anatomical models can be seamlessly passed to
advanced modelling tools (e.g. EP personalization). Current MRI product code may be combined with 3D
modelling and detectors for LV/RV segmentation developed for MRIBuildR and made available as part of a
modular framework (e.g. browser application) or as a sub-application within a workflow (e.g. Virtual CRT).
During the course of this project, MRIBuildR has been deployed and tested at one clinical site and a more
comprehensive test strategy will be required prior to commercial exploitation.

Heart valve modelling improvements by streamlining semi-automation in challenging cases


• Purpose: extend cardiac modelling capabilities, particularly heart valve modelling towards paediatric
and growth restricted populations (e.g. ASEAN)
• Foreground exploitation: Streamlined semi-automatic extensions to existing Siemens products for
cardiology research and clinical routine
• IPR exploitable measures taken or intended: patent applications
• Further research: to be extended towards full automation using novel machine learning techniques
and data augmentation
• Potential/expected impact: TBD

Personalized electrophysiological and biomechanical modelling of the heart


71
Advanced biophysical parameters (e.g. electrical conductivity of the myocardium, myocardial stiffness,
contraction force, etc.) cannot be measured or extracted directly from routinely-acquired clinical data (e.g.
MR images, ECG, etc.), but they can be crucial for precise patient stratification, disease course prediction,
clinical decision-making, and therapy planning. The purpose of the developed computational model
personalization methods is to estimate such parameters by integrating various types of patient data into
patient-specific whole-heart models. The developed methods are in the process of, or have already been
patented by SHC and may be used in SHC’s future products and services (e.g. advanced diagnosis tools or
clinical decision support systems). The main potential impact is reduced overall expenditures of healthcare
systems through 1) improved patient stratification (more accurate, more efficient), and 2) model-based
therapy planning and therapy outcome prediction, which may help to reduce unnecessary interventions.

Tool for planning Beta-Blocker Therapy


The purpose of the tool is to employ personalized reduced-order modelling for supporting physicians in
planning patient-specific beta-blocker therapies. Further research is required to validate the model and the
predictions at larger scales, in rigorous clinical studies.

Fast Mechanical Parameter Estimation


• Enables estimation of intrinsic parameters of cardiac tissue (stiffness, contractility, compliance)
• Such parameters can help in characterising a disease and in predicting its evolution
• The in-house software developed will be protected by the French agency of software protection
• Further research needed is a clinical study
• Potential impact in reduced number of exams needed to follow up patients and in anticipating
important cardiac events

Reduced model of cardiac motion


• Enables representation of cardiac contraction and relaxation in a compact way
• Enables to build statistics on motion and compare to a group of patients / controls, for diagnosis
and prognosis
• The in-house software developed will be protected by the French agency of software protection
• Further research needed is a clinical study
• Potential impact in diagnostic tools and longitudinal evolution of patients

WP 15
HES-SO Valais provide a service that allows searching the MD-Paedigree repository through images
input into the system by the user. This service works in tandem with text-based search to provide the most
relevant hits to the user. HES-SO Valais contend that image-based retrieval alone and in combination with
text-based search will facilitate the process of finding similar past patients to current ones and, thereby,
provide effective clinical decision support in terms of diagnosis and treatment planning.

72
FINAL REPORT ON THE DISTRIBUTION OF THE European Union
FINANCIAL CONTRIBUTION

73
Appendix 1: Outcome of the Fourth Internal Review
The fourth and final Internal Review took place in Rome, on May 23rd 2017, during the Final Conference,
kindly hosted by P1 OPBG. The Conference offered a concrete opportunity to assess the final status of
deliverables and progress of tasks in each disease area.

MD-Paedigree – Fourth internal Review meeting (Rome, 22nd-23rd May


2017) - Governing Board’s Final Statement
MD-Paedigree has been focusing on developing and validating patient-specific computer-based predictive
models of various paediatric diseases, for new personalised predictive medicine workflows at the point of
care. As a clinically-driven and strongly VPH-rooted project, it has improved interoperability of paediatric
biomedical information, data and knowledge by developing together a set of reusable and adaptable multi-
scale models for more predictive, individualised, effective and safer paediatric healthcare.

After four and a half years the project has come to an end with the vast majority of its ambitious goals being
met, in some relevant cases exceeded. With more than 630 patients recruited, assessed and followed up, the
clinical validity and applicability of both analytics and simulation technologies has been extensively proven,
even if delays in the collection and management of patient data have threatened in some occasions to delay
meeting research and development results.

One of the major achievements has indeed been the collection, curation and harmonized aggregation of both
research and routine clinical data, for more 7 million episodes of care, ranging from 3D MRI scans to
metagenomic profiles. The Infostructure so populated remains one of the largest curated data repositories
for clinical and in-silico medicine research in Europe being already used by other consortia.

One of the major challenges in the project has indeed been managing disparate data sources, especially the
integration of multimodal datasets. This crucial step presented unforeseen issues in terms of real-time,
automated anonymization and data format reconciliation, which required increased coordination and
dedicated efforts from both clinical and technical partners, all of which very generously contributed to their
solution. Technical tools have been for instance created to facilitate data transfer and integration and, in
some cases, manual anonymization was carried out in clinical centres as anonymizing tools were perfected.

After the Third Biannual Meeting held in Crete in September 2015, clinical data quality managers at each
clinical and technical centre were appointed to manage this complex process. This allowed modelling and
analytics activities to make substantial progress. These developments were showcased at the Final
Conference in Rome, attended by 121 participants from 11 countries. All mechanistic models in the
respective clinical areas reached the intended level of technological maturity, being able to predict relevant
and actionable clinical parameters. Their design has been directly informed by physicians’ feedback and
refined iteratively on the basis of hands-on user testing and clinical validation in prospective clinical studies.
The feedback and training sessions for cardiac models, data curation tools and analytics systems were
supported by printed training booklets for each tool, forming the basis of more comprehensive “User
Manuals” that have been completed and published.

Validation activities occupied most of the final year. The leader of WP12 - Models validation, outcome
analysis and clinical workflows facilitated this crucial process defining upfront, for clinical and technical
74
partners, a 3-level validation framework (D12.1), from technological to properly clinical tests. On this basis,
a complete validation protocol was developed for each modelling and analytics module, including clinical use
case and end points, target parameters, sample population and statistical methods. In parallel, detailed
workflow analyses were conducted to assess how mechanistic models, specifically in the area of
cardiomyopathies, could assist in allocating patients on the most appropriate treatment and monitoring
protocol, based not on their traditional clinical categories, but on the more accurately predictive indications
generated by the model (See below). These economic analyses uncovered large saving potentials, in relation
to reduced hospital resources consumption, improved quality of life and extended life expectancy.

The area of metagenomics was one of remarkable success, having analyzed more than [--] in three European
regions, which were profiled through both microbiome genetic sequences and metabolites quantifications.
Advanced analytics revealed in these populations direct and statistically significant relationships with
cardiometabolic syndrome in obese children at risk of obesity, with specific radiology markers, such as liver
fat, while corroborating other previous findings. The JJA portion of this study has provided unprecedented
information pointing to meaningful changes in species richness and evenness at baseline, flare and remission.
One major aspect addressed by these studies was the integration of microbiome analysis into clinical routine
diagnostics, which is currently a still open issue. This goal was reached by using an innovative way of reporting
OTUs (phyla, genera and species) and metabolites (volatilome) through the fold-change rank ordering
analysis (FCROA), expressing the different OTUs/metabolites in a log2 scale. An original microbiota-based
predictive model was developed for JIA children, by processing and reducing to a low fused data
metagenomic and metabolomic multidimensional profiles. Partial least square discrimination analysis
identified several OTUs and metabolites that were integrated as variables important for prediction to
produce the correct classification rate model, which allowed to obtain significant predictive figures for JIA vs
controls (CTRLs), and for patients in different stages of disease in respect to CTRLs

The integration of similarity search tools implemented by SIEMENS, HES-SO, and Athena brought to the
creation of a multifunction, single user interface tool running DeepReasoner for multidimensional queries on
diverse structured data and the Case-based Retrieval to query unstructured texts (discharge summaries). This
tool has been tested by clinicians for its capacity to identify and display cases that contribute relevant to
current patients’ management, which was indeed demonstrated in the vast majority of cases. The tool thus
proved to provide clinicians with precise, relevant and clinically actionable information to support patients
management.

Key to this effort was the updated Data Curation and Validation (DCV) tool that includes Knowledge Discovery
(KDD) components. Similarity-based classifiers were trained on seven NND classification joint-movement
patterns. Prediction accuracies above 80-85% were achieved for all joints. DCV has been published
(http://dl.acm.org/citation.cfm?id=2955105) and can be used via the MD-Paedigree platform but also online
at http://shovel.madgik.di.uoa.gr:9090.

Finally, exploitation has been a major focus of MD-Paedigree’s meeting at this advanced state of the project.
With the assignment of the MyHealth-MyData project grant, many of the analytics tools developed during
the project will be exposed to a large population of physicians and patients in two decision support use cases
(patients like mine, patients like me) over a blockchain-enabled data sharing infrastructure. The Infostructure
will be extended further to support advanced privacy and security models in healthcare.

75
Discussion on potential markets, possible product and services features, and steps required for a commercial
launch took place. Along these lines, two EU funded SME Instrument proposals exploring the concept of
multisided platforms have been submitted, but were unsuccessful.

The obesity WP leader in conjunction with SHC developed a screening and risk management protocol for
cardiometabolic disease, leveraging ‘low cost’ data first, in a longitudinal diagnostic process, such as patient
questionnaires, based on which increasingly invasive and costly exams can be performed. This probabilistic
model could be used as underlying logic to an end user tool with appropriate application-level functionality,
in a mobile or web-based implementation, to assist with early identification cases at risk with minimal
screening costs. Therefore, a pilot multi-centric study was conducted to collect a cross-sectional dataset of
approximately 160 children (including more than 100 obese), consisting of questionnaire, anthropometrics,
genetic, clinical as well as imaging data. Using their MRI data, parameters characterizing different aspects of
their cardiac function as well as their liver, subcutaneous as well as visceral fat distribution were extracted.
Extensive cross-validation experiments have been conducted to evaluate the different hypotheses linking the
parameter domains at different levels of screening for identifying patient at risk. From these experiments,
many interesting results could be identified for inferring the risk considering a particular intermediate
outcome. Interestingly, early information gained from questionnaire or postal test provided good results as
for instance questionnaire data for predicting compliance, microbiome and BMI information for predicting
liver fat ratio or microbiome and clinical information for predicting change in BMI. This suggests that the
proposed screening seems to be a viable strategy to identify young patients at risk.

A major application of cardiac mechanistic model has been developed by the cardiomyopathies working
group, in which a lumped parameter hemodynamic model of the heart has been used to predict how changes
on heart rate and blood pressure impact on patient/specific cardiac function, which was validated on 18
patients. Moreover, anatomical, electro-physiological and biomechanical models have been run for more
than 100 cases. The systems’ personalization parameters are currently analysed statistically for correlation
with clinical outcomes.

The Neurological and Neuromuscular Disease area has also achieved remarkable success by developing
patient specific gait simulations that accurately predict pathological and post-surgical postures and dynamic
from tagged CT scans and clinical exams.

The JIA area was subject to both analytics and modelling work. Regarding the first, while deep and extensive
statistical analysis highlighted some individual factors associated with disease evolution, these variables
taken together could not predict disease activity with satisfactory accuracy in all patients together.
Nevertheless, upon reducing the heterogeneity underlying JIA by analysing specific categories separately,
models with adequate predictive ability in test data could be developed. Thus, these results point towards
the importance of a better definition of the pathologies underlying JIA, thereby reducing the heterogeneity
among patients. Our results provide the basis to explore the prediction of disease evolution more fully in
such better defined subgroups of the disease. On the modelling side, the JIA group was highly successful in
collecting and processing a unique dataset of clinical, imaging and biomechanical variables, and leveraging
on these longitudinal data proved the effectiveness of the developed models in predicting patient-specific
walking patterns and relevant changes in light of varying symptoms severity and response to treatment.

While MD-Paedigree achieved the vast majority of its intended outcomes and in some cases exceeded them,
for the sake of scientific openness and proactive collaboration with the European Commission, the
Consortium deems beneficial to list and analyse also some relevant project shortcomings and their causes.
76
Patients data collection
Thanks to the project’s extension, delays in data collection did not, in themselves, impact modeling
operations, but issues in data integration and sharing, driven by complexity in anonymization and data
formatting, were nevertheless significant. The configuration of anonymization tools at clinical centres was in
retrospect too complex for clinical users and after anonymization defects were noted on test data, alternative
solutions were attempted. None of these was straightforward or free of privacy risks, so ultimately, manual
anonymization and upload, although highly time consuming, was adopted as the only viable method for the
data subsets at stake.

Formatting issues, mostly related to MRI images, while less severe, also impaired data sharing. In this case,
problems were related to acquisition protocols and special DICOM configurations used at different clinical
centres. This problem also required manual data clean-up.

These issues made roughly 20 cardiology records inaccessible. While these were in the end not used, other
ongoing projects are set to make use of them. In addition, the statistical validity of simulations and analytics
were not impacted.

Limits in the socio-economic impact analysis


The scope of the socio-economic impact analysis was reduced during the intermediate phases of the project
to the investigation of only one cardiology use case. This decision stemmed from two considerations: the
maturity of other models, partially impacted by the problems listed above, was not granting enough clinical
perspective to investigate their real-world impact. In addition the possibility of focusing in depth on one use
case and fully addressing it, rather than distributing efforts to describe multiple scenarios was considered
more appropriate. In the Consortium’s view, the detailed study of one clinical use, was better suited to
provide valuable evidence, to be then extrapolated to other contexts, rather than a more superficial
treatment of different areas.

Lack of involvement from patient associations


Patient associations were included in the original plan among other external stakeholders to provide
feedback and strategic insights. Ultimately this was not accomplished due to multiple factors. Resources
constraints arose as the final conference’s budget exceeded initial allocation, leading the organizers to
restrict funding. In addition, the clinical scenarios more relevant to these groups, ‘patients like me’, did not
achieve full maturity due to structural limits (personal data sharing outside of research contexts) in the scope
of the project and the whole field. The ongoing MHMD project is addressing these issues and prioritizing the
involvement of such groups accordingly.

Other minor shortcoming


NND: the automated mapping of muscle insertion points presented technical challenges that made the
development unfeasible. The clinical value of the procedure was also in question due to the fact that clinical
palpation can provide similar information at much lower cost and finally the decision was made to forgo
further development.

77
Review by Prof. Tammo Delhaas, assessing the progress of the Cardiomyopathies study WPs 3
– 4 -8 - 9

Faculty of Health Medicine and Life Sciences


MD-Paedigree final conference May 22nd -23rd, 2017
held in Rome – Ospedale Pediatrico Bambino Gesù
Review of Workshop on:
‘The importance of modifying cardiologic risk factors in children and adolescents’
Review by: Tammo Delhaas, MD, PhD - Consultant Pediatric Cardiologist - Professor and Chair of Biomedical
Engineering Maastricht University, Maastricht, The Netherlands

Overall impression: All participants were highly motivated and eager to discuss their own results as
well as the results of other participants. The atmosphere was open and people were willing to
explore the strengths as well as the weaknesses of their work.
@Andrew Taylor: He gave an excellent overview of the pandemic problem of increased incidence
of (morbid) obesity in childhood and pointed out the cardiovascular burden imposed by obesity.
@Franziska Degener: She introduced laymen in the audience to the pathophysiology and clinical
picture of dilated cardiomyopathy in childhood.
@Marcello Chinali: He showed how finally the patient-specific heart models were used in the
prediction of outcome (and therapy) in pediatric dilated cardiomyopathy. Despite the fact that it
became apparent that OPBG missed the ICT-infrastructure to store and recall clinical data, thus
hampering the possibilities for input to the models, it showed that especially the actual level of force
development in both ventricles at moment of representation predicted outcome. A drawback of the
methods used was that finally the outcomes of multimodality models were used as input for a linear
regression model. This reviewer finds it a missed opportunity that the individual models were not
used to test effects of interventions. This reviewer also strongly advises to be very specific when
using the word ‘model’, whether it refers e.g. to a Finite Element Model, to a Fluid-Structure
interaction model, to a lumped parameter model of the heart and circulation.
@Alex Jones: Most interesting result showed by this speaker was the fact that the Beta-adrenergic
response of the gut in obese people might be responsible for sustained hypertension. This might
explain why renal denervation doesn’t work to combat hypertension.
@Olivier Pauly: He showed how machine learning could help to identify the individual course of
BMI-development before/after an intervention. Both Jones and Pauly were modest about their
results, because the size of the population on which they had data to feed the model, was not large
enough. Hence, they
called the results preliminary; it was about developing a concept and to test whether it could be
right.
@Tobias Heimann: He introduced as to multiscale simulation, in which a modeling-pipeline from
anatomy to electrophysiology, biomechanics and finally circulation was used. He also introduced as
to the various models used in the work packages on pediatric cardiomyopathies and on

78
cardiovascular disease risk in obese children and adolescents (these models were subsequently
demonstrated by the persons in charge of the specific model). It seems to this reviewer that a leader
with a helicopter-view, skilled in both modeling and pediatric cardiology, was missing in this work
package as evidenced by the fact that all models used are stand-alone models that were not truly
integrated.
@Karl Stroetmann: He shared his results on the possible financial impact of implementing modeling
in the treatment of pediatric dilated cardiomyopathy. His choices for probabilities used in his
stochastic prediction model were well thought-off, but this reviewer finds it a missed opportunity
that they neither
tested multiple settings nor performed a sensitivity analysis in order to identify the factors that are
most rewarding to chase.
@Panel discussion: This discussion mainly focused on (the ownership of) data: how can data be
shared between institutes, semi-anonymity, the patient owns his/her own data.
Concluding remarks:
Approaching now its conclusion, these two work packages of MD-Paedigree have confirmed the
high level of scientific competence of its Partners and the excellent quality of most of its work. Major
challenge will be to truly integrate the patient-specific models with each other and within daily
clinical practice.

Maastricht, June 24th, 2017


Visiting address Postal address T +31 43 388 1659 IBAN: NL05 INGB 0657 6187 05
Universiteitssingel 50 P.O. Box 616 F +31 43 388 1725 BIC: INGBNL2A
6229 ER Maastricht 6200 MD Maastricht www.maastrichtuniversity.nl VAT Identifier EU
The Netherlands The Netherlands c.meertens@maastricht
university.nl
NL0034.75.268.B01
KvK nr.: 50169181

79
Review by Prof. Rolando Cimaz, assessing the progress of the JIA Study – WP5 – WP10

Florence, May 25th, 2017

I have been involved as internal reviewer in the final evaluation of the JIA part of MD-Paedigree
project, and asked to submit this report. Having taken part in all the previous annual meetings, I have seen
the project grow and can say that it has substantially met the initial requirements.
With regard to the total number of inclusions, it has reached its expected target of about 170 patients
with Juvenile Idiopathic arthritis. The project has been very ambitious and consisted of different parts: 1.
imaging (ultasound and MRI), mostly performed in Genova (G Gaslini); 2. cytokine profile (Utrecht);
microbiome (Bambin Gesù); and biomechanics (Sheffield).
All these components have achieved important goals, with clinically relevant scientific
advancements. What has been more challenging has been the integration of the different parts.
Indeed, several models and algorythms have been tested in order to harmonize the results of the
different project components. The final result has included a model of prediction of inactive disease by
looking at some baseline variables combined. Although the model is yet to be perfect, it is a substantial
advancement in the field.
I think that the researchers and participants in the JIA section of MD-Paedigree have worked very
well and that the initial proposed goals have been satisfied.
I am obviously available for any further information that you may need, and send my best regards.

Rolando Cimaz, M.D.

Prof Rolando Cimaz


Associate Professor of Pediatrics
Head, Pediatric Rheumatology Unit
Anna Meyer Children Hospital
University of Florence
Italy
r.cimaz@meyer.it

80
Review Dr. Adam Shortland - Report on the progress of the Neurological and Neuromuscular
Disease Working Group
Rome 22/23 May 2017
Authored by: Adam Shortland PhD, Consultant Clinical Scientist, Guy’s & St Thomas’ NHS Foundation
Trust, London.
Date: 24th May 2017 - Modified: 27th June 2017.
Overview
The members of this Working Group were charged with developing technologies and protocols to
inform clinical decision- making in three distinct disease groups (cerebral palsy, CP; Charcot Marie
Tooth (CMT); Duchenne muscular dystrophy, DMD). Clinical decision support was anticipated
through two methodologies i) a statistical approach in which many data sets mostly from different
clinical gait analysis units (VuMC, Amsterdam; KULeuven, Leuven; OPBG, Rome) would be uploaded
to a common data repository to be analysed; ii) a deterministic approach in which musculoskeletal
data from MRI data would be coupled with mathematical musculoskeletal modelling to estimate
the forceful contributions made by musculoskeletal components in walking, in health and disease.
While progress has been made using both these approaches, some of the outputs from the projects
that could be used by the clinical community may not have been fully produced, even though a
consensus protocol was developed by the end of year two and a new HBM model, usable not only
in steady-state gait analysis, but also in real-time feedback regarding gait perturbations and was
disseminated through the ESMAC network.
For example, I understand that more than 800 datasets (the greater proportion of them
retrospective) have been uploaded to the repository, though evidence of significant statistical
analyses being performed on these data was not presented in the reporting sessions at the final
review meeting. My understanding is that these analyses may have been reported under a different
work package (WP12). Further, data was available to perform comprehensive bespoke analyses but
it was unclear that bony morphology and technical axis systems had been imported into the clinical
software package directly from the MRI data.
One criticism of the work plan is the dependence of WP11 on WP6. Particularly, the delivery of an
adequate number of MRI datasets to the teams at Siemens and the University of Sheffield during
the early stages of the project. This resulted in delays in the inclusion of bespoke musculoskeletal
data in the gait analysis package of the commercial partner.
Despite some shortcomings, the teams should be congratulated on the work performed, and on
some of the technical challenges that they had to overcome. The problem remains that tangible,
useable outcomes from the project are still some way off and it remains hard to see how/when
these are to come to completion.
Work Package 6.
The deliverables here were met in the end even if some of the deliverables were delayed.
Particularly, the provision of an adequate number of MRI datasets was delayed meaning that
progress in WP 11 was checked. Some ground was made by the delivery of 42 MRI datasets later in
the project.
81
It was disappointing that while some longitudinal retrospective data was included in the repository,
the reviewer was not able to witness the analyses of these data. I understand that these may have
been reported under WP12.
Work Package 11
I believe that it was always going to be difficult to meet the deliverables for this hugely ambitious
work package.
D11.1 At the time of the internal review SAG (Siemens) delivered a semi-automatic segmentation
of muscles, and UoS (University of Sheffield) made some progress extracting bony morphological
data from the same data. However, I did not see a comprehensive analysis of the accuracy and
reliability of the segmentation or of the parameter extraction. Development of the approaches used
by SAG and UoS were reliant on other team members supplying comprehensive MRI datasets. I
believe that these may not have arrived on schedule but were included eventually under a modified
description of work (that was not made available to this reviewer).
D11.2 I did not receive evidence that significant analyses of the accuracy of the scalable models
informed by regression equations were made.
D11.3 It was not clear to the reviewer how data from the clinical parameters was included in the
optimization model, besides muscle volume. Some very good work from the University of Delft on
model optimization had been produced but I was not witness to much data describing the validity
of the model – say from electromyography data.
D11.4 There was some dependence of this deliverable on D11.3. I know that some methodologies
developed by the participants may be available to help develop a disease specific muscle model and
I look forward to seeing this in the future.

Adam Shortland 27/06/17

82
Review by Prof. Maria Krestyaninova, assessing the progress of the Microbiome Study – WP7

83
84
Review by Prof. Alberto Sanna, assessing the progress of the Infostructure WPs 14 – 15 - 16 -
17

The MD-Paedigree Infostructure has achieved a coherent and comprehensive scientific,


technological and organizational framework that has enabled multiple medical domains with
different missions, from clinical to research, distributed in different leading research hospitals all
across Europe dealing with a variety organizational procedures and trust domains to improve and
harmonize at their best practice, increase the overall quality of their medical data, secure shared
access among them, integrate in the end-to-end workflow a variety of data value-enhancing tools
such as Data Curation, Knowledge Discovery and High Performance Computing.

The MD-Paedigree Infostructure expertise, tools and platform position SMEs and academic partners
with a competitive advantage to transfer individually and collectively their knowledge assets and
software prototypes and products in the European Healthcare research and market arena. At the
same time, the experience and the collaboration gained by the individual clinical/research
departments of the hospitals involved position them as champions in their own organization with
the role to lead the extension of the MD-Paedigree Infostructure vertically within their own
institution and horizontally with other collaborating institutions.

All partners involved in The MD-Paedigree Infostructure are strongly encouraged to exploit their
leading position and exploit their project results to stimulate and accelerate, as a reference project
in the domain, more efficient and effective use of clinical and research healthcare data in Europe.

Alberto Sanna
Director e-Services for Life and Health
San Raffaele Hospital
Milano, Italy

Alberto Sanna
e-Services for Life & Health Director
Scientific Institute San Raffaele
Via Olgettina, 60
20132 Milano
email: alberto.sanna@hsr.it
tel.: +39.02 2643 2019

85
ETHICAL EVALUATION CRITERIA APPLIED TO MD-PAEDIGREE

1. Committee Composition

Profs.: Laura Palazzani (chair), Siobhan O’Sullivan, Linda Nielsen, Herman Nys, Marco Trabucchi

2. The requirement for an ethical and legal committee in the project

Paedigree is a challenging and strongly innovative project on a scientific level, within the
context of the dynamic acceleration of techno-science in medicine and beyond medicine, because of
the transformation of medicine (the so called ‘the 4Ps Medicine’: prevention, prediction,
personalization, participation), the complexity and convergence of disciplines (medicine, biology,
informatics, bioengineering etc.), and the interaction and integration of different methodologies.

The ‘new wave’ of health technologies includes the rise of digital health technologies. In this area,
Paedigree is a ‘pioneering scientific method’ of predictive, personalized disease diagnosis in the era of 'big
health data', which combines ‘traditional’ clinical research with ‘innovation’ of digital repository data. These
technological tools and innovations raise ‘big promises’ and, at the same time, ‘big challenges’ at the ethical
and legal levels.

The task of the Paedigree ethical and legal committee has been to raise critical awareness of
researchers in the project with regard to some ethically and legally problematic areas.

3. The role/task of the committee

The role of the Paedigree ethical and legal committee was two-fold, firstly on a general level,
raising awareness of the ethical and legal framework in which the project was situated (the state of the
art of the bioethical and biojuridical debate on the contents of the project, reference to Opinions of
ethical committees at European and international levels, legal norms at national and international
levels) and, secondly, on a specific level, carrying out evaluation of the project itself, monitoring step-
by-step its compliance with ethical and legal requirements, in order to guarantee the highest ethical
standards in the research project.

The Committee (also through meetings with the investigators of the project, as well as internal
meetings) has had a regular ethical and legal overview of the project: at the beginning by formulating
general and specific questions to researchers/investigators, all the other research units of the
consortium, and processing/evaluating the answers; afterwards, finalizing a report each year which
included the ethical-legal evaluation of the development of the project, as well as guidelines and
recommendations in order to raise ethical awareness in researchers involved in the project. What was
very appreciated by the committee was the active engagement and responsive dialogue with the
investigators.
88
It is important to underline that ethical and legal monitoring has been a continuous process
during the project. The activity was planned from the outset, in order to run in parallel with the research
itself .

4. Ethical and legal problematic areas in the project

In the ethical and legal analysis, it is possible to differentiate the project into two phases: the
clinical phase and the technical-digital phase.

The first phase raised ‘traditional’ ethical problems (ethical requirement of the involvement in
research, management of biological samples, counselling for genetic tests); the second phase required
the evaluation of the ethical and legal conditions for the establishment and use of digital repository.

1. In the clinical phase of research

The Committee identified the ethical conditions for the recruitment/enrolment/involvement of


the patients11 and the elaboration of informed assent of the children (taking into account their age,
using age-specific materials) and informed consent of their parents. The issue of minors reaching the
age of majority during the course of the project as well as the use of previously collected data from
Health-e-Child and Sim-e-Child, and the scope of the original consent for those studies): were also
considered.

- In drafting the consent forms the committee advised that certain elements be included including
information about the project, brief and clear description of the research and specifically explicit and
detailed evaluation of risks/benefits of the participation (articulated in the different parts of the
project)12; thus while the specific aspects of the research could not always be detailed it was clear that
the morally relevant features of the research would be elaborated for participants and their parents.

- in the case of genetic tests: the committee recommended that the issue of possible feedback of
‘incidental findings’ uncovered during genetic testing should be include din the consnet form, this
included information about the right to know or not to know; possible duty of the parents to know
when information would be relevant for the health of the children, on a diagnostic, preventive or
therapeutic level, or for reproductive choices); in these cases adequate genetic counselling should be
provided13;

- the management of biological samples was also covered in some detail in the consent forms. In accordance
with existing ethical standards and regulations: information about the storage of biological material was
proivided to the patients/parents (place of storage, length of time of conservation and the possibility to

11 The main international documents on the ethical and legal requirements in this topic: Council of
Europe, Convention on Biomedicine and Human Rights, 1997, art. 21; Additional Protocol concerning
Biomedical Research, 2005; European Union, Charter of Fundamental Rights, 2000, art. 3; UNESCO,
Universal Declaration on Bioethics and Human Rights, 2005.
12 See also International Bioethics Committee of Unesco, Informed Consent.
13 See Italian Committee for Bioethics, Managing “Incidental Findings” in genomic investigations

with new technology platforms, 2016.


89
withdraw consent with the consequent destruction of the sample, possible future use for research purposes
directly/indirectly connected with the research, modality of anonymization applied, total or partial
codification); specific caution involving the children who reach the age of majority during the research, should
their consent be sought to continue to retain/use collected biological material and/or data14;

- information was also provided to participants on data collection, data access, retention and security
measures to be adopted; anonymization of data (when possible, or limits of anonymization);

- a description of the communication of the results of research (most notably in genetics) to participants
was also discussed.

A specific ethical aspect raised by the committee was the use of MRI in healthy children.
According to the committee, the ethical approval for the use of MRI on healthy children should be
granted by specific ethics committees at each of the research unit, in order to ensure that MRI being
utilised in the project would guarantee minimal-risk standard from the point of view of physical harm
and burden (for example, the use of contrast medium appears ethically problematic). Moreover, it is
vital that parents (and the children in line with their degree of maturity) be aware of possible risks,
both neurological and psychological. MRI affords researchers exceptional views inside the human
body, but it also poses risks, including physical injury from the strong magnetic forces and
psychological harm such as anxiety.

A report by an external evaluator of the project, in the first year, underlined ‘delays’ in the
recruitment process, which were identified as being related to meeting the ethical requirementsof the
study.. The committee acknowledged the difficulty in recruiting participants who were willing to
commit to the significant investment of time required by the project, however this fact could not justify
the payment of volunteers. The committee saw no ethical issue with offering small tokens of
appreciation which the esearchers were already doing..

The Ethical and Legal Committee confirms that the ethical conditions for patients recruitment
(established since the beginning of the project) are complete voluntariness, without any direct
compensation or indirect/undue incentives. The ethical principle of voluntariness is part of an
internationally shared consensus and also a consolidated normative framework. The Committee
underlines that the ethical principle of the respect of the body and its integrity, alongside the
prohibition to make a profit or gain from the body, was placed at the centre of the ethical and legal
consideration of the project. Even if abiding by this ethical requirement could be considered as a cause
of empirical delay to the research, the Committee reminded researchers of the importance of this
ethical principle in order to balance progress of scientific knowledge with the respect of dignity of the
participants, while ensuring non-exploitation of the participants.

14See Italian Committee for Bioethics and Italian Committee for Biosecurity, Biotechnology and Life
Sciences, Collection of biological samples for research purposes: informed consent, 2009; Council of
Europe, Recommendation 2016/6 on research on biological materials of human origin.
90
2. In the digital phase of the research project

The enormous amount of data (in terms of volume or quantity, variety or heterogeneity coming
from different sources and velocity or speed of collection) poses a significant challenge to the
traditional way of guaranteeing accuracy, objectivity, security and privacy of data.

Big data could become a ‘big opportunity’ for personal and social benefits. The goals of big
data include, as in this project, the elaboration of model-driven patient-specific predictions and
simulations and personalised diagnoses and treatments. From an ethical point of view, the goal is
unquestionably good and opens up promising future possibilities in medicine.

However, these relevant goals for the advancement of techno-science and public health are
also pose ‘big challenges’ to individuals and society. Challenges are not a sufficient reason to limit
or even stop the development of techno-science, but it is evident that new forms of ethical evaluation,
legal regulation or ‘governance’ are required, which could balance human fundamental rights and
techno-scientific advancements.

Some challenges identified in this project may come to the fore during the establishment of the digital
repository and its use by researchers and technicians. In this regard, consideration of Opinion 7/2015 from
the European Data Protection Supervisor on meeting the challenge of big data proved beneficial; as well as
EGE, New health technologies and citizen participation, 2015; future documents from the Italian Committee
for Bioethics, ICT, big data and health, 2017; and UNESCO International Bioethics Committee is working on
Big data and health (publication anticipated in September 2017) will be useful for the work going forward.

As The SOKU infrastructure will be directly accessible and can be employed by users and
client applications - therefore the most sensitive section of the project - the anonymization processes
and the access rights have been further examined and validated. Moreover, ethics Guidelines have
been shared between collaborating projects (namely MD-Paedigree & Cardioproof, that use the same
repository), in order to ensure a standard ethical framework for users in accessing the digital
repository.

The committee stressed the need to implement and adopt, in the data collection, selection and
data sharing, the highest possible ethical standards to ensure quality of data (thereby preserving the
utility of the research) and establish appropriate levels of security regarding access and use of data,
while maintaining interoperability.

The possible main ethical issues identified are the following:

- possible inaccuracy/ doubtful veracity in the collection of data (the challenge of the so-called ‘big bad data’),
due to the quantity, complexity, heterogeneity of sources of data, and possible non-authenticity; in particular
any difficulty that could arise from the quantitative translation of psycho-sociological and biographical data
into numbers. It is important to realise that the validity of the models is contingent upon the choice of criteria;
- possible difficulties in finding objective criteria with regard to the selection of data, the elaboration of
criteria, categories, typologies, clusters (the selection of data is not an automatic mechanical tool, but requires
a subjective and discretional intervention of researchers in the creation of algorithms; the so-called ‘ethics of
algorithms’)
- possible difficulties in ensuring security and an effective anonymization of data (gender, age, date of birth
are sufficient to identify participants); the so-called phenomenon of the ‘evaporation of privacy’ or even ‘end
of privacy’ reveals the difficulty or even impossibility in an age of big data to completely defend the
confidentiality of the patient (or the data owner).

91
The interdisciplinarity between clinicians/researchers and engineers/bioinformaticians working
together to extend and translate their existing and advanced data analysis technology into targeted big data
analytical approaches, in order to achieve clinically useful outputs. Although engineers and clinicians have
long collaborated successfully, development work on "Big Data Healthcare" will require particularly intimate
reciprocal understanding, which is evident in this project.

5. Final Recommendations and Guidelines:


In the framework of the discussion surrounding the Paedigree project, the Committee elaborated
some general ethical and legal reflections on this kind of research, which is currently experiencing a growing
expansion in Europe:
1. clarity and transparency in all communications with partcipants involved in the research from the
beginning (patients/parents) about ways in which their data is, and may be used in the future, within a
dynamic unforeseeable scenario, along with a realistic acknowledgement that no system can completely a
priori guarantee, security, privacy and confidentiality in all circumstances;

2. implementation of a policy of open, transparent communication with the participants involved in


the research (patients/parents), including keeping them up to date – relying on ICT - with progress
of the study and informing them on the way their data had been used and is now used; what kind of
results the project is achieving;

3. minimization - whenever possible - of the risks of intrusion into private life, guaranteeing a
proportionate approach between the risks of privacy and the potential benefits of the research.
Information should be supplied about which data will be stored by whom, where, in which manner,
for how long, who will have access to stored data, and the fate of the data after the study has been
concluded;

4. the importance of ongoing interactions between ethicists and clinicians/researchers-


engineers/bioinformaticians, in order to collaboratively identify the emerging ethical issues in data-
driven research projects. The clinical and digital efficiency of the research should be compatible with
ethical standards and a normative framework;

5. specific attention, after the elaboration of the model for clinical practice, to the education of
physicians in the interpretation and application of the elaborated model. The research project should
encompass guidelines for possible curriculum changes neccesary in medical education, focusing on
the skills required for the interpretation of predictive models and application to the real condition of
the patients in the context of interpersonal relations (face-to-face with the patient), in order to
maximise the impact of the predictive models and their applications; medical education relating to
risk-management, minimizing risks and maximizing results for the patients. Training in the use of the
predictive model should include awareness building of the strengths and limitations of the model,
alongside the fact that it should not replace the real concrete clinical evaluation case by case (being a
sort of ‘automated’ medical care), but only aid, support and integrate the personal care/cure
relationship with the patients.

92
THE ETHICAL AND LEGAL FRAMEWORK FOR A MODEL-DRIVEN HEALTH DATA
REPOSITORY

1. Introduction: model-driven health data and big data

It is not an easy task to analyse the ethical and legal framework of a ‘model-driven health data
repository’, both generally and specifically of Paedigree, as a model-driven European paediatric digital
repository.

It deals with a strongly innovative project, on a scientific level;the ‘new wave’ of health
technologies, characterized by the transformation of medicine (going towards the ‘4Ps medicine’ or
prevention, prediction, personalization/precision, participation) and the convergence/confluence of
traditionally different disciplines (medicine, biology, informatics, engineering, computer science).
The advancement of ICT (with the increase and acceleration in the collection, storage and processing
of information) and the development of "data science" (i.e. the use of computing and mathematics
with statistical techniques and algorithms) are opening up new possibilities of knowledge, use and
application in various fields, including healthcare.

In this context, the ‘model-driven health data repository’ is a form of ‘health data-driven
medicine’ (yet the object of study and research, currently not extended to all aspects of biology and
medicine): which offers the possibility to make predictions and simulations of diagnosis and
treatments for patients in specific contexts or for stratified groups of patients (so-called
personalized/stratified medicine or precision medicine) on the basis of an amount of data collected,
and transferred from the clinical arena to digitalization.

The collection and analysis of a huge amount of data, in the era of ICT, has health care
applications, as model data-driven medicine: ‘big data’ is an exponentially growing and evolving
phenomenon, transforming medicine and health, both as a concept and as a practice.

Big data refers to massive digital data, which is characterized by the ‘5 Vs’: volume (huge
amount of data); variety (heterogeneity of sources); velocity (speed of collection, processing and
application), veracity (quality of data), value (meaning of data). The novelty of the phenomenon,
combined with its complex manifestation and dynamic development, requires new approaches also in
ethics and law.

Traditional ethical categories (dignity/integrity, autonomy, privacy, equality, justice) need to


be reinterpreted in light of the new issues emerging from the fast developing technologies; existing
legal norms at international, regional, national levels, often prove outdated, inadequate and not
applicable to the new and emerging problems, which require new forms of governance.

The European context of the project cannot limit the framework within the global and
ubiquitous horizon of ICT; and dealing with paediatric population make problems worse, given how
difficult, if not impossible, it is to detect age in the digital sphere.

93
2. Ethical challenges

Big data and big digital repository could be considered a ‘big opportunity’ for personal and
social benefits as regards health. The management of large amounts of diverse types of
information, the conversion of these data into hypotheses about health and disease, as well as
their transformation into usable knowledge, offers a ‘big promise’. The goal of this project, the
elaboration of a model-driven patient-specific predictions and simulations and personalised diagnoses
and treatments, is unquestionably good from an ethical point of view; it also envisages promising
future possibilities in medicine, both for individuals and for society, including present and future
generations.

But ‘big challenges’ to ethics are equally likely.. Challenges are not a sufficient reason to limit
or even stop the development of techno-science in medicine, but they should be taken into account in
the ethical evaluation and legal governance of these emerging technologies, in order to balance the
human fundamental rights (dignity, integrity, autonomy, privacy, justice) and the advancement of
progress.

There is no consolidated literature on the topic. There are documents and opinions of international
and national ethics committees, which may contribute to the development of an ethical framework in this
analysis. Among the main documents, it is worthwhile recalling: The European Group on Ethics in Science and
New Technologies (European Commission), New health technologies and citizen participation, 2015 and
Opinion 7/2015 from the European Data Protection Supervisor on Meeting the challenge of big data; UNESCO
International Bioethics Committee is currently working on Big data and health (likely to be published in
September 2017); Italian Committee for Bioethics, ICT and Big Data: Bioethical Issues, 2016; Nuffield Council
on Bioethics (UK), The collection, linking and use of data in biomedical research and healthcare: ethical issues,
2015.

Inaccuracy: Data ‘quality’ in collection

Possible inaccuracy/doubtful veracity in data collection (the challenge of the so-called ‘big bad data’),
due to the quantity, complexity, heterogeneity of data sources, as well as its possible non-authenticity or
untruthfulness, represent a scientific and ethical challenge. The curation of data including cleaning and
quarantining of data, as happed in the MD-Paedigree, serves to bolster the veracity of data.
Large amounts of data are continuously and rapidly accumulating. This explosion of information
requires reliable tools to build an accurate collection, evaluation of change in clinical parameters to preserve
high standards of healthcare. In this regard, artificial intelligence may be helpful in achieving this goal:
computerized patient records may be analysed by appropriate programs developed to yield important clinical
information or construct algorithms helpful in diagnosis, prognosis and monitoring therapy. Biases should be
identified and eliminated, as far as possible.
However, the quality of datasets used may be at stake: it is necessary to make a clear distinction
between quality big data and big data of poor or dubious quality. This also raises the question of robustness
and the need for research purposes to access raw data aimed at checking, meta-analysis and reinterpretation.
If bias is detected in the data collected, its analysis and use becomes irrelevant and even damaging
for scientific progress, as well as dangerous for individual and public health, entailing a needless and
erroneous exposure to risks (to their integrity). Incomplete collections of data or miscoded data are not
uncommon, for various reasons (e.g. because of the increase in physicians' documentation burden, lack of
education and accuracy and/or knowledge of the correct methodology in registering data, lack of updated
international classification of diseases or knowledge of them, difficulties in selecting essential and specific
information of a patient's history, incorrect registering of the patients unaware of the possible/future use for
analysis purposes, lack of proper oversight and selection for authenticity, lack of standardisation,
94
interoperability and data harmonization as regards terminology etc.). There can be biases in the automated
processes used for collecting and assessing the data, often due to the algorithms (and their designers)
including a lack of human checks in analysis, lack of education and the competency of analysts. If data quality
in the collection and analysis process is not checked, monitored and guaranteed, invalid conclusions may be
drawn, on a scientific and clinical level, with possible negative consequences both for individuals and society15.

Transparency: the possible ‘opacity’ of algorithms in data selection/classification

Using these technologies for data collection makes it possible to combine biological, social and/or
environmental information: big data may show correlations and interactions of complexities in health and
disease, that could not be identified before. The key challenge hinges upon transforming biological-social-
environmental information into predictive abstract models.
Although, some ethical challenges come to the fore: which are the selection criteria of the most
relevant data (criteria of data exclusion or inclusion); the criteria that establish correlations between data
(are they objective or subjective); to what extent is it possible to transform bio-socio-environmental data in
numbers? There are also possible difficulties in finding objective criteria with regard to data selection,
elaborating criteria, categories, typologies, clusters (data selection is not an automatic mechanical tool, but
requires a subjective and discretional intervention of researchers in the arbitrary creation of algorithms; the
so-called ‘ethics of algorithms’).
We should be conscious of the fact that algorithms construct correlations and predictions regarding
(mathematically calculated) "probability" , which should not be confused with causality. Profiling identifies
the major and minor probability or propensity of certain stratified groups of individuals. What criteria form
the basis on which algorithms are constructed? An important aspect, therefore, concerns the full publication
of all the factors, which research algorithms take into account. The opacity of the factors upon which search
engines are based does not allow the monitoring of information quality.

Privacy and confidentiality: the ‘end’ of privacy?


The protection of privacy and confidentiality is facing several challenges in the era of big data. Full
anonymization of personal data no longer provide sufficient guarantees. Partial anonymisation, also called
pseudo-anonymisation, or the replacement of identifiers with a code in order to guarantee re-identification
when necessary or desirable (for example, to give information to individuals when serious illness or specific
risks are discovered), entails risks and possible vulnerabilities, including potential access by third parties
(employers, insurances). Re-identification should not only be considered as a theoretical possibility, but also
as a practical and real one in the context of big data. For this reason, effective anonymisation preventing all
parties from identification becomes a real challenge. By integrating large amounts of data from different kinds
of sources, it is often possible to perform re-identifiy an individual..
“Privacy” in the sense of a right to respect for private life (which is more than personal data), in
relation to those areas of life or the data that individuals want to keep confidential, is at stake. Problems are
likely to emerge in ensuring an effective anonymization of data (gender, age, date of birth are sufficient to
identify participants); the so-called phenomenon of the ‘evaporation of privacy’ or even ‘end of privacy’

15
W. Raghupathil, V. Raghupathi, Big data analytics in healthcare: promise and Potential, Health Information Science
and Systems 2014, 2, 3. There is a discussion on what forms of regulations could provide for data quality monitoring
(collection, storage, analysis), as a requirement of data use, with flexible and updatable tools (as code of conducts),
ensuring competences and correctness of operators (clinicians, analysts, as engineers, statisticians, bioinformaticians)
and correct interactions among them. Proposals include ‘soft regulation’ such as an updated code of practice for
clinicians or other professionals involved in collecting health-related data. Others aim to foster inter-disciplinarity
between clinicians/researchers and engineers working together to translate and extend their existing and advanced
data analysis technology (including on the one hand the clinically trained human mind), into targeted big data analytical
approaches that will achieve clinically effective outputs. Although engineers and clinicians have long collaborated
successfully, development work on "Big Data Healthcare" will particularly require mutual understanding by each
disciplinary culture of the other. This will resort to further cultural development in both areas.
95
reveals the difficulty or even impossibility, in an age of big data, to completely defend the confidentiality of
the patient (or the data owner). The risk for the patients is the possible/probable loss of control over their
private information in this virtual space, through the expansion of data16. The challenge lies in informing the
patients and gaining critical awareness of the problem. The challenge is to raise awareness without losing
trust in the governance of health data.
There is a loss of trust in confidentiality caused by the awareness of large-scale data disclosures, as
well as revelations regarding intrusions by government agencies and commercial companies, (including
inappropriate data sharing by social media organisations). In this sense, there is a need for designing new
forms of governance in data collection systems. There should be an improvement of technical measures - as
concretely as possible - in order to prevent the identification of subjects and reduce the risk of privacy
infringements whenever possible (privacy-by-design). On this point, discussion should be taken forward, at a
normative level, on how to guarantee transparency at the moment of data collection and find additional
measures to prevent identification of individuals, as standardised anonymisation protocols are insufficient in
specific contexts.

Informed consent: broad and dynamic

In the age of big data, there is a radical digital transformation of informed consent. We witness a
"challenge" to informed consent as it has been traditionally understood. It is, sometimes, almost impossible
to specify who is collecting the data and who will use it, which data is involved, how the data is collected,
where it will be stored, for how long and for what reason and purpose. It becomes difficult to guarantee the
right to access, modify and delete personal data, to revoke consent or to dissent.
Informed consent, in the field of big data, cannot be specific; it, inevitably, calls for characteristics of
"broadness" given the impossibility of accurately anticipating the paths of research or application in
healthcare (similar to what occurs within the framework of biobanks): it is therefore a "dynamic" and "flexible"
consent, which identifies similar areas of research directly or indirectly related to the original path. Broadness,
dynamism and flexibility do not mean "blind" or ‘blanket’ consent to whatever research. Broad consent
implies asking individuals transparently to consent not only to the immediate purpose for which their data
has been collected, but also to unforeseen uses of their data. Dynamism and flexibility mean engaging the
active participation of the data subjects, allowing a constant control of data access by individuals, through
consent portals.
In a context of greater participation and citizen involvement in science/medicine, concerns centre
around the risk that individuals may also be willing to actively participate in research and express their consent
to donate and share data to ‘open source’ platforms. This may be explained as a form of altruistic behaviour,
expressed in the willingness to give unlimited permission regarding the use of data in a collaborative or
cooperative context (the so-called phenomenon of ‘citizen science’). Some criticise this kind of consent to
data donation, defining it as misinformed naivety and suspect that the exaltation of an unselfish logic, may be
inspired by a hidden desire to stimulate above all the market.
One could perhaps rethink the very expression "Informed consent" in the digital world, limiting it to
an "awareness" or "acknowledgement" that the data will be collected, with a critical consciousness of the
difficulty/impossibility of anonymity, based on the non-precise a priori determination of the method of use,
storage, and analysis of data, and the impossibility of guaranteeing security and confidentiality in all
circumstances.

Equality: non-stigmatization and non-discrimination

16
The information requested is of an heterogeneous nature: With specific regard to health-related data, consideration
should also be given to the fact that boundaries between the strictly medical and non-medical spheres are becoming
increasingly blurred, like those between health and society; information on lifestyles and behaviours tends to become
increasingly more relevant to health even within the perspective of prevention. In this sense, health information is not
only deemed to be the outcome of laboratory tests or epidemiological data, but also the general news that comes from
social networks.
96
The protection of personal data (confidentiality, discretion, privacy) should be combined with the
protection of personal freedom17, in order to avoid or at least mitigate the "social risks" of new technologies,
namely the abuse and misuse of data for discriminatory purposes (in the field of insurance companies or
workplace).

3. Legal framework and governance

There are no specific regulations of the phenomenon of big data in national and international legal
frameworks. Regulation of data protection is provided for in many legal systems, numerous rules thereof
could be applicable to the area of big data: even though big data is a new reality and may require new
regulation. While there is a broad consensus on the core data protection principles at the heart of most
national laws and international norms, the main challenge the divergence in the implementation of these
principles, as well as in the detailed data protection laws of the world.
The legal framework applicable to data use in biomedical research and health care recognises,
broadly, two sorts of measures that may protect the interests of citizens against potentially injurious misuse
of data. First, it recognises operations that alter the data in order to de-identify it, so that its use would no
longer pose a direct risk to data subjects through person identification. Second, it sets out controls over access
to data in a way that the data is only made available to authorised users, under circumstances in which it is
expected not to be misused or otherwise result in harm to data subjects. These measures are often used in
combination.
In the framework of the UN Declaration of Human Rights (1948) and the European Convention on
Human Rights (1950), there are two recent European documents: Directive 95/46/EC of the European
Parliament and of the Council of 24 October 1995 on the protection of individuals with regard to the
processing of personal data and on the free movement of data and the Regulation (EU) 2016/679 of the
European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to
the processing of personal data and on the free movement of such data. Both are the main binding legal
instruments at international level, which address privacy. The European Region has a more developed legal
protection of health-related data than other regions18.
Regulation 2016/679 is of great importance for information and communication technologies,
because it provides the basis for the exercise of new rights and defines limits with regard to the automatic
processing of personal data. The Regulation states that the persons involved in the processing of personal
data should be informed of their right to revoke consent to certain processing, and the right to "be forgotten",
or the cancellation of their personal data. Furthermore, the Regulation introduces the "right to portability" of
one's personal data to transfer from one data controller to another. The Regulation confirms the transfer ban
on personal data to countries located outside the European Union or to international organizations that do
not meet the required standards on data protection, in relation to which the Regulation introduces more

17C. Bock, Preserve Personal Freedom in Networked Societies. Broad Anti-Discrimination Laws and
Practices could Compensate for Failing Data Protection and Technology-Linked Loss of Privacy,
"Nature", 2016, 537 (7618), p. 9.
18
WHO, Legal frameworks for eHealth, Based on the findings of the second global survey on eHealth, Global Observatory
for eHealth series, Volume 5, 2012, p. 27. Among the most significant documents on the subject adopted by the Council
of Europe, one should recall: Convention for the protection of individuals with regard to automatic processing of
personal data, Council of Europe, 1981 and additional protocol (Additional Protocol to Convention ETS No.108 on
Supervisory Authorities and Transborder Data Flows); Recommendation CM/Rec (2010)13 of the Committee of Ministers
to member states on the protection of individuals with regard to automatic processing of personal data in the context
of profiling (23 November 2010); Recommendation CM/Rec (2010)13 on the protection of individuals with regard to
automatic processing of personal data in the context of profiling (23 November 2010); Recommendation CM/Rec (2012)4
on the protection of human rights with regard to social networking services; Recommendation CM/Rec (2014)6 on
human rights for Internet users; Recommendation CM/Rec (2016)1 on protecting and promoting the right to freedom
of expression and the right to private life with regard to network neutrality.

97
stringent evaluation criteria. Also significant for information and communication technologies is the principle
of "privacy by design", whereby it is necessary to ensure the right to data protection from the initial stage of
conception and design of a process or of a system.
There is a discussion in biolaw, at the international level, on a new form of governance of big data.
Looking at the global scope of big data and health, as well as the fast technological development, it is difficult
to elaborate comprehensive and balanced regulations. Governance systems for big data should protect the
fundamental rights of the persons from whom the data originates: dignity/integrity, autonomy, privacy and
data protection, transparency, equality. Data governance should guarantee that citizen involvement,
engagement, participation and sharing of data will not be subject to any form of exploitation, stigmatization
or discrimination.
This goal could be reached through: transparency of database purposes, arrangements for the control
of accuracy and transparency of procedures (collection, use of algorithms, consent), arrangements for the
protection of privacy, at least declaring the limits of privacy protection, arrangements for the duration of data
storage, concerning data ownership, data sharing and criteria for access, including the prioritization of
research and data users.

Conclusion
It is important to be cautious and to avoid exaggeration of the current state of scientific knowledge
and the potential benefits of big data and precision medicine for healthcare. The bio-optimistic ´hype of big
data´ can lead to overstatements and unrealistic estimations. On the other side, the bio-pessimistic underline
of threats, potential risks and damages may lead to neglect of the potentials of big health data.
A balanced way of dealing with hopes and promises, opportunities and challenges, is very important
in order to protect human values and rights in the context of the advancement of techno-science in medicine.

98

S-ar putea să vă placă și