More and more data is being produced from various sources (e.g., medical devices, smart devices, public records), in different geographies, and is often owned by different parties like academia, hospitals, industry as well as governments.
Data sharing can improve AI analysis: The richer the data collection, the more robust and reliable the model will be and it is a reality that a considerable amount of data is required to grant that training process will have success finding these patterns.
With the entry into force of the Data Protection Regulation (GDPR) on May 2018, the of the European Union set strict rules on data security and privacy protection, emphasizing that the collection of participant data must be open and transparent. In the context of health related data, GDPR has also entailed a number of challenges, being one of most relevant ones the fact that health care organizations have become a data silo and this makes very difficult to train high quality mathematical models.
The breakout session on “Data sharing in the Healthcare Sector” has gathered the expertise of relevant stakeholders – Data owners, data scientists, researchers and lawyers mainly – to discuss and brainstorm on the challenges and benefits of data sharing as well as on the key technologies and enablers currently coming into scene.
Key findings identified during the session included:
- GDPR- it is blocking the research unwillingly. Hopefully, there is going to be more discussion about how to adapt GDPR for research purposes. This does not mean to work to elude GDPR but to collaborate and create „lobby groups“ on changes in GDPR specifically for research purposes and redesign on Data transparency, Data cross-border exchange, data as utility and individual consents.
- Data Silos are partially motivated by the lack of trust in data privacy and security mechanisms. It is a fundamental right to know what others are doing with your data. Thus, ownership of Health Data needs (global/EU) privacy by design governance guidance for all involved multi-stakeholders based on the new data economy principles (everything by design (Commercial/ Legal/Security/AI) regarding data. In addition, it requires to be supported by legislators and authorities.
- Data governance should go one-step beyond current regulation: Basic design blocks for next steps data governance & collaboration synchronization should include: Transparency, Legitimate & meaningful collection, Responsible & sustainable processing, Security, integrity & quality and Minimal retention.
- AI environment claims a number of benefits while gaining access to Data to train models. Nevertheless, data owners are skeptical to some extent in the sense, the don´t see yet the real benefits of adopting AI in healthcare settings as a counterpart for providing access to data.
- With regards the inclusion of Real world data and synthetic data as a mean to gain access to larger datasets these are great approaches it helps to train AI, however, it is essential that we have access to the real data.
- In addition, there are many similar documents, guidelines, but unfortunately no effective, data-rich, EU medical data repositories. Usually there are some limited datasets and you must ask for access, which is not simple to get. It should be promoted specific incentives for data owners and pilots to generate medical data for AI in Europe and more European funding and programs to obtain European data sets that allow to develop more effective algorithms
- COVID-19 has been a level and motivation to innovate and bring new solutions for data sharing. Federated learning and split learning approaches are becoming popular as they enable to preserve data privacy and security, two of the top demands posted by healthcare organizations acting as data providers. Nevertheless, these tools and technologies, which may contribute to encourage data sharing, still need to ensure and include Privacy Enhancing Technologies (PET) and security measures for wider adoption[1].
One thing is clear: „sharing is caring“; and all stakeholders should contribute to address aforementioned challenges and collaborate to make health data really actionable data to leverage on and benefit research, benefit healthcare professionals and carers and far above all patients, real owners of the health data.