I don’t think it would be news to anyone reading this blog to hear we are living in a digital economy. Whether at home, work, or play, the data we provide about ourselves are being collected and used to understand our preferences and shape our behaviors. A lot of the time, we freely give away this information so that we can get access to the latest app or service aimed at making life easier. For example, providing information on your location so that a charming British voice can provide you with step-by-step directions to brunch on your GPS can be a real timesaver. However, how does your calculus change when you are asked to provide data related to your health and well-being? Would you think differently about giving access to your personal health data so readily? How would you consider your family’s privacy or the potential benefit to public health?
As the leading biomedical research agency for the Nation, how people think about sharing their personal health data is key to informing policy development. It is clear that health-related apps, wearable devices, social media, and other personalized technologies can move a research study from the lab to the real world. But how this transformation takes place, especially in terms of how NIH can ensure responsible data collection, analysis, and use, requires careful consideration.
To assist NIH in thinking through these issues, NIH charged a working group of the Novel and Exceptional Technology and Research Advisory Committee (NExTRAC) to forecast areas of research in which emerging technologies might yield novel data types and sources.
The first step in tackling this problem is to identify potential studies and why they might be undertaken. This is a vital step as any future policy must clearly balance the benefits and risks for individuals and the public. The working group has been hard at work over the last several months meeting with a variety of experts across sectors to develop a draft list of types of research questions. These questions were presented and discussed during the working group’s progress update at the meeting of the NExTRAC on July 14, 2022.
In general, key aspects of the discussion focused on:
- Emerging Data Sources: Personal health data collected from outside of the traditional healthcare system are increasingly being used to study health-related questions and predict health risks. Collection and sharing of these data has enormous potential to help people, but how these research aims affect not only individuals, but also families and communities (who might share genetic makeup or walk around the same rooms) and how we effectively communicate the broader risks and benefits require careful consideration.
- Use of Models and Algorithms: Computer-based technologies, such as artificial intelligence (AI), machine learning (ML), and automated image analysis, have the potential to revolutionize diagnostic and treatment decisions. Knowing that the accuracy of these technologies depends on the data that were used in their development, how do we responsibly deploy for early adoption or underserved populations who may not be represented in those underlying data? Lack of representation in datasets remains a pervasive challenge, which can be detrimental when used to inform healthcare decisions. These biases can be further exacerbated when datasets are combined as the biases existing in one dataset can be reflected in all the others.
- Linkage and Aggregation: Researchers have an increasing capability to link diverse datasets, such as electronic health records with genomic information, creating new opportunities and challenges. Can dataset formats be standardized so that data from different countries and healthcare systems could be aggregated, linked, and shared across populations? How can personal health libraries be used to combine individuals’ health information across multiple different data streams to inform health outcomes?
Importantly, much of the discussion emphasized the point that how we think about these technologies – and the opportunities and challenges they raise – is highly personal and context matters. This is why we need robust and representative input. Particular attention should be placed on those not traditionally engaged in the conversation, including people who may be more skeptical of sharing data for research. Accordingly, the working group discussed its plans to engage the public in the conversation to understand more about how to balance the benefits and risks in these types of research. These engagements represent a critical step in the process to ensure that participants remain partners at the center of our research efforts.
While this will not be an easy task, I’m very excited about the NExTRAC’s next steps as they will not only help us think through policy issues on the horizon, but also help chart the course for new ways of engaging the public in the policy process. In case you were unable to attend the meeting (archived here), expect to see some announcements for the working group’s engagement activities in the near future and their final recommendations to NIH some time in 2023. Emerging technology in data science continues to shape the way we conduct research, and we look forward to working with the public we serve to ensure that we balance this change for the greatest good.