NOTE: This puzzle uses HTML embedding. If the interactive features at first do not work, you may need to refresh the page or clear your cache (CTRL+SHIFT+R)
Author: Lyric Jorgenson, PhD
Acting NIH Associate Director for Science Policy About Lyric
A Fun Way to Learn More About OSP
Harnessing the Power of Our Digital Future
I don’t think it would be news to anyone reading this blog to hear we are living in a digital economy. Whether at home, work, or play, the data we provide about ourselves are being collected and used to understand our preferences and shape our behaviors. A lot of the time, we freely give away this information so that we can get access to the latest app or service aimed at making life easier. For example, providing information on your location so that a charming British voice can provide you with step-by-step directions to brunch on your GPS can be a real timesaver. However, how does your calculus change when you are asked to provide data related to your health and well-being? Would you think differently about giving access to your personal health data so readily? How would you consider your family’s privacy or the potential benefit to public health?
As the leading biomedical research agency for the Nation, how people think about sharing their personal health data is key to informing policy development. It is clear that health-related apps, wearable devices, social media, and other personalized technologies can move a research study from the lab to the real world. But how this transformation takes place, especially in terms of how NIH can ensure responsible data collection, analysis, and use, requires careful consideration.
To assist NIH in thinking through these issues, NIH charged a working group of the Novel and Exceptional Technology and Research Advisory Committee (NExTRAC) to forecast areas of research in which emerging technologies might yield novel data types and sources.
The first step in tackling this problem is to identify potential studies and why they might be undertaken. This is a vital step as any future policy must clearly balance the benefits and risks for individuals and the public. The working group has been hard at work over the last several months meeting with a variety of experts across sectors to develop a draft list of types of research questions. These questions were presented and discussed during the working group’s progress update at the meeting of the NExTRAC on July 14, 2022.
In general, key aspects of the discussion focused on:
- Emerging Data Sources: Personal health data collected from outside of the traditional healthcare system are increasingly being used to study health-related questions and predict health risks. Collection and sharing of these data has enormous potential to help people, but how these research aims affect not only individuals, but also families and communities (who might share genetic makeup or walk around the same rooms) and how we effectively communicate the broader risks and benefits require careful consideration.
- Use of Models and Algorithms: Computer-based technologies, such as artificial intelligence (AI), machine learning (ML), and automated image analysis, have the potential to revolutionize diagnostic and treatment decisions. Knowing that the accuracy of these technologies depends on the data that were used in their development, how do we responsibly deploy for early adoption or underserved populations who may not be represented in those underlying data? Lack of representation in datasets remains a pervasive challenge, which can be detrimental when used to inform healthcare decisions. These biases can be further exacerbated when datasets are combined as the biases existing in one dataset can be reflected in all the others.
- Linkage and Aggregation: Researchers have an increasing capability to link diverse datasets, such as electronic health records with genomic information, creating new opportunities and challenges. Can dataset formats be standardized so that data from different countries and healthcare systems could be aggregated, linked, and shared across populations? How can personal health libraries be used to combine individuals’ health information across multiple different data streams to inform health outcomes?
Importantly, much of the discussion emphasized the point that how we think about these technologies – and the opportunities and challenges they raise – is highly personal and context matters. This is why we need robust and representative input. Particular attention should be placed on those not traditionally engaged in the conversation, including people who may be more skeptical of sharing data for research. Accordingly, the working group discussed its plans to engage the public in the conversation to understand more about how to balance the benefits and risks in these types of research. These engagements represent a critical step in the process to ensure that participants remain partners at the center of our research efforts.
While this will not be an easy task, I’m very excited about the NExTRAC’s next steps as they will not only help us think through policy issues on the horizon, but also help chart the course for new ways of engaging the public in the policy process. In case you were unable to attend the meeting (archived here), expect to see some announcements for the working group’s engagement activities in the near future and their final recommendations to NIH some time in 2023. Emerging technology in data science continues to shape the way we conduct research, and we look forward to working with the public we serve to ensure that we balance this change for the greatest good.
DataWorks! Prize – Incentives for building a culture of data sharing and reuse
This is a guest blog from Susan K. Gregurick, Ph.D. Dr. Gregurick is the Associate Director for Data Science and Director of the Office of Data Science Strategy (ODSS). More information about ODSS can be found at: https://datascience.nih.gov/
A $500,000 prize purse, rewarding data sharing and reuse in biomedical research, is a new, innovative strategy for supporting the research community. The DataWorks! Prize highlights the role of data sharing and reuse in scientific discovery while recognizing and rewarding researchers who engage in these practices. This prize, which launched on May 11, 2022, is a partnership between the NIH Office of Data Science Strategy and the Federation of American Societies for Experimental Biology (FASEB).
The future of biological and biomedical research hinges on researchers’ ability to share and reuse data. Sharing and reuse had a sizable, catalytic impact on the development of COVID-19 vaccines and treatment protocols. The DataWorks! Prize is an opportunity for the research community to share their stories about the practices, big and small, that lead to scientific discovery.
To participate, research teams share their stories through a simple two-stage application. Through narrative prompts, teams share details of the practices they used, the scientific impact of their achievements, and the potential for replicating their practices for further scientific research. This year, the DataWorks! Prize purse is up to $500,000 across 12 monetary awards including two $100,000 grand prize awards.
Beyond monetary awards, the DataWorks! Prize is an opportunity for the research community to learn from peers and apply those lessons to their research practices. The innovative approaches and tools from prize winners will be highlighted in a symposium 2023 and made available to support community learning.
As implementation of the NIH Data Management and Sharing Policy draws near, consider the broader intent of this policy: building a culture of data sharing and reuse in the biomedical research community. Incentives are a major part of culture change and we are excited to provide a space for the community to share their achievements and learn together. Through initiatives like the prize and the launch of the new sharing.nih.gov website, we are taking new steps to support the future of biological and biomedical research at the center of the NIH’s Data Management and Sharing Policy.
The DataWorks! Prize is currently open for submissions. Participants must register to participate by June 28, 2022 – visit Challenge.gov for more information and to apply.
Gearing Up for 2023 Part II: Implementing the NIH Data Management and Sharing Policy
Sequels are all the rage these days. I figure if Marvel can make endless “Avengers” movies, I could start making blog sequels. Back in the beginning of the year, I wrote Part I of this blog series about how NIH is working to implement the new NIH Data Management and Sharing Policy (DMS Policy). I mentioned at that time that additional resources were forthcoming.
I should note that when we started to receive comments on what was to become the NIH DMS Policy, one thing in particular stood out to us. Many commentors told us it would be helpful to have clear information on how to protect the privacy and respect the autonomy of participants when sharing data. Now, we all know that cliffhangers build anticipation, so without further delay, I want to share with you some of the tools NIH has been working on to answer that call.
First, if you have seen the Avengers movies, you likely will have noticed that they tend to introduce a new villain that the team needs to battle with either new tools (think of OSP with Thor’s Stormbreaker axe) or the help of new superheroes like Captain Marvel. While not exactly a new villain, the lack of consistent consent language to facilitate secondary research with data and biospecimens is certainly a challenge many of our stakeholders have raised and one that we thought we could help address.
NIH has a long history of developing consent language and, as such, our team worked across the agency – and with you! – to develop a new resource that shares best practices for developing informed consents to facilitate data/biospecimen storage and sharing for future use. It also provides modifiable sample language that investigators and IRBs can use to assist in the clear communication of potential risks and benefits associated with data/biospecimen storage and sharing. In developing this resource, we engaged with key federal partners, as well as scientific societies and associations. Importantly, we also considered the 102 comments from stakeholders in response to a RFI that we issued in 2021.
As for our second resource, we are requesting public comment on protecting the privacy of research participants when data is shared. I think I need to be upfront and acknowledge that we have issued many of these types of requests over the last several months and NIH understands the effort that folks take to thoughtfully respond. With that said, we think the research community will greatly benefit from this resource and we want to hear your thoughts on whether it hits the mark or needs adjustment.
When reviewing the document, please bear in mind that the main purpose is to provide researchers with information on:
- Operational Principles for Protecting Participant Privacy when Sharing Scientific Data
- Best Practices for Protecting Participant Privacy when Sharing Scientific Data
- Points to Consider for Designating Scientific Data for Controlled Access
Comments on the draft will be accepted until June 27, 2022, full information and how to submit a comment can be found here.
Finally, every sequel needs a twist ending! In November 2021, NIH published a request for comments on the future directions of the NIH Genomic Data Sharing Policy. We are still reviewing the many points and perspectives that were raised, but while we consider next steps, the comments we received are now available on the OSP website. Okay, so maybe that twist wasn’t as big as, say, Darth Vader revealing he is (spoiler alert) Luke’s father in The Empire Strikes Back, but it’s still pretty good for the science policy world.
With a little more than half a year left until the implementation date of the NIH DMS Policy, we will continue to provide updates and resources over the next several months.