1 2

This Request for Information (RFI) seeks public comments on Processes for database of Genotypes and Phenotypes (dbGaP) Data Submission, Access, and Management.

Response to this RFI is voluntary. Responders are free to address any or all of the items below, or any other relevant topics respondents recognize as important for NIH to consider. Respondents should not feel compelled to address all items. Instructions on how to respond to this RFI are provided in “Concluding Comments.”

For more information, see NIH Guide Notice NOT-OD-17-044.

How to Submit a Response

Responses will be accepted through April 7, 2017. NIH will consider all public comments before taking any next steps. No proprietary, classified, confidential, or sensitive information should be included in your response. Comments on the topic areas of interest should be submitted electronically using this webpage or alternatively mailed to: Office of Science Policy (OSP), National Institutes of Health, 6705 Rockledge Drive, Suite 750, Bethesda, MD 20892, or by fax to: 301-496-9839 by April 7, 2017.

Responses will be compiled and shared publicly in an unedited version on the NIH GDS website after the close of the comment period.

To ensure consideration, responses must be submitted by: April 7, 2017 11:59:59 PM EDT

( * = Required fields)

Top of Form

*If submitting comments on behalf of another individual, please submit the name and function of that other individual.



This Request for Information (RFI) seeks public comments on the data submission and access processes for the NIH National Center for Biotechnology Information (NCBI) database of Genotypes and Phenotypes (dbGaP), and on the management of data in dbGaP, in order to consider options to improve and streamline these processes and to maximize the utility of dbGaP.

Response to this RFI is voluntary. Responders are free to address any or all of the topics listed in the request or any other relevant topics respondents recognize as important for NIH to consider for dbGaP data submission, access, and management. Respondents should not feel compelled to address all items.



NIH Policies for the sharing of genomic and associated phenotypic data, the 2014 NIH Genomic Data Sharing (GDS) Policy [1] and its predecessor, the 2007 NIH Policy for Sharing of Data Obtained in NIH Supported or Conducted Genome-Wide Association Studies (GWAS Policy) [2], set forth expectations and responsibilities to ensure the broad and responsible sharing of genomic research data in a timely manner. Fundamental to NIH’s stewardship of these data is respect for and protection of research participants’ interests. In 2007, NIH developed dbGaP [3] to archive and distribute the results of human genotype-phenotype studies that fall under these policies, in a manner that is consistent with the consent of the study participants whose genomic data are to be shared. dbGaP, a controlled-access data repository, currently serves as a central portal to submit, locate and request access to human genomic (e.g., GWAS, sequencing, expression, epigenomics data) and associated phenotypic and exposure datasets. NIH has established a governance system to facilitate the development and oversight of consistent, transparent, and efficient processes for using dbGaP and related genomic data sharing activities under the NIH GDS Policy [4].

As of January 2017, dbGaP maintains 4,625 datasets from 786 studies, representing over 1.2 million unique research participants. To date, over 44,000 Data Access Requests (DARs) submitted by 4,898 investigators from 46 countries have been processed. Even though dbGaP is a rapidly growing and highly utilized resource and many improvements to the dbGaP data submission and access processes have been made [5], NIH believes that the processes for requesting and submitting data could be streamlined and improved. Through this RFI, NIH seeks public feedback on the dbGaP data submission and access processes, and data management practices, to inform NIH about how to make dbGaP systems more user-friendly and efficient as they continue to grow and evolve.


Information Requested

The NIH invites feedback pertaining to any opportunities or challenges related to the following topics, as well as potential areas and opportunities to improve understanding, efficiency, or transparency of the processes associated with these topics:

1. dbGaP Study Registration and Data Submission [6]

Examples of areas of possible comments include, but are not limited to:

  • dbGaP study registration process for NIH-funded studies or non-NIH-funded studies
  • dbGaP data submission process
  • Technical aspects of study registration, data submission, and data release (e.g., obtaining study accession numbers, data formatting and standards, data transmission)

(Maximum: 500 words)

2. dbGaP Data Access Request (DAR) and Review [6]

Examples of areas of possible comments include, but are not limited to:

  • DAR process
  • Data Access Committee review process
  • Downloading data from dbGaP
  • Project renewal and close-out processes

(Maximum: 500 words)

3. Policies for the Management and Use of dbGaP Data

Over time, NIH has received feedback about existing practices for managing access to data subject to the GDS Policy. To further inform NIH policies and practices for the management of dbGaP data, NIH is interested in public feedback on topics such as the following:

Benefits and risks associated with the availability of genomic study summary statistics [7]

This type of information is currently managed through controlled-access in dbGaP [8] and was the topic of the recent National Human Genome Research Institute (NHGRI) Workshop on Sharing Aggregate Genomic Data [9]. Examples of areas of possible comments include, but are not limited to:

  • Risks and benefits of different management models for genomic summary statistics related to participant privacy and/or scientific opportunity for its broad use
  • Alternative options for providing access to genomic summary statistics beyond unrestricted or controlled-access models (e.g., registered access)
  • Factors to consider in determining the risk-benefit balance in the management of and access to genomic summary statistics for specific datasets (e.g., those including sensitive information or vulnerable populations)
  • Methods for mitigating risks associated with unrestricted access to genomic summary statistics

(Maximum: 500 words)

Clinical Use of Genomic Research Data Maintained in Controlled-Access in dbGaP

The medical genetics community is increasingly interested in obtaining access to dbGaP resources (e.g., NCBI dbGaP Data Browser [10]) for clinical reference uses, such as obtaining additional information to help interpret the significance of a genomic variant. This type of use might be analogous to reference use by health care providers (e.g., physicians, genetic counselors, clinical laboratorians) to NIH-supported resources such as Online Mendelian Inheritance in Man (OMIM) [11], ClinVar [12], or the Genetic Testing Registry (GTR) [13].


Submitting a Response

This RFI is for planning purposes only and should not be construed as a policy, solicitation for applications, or as an obligation on the part of the Government to provide support for any ideas identified in response to it. Please note that the United States Government will not pay for the preparation of any information submitted or for its use of that information.

Comments received, including any personal information, will be posted without change after the close of the comment period to the NIH GDS website [14]. Please do not include any proprietary, classified, confidential, or sensitive information in your response. We look forward to your input and hope that you will share this RFI document with your colleagues. Updates to this document, if any, will be noted.

The Government reserves the right to use any non-proprietary technical information in summaries of the state of the science, and any resultant solicitation(s).

The NIH may use information gathered by this RFI to inform development or modification of data sharing databases, websites, policies and practices, processes and procedures, and supporting documentation (e.g., guidance, FAQs).



[1] https://gds.nih.gov/03policy2.html

[2] http://grants.nih.gov/grants/guide/notice-files/NOT-OD-07-088.html

[3] dbGaP was developed by the National Center for Biotechnology Information, National Library of Medicine, NIH. https://www.ncbi.nlm.nih.gov/gap

[4] https://gds.nih.gov/04po2.html

[5] http://osp.od.nih.gov/under-the-poliscope/2017/02/de-clunking-dbgap-data-submission-and-access-process-we-re-all-ears: OSP Poliscope blog post describing previous improvements to dbGaP

[6] https://gds.nih.gov/06researchers1.html

[7] For the purposes of this document, genomic summary statistics are defined as calculated summary statistics, including genotype counts, allele frequencies, effect size estimates and standard errors, and p-values calculated from a study sample. 

[8] http://www.nature.com/ng/journal/v46/n9/full/ng.3062.html

[9] https://www.genome.gov/27566089/workshop-on-sharing-aggregate-genomic-data/workshop-on-sharing-aggregate-genomic-data

[10] https://academic.oup.com/nar/article/45/D1/D819/2605794/The-dbGaP-data-browser-a-new-tool-for-browsing

[11] https://www.omim.org/

[12] https://www.ncbi.nlm.nih.gov/clinvar/

[13] https://www.ncbi.nlm.nih.gov/gtr/

[14] https://gds.nih.gov/index.html

Please attach a file
Files must be less than 20 MB.
Allowed file types: txt pdf doc docx ppt xls xlsx.


Please direct all inquiries via email to:

NIH Office of Science Policy
Division of Scientific Data Sharing Policy
Telephone: 301-496-9838

Email: sciencepolicy@mail.nih.gov.

This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.