Skip to main content.

Data Management Plan

A data management plan asks the researcher to consider how data and associated products of research (such as code or other files) will be handled across the life span of a project and beyond. This includes how the data will be stored, secured, accessed, documented, formatted and versioned. The plan should also include where and when data will be shared, if it will be made publicly available, how it will be licensed for reuse and how and for how long data will be archived. Both general best practices for data management and archiving should be considered as well as any discipline-specific practices for file formats, metadata and documentation that would support the discovery and reuse of the data. If your research involves human subjects or other sensitive information, ethics, consent, and de-identification of data should also be addressed.

Research Informatics Services- Planning & Design

Research Informatics has provided guiance to support researchers in completing thier data management and sharing plan, as described in the National Institutes of Health (NIH). Here you will find resources and tools aligned with the NIHs final policy

Build Your Plan- Planning & Design

Create Data Management Plans that meet requirements and promote your research. You have free access to an online tool for writing DMPs: To use the DMPTool. You just need to sign in as a KUMC researcher, and you’ll have access to templates, example DMPs, and KUMC-specific guidance. You can also find some helpful public guidance on using DMPTool 

To build your Data Management Plan visit this link DMPTool  

  1. Sign in/Sign up
  2. Create a plan
  3. Enter what research project you are planning
  4. Select the primary research organization (e.g., University of Kansas Medical Center)
  5. Select the primary funding organization, under funder select “National Institutes of Health (nih.gov)”
  6. Which DMP template would you like to use? Select “NIH Gen DMSP (Forthcoming 2023)”
  7. Enter your Project Details, Collaborators, Write Plan, then you can download the plan in a word document (docx) and Finalize and Publish (to control who gets to see the plan).

Data Management Elements

  • Data Type
  • Related Tools, Software and/or Code
  • Standards
  • Data Preservation, Access & Associated Timelines
  • Access, Distribution, or Reuse Considerations
  • Oversight of Data Management &Sharing

Resources for the Data Plan

Map out the processes and resources for the entire data life cycle. Start with the project goals (desired outputs, outcomes, and impacts) and work backwards to build a data management plan, supporting data policies, and sustainability plans.

Best Practices

For students and others new to data management, DataONE provides a Best Practices Primer as an introduction to this Best Practices database and data management in general.
  • DataONE's Data Management Skillbuilding Hub
  • Collect & Create

    Observations are made either by hand or with sensors or other instruments and the data are placed a into digital form. You can structure the process of collecting data up front to better implement data management.

    For Cohort Discovery Visit KUMC's Research Informatics Public Website

    Assurance Stage

      Employ quality assurance and quality control procedures that enhance the quality of data (e.g., training participants, routine instrument calibration) and identify potential errors and techniques to address them For more informationvisit DataONE's Data Management Assurance Stage

    Data Descriptions

    Sample Plans: Data Type

      Briefly describe the scientific data to be managed and shared:

    1. Summarize the types (for example, 256-channel EEG data and fMRI images) and amount (for example, from 50 research participants) of scientific data to be generated and/or used in the research. Descriptions may include the data modality (e.g., imaging, genomic, mobile, survey), level of aggregation (e.g., individual, aggregated, summarized), and/or the degree of data processing.

      • Demographic, clinical, and MRI, 1H fMRS and fMRI imaging data will be acquired from 110 affected youth and 110 matched healthy controls (described in detail in sections C.3 and C.4 of this application). All data will be de-identified prior to receipt by the repository, but the information needed to generate a global unique identifier for the NIMH Data Archive (NDA) will be collected for each subject.


    2. Describe which scientific data from the project will be preserved and shared. NIH does not anticipate that researchers will preserve and share all scientific data generated in a study. Researchers should decide which scientific data to preserve and share based on ethical, legal, and technical factors. The plan should provide the reasoning for these decisions.
      • Sufficient data from this project will be preserved to enable sharing via NDA data of sufficient quality to validate and replicate research findings described in the Aims. NIMH requires data measured from human subjects to be shared using the NDA


    3. A brief listing of the metadata, other relevant data, and any associated documentation (e.g., study protocols and data collection instruments) that will be made accessible to facilitate interpretation of the scientific data
      • In addition to the subject level data described above, all 1H fMRS and fMRI task related paradigm designs and experiment definitions will be deposited in the NDA.

    Consider the following:

    • What types of data will be collected? E.g. Spatial, temporal, instrument-generated, models, simulations, images, video etc.
    • How many data files of each type are likely to be generated during the project? What size will they be?
    • For each type of data file, what are the variables that are expected to be included?
    • What software programs will be used to generate the data?
    • How will the files be organized in a directory structure on a file system or in some other system?
    • Will metadata information be stored separately from the data during the project?
    • What is the relationship between the different types of data?
    • Which of the data products are of primary importance and should be preserved for the long-term, and which are intermediate working versions not of long-term interest?

    Sample Plans: Standards

      State what common data standards will be applied to the scientific data and associated metadata to enable interoperability of datasets and resources and provide the name(s) of the data standards that will be applied and describe how these data standards will be applied to the scientific data generated by the research proposed in this project. If applicable, indicate that no consensus standards exist.
      • Participant age, sex, ethnicity, height, weight, socioeconomic status, and other demographic data will be collected using the following instruments as defined in NDA:

        1. Research Subject and Pedigree (ndar_subject01)

        2. Demographics Short Form (demsf01)

        3. Ethnic Group Questionnaire (ethgrp01)

        4. Height and Weight (height_weight01)

        5. Hollingshead Socioeconomic Rating Scale (ses01)

        6. Pubertal Development Scale (pds01)

        7. Edinburgh Handedness Inventory (edinburgh_hand01)

        8. WASI-2 (wasi201)"

        In compliance with NOT-MH-20-067, the following data will be collected to facilitate aggregation of this data set with other data sets:

        1. DSM Crosscutting for Youth (dsm5crossch01)

        2. RCADS-25 (rcads2501)


        The clinical assessments we plan to collect for this study include:
        1. Kiddie-SADS-Present and Lifetime Version (ksads_pl01)

        2. Children’s Yale-Brown Obsessive Compulsive Scale (cybocs01)

        3. Schedule for Obsessive-Compulsive and Other Behavioral Syndromes (Hanna. Schedule for Obsessive-Compulsive and Other Behavioral Syndromes, Ann Arbor: University of Michigan, 2010, new data dictionary will be defined in NDA)

        4. Dimensional Obsessive Compulsive Scale (docs01)

        5. Yale Global Tic Severity Scale (yale01)

        6. Child Behavior Checklist (cbcl01)

        7. Multidimensional Anxiety Scale for Child Parent and Self (masc_p01)

        8. Conners 3 (conners3_ps01)

        9. Adolescent Depression Rating Scale (doi:10.1186/1471-244X-7-2, new data dictionary will be defined in NDA)

          • 1H fMRS and fMRI data will be shared with the Image (image03), Imaging Work Flow (iwf01), and Imaging Collection (imagingcollection01) data dictionaries as defined in NDA.

    Data Analysis

    Create analyses and visualizations to identify patterns, test hypotheses, and illustrate finding. During this process record your methods, document data processing steps, and ensure your data are reproduceable.

    Research Informatics Services- Data Analysis

    Sample Plans: Related Tools, Software and/or Code

    1. State whether specialized tools, software, and/or code are needed to access or manipulate shared scientific data, and if so, provide the name(s) of the needed tool(s) and software and specify how they can be accessed.

      • The clinical data will be analyzed with custom Python code written using the statsmodels, numpy, and pandas packages, all of which are freely available. 1H fMRS spectra will be analyzed with LCModel 6.3 software using LCMgui, which is freely available. fMRI images will be analyzed using the SPM8 toolbox for MATLAB. While MATLAB is commercial software, most universities have site licenses available and the SPM8 toolbox is free. It is also possible that the toolbox might run in Octave, an open-source alternative to MATLAB, but we have not tried it. All code will be shared on our GitHub lab website. The code can be found by searching for “labname” on GitHub. The main readme.md file for the project will also include instructions and parameter choices for the GUI-based analyses.


    Consider the following:
    • Describe method to create derived data products
    • Document steps used in data processing (visualizations, plots, statistical outputs, a new dataset created by integrating multiple datasets, etc)
    • Ensure datasets used are reproducible
    • Identify most appropriate software
    • Identify outliers
    • Identify values that are estimated
    • Store data with appropriate precision
    • Understand the geospatial parameters of multiple data sources

    Archive & Preservation

      Plan to preserve data in the short term to minimize potential losses (e.g., via accidents), and in the long term so that project stakeholders and others can access, interpret, and use the data in the future. Decide what data to preserve, where to preserve it, and what documentation needs to accompany the data. For more information visit DataONE's Data Presevation Stage

      Research Informatics Services- Storage & Archive

      KU ScholarWorks (KUSW) is the institutional repository at the University of Kansas, Lawrence and Edwards campuses. Its aim is to centralize and provide persistent and reliable access to the research output, scholarship, and creative works of faculty, academic staff, and students at KU in addition to housing digital content from the University Archives. KUSW complements traditional publishing outlets by increasing access to the scholarly journal literature produced by KU researchers and by hosting journals edited by KU faculty and departments. As an evolving resource, an ongoing focus of the repository is to capture emerging research driven by the intellectual environment of the campus, in addition to providing access to documents that are of permanent value to the University. The collections in KUSW are focused on the research, scholarship, creative works of KU faculty and researchers, and in some cases students, as well as materials that document the history of the University and reflect its intellectual environment.

    Sample Plans: Data Preservation, Access, and Associated Timelines

    Give plans and timelines for data preservation and access, including:
    • The name of the repository(ies) where scientific data and metadata arising from the project will be archived. See Selecting a Data Repository for information on selecting an appropriate repository.

    • All data will be deposited to NDA starting 12 months after the award begins and will be deposited every six months thereafter following the usual NDA data submission dates.

    • How the scientific data will be findable and identifiable, i.e., via a persistent unique identifier or other standard indexing tools.
    • Data will be findable for the research community through the NDA Collection that will be established when this application is funded. For all publications, an NDA study will be created. Each of those studies is assigned a digital object identifier (DOI). This data DOI will be referenced in the publication to allow the research community easy access to the exact data used in the publication.


    • When the scientific data will be made available to other users and for how long. Identify any differences in timelines for different subsets of scientific data to be shared.
    • Note that NIH encourages scientific data to be shared as soon as possible, and no later than the time of an associated publication or end of the performance period, whichever comes first. NIH also encourages researchers to make scientific data available for as long as they anticipate it being useful for the larger research community, institutions, and/or the broader public.

      Consider the following:
      • Identify data sensitivity
      • Identify data with long-term value
      • Identify suitable repositories for the data
      • Plan data management early in your project
      • Plan for effective multimedia management
      • Preserve information: Keep raw data raw
      • Provide a citation and document provenance for your dataset
      • Provide identifier for dataset used
      • Provide version information for use and discovery
      • Recognize stakeholders in data ownership
      • Store data with appropriate precision

    Data Sharing

      Indicate how compliance with the DMS Plan will be monitored and managed, the frequency of oversight, and by whom (e.g., title, roles). This element refers to oversight by the funded institution, rather than by NIH. The DMS Policy does not create any expectations about who will be responsible for Plan oversight at the institution.

      Research Informatics Services- Oversight of Data Management and Sharing

      • See our Guidance for Institutional Repositories and Oversight(Planning & Design)

    Data Visualization & Integration”

      Data from multiple sources are combined into a form that can be readily analyzed. For example, you could combine citizen science project data with other sources of data to enable new analyses and investigations. For more information please visitDataONE's Data Presevation Stage

    Data Discovery

      Identify complementary data sets that can add value to project data. Strategies to help endure the data have maximum impact include registering the project on a project directory site, depositing data in an open repository, and adding data descriptions to metadata clearing houses.

      For more information please visitDataONE's Data Discovery Stage

    Sample Plans: Oversight of Data Management and Sharing

    Describe how compliance with this Plan will be monitored and managed, frequency of oversight, and by whom at your institution (e.g., titles, roles).

    • The Office of Sponsored Programs at University X that will be administering this award has created a data management and sharing plan compliance system as part of their process for submitting the annual NIH progress report. That Office is collecting information related to the number of research participants that are deposited each reporting year. The Office of Sponsored Programs will also look for the NDA data DOIs from NDA Studies and will include that information in the annual progress report.

      Validation Schedule (this section is required by NIMH)

      If funded, within 6 months of the Notice of Award date we will submit a Data Submission Agreement signed by the principal investigators and an institutional business official, as well as define and complete the Data Expected section of this project. Uploads of all initial demographic, clinical, and raw structural MRI, 1H fMRS and fMRI research data will be completed using the second submission cycle deadline following the Notice of Award date. Subsequent data uploads will be harmonized, validated, and submitted biannually on the standard January 15th and July 15th submission deadlines. We also plan to use the NDA validation tool as a quality control measure in the laboratory. The data manager in charge of submitting data to NDA will help researchers in the group validate their data once every month.

    Element Descriptions

    • Data Definitions- A description of the information to be gathered; the nature and scale of the data that will be generated or collected.
    • Existing data- A survey of existing data relevant to the project and a discussion of whether and how these data will be integrated.
    • Format Formats- in which the data will be generated, maintained, and made available, including a justification for the procedural and archival appropriateness of those formats.
    • Metadata- A description of the metadata to be provided along with the generated data, and a discussion of the metadata standards used.
    • Storage and backup- Storage methods and backup procedures for the data, including the physical and cyber resources and facilities that will be used for the effective preservation and storage of the research data.
    • Security- A description of technical and procedural protections for information, including confidential information, and how permissions, restrictions, and embargoes will be enforced.
    • Responsibility- Names of the individuals responsible for data management in the research project.
    • Intellectual property rights Entities or persons who will hold the intellectual property rights to the data, and how IP will be protected if necessary. Any copyright constraints (e.g., copyrighted data collection instruments) should be noted.
    • Access and sharing- A description of how data will be shared, including access procedures, embargo periods, technical mechanisms for dissemination and whether access will be open or granted only to specific user groups. A timeframe for data sharing and publishing should also be provided.
    • Audience- The potential secondary users of the data.
    • Selection and retention periods- A description of how data will be selected for archiving, how long the data will be held, and plans for eventual transition or termination of the data collection in the future.
    • Archiving and preservation- The procedures in place or envisioned for long-term archiving and preservation of the data, including succession plans for the data should the expected archiving entity go out of existence.
    • Ethics and privacy- A discussion of how informed consent will be handled and how privacy will be protected, including any exceptional arrangements that might be needed to protect participant confidentiality, and other ethical issues that may arise.
    • Budget- The costs of preparing data and documentation for archiving and how these costs will be paid. Requests for funding may be included.
    • Data organization- How the data will be managed during the project, with information about version control, naming conventions, etc.
    • Quality Assurance - Procedures for ensuring data quality during the project.
    • Legal requirements- A listing of all relevant federal or funder requirements for data management and data sharing.

    Research Informatics

    University of Kansas Medical Center
    Research Informatics
    Student Center, 3001C 
    3901 Rainbow Boulevard
    Kansas City KS 66160 
    913-588-7251 

    Research and Project Requests: dataconcierge@kumc.edu 
    HERON and REDCap support: ocriosupport@kumc.edu