Theme 3: Data sharing, IPR & human dimension issues

(Uhlir/Vande Castle)

To Share or Not to Share

  1. Legal, economic, and cultural factors and reasons for open data sharing
    1. Law/information policy re public-domain data
      1. Data not subject to protection under exclusive IP rights
        1. Data that cannot be protected because of its source (i.e., the federal U.S. government and many state agencies)
        2. Data for which the statutory period of protection has expired
        3. Ineligible or unprotectable components of otherwise protectable subject matter (e.g., factual data in databases, or ideas in copyrightable works)
      2. Otherwise protectable data which are expressly designated as unprotected and hence in the public domain
      3. Fair-use exceptions
    2. Economic principles that support open data resources in public domain
      1. Basic research and related scientific data as a public good
      2. Promote positive externalities, especially network externalities on the Internet
      3. Scientific, economic, and social values of open and unrestricted data dissemination
    3. Scientific culture/policy supporting data sharing
      1. Non-commercial value system
      2. Sharing ethos ("full and open" data exchange policy)

  2. Legal, economic, and cultural factors and reasons for not sharing data openly
    1. Law/information policy re proprietary data
      1. Copyright
      2. Licensing + digital rights management technologies
      3. Database protection legislation
      4. Trade secrets
      5. Patents
      6. Countervailing, superseding information restrictions on government data, based on national security or privacy/confidentiality concerns
    2. Economic factors that support proprietary data production
      1. IP protection needed to stimulate creative production and investment, and protect economic activity from market failure
      2. Efficiencies of private sector
      3. Pressures on government to privatize data collection/dissemination functions (or to commercialize in other countries)
      4. Pressures on academics to commercialize (Bayh-Dole Act, university policies)
    3. Scientific culture/policy working against data sharing
      1. PI periods of exclusive use
      2. Weak sharing ethos in highly distributed, heterogeneous, individual PI-driven research (much of biology)


Specific Items:

An incentive structure is needed to encourage data archiving. Federally funded facilities are needed to maintain long-term data. Such institutional mechanisms can help enforce public domain availability, because of the involvement of government. This is particularly important internationally, through mechanisms such as GBIF.

International data rights are more restrictive, and include restrictions on exports of biodiversity data and samples. Data sets for global climate research are not always freely available. Privatized government data in the United States can also be a problem, focused primarily on near-real time data (NexRad, OrbView data), although other data sources under pressure through Congress and OMB.

New legal mechanisms need to be developed to promote open data availability. Examples cited include: general public licenses (first developed by open-source S/W movement), copyleft, data easements. These public access licenses, coupled with S/W implementation, can be used to promote open access to nonprofits, while allowing commercialization efforts in the private sector.

Data policies or legislation is needed to protect identity of rare species and private lands in order to permit the release of such data.


Human dimension issues:

  • Tools and resources must be available for data archiving and publication to help enforce the sharing ethos
  • Cultural shift is needed for data sharing - Education as a tool
    • Data management must be part of educational curriculum
    • Training experts/facilities/workshops for developing or existing programs - continuing education (Data Management for Dummies)
  • Professional recognition is needed for data related activities and data management. Professional society prizes, explicit consideration in tenure review.
  • Financial incentives from research funding agencies through data management and access requirements and associated funding. No new grant if no data available.
  • Reward system for publication needs to be extended to data publication - including peer review. This should be the responsibility of the professional societies through their journals.
    • Recommendations must come from the community itself
    • Data publication a requirement with peer review publication (e.g., the ESA data archive model should be expanded to other societies, such as AIBS)
    • An award structure is needed
  • More collaborative research is needed - not just individual projects, but cross-discipline as well. Multi-Directorate research opportunities should be developed by NSF and other science agencies (e.g., to integrate IT and education or social science aspects with traditional discipline research)
  • Peer review of proposal requirements (panel considerations) is important - a result of a community consensus.

PBI Web site design by Ben Tolo, San Diego Supercomputer Center