Situation
Angela Beringer, a sociologist, leads a long-term multi-national project to study changes in old-age income security programs both in the US and in some selected European countries. The project collects historical and archival data as well as current data. Surveys of current policy-makers occur from time to time. The project is expected to last approximately ten years and many researchers are involved.
After four years of data collection, Angela and two colleagues publish a paper from the study. In addition, a grad student involved with the study writes a dissertation based on the data collected during the first two years of the project. The researchers plan to produce papers and reports as the project continues but they are aiming for a significant monograph when the project is finished.
After the first paper and dissertation are published, a sociologist, Mike Gallo, asks for a copy of the data because he wants to do further analysis. He suspects that the Beringer team has not used the best statistical techniques and that there is still enough critical information missing from the archival data that the interpretations would be different depending on the assumptions made about the missing data.
Angela does not want to release the data because the study is in progress, and because she now has some of the data that were missing when the first paper were published. She and her colleagues plan to present a new analysis at an upcoming international conference. Also, she wants to protect the integrity of the project and fears that external analyses might make it more difficult to continue to collect data during the remaining years of the project.
Questions
- Should Angela release the data that the first paper and the dissertation were based on to Mike?
- While data sharing is encouraged in the code, does that mean that all data must be shared?
- The code states that data should be shared at the completion of the project or after significant publications. Does publishing a paper or a dissertation qualify as a significant publication?
- What is the appropriate time to share data collected during an ongoing project?
- Does Angela have a right to protect the project by ensuring the publications meet the standards she has set for data analysis from it?
Discussion
The issue of data sharing becomes much more complex with long term projects. The norm of data sharing is important in science because it allows independent analysis of data which provides a necessary safeguard against error and excessive researcher bias. It also allows the data collected to have more value because they can be analyzed more fully. Most major research projects now require a form of data sharing.
With Angela’s project, the issues are not clear but she should be willing to share the data that were used as the basis for the publications. If that is not possible for technical or practical reasons, she might need to consider how she might be able to provide data after subsequent publications.