Sharing Cancer Research Data
I read a recent opinion piece in the New York Times from a researcher at the Sloan-Kettering Cancer Center in New York. He states that researchers around the world are trying to find better ways to prevent and treat cancer yet are often not willing to work together or share their data. He points out that the patients who authorized this data to be collected gave it freely, so the data should be available publicly for validation and additional research. He suggests this indicates many researchers care more about their own resume than about the public interest.
There is a fine line…researchers need some motivation to collect data, process it, and analyze it. Publishing is a great motivation because it can mean more funding, prestige, and an avenue to obtain feedback. If they had to give up their data immediately, they would have to compete with other researchers who hadn’t made these efforts. On the other hand, as the author points out, the data should not be owned completely by the researchers. Sharing data should advance the cause of science.
For many types of Informatics research, authors are required to publish their data in publicly available repositories as their papers are published. These create a happy medium…findings can be published and recognition rewarded, yet the data are made available for others to validate their findings or do secondary research.
This topic is important to me because I will rely on such data repositories to do my PhD research without having to secure funding and find study subjects, etc. I am getting “recycled” data that have been used by others; but the beauty is that I think I have some new and interesting ways to look at the data that were not considered by the original authors.
GEO and CGEMS are examples of such data-sharing repositories. caBIG is an infrastructure being developed to help cancer researchers share data (my PhD advisor is involved in this effort).
