Columbia Center for Children's Environmental Health Issues Public Comment on EPA Proposed Rule
* * *
The CCCEH is a national
Along with many academics, scientists, and clinicians, Center investigators have co-signed comments submitted online via Regulations.gov to docket EPA-HQ-OA-2018- by the Environmental Health Section of
The result of
This is because EPA has premised that for any scientific results worthy of consideration to be considered in the development of environmental regulations, the original "raw" data underlying the results must be available for scrutiny by all concerned parties: industry, government agencies, and the public.
This premise ignores the history of science-based government regulation, which has always been based on the weight of evidence obtained from a variety of sources: laboratory-based chemical and biological (test-tube) analysis, laboratory-based testing on non-human animals, and observational research of human subjects.
An example is the weight of compelling evidence from multiple sources that led to the regulation of tobacco marketing and distribution. As is well-known, research in all of the above domains established the fact that use of tobacco products caused cancer, a variety of other diseases, and early death.
In addition to the weight of evidence from multiple domains, regulators have also examined consistency in findings from independently conducted studies. In the case of humans, the finding that a chemical agent is harmful becomes more compelling as studies conducted by unrelated investigators, using different subject pools, all point to the same conclusion.
A third factor used in assessing research findings is publication in a peer-reviewed journal. Peer reviewers are experts in the field who consider every aspect of a manuscript, including research design, results, and historical evidence in making a recommendation to publish an article. Peer review is a key element to consider in evaluating the strength of a research finding.
To the extent that there are questions about findings in a particular study, mechanisms are available for accessing and verifying the results without a public release of data. One approach is the establishment of data enclaves, in which an independent expert can access the raw data for statistical analysis, generate results to replicate the original findings and perform additional analyses, without taking possession of the data itself. The technology to create a secure data enclave is well-established, and many examples exist. This approach allows rigorous independent assessment of the raw data without compromising the sensitive information provided by human subjects.
The notion that the public release of raw data is a protection against fraud or the generation of misleading results is also flawed. There are instances in which an unscrupulous researcher alters or even fabricates data to fit a preconceived hypothesis. There is no way to tell from the data itself that it is not accurate. This is why the weight of evidence, replication of findings from multiple sources, and peer review by scientific journals are so important--these factors, not the public availability of raw data, are the main safeguards against flawed or fraudulent science.
There are substantial risks involved in sharing of individual level research data, especially if data are available to anyone in a public release. Even if obvious identifiers like name, address, and phone number are redacted, detailed information about an individual can be linked to other data source to re-identify the individual. Enormous technological advances have been made in data linkage techniques, and substantial resources have been devoted to data linkage as businesses, government agencies and sophisticated rogue hacking groups seek to find out as much as possible about as many people as possible.
The uses of sensitive personal data in the wrong hands can range from personal embarrassment through release of identified information on social media, to impacts on an individual's employment and insurance when health data are involved. And of course, as more and more data about individuals becomes public, and the techniques for identification and re-identification become more sophisticated, the risk of public release of human subject data will only increase. It is important to remember that the genie cannot be put back in the bottle: once a dataset becomes publicly released, it can be downloaded and copied to many locations; it can never be recalled. So, data that appear to be safely 'de-identified' today may become highly identifiable in 20 years, or 20 days, as data linkage technology improves and the number of sources of data about individuals expands.
Respectfully submitted,
Associate Professor,
Director,
Co-Director, Certificate Program in Molecular Epidemiology
* * *
The proposed rule can be viewed at: https://www.regulations.gov/document?D=EPA-HQ-OA-2018-0259-9322
TARGETED NEWS SERVICE (founded 2004) features non-partisan 'edited journalism' news briefs and information for news organizations, public policy groups and individuals; as well as 'gathered' public policy information, including news releases, reports, speeches. For more information contact
Academy of Managed Care Pharmacy: America's Pharmaceutical Supply and Payment Chain Collaborating to Promote Undisrupted Patient Access to Medications
Rep. Engel Blasts Trump Administration for Playing Politics With Reproductive Health Care During Pandemic
Advisor News
Annuity News
Health/Employee Benefits News
Life Insurance News