A new technique allows medical records to be used for research on the genetics of disease while still protecting patients from prying eyes.
Databases that link thousands of people’s DNA profiles to their medical histories are a powerful tool for researchers who want to use genetics to individualize the diagnosis and treatment of disease. But this promise of personalized medicine comes with concerns about patient privacy. Now scientists have come up with a way to alter personal medical information so it’s still meaningful for research, but meaningless to someone trying to ID an individual in a database.
“We’re hoping that it’s a game-changer,” says Bradley Malin, a biomedical informatics specialist from Vanderbilt University in Nashville who helped develop the method.
The new method, published online April 12 in the Proceedings of the National Academy of Sciences, simply disguises parts of the medical history data that are not relevant to a geneticist’s particular research question using an algorithm that combs through health records and makes some aspects of them more general.
For example, if scientists want to examine links between genes and asthma, parts of an individual’s medical record that pertain to asthma are kept intact. But if that asthmatic patient also had a broken arm as a teenager, the algorithm changes the medical code for a broken left forearm to a code that indicates only a broken bone.
“What’s really great about this is even though it anonymizes the data, it still allows you to go in and find an association with medical history,” says Nils Homer of the University of California, Los Angeles, who was not involved with the research.
The researchers tested their algorithm against potential hackers using information from more than 2,600 patients. The team assumed a hacker might know a patient’s identity, some of their medical history and maybe some of the medical codes associated with that history. The technique stymied efforts to ID an individual based on that information, the researchers report.
“There is definitely a need to de-identify individuals,” says Homer, who was part of a team that demonstrated two years ago that it is possible to trace a genetic signature back to an individual even when that person’s DNA profile was buried in a pool of thousands. The finding prompted the National Institutes of Health to restrict access to genetic databases that had previously been available to anyone with Internet access.
Genome-wide association studies, which comb through these giant databases looking for links between genetic and physical traits, have the potential to generate clinically valuable information. Establishing such links could help doctors understand, for example, why patients respond differently to certain drugs.