Hidden keys to health

The medical community is sitting on mountains of e-health data that could lead to important medical discoveries. But will its value remain buried by privacy concerns and lack of funding?

Dr. Ross Fletcher, a cardiologist and chief of staff at the Department of Veterans Affairs Medical Center in Washington, D.C., wanted to know how well the hospital was treating patients with hypertension. Last October he checked the facility’s database of patient medical records to see how many had had blood pressure readings exceeding 140/90 on three separate days. The answer: about 45 percent of patients.

In January, he checked again, confident that increased attention to treating hypertension would have reduced its incidence. Instead, he found that about half of the hospital’s patients — 5 percent more than in his first check — had had high blood pressure readings.

Searching for an explanation, Fletcher checked the VA’s database of veterans’ medical records back to 1998. He found that average blood pressure readings increased every winter and dropped every summer.

It was a medical discovery that could improve control of blood pressure. Heart attacks are more common in the winter than in the summer, and Fletcher might have found a reason. That knowledge could help doctors head off some heart attacks, he said.

Without the electronic records in the VA’s system, Fletcher’s discovery probably would not have happened. “To see changes over time requires large numbers of patients,” he said, adding that it would be difficult at best to scour paper records for blood pressure readings for a sufficient number of patients.

Evidence-based medicine

Much of the public policy push for electronic health records (EHRs) has focused on the potential reduction in costs and increases in patient safety and quality of care. But there is another possible benefit — the ability to uncover knowledge buried deep within the data.

Some of the kinds of questions that could be answered with the help of large collections of medical records include:

  • What is the most effective treatment for persistent lower back pain in a particular kind of case?

  • Which of the many drugs on the market is the most effective in treating a patient with a certain set of symptoms?

  • Do some medical devices work better for very young or old patients?

  • What are the links between certain kinds of residential environments and diseases?

Studies of medical issues such as those are under way today, of course, as many health care providers embrace what’s known as evidence-based medicine. But having large numbers of patient records available online for research will make a big difference in the quality and scope of the evidence.

“Sometimes the effects you’re looking for are subtle, and therefore you need large populations,” said Dr. Barry Hieb, a health care research director at Gartner. He cites the example of the pain reliever Vioxx, which was withdrawn from the market in 2004 after it was found to cause heart attacks in a small number of patients. “One could argue that the reason the federal government didn’t pull Vioxx off the shelf early is that it didn’t have enough big studies to see the link with the heart attack problem,” Hieb said. “The way to solve that is to have 250 million people in your database.”

It would take many years to build such a database, but President Bush has called for all Americans to have EHRs by 2014. Some observers say that goal is too ambitious.

Wiping out identities

Even when EHRs have become the norm, researchers might find them tantalizingly out of reach. To begin with, the Health Insurance Portability and Accountability Act of 1996 requires that patients give their consent before their records can be used in research. HIPAA does not require such consent when records are being shared for purposes of treatment or billing, but research is another matter under the law.

There is an exception for medical records that have been cleansed of identifying information — a process known as de-identification. However, the approach is somewhat controversial, and some privacy advocates say that, given today’s powerful data-mining techniques, one can never be sure that the information will remain anonymous.

“The really difficult thing is a thing called inferencing,” Hieb said. “You look at a set of data and you say, ‘Gee, this is a patient in the Cincinnati area. I don’t have the name, but I do have her birth date and I can go find some other piece of information,’ such as they had an appendectomy on July 14, and then you tag some other source, and before you know it, you know who the person is.”

Hieb said it is possible to de-identify data, but it’s not easy. Like some others in the field, he said he believes de-identification is not a single process but rather a collection of techniques of varying strength that can be used as needed, depending on the dataset and the intended use. De-identification is like computer security, he said. There is no perfect solution, and the near-perfect ones exact a heavy toll in efficiency and performance. “You have to balance off how valuable or damaging would this data be versus what you do then to protect it,” Hieb said.

Making research harder

The problem is that stripping too many identifiers from data could render it useless to researchers. For example, medical experts studying causes of death need to know the ages at which people died. Those studying the health effects of pollution need to know how close people live to toxic sources.

In such cases, HIPAA and other privacy rules allow for releasing a limited dataset that is subject to a data-use agreement. That time-consuming and sometimes difficult process involves review boards and lawyers, and sometimes they cannot reach an agreement. When medical researchers ask the state of Wisconsin to release data collected by local health departments, “this is handled through a fairly ad hoc negotiation process,” said Raghu Ramakrishnan, a computer sciences professor at the University of Wisconsin.

“By being too hard-nosed with people who are actually using this data to improve our understanding of things like cancer, we are making things harder in our attempt to understand these diseases and how to attack them,” Ramakrishnan said.

He leads a team of researchers trying to develop standard techniques for de-identifying data while preserving its value to researchers. The team includes biomedical researchers and the Wisconsin state epidemiologist, along with information technology specialists. It received a $1.6 million grant from the National Science Foundation in 2005.

“This is an area that has not been widely studied,” Ramakrishnan said. “We need to understand these interactions better.”

One of the most promising approaches involves modifying some data elements whose alteration will not affect the validity from the researcher’s perspective. The Wisconsin team will test several alternatives.

Health research superhighway

Besides providing information about people’s medical histories and the course of diseases, EHR collections could help researchers locate participants for studies, including trials of promising new drugs.

For example, if a researcher needed to find 100 women in a certain age group who had a certain disease, a large database of patient records could speed the process.

Once participants were enrolled in the clinical trials, researchers could have access to their electronic records, saving hours of interviews by retrieving case histories.

The prospect of benefits such as those has prompted a nonprofit organization called FasterCures to join the ranks of EHR advocates.

The group aims to accelerate biomedical research that promises to find cures for deadly and debilitating diseases, and its leaders say that having medical records online will help. FasterCures President Greg Simon said he is concerned, however, that developers will not design the Bush administration’s proposed National Health Information Network with researchers’ needs in mind.

“Why are we going to build this new superhighway of health information and then have to dig it up two or three years later and put research infrastructure in it, which is the course we’re on if we’re not careful?” he asked.

Medical records as they exist today typically do not hold all the information researchers need, Simon said, adding that “researchers ask very different kinds of questions than your doctor does in the 10 minutes he spends with you.”

For example, medical records seldom indicate whether patients actually underwent the treatment their doctors recommended. “Maybe you didn’t do the surgery the doctor wanted to schedule for you,” he said. “You took garlic pills instead.”

And medical records — whether on paper or online — rarely note whether patients were cured.

Lessons from a gold mine

Dr. Shawn Murphy, an assistant professor of neurology at Harvard Medical School, directs Partners Healthcare System’s Research Patient Data Registry, a warehouse of data on nearly 2 million patients treated at Boston hospitals since the 1980s. The data is helping researchers locate patients to take part in clinical trials, and they say it has saved them substantial amounts of time.

Murphy said the registry is indeed a gold mine, but the data must be used carefully. For example, medical records show what diseases have been diagnosed in patients, he said, but they don’t indicate whether the patient is free of a specific ailment.

If you are a medical researcher trying to compare a set of patients who are diabetic with a similar set of patients who are not diabetic, you can’t rely on the absence of a diabetes diagnosis. “In general,” Murphy said, “clinical data doesn’t come with negative diagnoses — diagnoses that say a person does not have diabetes.”

Researchers also find discrepancies in medical records. For example, one set of records might indicate that a patient has Type 1 diabetes while another set indicates the presence of Type 2 diabetes in the same patient, Murphy said. In such cases, researchers can’t use the records or enroll a patient in a clinical trial without further verification or corroborating data. Nonetheless, hundreds of research projects have relied on the Partners registry and added to the world’s store of medical knowledge.

Privacy worries

When patients first go to Partners Healthcare for treatment, they sign a HIPAA consent form granting permission for their records to be used in research.

Such blanket consent is worrisome to privacy advocates such as Dr. Deborah Peel, a Texas psychiatrist who founded the Patient Privacy Rights Foundation, a nonprofit organization that advocates greater protection of medical records.

Peel said she became concerned about the privacy of medical records when some of her patients complained that their employers knew they were receiving psychiatric care. HIPAA does not prevent insurance companies from sharing such information with companies that contract for employee health benefits.

Peel’s group has joined forces with the Electronic Privacy Information Center, a Washington, D.C., organization, to promote stronger privacy protections for EHRs.

Peel said she is not opposed to electronic records. “IT can really protect privacy” by shielding records of patients who opt for privacy, she said. But unless the government requires the developers of medical record networks to incorporate stringent protections, she said she fears that a national network would allow records to be shared too freely. In particular, she said patients should not be forced to give up their privacy rights to get medical treatment.

Hieb expects the privacy issue to escalate as electronic records proliferate. “This is never going to be a totally solved problem,” he said.

Mary Woolley, president of a nonprofit group called Research!America, said she hopes the issues associated with EHRs prove manageable. Once more records are available in electronic form, she said, it will be easier to do biomedical research studies that examine how patients fare over time. “Today, for the most part [these studies] can be done only retrospectively,” she said.

Woolley, whose organization advocates more public investment in health research, also said EHRs will offer ways to find out more about the health of children and pregnant women, who often cannot be research subjects.

Everyone stands to benefit from a National Health Information Network, and that starts with patients, Woolley said.

Biomedical research: Big and getting bigger

Public and private spending on biomedical research in the United States doubled in the past decade, according to an analysis published in the Journal of the American Medical Association.

In the Sept. 21 article, Dr. Hamilton Moses III and fellow researchers reported that such spending increased from $37.1 billion in 1994 to $94.3 billion in 2003, which represents a doubling after adjusting for inflation. Industry paid for more than half of that research, while the National Institutes of Health sponsored 28 percent, the article states.

Only about 1.5 percent of the total covered research on health services. “The low proportion of spending on health services research is especially notable because it is the main tool available to evaluate the clinical benefit of technology,” the authors wrote.

They said the allocation of research dollars does not appear to reflect concerns about health care costs and medical quality, although that situation might be changing.

Removing identifiers from medical records

The National Institutes of Health lists 18 data elements that generally must be removed from medical records if they are to be considered de-identified. They are:

1. Names.

2. All geographic subdivisions smaller than a state, including street addresses, cities, counties, precincts and ZIP Codes.

3. Dates directly related to an individual, excluding years.

4. Telephone numbers.

5. Fax numbers.

6. E-mail addresses.

7. Social Security numbers.

8. Medical record numbers.

9. Health plan beneficiary numbers.

10. Account numbers.

11. Certificate/license numbers.

12. Vehicle identifiers and serial numbers.

13. Device identifiers and serial numbers.

14. Web addresses.

15. IP addresses.

16. Biometric identifiers, including fingerprints and voiceprints.

17. Full-face photographic and comparable images.

18. Any other unique identifying numbers.

Source: National Institutes of Health


  • innovation (Sergey Nivens/Shutterstock.com)

    VA embraces procurement challenges at scale

    Steve Kelman applauds the Department of Veterans Affairs' ambitious attempt to move beyond one-off prize-based contests to combat veteran suicides more effectively.

  • big data AI health data

    Where did the ideas for shutdowns and social distancing come from?

    Steve Kelman offers another story about hero civil servants (and a good president).

Stay Connected


Sign up for our newsletter.

I agree to this site's Privacy Policy.