Patent Issued for Computing system for de-identifying patient data (USPTO 11366927): Allscripts Software LLC
2022 JUL 12 (NewsRx) -- By a
Patent number 11366927 is assigned to
The following quote was obtained by the news editors from the background information supplied by the inventors: “De-identifying data is crucial in healthcare fields to protect patient privacy. Certain laws, such as the Health Insurance Portability and Accountability Act of 1996 (HIPAA), require that patient data of a patient be de-identified before the data can be transmitted to parties that are not authorized to view protected aspects of the patient data, such as a software developer that is to use the patient data to test the functionality of healthcare software. To test the functionality of the healthcare software, it is desirable that realistic clinical data is used. By using real clinical data collected from real patients instead of randomly generated data, the software developer is more likely to be able to verify the intended functionality of the healthcare software, which is done to ensure patient safety.
“HIPAA defines protected health information (PHI) as information, including demographic information, in a medical record or designated record set that can be used to identify an individual and that was created, used, or disclosed in the course of providing a health care service such as diagnosis or treatment. PHI includes many common identifiers for individuals, including names, addresses, dates of birth, social security numbers, etc.
“Conventional computer-implemented de-identifying techniques have been developed to alter PHI of a patient such that an identity of the patient cannot be readily determined from the altered PHI. However, there are various deficiencies associated with conventional computer-implemented de-identifying techniques. First, conventional computer-implemented de-identifying techniques were developed before the advent of large-scale analytics and therefore existing de-identifying techniques no longer sufficiently de-identify patient data. Large-scale analytics can be used to identify, to a reasonably high probability, a patient identity when provided with data that has been de-identified through us of these conventional techniques.
“Conventional de-identifying techniques also tend not to be well-suited for maintaining the clinical relevance of a patient record while de-identifying the data. Rather, conventional de-identifying techniques generally randomize the PHI. Hence, conventional de-identifying techniques can produce de-identified data that is not well-suited for use in testing functionality of healthcare software (particularly clinical decision support functionality), as the produced de-identified data is not representative of clinically relevant patient data. This is due to the fact that conventional de-identifying techniques were not developed to ensure that a de-identified patient record retains clinical meaning with regard to itself and related patient records. For example, conventional de-identifying techniques might change the date field of a birth of a patient’s child to occur prior to the patient’s pregnancy. In another example, a conventional de-identifying technique might change the last names in patient records for siblings to two different last names, thereby losing an indication that the patients are related.
“Additionally, conventional de-identifying techniques tend to lack configurability options and hence are not well-suited for varying end-uses. Furthermore, conventional de-identifying techniques are generally unable to recognize patient records belonging to the same patient across disparate databases. Therefore, when de-identifying multiple records pertaining to a same patient, a conventional computing system for de-identifying patient data creates multiple de-identified patient records, one for each of the multiple records that pertain to the patient. These multiple de-identified patient records are not identifiable as belonging to same patient, and therefore clinical relevance of these de-identified records to one another by virtue of being representative of a same patient is lost by a conventional system for de-identifying patient data, restricting the ability to do interoperability testing across systems.”
In addition to the background information obtained for this patent, NewsRx journalists also obtained the inventors’ summary information for this patent: “The following is a brief summary of subject matter that is described in greater detail herein. This summary is not intended to be limiting as to the scope of the claims.
“Described herein are various technologies pertaining to de-identifying patient data. More specifically, a de-identifying application that is configured to de-identify patient data is described herein. The de-identifying application generates de-identified patient data by modifying or replacing protected health information (PHI) and/or other sensitive information in patient data using various de-identifying techniques. In operation, a computing system that executes the de-identifying application receives a request to de-identify data pertaining to a patient. In exemplary embodiments, the computing system may be or be included in a cloud platform and the de-identifying application may execute in a virtual machine that runs on the cloud platform. The de-identifying application processes patient data of the patient based upon configuration data that is used to configure functionality of the de-identification application. By way of example, the de-identifying application can be configured to de-identify patient data of different PHI categories differently. For instance, when de-identifying an address, a street number and street name may be transformed, while the city remains unchanged. In another example, a date in the patient data may be de-identified by transforming a date and a month, while the year remains unchanged. In yet another example, a first name and a last name of a first patient is transformed and instances of the same last name appearing in a database are transformed to a same de-identified last name. Other methods of de-identification are also contemplated for other categories of patient data such as social security numbers, payor information, relatives, insurance, etc. The de-identifying application may then store the de-identified patient data. The de-identifying application (or another application) may then transmit the de-identified patient data to a computing device, whereupon the de-identified patient data may be utilized in healthcare software testing, clinical studies, incorporated into a statistical model, etc.
“In various exemplary embodiments, the de-identifying application can be configured to standardize an address in the patient data prior to de-identifying the patient data such that instances of similar addresses that are representative of a same physical address (e.g.,
“The above-described technologies present various advantages over conventional computer-implemented de-identifying techniques pertaining to de-identifying of patient data. First, unlike conventional techniques, the de-identifying application described above is well-suited for maintaining the clinical relevance of a patient record while de-identifying the data. Additionally, the de-identifying application is configurable to perform various types of de-identification operations and hence is well-suited for varying end-uses of the de-identified patient data. Furthermore, the de-identifying application is able to identify patient records belonging to the same patient across disparate databases. In an exemplary embodiment, the de-identifying application de-identifies multiple patient records pertaining to a same patient by creating a single de-identified patient record that is representative of all of the multiple patient records. In another exemplary embodiment, the de-identifying application de-identifies multiple patient records pertaining to a same patient by creating multiple de-identified patient records that are indicated as pertaining to a same patient. By contrast, conventional computing systems for de-identifying patient data do not recognize when patient records belong to a same patient, and therefore create multiple de-identified records that are not identifiable as pertaining to the same patient, restricting the ability to do interoperability testing. Finally, the de-identifying application does not require frequent user interaction with a graphical user interface (GUI) each time that patient data is de-identified, thereby enabling automation of electronic de-identification of patient data.
“The above summary presents a simplified summary in order to provide a basic understanding of some aspects of the systems and/or methods discussed herein. This summary is not an extensive overview of the systems and/or methods discussed herein. It is not intended to identify key/critical elements or to delineate the scope of such systems and/or methods. Its sole purpose is to present some concepts in a simplified form as a prelude to the more detailed description that is presented later.”
The claims supplied by the inventors are:
“1. A computing system, comprising: a processor; and memory storing instructions that, when executed by the processor, cause the processor to perform acts comprising: receiving an indication that a software application is to be tested based upon values in a de-identified database; and responsive to receiving the indication, testing the software application based upon the values in the de-identified database, wherein the de-identified database is constructed based upon a database, wherein the de-identified database has a same format as the database such that fields in the database have corresponding fields in the de-identified database, and further wherein the database comprises: a first row that includes first data for a first person, where the first data comprises a first street address of the first person, and further wherein the first street address includes a first street name; a second row that includes second data for a second person, wherein the second data comprises a second street address of the second person, and further wherein the second street address includes the first street name, wherein a plurality of acts are undertaken to construct the de-identified database based upon the database, the plurality of acts comprising: transforming the first data into first de-identified data, wherein transforming the first data into the first de-identified data comprises replacing the first street name in the first data with a second street name such that the first de-identified data comprises a third street address that comprises the second street name; and transforming the second data into second de-identified data, wherein transforming the second data into the second de-identified data comprises replacing the first street name in the second data with the second street name such that the second de-identified data comprises a fourth street address that comprises the second street name, wherein the software application is tested based upon the first de-identified data and the second de-identified data.
“2. The computing system of claim 1, the acts further comprising: prior to transforming the first data into the first de-identified data and transforming the second data into the second de-identified data, modifying the first street address and the second street address to conform to a standardized address format.
“3. The computing system of claim 1, wherein the database further comprises a third row that includes third data for a third person, wherein the third data comprises a fifth street address of the patient, and further wherein: the fifth street address includes a third street name, the plurality of acts further comprising: transforming the third data into third de-identified data, wherein transforming the third data into the third de-identified data comprises replacing the third street name in the third data with a fourth street name such that the third de-identified data comprises a sixth street address that comprises the fourth street name.
“4. The computing system of claim 1, wherein the first data comprises a first last name, wherein transforming the first data into the first de-identified data further comprises replacing the first last name with a second last name such that the first de-identified data comprises the second last name.
“5. The computing system of claim 4, wherein the second data comprises the first last name, wherein transforming the second data into the second de-identified data further comprises replacing the first last name with the second last name such that the second de-identified data comprises the second last name.
“6. The computing system of claim 1, wherein the first data comprises a first date of birth of the first person, the first date of birth comprising a first day, a first month, and a first year, and wherein transforming the first data into the first de-identified data further comprises replacing the first day in the first data with a second day and replacing the first month with a second month such that the first de-identified data comprises a second date of birth that comprises the second day, the second month, and the first year.
“7. The computing system of claim 1, wherein the first street address further comprises a first street number and the second street address further comprises a second street number, wherein transforming the first data into the first de-identified data further comprises replacing the first street number with a third street number such that the third street address comprises the third street number, and wherein transforming the second data into the second de-identified data further comprises replacing the second street number with a fourth street number such that the fourth street address comprises the fourth street number.
“8. A method executed by a processor of a computing system, the method comprising: receiving an indication that a software application is to be tested based upon values in a de-identified database; and responsive to receiving the indication, testing the software application based upon the values in the de-identified database, wherein the de-identified database is constructed based upon a database, wherein the de-identified database has a same format as the database such that fields in the database have corresponding fields in the de-identified database, and further wherein the database comprises: a first row that includes first data for a first person, where the first data comprises a first street address of the first person, and further wherein the first street address includes a first street name; a second row that includes second data for a second person, wherein the second data comprises a second street address of the second person, and further wherein the second street address also comprises the first street name, wherein a plurality of acts are undertaken to construct the de-identified database based upon the database, the plurality of acts comprising: transforming the first data into first de-identified data, wherein transforming the first data into the first de-identified data comprises replacing the first street name in the first data with a second street name such that the first de-identified data comprises a third street address that comprises the second street name; and transforming the second data into second de-identified data, wherein transforming the second data into the second de-identified data comprises replacing the first street name in the second data with the second street name such that the second de-identified data comprises a fourth street address that comprises the second street name, wherein the software is tested based upon the first de-identified data and the second de-identified data.
“9. The method of claim 8, further comprising: prior to transforming the first data into the first de-identified data and transforming the second data into the second de-identified data, modifying at least one of the first street address or the second street address to conform to a standardized address format.
“10. The method of claim 8, wherein the database further comprises a third row that includes third data for a third person, wherein the third data comprises a fifth street address of the patient, and further wherein: the fifth street address comprises a third street name, the method further comprising: transforming the third data into third de-identified data, wherein transforming the third data into the third de-identified data comprises replacing the third street name in the third data with a fourth street name such that the third de-identified data comprises a sixth street address that comprises the fourth street name.
“11. The method of claim 8, wherein the first street address further comprises a first street number, the second street address further comprises the first street number, and the fifth street address further comprises the first street number, wherein transforming the first data into the first de-identified data further comprises replacing the first street number in the first data with a second street number such that the third street address comprises the second street number, wherein transforming the second data into the second de-identified data further comprises replacing the first street number in the second data with the second street number such that the fourth street address comprises the second street number, and wherein transforming the third data into the third de-identified data further comprises replacing the first street number in the third data with a third street number such that the sixth street address comprises the third street number.
“12. The method of claim 8, wherein transforming the first data into the first de-identified data comprises replacing a first last name with a second last name such that the first de-identified data comprises the second last name.
“13. The method of claim 12, wherein transforming the second data into the second de-identified data comprises replacing the first last name with the second last name such that the second de-identified data comprises the second last name.
“14. The method of claim 13, wherein transforming the first data into the first de-identified data comprises replacing a first day, comprising a first date of death, in the first data with a second day and the first month with a second month such that the first de-identified data comprises a second date of death that comprises the second day, the second month, and the first year.”
There are additional claims. Please visit full patent to read further.
URL and more information on this patent, see: Chandrasekaran,
(Our reports deliver fact-based news of research and discoveries from around the world.)
Patent Issued for Data-defined architecture for network data management (USPTO 11368476): Helios Data Inc.
New Medical Marijuana Findings from Temple University Reported (Medical Cannabis and Automobile Accidents: Evidence From Auto Insurance): Marijuana/Cannabis – Medical Marijuana
Advisor News
Annuity News
Health/Employee Benefits News
Life Insurance News