“Systems And Methods For Computing With Private Healthcare Data” in Patent Application Approval Process (USPTO 20240119176): Nference Inc.
2024 MAY 01 (NewsRx) -- By a
This patent application is assigned to
The following quote was obtained by the news editors from the background information supplied by the inventors: “Hospitals, healthcare providers and care givers collect large amounts of data from patients. It is a necessary part of the processes by which healthcare is provided to members of the public. Typically, a patient provides data to the care giver as a part of receiving treatment for his/her ailments. This data is stored by the care giver and may be used later, inter alia, for research purposes. In another typical scenario data may be collected from consumers via one or more devices, e.g., pulse oximeter, glucose monitor, smart watch, fitness bracelet, etc. In such use cases, the collected data is often used to analyze a patient’s health in a continuous manner or over a period of time. Consequently, huge amounts of patient information may be accumulated by service providers.
“Many aspects of patient data collected by care givers and service providers may be subject to privacy regulations. The usefulness and benefit of processing data collected from patients is clear and acknowledged by the public. However, there is a growing concern of maintaining the privacy of user data, particularly when the data can be used to identify the patient. Such concerns are the basis of HIPAA (Health Insurance Portability and Accountability Act) regulations initially passed in 1996 by the
“There is thus a need to enable biomedical (and other types of) data to be analyzed by computational processes under the constraint of maintaining the privacy of the individual patient or consumer. Such a system and methods will consequently be of great commercial, social and scientific benefit to society.”
In addition to the background information obtained for this patent application, NewsRx journalists also obtained the inventors’ summary information for this patent application: “In an aspect, a de-identification method is disclosed. The de-identification method includes receiving a plurality of data sets, wherein the plurality of data sets includes a first data set, wherein the first data set includes a labeled data set for one or more entity types and a second data set, wherein the training data set includes an unlabeled data set for the one or more entity types, determining one machine-learning model from a plurality of machine-learning models for each of one or more entity types, fine-tuning the determined machine-learning model for each of the one or more entity types, wherein fine-tuning the determined machine-learning model includes creating a plurality of training data sets, wherein the plurality of training data sets includes a first training data set, wherein the first training data set includes the first data set and a second training data set, wherein the second training data set includes the second data set, training the determined machine-learning model using the first training data set, validating the trained machine-learning model and updating the trained machine-learning model using the second training data set as a function of the validation and obfuscating the second data set using the fine-tuned machine-learning model.
“These and other aspects and features of non-limiting embodiments of the present invention will become apparent to those skilled in the art upon review of the following description of specific non-limiting embodiments of the invention in conjunction with the accompanying drawings.
“The drawings are not necessarily to scale and may be illustrated by phantom lines, diagrammatic representations and fragmentary views. In certain instances, details that are not necessary for an understanding of the embodiments or that render other details difficult to perceive may have been omitted.”
The claims supplied by the inventors are:
“1. A de-identification method comprising: receiving a plurality of data sets, wherein the plurality of data sets comprises: a first data set, wherein the first data set comprises a labeled data set for one or more entity types; and a second data set, wherein the training data set comprises an unlabeled data set for the one or more entity types; determining one machine-learning model from a plurality of machine-learning models for each of one or more entity types; fine-tuning the determined machine-learning model for each of the one or more entity types, wherein fine-tuning the determined machine-learning model comprises: creating a plurality of training data sets, wherein the plurality of training data sets comprises: a first training data set, wherein the first training data set comprises the first data set; and a second training data set, wherein the second training data set comprises the second data set; training the determined machine-learning model using the first training data set; validating the trained machine-learning model; and updating the trained machine-learning model using the second training data set as a function of the validation; and obfuscating the second data set using the fine-tuned machine-learning model.
“2. The de-identification method of claim 1, wherein obfuscating the second data set further comprises replacing two or more entities that refer to a common subject with a common surrogate.
“3. The de-identification method of claim 2, further comprising: selecting the common surrogate based one or more attributes of the two or more entities.
“4. The de-identification method of claim 2, further comprising: selecting the common surrogate based on a gender associated with the two or more entities.
“5. The de-identification method of claim 2, further comprising: selecting the common surrogate based on an ethnicity associated with the two or more entities.
“6. The de-identification method of claim 1, wherein validating the trained machine learning model further comprises: generating a recall score for each entity type of the one or more entity types; comparing the recall score to a threshold for the recall score for each entity type of the one or more entity types.
“7. The de-identification method of claim 1, wherein validating the trained machine learning model further comprises: generating a precision score for each entity type of the one or more entity types; and comparing the precision score to a threshold for the precision score for each entity type of the one or more entity types.
“8. The de-identification method of claim 1, wherein validating the trained machine learning model further comprises: generating a F-score for each entity type of the one or more entity types; and comparing the F-score to a threshold for the F-score for each entity type of the one or more entity types.
“9. The de-identification method of claim 1, further comprising: updating the trained machine-learning model as a function of comparison of an average of F-score, precision score and recall score to a threshold success percentage.
“10. The de-identification method of claim 1, wherein: the one or more entity types comprises two or more personal names; and obfuscating the second data set further comprises replacing each of the two or more personal names with a different surrogate.
“11. The de-identification method of claim 1, wherein obfuscating the second data set further comprises replacing two or more entities that refer to a common person with surrogates that match a gender associated with the common person.
“12. The de-identification method of claim 1, wherein obfuscating the second data set further comprises replacing two or more entity types that refer to a common person with surrogates that match an ethnicity associated with the common person.
“13. The de-identification method of claim 1, wherein: obfuscating the second data set further comprises replacing two or more entities that represent dates with surrogate dates, wherein: the surrogate dates are based on the two or more entity types altered by a random value.
“14. The de-identification method of claim 13, wherein dates associated with a common patient are altered by the same random value.
“15. The de-identification method of claim 1, wherein obfuscating the second data set further comprises scrambling two or more entities that represent numeric identifiers with random values to scramble the numeric identifiers.
“16. The de-identification method of claim 1, wherein the one or more entity types comprises at least a portion of an electronic health record.
“17. The de-identification method of claim 1, further comprising: receiving a text sequence; and tagging one or more entities in the text sequence.
“18. The de-identification method of claim 17, further comprising: aggregating the tagged entities from the text sequence; and passing the aggregated tagged entities through a one or more dreg filters, wherein each of the one or more dreg filters is configured to filter a corresponding entity type based on a rule-based template.
“19. The de-identification method of claim 18, further comprising, creating the rule-based template, wherein creating the rule-based template comprises: mapping each of the one or more portions of the text sequence to a corresponding syntax template; identifying a candidate syntax template based on a machine learning model that infers one or more candidate syntax templates based on the one or more portions of the text sequence; and creating the rule-based template from the candidate syntax template by replacing each of the one or more tagged entities in the portion of the text sequence corresponding to the candidate template with a corresponding syntax token.
“20. The de-identification method of claim 18, wherein each of the one or more dreg filters is further configured to filter the corresponding entity type based on a pattern-based filter.”
URL and more information on this patent application, see: ARAVAMUDAN, Murali; ARDHANARI, Sankar; MURUGADOSS, Karthik; RAJASEKHARAN, Ajit. Systems And Methods For Computing With Private Healthcare Data.
(Our reports deliver fact-based news of research and discoveries from around the world.)



Patent Issued for Method for distributing emails (USPTO 11956200): United Services Automobile Association
Patent Issued for Systems and methods for determination of patient true state for personalized medicine (USPTO 11955238): Apixio LLC
Advisor News
- Addressing the ‘menopause tax:’ A guide for advisors with female clients
- Alternative investments in 401(k)s: What advisors must know
- The modern advisor: Merging income, insurance, and investments
- Financial shocks, caregiving gaps and inflation pressures persist
- Americans unprepared for increased longevity
More Advisor NewsAnnuity News
- Globe Life Inc. (NYSE: GL) Making Surprising Moves in Monday Session
- Aspida Life and WealthVest Offer a Powerful New Guaranteed Income Product with the WealthLock® Income Builder
- Lack of digital tools drives wedge between insurers, advisors
- LIMRA: Annuity sales notch 10th consecutive $100B+ quarter
- AIG to sell remaining shares in Corebridge Financial
More Annuity NewsHealth/Employee Benefits News
- Studies from Denise Wolff et al Have Provided New Data on Atopic Dermatitis (AMCP Market Insights: Beyond skin deep on the role of managed care in moderate to severe atopic dermatitis): Skin Diseases and Conditions – Atopic Dermatitis
- New Clinical Trials and Studies Findings from RAND Corporation Described (Benefit design and consumer information: results from a randomized trial): Clinical Research – Clinical Trials and Studies
- School, BOCES healthcare costs up 22%, here’s why
- Healthcare cuts threaten Sullivan's reelection chances in Alaska
- Health insurance marketplace feels growing tremors from GOP cuts
More Health/Employee Benefits NewsLife Insurance News
- Best’s Market Segment Report: AM Best Revises Outlook on Italy’s Life Insurance Segment to Stable From Negative
- Globe Life Inc. (NYSE: GL) Making Surprising Moves in Monday Session
- Dan Scholz to receive NAIFA’s Terry Headley Lifetime Defender Award
- Best’s Special Report: US Property/Casualty and Health Insurers Exceed Cost of Capital; Life Insurers Narrowly Miss
- Aspida Life and WealthVest Offer a Powerful New Guaranteed Income Product with the WealthLock® Income Builder
More Life Insurance News