Patent Issued for Utilizing a protected server environment to protect data used to train a machine learning system (USPTO 11210424): Deepintent Inc.
2022 JAN 13 (NewsRx) -- By a
The patent’s inventors are Dakic, Vaso (
This patent was filed on
From the background information supplied by the inventors, news correspondents obtained the following quote: “The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section. Further, it should not be assumed that any of the approaches described in this section are well-understood, routine, or conventional merely by virtue of their inclusion in this section.
“Machine learning systems have become popular for solving various types of problems based on training data. A key benefit of a machine learning system is the ability to learn based on data, bypassing any requirements for manual coding of an algorithm. Instead, the machine learning system generates an algorithm or model through repeated computations using the training data.
“A potential drawback of machine learning systems is that determining specific internal operating mechanisms of the core machine learning engine can be difficult. Most machine learning systems are configured to generate fairly complex patterns based on the given training data. Because machine learning systems use complex algorithms and execute continuous learning, determining why a machine learning system produced a particular result from a set of input data can be difficult, if not impossible. In some situations, this can lead to a lack of accountability; in other situations, this feature protects the training data. Because a trained machine learning system exists separately from the training data, any data that is protected or sensitive data can be safeguarded during the use of the machine learning system.
“A trained machine learning system inherently protects the data used to train it. However, the training phase can create issues, especially when the data used to train the machine learning system is robust but protected. Many people provide data under the assurance that data security measures will be used. As an example, the Health Insurance Portability and Accountability Act (HIPAA) has stringent requirements on the protection of medical claims data which would prevent a person from viewing any of the medical claims data to train a machine learning system.
“Additionally, even when information is protected from viewing, the training data or machine learning system can still provide protected information to a viewer. For instance, a machine learning system using ten inputs could memorize a vast majority of people in
“Thus, there is a need for a system that can protect personal, private, confidential, or otherwise protected information during training and validation of a machine learning system that utilizes the protected information.”
Supplementing the background information on this patent, NewsRx reporters also obtained the inventors’ summary information for this patent: “The appended claims may serve as a summary of the disclosure.”
The claims supplied by the inventors are:
“1. A computer implemented method comprising: storing, at a server computer executing within a protected environment, a plurality of media items, each of the media items corresponding to one of a plurality of different status values; receiving, from a requesting computing device that is outside the protected environment, a request to send a plurality of media items outside the protected environment to a client computing device; computing, using a plurality of machine learning systems executed by the server computer, each of the machine learning systems having been trained with one of the plurality of status values as an output, a plurality of likelihood values of a particular status value for the client computing device; the training the machine learning systems having comprised receiving, by the server computer executing within a protected environment, instructions to generate and train a particular machine learning system, using attribute values associated with personal data records as inputs, and existence or non-existence of a one of the plurality of different status values as outputs, the server computer storing first data comprising a plurality of attribute values for a plurality of the personal data records and second data indicating, for each personal data record of the plurality of personal data records, whether the personal data record has the status value, the server computer being configured to train the machine learning system in the protected environment only if the first data and the second data satisfy a first criterion and being configured to send the trained machine learning system to the requesting computing device only if the trained machine learning system satisfies a second criterion; identifying a particular status value, among the plurality of status values, having a highest likelihood value; determining, based on the highest likelihood value, a corresponding value for one or more media items as a percentage related to the highest likelihood value of the particular status value; selecting a plurality of media items based on the corresponding value for the one or more media items, in a number corresponding to the plurality of media items specified in the request to send the plurality of media items outside the protected environment to the client computing device; and sending, from the server computer to the client computing device, the plurality of media items that have been selected.
“2. The method of claim 1, further comprising the server computer using the highest likelihood value associated with the particular status value to dynamically price sending media items to the client computing devices by determining a charged price by discounting a standard price by an amount corresponding to the percentage value.
“3. The method of claim 1, further comprising the server computer requesting attribute data from an outside attribute database based on information received from the client computing device.
“4. The method of claim 1, further comprising: receiving, from the requesting computing device that is outside the protected environment, particular attributes for a client computing device; and determining, based on the particular attributes, whether to serve a particular media item to the client computing device.
“5. The method of claim 1, further comprising the server computer storing attribute values for a plurality of different client computing devices in an attribute database in the protected environment.
“6. The method of claim 1, the first criterion being a minimum number of instances in the second data of a particular personal data record having the status value.
“7. The method of claim 1, the second criterion being a maximum fraction of population at risk.
“8. The method of claim 7, further comprising computing the maximum fraction of population at risk as a quotient of a number of instances in the subset of the first data of a patient having the status value and a number of positive predictions of the status value from applying the trained machine learning system to each of the plurality of personal data records in the first data.
“9. The method of claim 1, further comprising training the machine learning system with a first set of parameters; and determining that the trained machine learning system does not satisfy the second criterion and, in response, training the machine learning system using a second set of parameters.
“10. The method of claim 1, the status being a particular medical diagnosis or prescription.
“11. One or more non-transitory computer-readable storage media storing sequences of instructions which when executed by one or more processors cause the one or more processors to execute: storing, at a server computer executing within a protected environment, a plurality of media items, each of the media items corresponding to one of a plurality of different status values; receiving, from a requesting computing device that is outside the protected environment, a request to send a plurality of media items outside the protected environment to a client computing device; computing, using a plurality of machine learning systems executed by the server computer, each of the machine learning systems having been trained with one of the plurality of status values as an output, a plurality of likelihood values of a particular status value for the client computing device; the training the machine learning systems having comprised receiving, by the server computer executing within a protected environment, instructions to generate and train a particular machine learning system, using attribute values associated with personal data records as inputs, and existence or non-existence of a one of the plurality of different status values as outputs, the server computer storing first data comprising a plurality of attribute values for a plurality of the personal data records and second data indicating, for each personal data record of the plurality of personal data records, whether the personal data record has the status value, the server computer being configured to train the machine learning system in the protected environment only if the first data and the second data satisfy a first criterion and being configured to send the trained machine learning system to the requesting computing device only if the trained machine learning system satisfies a second criterion; identifying a particular status value, among the plurality of status values, having a highest likelihood value; determining, based on the highest likelihood value, a corresponding value for one or more media items as a percentage related to the highest likelihood value of the particular status value; selecting a plurality of media items based on the corresponding value for the one or more media items, in a number corresponding to the plurality of media items specified in the request to send the plurality of media items outside the protected environment to the client computing device; and sending, from the server computer to the client computing device, the plurality of media items that have been selected.
“12. The non-transitory computer-readable storage media of claim 11, further comprising sequences of instructions which when executed by one or more processors cause the one or more processors to execute: the server computer using the highest likelihood value associated with the particular status value to dynamically price sending media items to the client computing devices by determining a charged price by discounting a standard price by an amount corresponding to the percentage value.
“13. The non-transitory computer-readable storage media of claim 11, further comprising sequences of instructions which when executed by one or more processors cause the one or more processors to execute: the server computer requesting attribute data from an outside attribute database based on information received from the client computing device.
“14. The non-transitory computer-readable storage media of claim 11, further comprising sequences of instructions which when executed by one or more processors cause the one or more processors to execute: receiving, from the requesting computing device that is outside the protected environment, particular attributes for a client computing device; and determining, based on the particular attributes, whether to serve a particular media item to the client computing device.
“15. The non-transitory computer-readable storage media of claim 11, further comprising sequences of instructions which when executed by one or more processors cause the one or more processors to execute: the server computer storing attribute values for a plurality of different client computing devices in an attribute database in the protected environment.
“16. The non-transitory computer-readable storage media of claim 11, the first criterion being a minimum number of instances in the second data of a particular personal data record having the status value.
“17. The non-transitory computer-readable storage media of claim 11, the second criterion being a maximum fraction of population at risk.
“18. The non-transitory computer-readable storage media of claim 17, further comprising sequences of instructions which when executed by one or more processors cause the one or more processors to execute: computing the maximum fraction of population at risk as a quotient of a number of instances in the subset of the first data of a patient having the status value and a number of positive predictions of the status value from applying the trained machine learning system to each of the plurality of personal data records in the first data.”
There are additional claims. Please visit full patent to read further.
For the URL and additional information on this patent, see: Dakic, Vaso. Utilizing a protected server environment to protect data used to train a machine learning system.
(Our reports deliver fact-based news of research and discoveries from around the world.)
“Methods Of Determining Accident Cause And/Or Fault Using Telematics Data” in Patent Application Approval Process (USPTO 20210407015): State Farm Mutual Automobile Insurance Company
Researchers Submit Patent Application, “Personal Profitability”, for Approval (USPTO 20210406997): Patent Application
Advisor News
Annuity News
Health/Employee Benefits News
Life Insurance News