Patent Application Titled “Simplistic Machine Learning Model Generation Tool For Predictive Data Analytics” Published Online (USPTO 20220050695): Patent Application
2022 MAR 04 (NewsRx) -- By a
No assignee for this patent application has been made.
Reporters obtained the following quote from the background information supplied by the inventors: “Service providers in various consumer industries maintain a massive amount of data related to the consumers. This data is typically dispersed across multiple “dimensions” that reflect various characteristics of the consumers. Such dimensions include, for example, the age of the consumer, the gender of the consumer, the race of the consumer, the occupation of the consumer, the annual income of the consumer, the marital status of the consumer, the type of services that are consumed over the time, etc. Particularly, for service providers in the auto insurance industry, such dimensions of consumer data may also include the type of vehicle-specific services that are consumed over the time, the type of claims that are filed over the time, the traffic violations associated with the consumer over the time, etc.
“Numerous efforts have been undertaken to discover correlations among various dimensions of consumer data. However, for a given product or service, identifying the key features that influence sales based on such correlations can be complex and time consuming, and may require specialized training related to dataset analysis. Traditionally, data scientists with in-depth knowledge in statistics coupled with insurance domain knowledge have been relied on to develop and provide such analysis. More recently, machine learning (ML) algorithms have been relied on to identify correlations between items in large datasets. In such efforts, a dataset may be divided into multiple parts. One or more parts of the dataset can then be used to train a ML model and the rest of the dataset can be used to test the trained ML model (also referred to herein as the “trained ML model”). Once the trained ML model has been tested to verify that it satisfies a desired level of prediction accuracy, the trained ML model can be implemented across multiple enterprise platforms (e.g., across auto insurance and claim operations platforms).
“However, with the limited availability of data scientists and the long cycle time required to develop ML models, deploying such ML models can, at least initially, cause significant reductions in the efficiency of business operations. Example embodiments of the present disclosure are directed toward addressing these difficulties.”
In addition to obtaining background information on this patent application, NewsRx editors also obtained the inventors’ summary information for this patent application: “According to a first aspect, a method implemented by a computing device for predictive data analytics comprises generating a guided user interface (GUI) that guides one or more user operations on the user interface causing the computing device to construct a machine learning model, the one or more user operations on the user interface including: obtaining, from a database, a dataset including a plurality of data objects; determining one or more characteristics associated with a first data object of the plurality of data objects; identifying a subset of the dataset based at least in part on the one or more characteristics; selecting at least one machine learning algorithm; and training a machine learning (ML) model with respect to the first data object using the subset of the dataset and the at least one machine learning algorithm to generate a trained ML model with respect to the first data object; implementing the trained ML model with respect to the first data object in a cloud server to enable distributing the trained ML model to a plurality of client device via a network.
“According to a second aspect, a system for predictive data analytics comprises at least one processor, and memory storing instructions that, when executed by the at least one processor, cause the at least one processor to perform various actions. Such actions include generating a guided user interface (GUI) that guides one or more user operations on the user interface causing the computing device to construct a machine learning model, the one or more user operations on the user interface including: receiving a dataset including a plurality of data objects; determining one or more characteristics associated with a first data object of the plurality of data objects; identifying a subset of the dataset based at least in part on the one or more characteristics; selecting at least one machine learning algorithm; and training a machine learning (ML) model with respect to the first data object using the subset of the dataset and the at least one machine learning algorithm to generate a trained ML model with respect to the first data object; implementing the trained ML model with respect to the first data object in a cloud server to enable distributing the trained ML model to a plurality of client device via a network.
“A third aspect of the present disclosure includes a computer-readable storage medium storing computer-readable instructions executable by one or more processors. When executed by the one or more processors, the instructions cause the one or more processors to perform actions comprising: generating a guided user interface (GUI) that guides one or more user operations on the user interface including: obtaining, from a database, a dataset including a plurality of data objects; determining one or more characteristics associated with a first data object of the plurality of data objects; identifying a subset of the dataset based at least in part on the one or more characteristics; selecting at least one machine learning algorithm; and training a machine learning (ML) model with respect to the first data object using the subset of the dataset and the at least one machine learning algorithm to generate a trained ML model with respect to the first data object; implementing the trained ML model with respect to the first data object in a cloud server to enable distributing the trained ML model to a plurality of client device via a network.”
The claims supplied by the inventors are:
“1. A method implemented by a computing device, the method comprising: generating a guided user interface (GUI) that guides one or more user operations on the GUI causing the computing device to construct a machine learning model, the one or more user operations including: obtaining, from a database, a dataset associated with a plurality of data objects; determining one or more characteristics associated with a first data object of the plurality of data objects; identifying a subset of the dataset based at least in part on the one or more characteristics; selecting at least one machine learning algorithm; and training a machine learning (ML) model with respect to the first data object using the subset of the dataset and the at least one machine learning algorithm to generate a trained ML model with respect to the first data object; and implementing the trained ML model with respect to the first data object in a cloud server to enable distributing the trained ML model to a plurality of client device via a network.
“2. The method of claim 1, wherein the one or more user operations on the user interface further comprising at least one of: determining dimensions of the dataset; determining statistic information associated with the dataset; performing a null value treatment on the dataset; or performing an outlier value treatment on the dataset.
“3. The method of claim 1, wherein the one or more user operations further including generating a visualization of the first data object, together with other data objects of the plurality of data objects, on a guided user interface, wherein the one or more characteristics associated with the first data object are determined based at least in part on the visualization.
“4. The method of claim 1, wherein the one or more characteristics associated with the first data object indicates correlations between the first data object and other data objects of the plurality of data objects.
“5. The method of claim 1, wherein identifying a subset of the dataset based at least in part on the one or more characteristics further comprises: determining influence degrees between the first data object and other data objects of the plurality of data objects; and performing a dimension reduction on the dataset by mapping the dataset to the subset of the dataset based at least in part on the influence degrees, wherein a dimension of the subset of the dataset is less than a dimension of the dataset.
“6. The method of claim 5, wherein the dimension reduction is performed using at least one of a random forest algorithm, a single variable logistic regression algorithm, or a variable clustering algorithm.
“7. The method of claim 1, further comprising: receiving a request to predict a target value associated with a target object, the request including a new dataset; training an additional machine learning model with respect to the target object to obtain an additional trained ML model with respect to the target object; configuring one or more parameters associated with the additional trained ML model with respect to the target object, the one or more parameters including at least one of an ML algorithm, one or more additional objects in the new dataset correlated to the target object, or a cross-validation parameter; and determining the target value, using the additional trained ML model, based at least in part on the target object and the one or more parameters.
“8. A system comprising: at least one processor; and memory storing instructions that, when executed by the at least one processor, cause the at least one processor to perform actions comprising: generating a guided user interface (GUI) that guides one or more user operations on the user interface causing the system to construct a machine learning model, the one or more user operations including: obtaining, from a database, a dataset including a plurality of data objects; determining one or more characteristics associated with a first data object of the plurality of data objects; identifying a subset of the dataset based at least in part on the one or more characteristics; selecting at least one machine learning algorithm; and training a machine learning (ML) model with respect to the first data object using of the subset of the dataset and the at least one machine learning algorithm to generate a trained ML model with respect to the first data object; implementing the trained ML model with respect to the first data object in a cloud server to enable distributing the trained ML model to a plurality of client device via a network.
“9. The system of claim 8, wherein the one or more operations further comprise at least one of: determining dimensions of the dataset; determining statistic information associated with the dataset; performing a null value treatment on the dataset; or performing an outlier value treatment on the dataset.
“10. The system of claim 8, wherein the one or more operations further comprise: determining the one or more characteristics associated with the first data object based at least on visualization of the first data object and other data objects on a guided user interface.
“11. The system of claim 8, wherein the one or more characteristics associated with the first data object indicates correlations between the first data object and other data objects of the plurality of data objects.
“12. The system of claim 8, wherein identifying a subset of the dataset based at least in part on the one or more characteristics further comprises: determining influence degrees between the first data object and other data objects of the plurality of data objects; and performing a dimension reduction on the dataset by mapping the dataset to the subset of the dataset based at least in part on the influence degrees, wherein the dimension of the subset of the dataset is lower than the dimension of the dataset.
“13. The system of claim 8, wherein the dimension reduction is performed using at least one of a random forest algorithm, a single variable logistic regression algorithm, or a variable clustering algorithm.
“14. The system of claim 8, wherein the one or more operations further comprise: receiving a request to predict a target value associated with a target object, the request including a new dataset; training an additional machine learning (ML) model with respect to the target object to obtain an additional trained ML model with respect to the target object; configuring one or more parameters associated with the additional trained ML model with respect to the target object, the one or more parameters including at least one of an ML algorithm, one or more additional objects in the new dataset correlated to the target object, or a cross-validation parameter; and determining the target value based at least in part on the additional trained machine learned model with respect to the target object and the one or more parameters.
“15. A computer-readable storage medium storing computer-readable instructions executable by one or more processors, that when executed by the one or more processors, cause the one or more processors to perform actions comprising: generating a guided user interface (GUI) that guides one or more user operations on the user interface, the one or more user operations including: obtaining, from a database, a dataset including a plurality of data objects; determining one or more characteristics associated with the first data object; identifying a subset of the dataset based at least in part on the one or more characteristics; selecting at least one machine learning algorithm; and training a machine learning (ML) model with respect to the first data object using of the subset of the dataset and the at least one machine learning algorithm to generate a trained ML model with respect to the first data object; implementing the trained ML model with respect to the first data object in a cloud server to enable distributing the trained ML model to a plurality of client device via a network.
“16. The computer-readable storage medium of claim 15, wherein the one or more operations further comprise at least one of: determining dimensions of the dataset; determining statistic information associated with the dataset; performing a null value treatment on the dataset; or performing an outlier value treatment on the dataset.
“17. The computer-readable storage medium of claim 15, wherein the one or more characteristics associated with the first data object indicates correlations between the first data object and other data objects of the plurality of data objects.
“18. The computer-readable storage medium of claim 17, wherein identifying a subset of the dataset based at least in part on the one or more characteristics further comprises: determining influence degrees between the first data object and other data objects of the plurality of data objects; and performing a dimension reduction on the dataset by mapping the dataset to the subset of the dataset based at least in part on the influence degrees, wherein the dimension of the subset of the dataset is lower than the dimension of the dataset.
“19. The computer-readable storage medium of claim 15, wherein the dimension reduction is performed using at least one of a random forest algorithm, a single variable logistic regression algorithm, or a variable clustering algorithm.”
There are additional claims. Please visit full patent to read further.
For more information, see this patent application: Chandrappa, Mahesh; Dickneite, Mark A.; Fiala, Charles T.; Gajendran, Suresh B.; Zaheer, Rashid. Simplistic Machine Learning Model Generation Tool For Predictive Data Analytics. Filed
(Our reports deliver fact-based news of research and discoveries from around the world.)
DOMA HOLDINGS, INC. – 10-K – Management's Discussion and Analysis of Financial Condition and Results of Operations
Fidelity Life: Universal Life Insurance vs. Whole Life Insurance
Advisor News
Annuity News
Health/Employee Benefits News
Life Insurance News