Patent Application Titled “Data Platform For Automated Data Extraction, Transformation, And/Or Loading” Published Online (USPTO 20190384849)
2020 JAN 06 (NewsRx) -- By a
The assignee for this patent application is
Reporters obtained the following quote from the background information supplied by the inventors: “As more and more industries become digitized, it is not uncommon for different kinds of information to be exchanged electronically. The healthcare industry is one in which an Electronic Data Interchange (EDI) plays a central role in facilitating the electronic communication and exchange of healthcare related data, including, for example, data pertaining to health insurance claims, health insurance enrollment, eligibility data, claims settlement, medical records, and/or the like. The Health insurance Portability and Accountability Act (HIPAA) has led to standardized claims administration and automation in the healthcare industry by employing EDI messages to exchange data.”
In addition to obtaining background information on this patent application, NewsRx editors also obtained the inventors’ summary information for this patent application: “According to some possible implementations, a method may include receiving, by a computing resource of a cloud computing environment, a plurality of data files from a healthcare electronic data interchange (EDI). The plurality of data files may be received in a plurality of different data formats, and the plurality of data files may include data elements associated with healthcare data. The method may include converting, by a computing resource of the cloud computing environment, the plurality of data files received in the plurality of different data formats to a common data format. The method may include extracting, by a computing resource of the cloud computing environment, data elements from the plurality of data files converted to the common data format. The method may include assigning, by a computing resource of the cloud computing environment, the data elements extracted from the plurality of data files to file identifiers that identify from which of the plurality of data files the data elements were extracted. The method may include assigning, by a computing resource of the cloud computing environment, the data elements extracted from the plurality of data files to attribute identifiers that identify types of healthcare data represented by the data elements. The method may include aggregating, by a computing resource of the cloud computing environment, the data elements based on the file identifiers and the attribute identifiers to create a standardized data set. The method may include mapping, by a computing resource of the cloud computing environment, the data elements in the standardized data set to a plurality of functions contained in at least one function library based on a mapping between the attribute identifiers and the plurality of functions. The method may include generating, by a computing resource of the cloud computing environment, a plurality of values based on mapping the data elements to the plurality of functions. The method may include determining, by a computing resource of the cloud computing environment, a healthcare metric based on combining the plurality of values according to a healthcare metric definition. The method may include posting, by a computing resource of the cloud computing environment, the healthcare metric to the healthcare EDI for consumption by healthcare data clients.
“According to some possible implementations, a device may include one or more memories, and one or more processors, communicatively coupled to the one or more memories, to receive a plurality of data files. The plurality of data files may be received in a plurality of different data formats, and the plurality of data files may include data elements associated with healthcare data. The one or more processors may convert the plurality of data files received in the plurality of different data formats to a common data format, extract data elements from the plurality of data files converted to the common data format, assign the data elements extracted from the plurality of data files to file identifiers that identify from which of the plurality of data files the data elements were extracted, and assign the data elements extracted from the plurality of data files to attribute identifiers that identify types of healthcare data represented by the data elements. The one or more processors may aggregate the data elements based on the file identifiers and the attribute identifiers to create a standardized data set, examine the standardized data set to identify the attribute identifiers present in the standardized data set, and determine, using a data model, a list of healthcare metrics that are derivable from the standardized data set based on the attribute identifiers present in the standardized data set. The one or more processors may map the data elements in the standardized data set to a plurality of functions based on a mapping between the attribute identifiers and the plurality of functions. The plurality of functions may be configured to generate a healthcare metric included in the list of healthcare metrics. The one or more processors may generate a plurality of values based on processing the data elements using the plurality of functions, derive the healthcare metric based on combining the plurality of values according to a healthcare metric definition, and post the healthcare metric to a healthcare electronic data interchange (EDI) for consumption by healthcare data client.
“According to some possible implementations, a non-transitory computer-readable medium may store one or more instructions that, when executed by one or more processors, cause the one or more processors to receive a plurality of data files. The plurality of data files may be received in a plurality of different data formats. The plurality of data files may include data elements. The one or more instructions, when executed by the one or more processors, may cause the one or more processors to convert the plurality of data files received in the plurality of different data formats to a common data format, and extract data elements from the plurality of data files converted to the common data format. The one or more instructions, when executed by the one or more processors, may cause the one or more processors to assign the data elements extracted from the plurality of data files to file identifiers that identify from which of the plurality of data files the data elements were extracted, and assign the data elements extracted from the plurality of data files to attribute identifiers that identify types of data represented by the data elements. The one or more instructions, when executed by the one or more processors, may cause the one or more processors to examine a data file of the plurality of data files to identify a combination of data elements present in the data file, and determine, using a first machine learning model, a first score for a data element in the data file based on the combination of data elements present in the data file. The first score may predict a type of data represented by the data element based on the combination of data elements present in the data file. The one or more instructions, when executed by the one or more processors, may cause the one or more processors to assign the data element to an attribute identifier based on the first score, aggregate the data elements based on the file identifiers and the attribute identifiers to create a standardized data set, and map the data elements in the standardized data set to a plurality of functions contained in at least one function library based on a mapping between the attribute identifiers and the plurality of functions. The one or more instructions, when executed by the one or more processors, may cause the one or more processors to generate a plurality of values based on mapping the data elements to the plurality of functions, derive a metric based on combining the plurality of values according to a metric definition, and post the metric to an electronic data interchange (EDT) for consumption by data clients.”
The claims supplied by the inventors are:
“1. A method, comprising: receiving, by a computing resource of a cloud computing environment, a plurality of data files from a healthcare electronic data interchange (EDI), wherein the plurality of data files are received in a plurality of different data formats, and wherein the plurality of data files include data elements associated with healthcare data; converting, by a computing resource of the cloud computing environment, the plurality of data files received in the plurality of different data formats to a common data format; extracting, by a computing resource of the cloud computing environment, data elements from the plurality of data files converted to the common data format; assigning, by a computing resource of the cloud computing environment, the data elements extracted from the plurality of data files to file identifiers that identify from which of the plurality of data files the data elements were extracted; assigning, by a computing resource of the cloud computing environment, the data elements extracted from the plurality of data files to attribute identifiers that identify types of healthcare data represented by the data elements; aggregating, by a computing resource of the cloud computing environment, the data elements based on the file identifiers and the attribute identifiers to create a standardized data set; mapping, by a computing resource of the cloud computing environment, the data elements in the standardized data set to a plurality of functions contained in at least one function library based on a mapping between the attribute identifiers and the plurality of functions; generating, by a computing resource of the cloud computing environment, a plurality of values based on mapping the data elements to the plurality of functions; determining, by a computing resource of the cloud computing environment, a healthcare metric based on combining the plurality of values according to a healthcare metric definition; and posting, by a computing resource of the cloud computing environment, the healthcare metric to the healthcare EDI for consumption by healthcare data clients.
“2. The method of claim 1, further comprising: extracting a first data element from a first data file of the plurality of data files; extracting a second data element from a second data file of the plurality of data files; determining that the first data element, from the first data file, requires a data transformation based on extracting the first data element from the first data file; determining that the second data element, from the second data file, requires the data transformation based on extracting the second data element from the second data file; mapping the first data element to a common transformation algorithm; mapping the second data element to the common transformation algorithm; transforming the first data element, using the common transformation algorithm, into a modified first data element; transforming the second data element, using the common transformation algorithm, into a modified second data element; assigning the modified first data element to a firm attribute identifier; and assigning the modified second data element to the first attribute identifier.
“3. The method of claim 1, wherein assigning the data elements extracted from the plurality of data files to the attribute identifiers comprises examining a data file, of the plurality of data files, to identify a combination of data elements present in the data file; determining, using a machine learning model, a score for a data element in the data file based on the combination of data elements present in the data file, wherein the score predicts a type of healthcare data represented by the data element based on the combination of data elements present in the data file; and assigning the data element to one of the attribute identifiers based on the score.
“4. The method of claim 1, further comprising: examining a data file, of the plurality of data files, to identify a combination of data elements present in the data file; determining, using a machine learning model, a score for the data file based on the combination of data elements present in the data file, wherein the score predicts a healthcare subject area associated with the data file based on the combination of data elements present in the data file; and assigning the data file to a healthcare subject area repository based on the score.
“5. The method of claim 1, further comprising: examining the standardized data set to identify the attribute identifiers present in the standardized data set; determining, using a data model, a list of healthcare metrics that are derivable from the standardized data set based on the attribute identifiers present in the standardized data set; and presenting the list of healthcare metrics that are derivable from the standardized data set to one or more healthcare data clients.
“6. The method of claim 1, further comprising validating decimal and integer fields in the plurality of data files converted to the common data format.
“7. The method of claim 1, further comprising: calculating a plurality of key performance indicators (KPIs) based on mapping the data elements in the standardized data set to the plurality of functions contained in the at least one function library; and posting the plurality of KPIs to the healthcare EDI for consumption by healthcare data clients.
“8. The method of claim 1, wherein the plurality of data files are received in two or more data formats including: a HL7 message format, a
“9. A device, comprising: one or more memories; and one or more processors, communicatively coupled to the one or more memories, to: receive a plurality of data files, wherein the plurality of data files are received in a plurality of different data formats, and wherein the plurality of data files include data elements associated with healthcare data; convert the plurality of data files received in the plurality of different data formats to a common data format; extract data elements from the plurality of data files converted to the common data format; assign the data elements extracted from the plurality of data files to file identifiers that identify from which of the plurality of data files the data elements were extracted; assign the data elements extracted from the plurality of data files to attribute identifiers that identify types of healthcare data represented by the data elements; aggregate the data elements based on the file identifiers and the attribute identifiers to create a standardized data set; examine the standardized data set to identify the attribute identifiers present in the standardized data set; determine, using a data model, a list of healthcare metrics that are derivable from the standardized data set based on the attribute identifiers present in the standardized data set; map the data elements in the standardized data set to a plurality of functions based on a mapping between the attribute identifiers and the plurality of functions, wherein the plurality of functions is configured to generate a healthcare metric included in the list of healthcare metrics; generate a plurality of values based on processing the data elements using the plurality of functions; derive the healthcare metric based on combining the plurality of values according to a healthcare metric definition; and post the healthcare metric to a healthcare electronic data interchange (EDI) for consumption by healthcare data clients.
“10. The device of claim 9, wherein the one or more processors are further configured to: examine a data file, of the plurality of data files, to identify a combination of data elements present in the data file; determine, using a machine learning model, a score for the data file based on the combination of data elements present in the data file, wherein the score predicts a healthcare subject area associated with the data file based on the combination of data elements present in the data file; and assign the data file to a healthcare subject area repository based on the score.
“11. The device of claim 9, wherein the attribute identifiers are associated with a health insurance member, a health insurance claim, a healthcare provider, a hospital, or a pharmacy.
“12. The device of claim 9, wherein the plurality of functions is configured to calculate a plurality of key performance indicators (KPIs) associated with a healthcare subject area.
“13. The device of claim 12, wherein the healthcare subject area includes one of: a first subject area relating to a pharmacy, a second subject area relating to a hospital, a third subject area relating to a primary care physician, or a fourth subject area relating to health insurance.
“14. The device of claim 9, wherein the plurality of data files are received from the healthcare EDI.
“15. The device of claim 14, wherein the plurality of data files are received in two or more data formats including: a HL7 message format, a
“16. A non-transitory computer-readable medium storing instructions, the instructions comprising: one or more instructions that, when executed by one or more processors, cause the one or more processors to: receive a plurality of data files, wherein the plurality of data files are received in a plurality of different data formats, and wherein the plurality of data files include data elements; convert the plurality of data files received in the plurality of different data formats to a common data format; extract data elements from the plurality of data files converted to the common data format; assign the data elements extracted from the plurality of data files to file identifiers that identify from which of the plurality of data files the data elements were extracted; assign the data elements extracted from the plurality of data files to attribute identifiers that identify types of data represented by the data elements, wherein for a data file of the plurality of data files: examine the data file to identify a combination of data elements present in the data file, determine, using a first machine learning model, a first score for a data element in the data file based on the combination of data elements present in the data file, wherein the first score predicts a type of data represented by the data element based on the combination of data elements present in the data file, and assign the data element to an attribute identifier based on the first score; and aggregate the data elements based on the file identifiers and the attribute identifiers to create a standardized data set; map the data elements in the standardized data set to a plurality of functions contained in at least one function library based on a mapping between the attribute identifiers and the plurality of functions; generate a plurality of values based on mapping the data elements to the plurality of functions; derive a metric based on combining the plurality of values according to a metric definition; and post the metric to an electronic data interchange (EDI) for consumption by data clients.
“17. The non-transitory computer-readable medium of claim 16, wherein the one or more instructions, when executed by the one or more processors, further cause the one or more processors to: determine, using a second machine learning model, a second score for the data file based on the combination of data elements present in the data file, wherein the second score predicts a subject area associated with the data file based on the combination of data elements present in the data file; and assign the data file to a subject area repository based on the second score.
“18. The non-transitory computer-readable medium of claim 16, wherein the one or more instructions, when executed by the one or more processors, further cause the one or more processors to: examine the standardized data set to identify the attribute identifiers present in the standardized data set; determine, using a data model, a list of metrics that are derivable from the standardized data set based on the attribute identifiers present in the standardized data set; and present the list of metrics that are derivable from the standardized data set to one or more data clients.
“19. The non-transitory computer-readable medium of claim 16, wherein the one or more instructions, when executed by the one or more processors, further cause the one or more processors to: generate a plurality of key performance indicators (KPIs) based on mapping the data elements in the standardized data set to the plurality of functions contained in the at least one function library; and post the plurality of KPIs to the EDI for consumption by healthcare data clients.
“20. The non-transitory computer-readable medium of claim 16, wherein the one or more instructions, when executed e one or more processors, further cause the one or more processors to: transmit the metric to one or more data clients.”
For more information, see this patent application: Sundararaman, Arun; Ramamoorthy, Udayakumar; Pargunarajan, Sureshkumar; Appusamy, Sangeetha. Data Platform For Automated Data Extraction, Transformation, And/Or Loading. Filed
(Our reports deliver fact-based news of research and discoveries from around the world.)


Sound the alarm: A fire engine food truck brings pizza to the people
Health and Human Services Department; Centers For Medicare & Medicaid Services (F.R. Page 71948) – Meeting
Advisor News
- Global economic growth will moderate as the labor force shrinks
- Estate planning during the great wealth transfer
- Main Street families need trusted financial guidance to navigate the new Trump Accounts
- Are the holidays a good time to have a long-term care conversation?
- Gen X unsure whether they can catch up with retirement saving
More Advisor NewsAnnuity News
- Product understanding will drive the future of insurance
- Prudential launches FlexGuard 2.0 RILA
- Lincoln Financial Introduces First Capital Group ETF Strategy for Fixed Indexed Annuities
- Iowa defends Athene pension risk transfer deal in Lockheed Martin lawsuit
- Pension buy-in sales up, PRT sales down in mixed Q3, LIMRA reports
More Annuity NewsHealth/Employee Benefits News
Life Insurance News
- Product understanding will drive the future of insurance
- Nearly Half of Americans More Stressed Heading into 2026, Allianz Life Study Finds
- New York Life Investments Expands Active ETF Lineup With Launch of NYLI MacKay Muni Allocation ETF (MMMA)
- LTC riders: More education is needed, NAIFA president says
- Best’s Market Segment Report: AM Best Maintains Stable Outlook on Malaysia’s Non-Life Insurance Segment
More Life Insurance News