Patent Application Titled “Data Governance Operations In Highly Distributed Data Platforms” Published Online (USPTO 20210250358)
2021 AUG 27 (NewsRx) -- By a
No assignee for this patent application has been made.
Reporters obtained the following quote from the background information supplied by the inventors: “Managing access to data in highly distributed data platforms spanning multiple geo-locations, multiple public cloud services and multiple data centers is very difficult and prone to unauthorized access. Conventional approaches for data access management utilize access controls associated with a centralized location for the data, such as, for example, a database, file system, object store, etc.
“These conventional techniques are not effective when the data leaves the centralized location, for example, in response to data requests made by authorized clients from remote locations. In other words, current data governance solutions are limited to controlling access to data at the locations where the access controls are implemented, and fail to protect the data outside of these locations. In a non-limiting example, sharing patient medical records with medical professionals or financial institutions that are not in a medical insurance provider’s network exposes the data to vulnerabilities once the data travels outside of and is no longer able to be controlled by the insurance provider’s network.
“Current access controls also lack the ability to track private data throughout the multiple transformations that may be part of the lifecycle of the data. Even when the data does not move out of a source database, once the data has gone through several transformations, such as, for example, data structure changes, data reductions, etc., conventional techniques are not able to determine what constitutes private data and put proper protections in place.”
In addition to obtaining background information on this patent application, NewsRx editors also obtained the inventors’ summary information for this patent application: “In one embodiment, an apparatus comprises at least one processing platform including a plurality of processing devices. The processing platform is configured to receive a plurality of requests for data records from a plurality of clients, wherein the data records are in a plurality of data systems of a global namespace, and wherein the plurality of data systems are in a plurality of locations. The processing platform is also configured to determine whether a given client of the plurality of clients is allowed access to one or more of the data records based on one or more of a plurality of data access policies, to retrieve one or more of the data records from at least one of the plurality of data systems based on a determination that the given client is allowed access to the one or more of the data records, and to provide the one or more of the data records to the given client. In retrieving the one or more of the data records, the processing platform is configured to determine a location of the plurality of locations for the one or more of the data records, and to generate a channel to the location through which the one or more of the data records are retrieved.
“These and other illustrative embodiments include, without limitation, apparatus, systems, methods and computer program products comprising processor-readable storage media.”
The claims supplied by the inventors are:
“1. An apparatus comprising: at least one processing platform comprising a plurality of processing devices; said at least one processing platform being configured: to receive a plurality of requests for data records from a plurality of clients, wherein the data records are in a plurality of data systems of a global namespace, and wherein the plurality of data systems are in a plurality of locations; to determine whether a given client of the plurality of clients is allowed access to one or more of the data records based on one or more of a plurality of data access policies; to retrieve the one or more of the data records from at least one of the plurality of data systems based on a determination that the given client is allowed access to the one or more of the data records; and to provide the one or more of the data records to the given client; wherein in retrieving the one or more of the data records, said at least one processing platform is configured: to determine a location of the plurality of locations for the one or more of the data records; and to generate a channel to the location through which the one or more of the data records are retrieved.
“2. The apparatus of claim 1 wherein said at least one processing platform is further configured to generate a plurality of data location descriptors identifying locations of respective data records in the global namespace.
“3. The apparatus of claim 1 wherein said at least one processing platform is further configured to generate a plurality of data descriptors describing one or more characteristics of respective data records, wherein the one or more characteristics comprise at least one of a version, a type, and a schema of the respective data records.
“4. The apparatus of claim 1 wherein said at least one processing platform is further configured to generate a plurality of data descriptors describing one or more operational requirements of respective data records, wherein the one or more operational requirements comprise at least one of a resilience, a compliance constraint and a performance level of the respective data records.
“5. The apparatus of claim 1 wherein said at least one processing platform is further configured to provide to one or more users a plurality of schemas for at least one of presenting and storing the data records.
“6. The apparatus of claim 1 wherein said at least one processing platform is further configured: to monitor access to the data records by the plurality of clients; and to generate one or more reports regarding the monitored access to the data records by the plurality of clients.
“7. The apparatus of claim 1 wherein said at least one processing platform is further configured: to track changes to respective data records; and to publish the changes to a data dictionary.
“8. The apparatus of claim 1 wherein said at least one processing platform is further configured to generate one or more reports regarding transformations to the data records.
“9. The apparatus of claim 1 wherein said at least one processing platform is further configured to at least one of replicate the data records, generate one or more snapshots of the data records and replace the data records.
“10. The apparatus of claim 1 wherein the one or more of the plurality of data access policies are dynamically generated and analyzed in real-time during active sessions with one or more of the plurality of clients.
“11. The apparatus of claim 1 wherein said at least one processing platform is further configured: to assign a trustworthiness level to one or more of the plurality of data systems; and to disable access to the one or more of the plurality of data systems in response to a negative change in the trustworthiness level.
“12. The apparatus of claim 1 wherein said at least one processing platform is further configured: to assign a trustworthiness level to one or more of the plurality of data systems; and to quarantine data records in the one or more of the plurality of data systems produced after assignment of the trustworthiness level.
“13. The apparatus of claim 1 wherein said at least one processing platform is further configured: to determine whether given ones of the data records have been compromised; to quarantine the given ones of the data records that have been determined to be compromised; and to recreate the given ones of the data records from a snapshot.
“14. The apparatus of claim 1 wherein said at least one processing platform is further configured to validate a trustworthiness of a data collection endpoint of the channel prior to transmission of the one or more of the data records through the channel.
“15. A method comprising: receiving a plurality of requests for data records from a plurality of clients, wherein the data records are in a plurality of data systems of a global namespace, and wherein the plurality of data systems are in a plurality of locations; determining whether a given client of the plurality of clients is allowed access to one or more of the data records based on one or more of a plurality of data access policies; retrieving the one or more of the data records from at least one of the plurality of data systems based on a determination that the given client is allowed access to the one or more of the data records; and providing the one or more of the data records to the given client; wherein retrieving the one or more of the data records comprises: determining a location of the plurality of locations for the one or more of the data records; and generating a channel to the location through which the one or more of the data records are retrieved; wherein the method is performed by at least one processing platform comprising at least one processing device comprising a processor coupled to a memory.
“16. The method of claim 15 further comprising: assigning a trustworthiness level to one or more of the plurality of data systems; and disabling access to the one or more of the plurality of data systems in response to a negative change in the trustworthiness level.
“17. The method of claim 15 further comprising: assigning a trustworthiness level to one or more of the plurality of data systems; and quarantining data records in the one or more of the plurality of data systems produced after assignment of the trustworthiness level.
“18. The method of claim 15 further comprising validating a trustworthiness of a data collection endpoint of the channel prior to transmission of the one or more of the data records through the channel.
“19. A computer program product comprising a non-transitory processor-readable storage medium having stored therein program code of one or more software programs, wherein the program code when executed by at least one processing platform causes said at least one processing platform: to receive a plurality of requests for data records from a plurality of clients, wherein the data records are in a plurality of data systems of a global namespace, and wherein the plurality of data systems are in a plurality of locations; to determine whether a given client of the plurality of clients is allowed access to one or more of the data records based on one or more of a plurality of data access policies; to retrieve the one or more of the data records from at least one of the plurality of data systems based on a determination that the given client is allowed access to the one or more of the data records; and to provide the one or more of the data records to the given client; wherein in retrieving the one or more of the data records, said at least one processing platform is configured: to determine a location of the plurality of locations for the one or more of the data records; and to generate a channel to the location through which the one or more of the data records are retrieved.
“20. The computer program product according to claim 19 wherein the program code further causes said at least one processing platform: to assign a trustworthiness level to one or more of the plurality of data systems; and to quarantine data records in the one or more of the plurality of data systems produced after assignment of the trustworthiness level.”
For more information, see this patent application: Chawla, Gaurav; Dumitru, Aurelian. Data Governance Operations In Highly Distributed Data Platforms. Filed
(Our reports deliver fact-based news of research and discoveries from around the world.)
Former UConn football coach Larry Naviaux dies at 84
Pa. Department of Education: Wolf Administration Reminds Families About Importance of Child Immunizations For New School Year
Advisor News
Annuity News
Health/Employee Benefits News
Life Insurance News