Researchers Submit Patent Application, “Sensitive Data Identification In Real-Time for Data Streaming”, for Approval (USPTO 20230370426): Patent Application
2023 NOV 30 (NewsRx) -- By a
No assignee for this patent application has been made.
News editors obtained the following quote from the background information supplied by the inventors: “Identifying and protecting sensitive data is critical for data protection and for meeting regulation requirements (general data protection regulation (GDPR), the
“The data firewall typically captures or sniffs data accesses to a database (e.g., requests and responses) in real-time and analyzes the data according to policy rules to identify sensitive data. The data firewall may include a data activity monitor (DAM) and/or file activity monitor (FAM). The requests and responses sniffed by the data firewall may include data packets that may include a query, e.g., a structured query language (SQL) requests, or a response, and associated header information. The header may include metadata such as machine information, network information, user information, client information, etc.
“The classification of data may be performed by parsing the captured data packets, extracting the mapping between the metadata and data (e.g., field name for every value), running a rule engine against the metadata and then scanning the data itself to identify sensitive data. Currently, DAM and FAM products are classifying the captured data offline due to the complexity and performance requirements of the classification process. However, using the classifier in offline mode may be too late for preventing data breach or data tampering.
“Therefore, a method for online classification and identification of sensitive data for data streaming is required.”
As a supplement to the background information on this patent application, NewsRx correspondents also obtained the inventors’ summary information for this patent application: “According to embodiments of the invention, a system and method for classifying data in real-time may include may include: capturing a plurality of data packets flowing between a data source machine and a data client; searching at least one of the data packets for tokens associated with sensitive information; if tokens associated with sensitive information are not found in a data packet: allowing the data packet to flow between the data source machine and the data client; and sending the data packet to a comprehensive security analysis; and if tokens associated with sensitive information are found in the data packet: preventing the data packet form flowing between the data source machine and the data client; sending the data packet to a comprehensive security analysis.
“Furthermore, if tokens associated with sensitive information are found in the data packet, embodiments of the invention may include continuing to prevent the data packet from flowing between the data source machine and the data client if the comprehensive security analysis finds security issues; and allowing the data packet to flow between the data source machine and the data client if the comprehensive security analysis finds no security issues.
“According to embodiments of the invention, the data source machine may be selected from: a database server, a file server, a proxy and a database server, a combination of a proxy and a file server, a combination of a network gate and a database server, and a combination of a network gate and a file server.
“According to embodiments of the invention, the data packet may be one of: a query sent from the data client to the data source machine, and a response sent from the data source machine to the data client.
“According to embodiments of the invention, capturing and searching may be performed by a software agent that is installed on the data source machine.
“According to embodiments of the invention, performing a comprehensive security analysis may be performed by a dedicated security server, and wherein the data packet is sent to the dedicated security server for performing the comprehensive security analysis.
“According to embodiments of the invention, searching the data packet for tokens associated with sensitive information may include at least one of: wildcard search, pattern search and dictionary search.
“Embodiments of the invention may include updating the tokens associated with sensitive information based on results of the comprehensive security analysis.
“According to embodiments of the invention, the comprehensive security analysis may include: parsing the data packet; mapping metadata to data; building hierarchy of the data; and processing policy rules.
“Embodiments of the invention may include issuing a security alert if tokens associated with sensitive information are found in the data packet and if the comprehensive security analysis finds security issues.
“Embodiments of the invention may include: after capturing, decrypting the plurality of data packets to obtain a header of each packet; analyzing the headers to determine security status of packets associated with the headers; and selecting the at least one data packet based on the security status.
“It will be appreciated that for simplicity and clarity of illustration, elements shown in the figures have not necessarily been drawn to scale. For example, the dimensions of some of the elements may be exaggerated relative to other elements for clarity. Further, where considered appropriate, reference numerals may be repeated among the figures to indicate corresponding or analogous elements.”
The claims supplied by the inventors are:
“1. A method for classifying data in real-time, the method comprising: capturing a plurality of data packets flowing between a data source machine and a data client; searching a header of at least one of the data packets for metadata to determine whether the data packet should be allowed or should be further analyzed, wherein the metadata includes at least one of machine information, network information, user information, and client information; and if the search of the header indicates that the at least one data packet should be further analyzed, searching raw data of a payload of the at least one of the data packets for tokens, values, expressions, words or phrases associated with sensitive information streaming in or out of a database in real-time without parsing the data packets or knowing which values in the payload fit into each field.
“2. The method of claim 1, wherein if, during the searching of the raw data of the payload, the tokens, values, expressions, words or phrases associated with sensitive information are not found in the payload of a data packet: allowing the data packet to flow between the data source machine and the data client and sending a copy of the data packet to an offline comprehensive security analysis; and if, during the searching of the raw data of the payload, tokens, values, expressions, words or phrases associated with sensitive information are found in the data packet: performing a wildcard search, a dictionary search, and a regular expression search of the payload in parallel in parallel for identified terms; and if identified terms are detected, preventing the data packet from flowing between the data source machine and the data client and sending the data packet or a copy of the data packet along with results from the searching of the raw data of the payload, to the offline comprehensive security analysis.
“3. The method of claim 1, wherein if tokens associated with sensitive information are not found in a data packet: allowing the data packet to flow between the data source machine and the data client; sending the data packet to a comprehensive security analysis; and if tokens associated with sensitive information are found in the data packet: preventing the data packet form flowing between the data source machine and the data client; and sending the data packet to a comprehensive security analysis.
“4. The method of claim 1, comprising, if tokens associated with sensitive information are found in the data packet: continuing to prevent the data packet from flowing between the data source machine and the data client if the comprehensive security analysis finds security issues; and allowing the data packet to flow between the data source machine and the data client if the comprehensive security analysis finds no security issues.
“5. The method of claim 1, wherein the data source machine is selected from the list consisting of: a database server, a file server, a proxy and a database server, a combination of a proxy and a file server, a combination of a network gate and a database server, and a combination of a network gate and a file server.
“6. The method of claim 1, wherein the data packet is one of: a query sent from the data client to the data source machine, and a response sent from the data source machine to the data client.
“7. The method of claim 1, wherein capturing and searching are performed by a software agent that is installed on the data source machine.
“8. The method of claim 2, wherein performing a comprehensive security analysis is performed by a dedicated security server, and wherein the data packet is sent to the dedicated security server for performing the comprehensive security analysis.
“9. The method of claim 1, wherein searching the data packet for tokens associated with sensitive information comprises at least one of: wildcard search, pattern search and dictionary search.
“10. The method of claim 2, comprising: updating the tokens associated with sensitive information based on results of the comprehensive security analysis.
“11. The method of claim 2, wherein the comprehensive security analysis comprises: parsing the data packet; mapping metadata to data; building hierarchy of the data; and processing policy rules.
“12. The method of claim 2, comprising: issuing a security alert if tokens associated with sensitive information are found in the data packet and if the comprehensive security analysis finds security issues.
“13. The method of claim 1, comprising: after capturing, decrypting the plurality of data packets to obtain a header of each packet; analyzing the headers to determine security status of packets associated with the headers; and selecting the at least one data packet based on the security status.
“14. A system for classifying data in real-time, the system comprising: a memory; and a processor configured to perform a method, the method comprising: capturing a plurality of data packets flowing between a data source machine and a data client; searching a header of at least one of the data packets for metadata to determine whether the data packet should be allowed or should be further analyzed, wherein the metadata includes at least one of machine information, network information, user information, and client information; and if the search of the header indicates that the at least one data packet should be further analyzed, searching raw data of a payload of the at least one of the data packets for tokens, values, expressions, words or phrases associated with sensitive information streaming in or out of a database in real-time without parsing the data packets or knowing which values in the payload fit into each field.
“15. The system of claim 14, wherein if, during the searching of the raw data of the payload, the tokens, values, expressions, words or phrases associated with sensitive information are not found in the payload of a data packet: allowing the data packet to flow between the data source machine and the data client and sending a copy of the data packet to an offline comprehensive security analysis; and if, during the searching of the raw data of the payload, tokens, values, expressions, words or phrases associated with sensitive information are found in the data packet: performing a wildcard search, a dictionary search, and a regular expression search of the payload in parallel in parallel for identified terms; and if identified terms are detected, preventing the data packet from flowing between the data source machine and the data client and sending the data packet or a copy of the data packet along with results from the searching of the raw data of the payload, to the offline comprehensive security analysis.
“16. The system of claim 14, wherein if tokens associated with sensitive information are not found in a data packet: allowing the data packet to flow between the data source machine and the data client; sending the data packet to a comprehensive security analysis; and if tokens associated with sensitive information are found in the data packet: preventing the data packet form flowing between the data source machine and the data client; and sending the data packet to a comprehensive security analysis.
“17. The system of claim 14, comprising, if tokens associated with sensitive information are found in the data packet: continuing to prevent the data packet from flowing between the data source machine and the data client if the comprehensive security analysis finds security issues; and allowing the data packet to flow between the data source machine and the data client if the comprehensive security analysis finds no security issues.
“18. A computer program product for classifying data in real-time, the computer program product comprising: one or more non-transitory computer readable storage media having computer-readable program instructions stored on the one or more computer readable storage media, said program instructions executes a computer-implemented method comprising: capturing a plurality of data packets flowing between a data source machine and a data client; searching a header of at least one of the data packets for metadata to determine whether the data packet should be allowed or should be further analyzed, wherein the metadata includes at least one of machine information, network information, user information, and client information; and if the search of the header indicates that the at least one data packet should be further analyzed, searching raw data of a payload of at least one of the data packets for tokens, values, expressions, words or phrases associated with sensitive information streaming in or out of a database in real-time without parsing the data packets or knowing which values in the payload fit into each field.
“19. The computer program product of claim 18, wherein if during the searching of the raw data of the payload, the tokens, values, expressions, words or phrases associated with sensitive information are not found in the payload of a data packet: allowing the data packet to flow between the data source machine and the data client and sending a copy of the data packet to an offline comprehensive security analysis; and if, during the searching of the raw data of the payload, tokens, values, expressions, words or phrases associated with sensitive information are found in the data packet: performing a wildcard search, a dictionary search, and a regular expression search of the payload in parallel in parallel for identified terms; and if identified terms are detected, preventing the data packet from flowing between the data source machine and the data client and sending the data packet or a copy of the data packet, along with results from the searching of the raw data of the payload, to the offline comprehensive security analysis.”
There are additional claims. Please visit full patent to read further.
For additional information on this patent application, see: Biller,
(Our reports deliver fact-based news of research and discoveries from around the world.)
Patent Issued for Systems and methods for adjusting electric power to devices (USPTO 11817711): United Services Automobile Association
Patent Issued for Routing for remote electronic devices (USPTO 11818020): Massachusetts Mutual Life Insurance Company
Advisor News
Annuity News
Health/Employee Benefits News
Life Insurance News