“Machine-Learning Based Detection And Classification Of Personally Identifiable Information” in Patent Application Approval Process (USPTO 20200081978) - Insurance News | InsuranceNewsNet

InsuranceNewsNet — Your Industry. One Source.™

Sign in
  • Subscribe
  • About
  • Advertise
  • Contact
Home Now reading Newswires
Topics
    • Advisor News
    • Annuity Index
    • Annuity News
    • Companies
    • Earnings
    • Fiduciary
    • From the Field: Expert Insights
    • Health/Employee Benefits
    • Insurance & Financial Fraud
    • INN Magazine
    • Insiders Only
    • Life Insurance News
    • Newswires
    • Property and Casualty
    • Regulation News
    • Sponsored Articles
    • Washington Wire
    • Videos
    • ———
    • About
    • Meet our Editorial Staff
    • Advertise
    • Contact
    • Newsletters
  • Exclusives
  • NewsWires
  • Magazine
  • Newsletters
Sign in or register to be an INNsider.
  • AdvisorNews
  • Annuity News
  • Companies
  • Earnings
  • Fiduciary
  • Health/Employee Benefits
  • Insurance & Financial Fraud
  • INN Exclusives
  • INN Magazine
  • Insurtech
  • Life Insurance News
  • Newswires
  • Property and Casualty
  • Regulation News
  • Sponsored Articles
  • Video
  • Washington Wire
  • Life Insurance
  • Annuities
  • Advisor
  • Health/Benefits
  • Property & Casualty
  • Insurtech
  • About
  • Advertise
  • Contact
  • Editorial Staff

Get Social

  • Facebook
  • X
  • LinkedIn
Newswires
Newswires RSS Get our newsletter
Order Prints
March 31, 2020 Newswires
Share
Share
Post
Email

“Machine-Learning Based Detection And Classification Of Personally Identifiable Information” in Patent Application Approval Process (USPTO 20200081978)

Insurance Daily News

2020 MAR 31 (NewsRx) -- By a News Reporter-Staff News Editor at Insurance Daily News -- A patent application by the inventors Ahmed, Mohamed N. (Loudoun County, VA); Toor, Andeep S. (Chantilly, VA), filed on September 7, 2018, was made available online on March 12, 2020, according to news reporting originating from Washington, D.C., by NewsRx correspondents.

This patent application is assigned to International Business Machines Corporation (Armonk, New York, United States).

The following quote was obtained by the news editors from the background information supplied by the inventors: “Personally identifiable information (PII) is information that can be using on its own or with other information to identify, contact, or locate a single person, or to identify an individual in context. Corporations and agencies are often under an obligation to protect content containing PII to prevent exposure of the PII to unauthorized parties. Because of the significant reputational and financial consequences of failing to protect content containing PII, corporations and governmental agencies have made it a major goal to identify and protect such content. Privacy expectations arise from a number of laws in different jurisdictions such as the Health Insurance Portability and Accountability Act (HIPPA) and Payment Card Industry (PCI) data security standards. One of the most challenging aspects related to identifying and protecting PII is how to deal with ‘unstructured’ content. Unstructured content refers to information that does not have a pre-defined data model or is not organized in a pre-defined manner. Examples of unstructured content may include, for example, documents or files on file shares, personal computing devices, and content management systems. These documents and files may be generated within or outside of an organization using many applications, can be converted to multiple file formats (e.g., Portable Document Format (PDF), and seemingly have unlimited form and content. By contrast, structured data such as data stored in in databases and support systems have often have defined fields in tables that have defined relationships with each other. For example, to protect social security numbers in a database, access to the field for social security numbers is controlled. With unstructured documents, the detection of PII is more challenging.”

In addition to the background information obtained for this patent application, NewsRx journalists also obtained the inventors’ summary information for this patent application: “The illustrative embodiments provide a method, system, and computer program product. An embodiment of a method for detection and classification of personally identifiable information includes identifying a document with a known author, and extracting a first set of features of the document using natural language processing. The embodiment further includes extracting a second set of features of the document based upon one or more past documents for the known author using a recurrent neural network, and classifying the first set of features and the second set of features using a classifier to produce classified extracted features. The embodiment further includes labelling personally identifiable information in the document based upon the classified extracted features.

“In another embodiment, the document is an unstructured document. In another embodiment, the first set of features includes text-based features. In another embodiment, the natural language processing includes one or more of n-grams, word embedding, part of speech, and dictionary-based natural language processing procedures. In another embodiment, the second set of features includes user-specific features.

“In another embodiment, the classifier includes a deep neural network classifier. In another embodiment, the classifier includes a maximum entropy classifier.

“Another embodiment further includes training the classifier based upon the classified extracted features. Another embodiment further includes receiving feedback associated with the classified extracted features, and modifying the training of the classifier based upon the feedback. In another embodiment, the feedback is received from a subject matter expert. In another embodiment, extracting the second set of features is based upon a user-specific model.

“In another embodiment, the user-specific model is trained based upon past results and user provided documents including labelled personally identifiable information.

“An embodiment includes a computer usable program product. The computer usable program product includes one or more computer-readable storage devices, and program instructions stored on at least one of the one or more storage devices.

“An embodiment includes a computer system. The computer system includes one or more processors, one or more computer-readable memories, and one or more computer-readable storage devices, and program instructions stored on at least one of the one or more storage devices for execution by at least one of the one or more processors via at least one of the one or more memories.”

The claims supplied by the inventors are:

“1. A method for detection and classification of personally identifiable information, the method comprising: identifying a document with a known author; extracting a first set of features of the document using natural language processing; extracting a second set of features of the document based upon one or more past documents for the known author using a recurrent neural network; classifying the first set of features and the second set of features using a classifier to produce classified extracted features, wherein the classifier employs a maximum entropy model to estimate a probability of a certain class occurring, and wherein the maximum entropy model uses a number of items extracted from a first word and from other words within a window of the first word; and labelling personally identifiable information in the document based upon the classified extracted features.

“2. The method of claim 1, wherein the document is an unstructured document.

“3. The method of claim 1, wherein the first set of features includes text-based features.

“4. The method of claim 1, wherein the natural language processing includes one or more of n-grams, word embedding, part of speech, and dictionary-based natural language processing procedures.

“5. The method of claim 1, wherein the second set of features includes user-specific features.

“6. The method of claim 1, wherein the classifier includes a deep neural network classifier.

“7. (canceled)

“8. The method of claim 1, further comprising training the classifier based upon the classified extracted features.

“9. The method of claim 8, further comprising: receiving feedback associated with the classified extracted features; and modifying the training of the classifier based upon the feedback.

“10. The method of claim 9, wherein the feedback is received from a user.

“11. The method of claim 1, wherein extracting the second set of features is based upon a user-specific model.

“12. The method of claim 11, wherein the user-specific model is trained based upon past results and user provided documents including labelled personally identifiable information.

“13. A computer usable program product comprising one or more computer-readable storage devices, and program instructions stored on at least one of the one or more storage devices, the stored program instructions comprising: program instructions to identify a document with a known author; program instructions to extract a first set of features of the document using natural language processing; program instructions to extract a second set of features of the document based upon one or more past documents for the known author using a recurrent neural network; program instructions to classify the first set of features and the second set of features using a classifier to produce classified extracted features, wherein the classifier employs a maximum entropy model to estimate a probability of a certain class occurring, and wherein the maximum entropy model uses a number of items extracted from a first word and from other words within a window of the first word; and program instructions to label personally identifiable information in the document based upon the classified extracted features.

“14. The computer usable program product of claim 13, wherein the document is an unstructured document.

“15. The computer usable program product of claim 13, wherein the first set of features includes text-based features.

“16. The computer usable program product of claim 13, wherein the natural language processing includes one or more of n-grams, word embedding, part of speech, and dictionary-based natural language processing procedures.

“17. The computer usable program product of claim 13, wherein the second set of features includes user-specific features.

“18. The computer usable program product of claim 13, wherein the stored program instructions are stored in a computer readable storage device in a data processing system, and wherein the stored program instructions are transferred over a network from a remote data processing system.

“19. The computer usable program product of claim 13, wherein the stored program instructions are stored in a computer readable storage device in a server data processing system, and wherein the stored program instructions are downloaded over a network to a remote data processing system for use in a computer readable storage device associated with the remote data processing system.

“20. A computer system comprising one or more processors, one or more computer-readable memories, and one or more computer-readable storage devices, and program instructions stored on at least one of the one or more storage devices for execution by at least one of the one or more processors via at least one of the one or more memories, the stored program instructions comprising: program instructions to identify a document with a known author; program instructions to extract a first set of features of the document using natural language processing; program instructions to extract a second set of features of the document based upon one or more past documents for the known author using a recurrent neural network; program instructions to classify the first set of features and the second set of features using a classifier to produce classified extracted features, wherein the classifier employs a maximum entropy model to estimate a probability of a certain class occurring, and wherein the maximum entropy model uses a number of items extracted from a first word and from other words within a window of the first word; and program instructions to label personally identifiable information in the document based upon the classified extracted features.”

URL and more information on this patent application, see: Ahmed, Mohamed N.; Toor, Andeep S. Machine-Learning Based Detection And Classification Of Personally Identifiable Information. Filed September 7, 2018 and posted March 12, 2020. Patent URL: http://appft.uspto.gov/netacgi/nph-Parser?Sect1=PTO1&Sect2=HITOFF&d=PG01&p=1&u=%2Fnetahtml%2FPTO%2Fsrchnum.html&r=1&f=G&l=50&s1=%2220200081978%22.PGNR.&OS=DN/20200081978&RS=DN/20200081978

(Our reports deliver fact-based news of research and discoveries from around the world.)

Older

Census count continues in spite of COVID-19

Advisor News

  • Millennials are inheriting billions and they want to know what to do with it
  • What Trump Accounts reveal about time and long-term wealth
  • Wellmark still worries over lowered projections of Iowa tax hike
  • Wellmark still worries over lowered projections of Iowa tax hike
  • Could tech be the key to closing the retirement saving gap?
More Advisor News

Annuity News

  • How to elevate annuity discussions during tax season
  • Life Insurance and Annuity Providers Score High Marks from Financial Pros, but Lag on User Friendliness, JD Power Finds
  • An Application for the Trademark “TACTICAL WEIGHTING” Has Been Filed by Great-West Life & Annuity Insurance Company: Great-West Life & Annuity Insurance Company
  • Annexus and Americo Announce Strategic Partnership with Launch of Americo Benchmark Flex Fixed Indexed Annuity Suite
  • Rethinking whether annuities are too late for older retirees
More Annuity News

Health/Employee Benefits News

  • Trump's Medicaid work mandate could kick thousands of homeless Californians off coverage
  • Confidence is the new workplace currency
  • Governor signs education package on reading, math, teacher benefits
  • Findings from Belmont University College of Pharmacy Provide New Insights into Managed Care and Specialty Pharmacy (Comparing rates of primary medication nonadherence and turnaround time among patients at a health system specialty pharmacy …): Drugs and Therapies – Managed Care and Specialty Pharmacy
  • Study Data from Ohio State University Update Knowledge of Managed Care (Preventive Care Utilization, Employer-sponsored Benefits, and Influences On Utilization By Healthcare Occupational Groups): Managed Care
More Health/Employee Benefits News

Life Insurance News

  • Kansas City Life: Q4 Earnings Snapshot
  • Gulf Guaranty Life Insurance Company Trademark Application for “OPTIBEN” Filed: Gulf Guaranty Life Insurance Company
  • Marv Feldman, life insurance icon and 2011 JNR Award winner, passes away at 80
  • Continental General Partners with Reframe Financial to Bring the Next Evolution of Reframe LifeStage to Market
  • ASK THE LAWYER: Your beneficiary designations are probably wrong
More Life Insurance News

- Presented By -

Top Read Stories

More Top Read Stories >

NEWS INSIDE

  • Companies
  • Earnings
  • Economic News
  • INN Magazine
  • Insurtech News
  • Newswires Feed
  • Regulation News
  • Washington Wire
  • Videos

FEATURED OFFERS

Elevate Your Practice with Pacific Life
Taking your business to the next level is easier when you have experienced support.

Your Cap. Your Term. Locked.
Oceanview CapLock™. One locked cap. No annual re-declarations. Clear expectations from day one.

Ready to make your client presentations more engaging?
EnsightTM marketing stories, available with select Allianz Life Insurance Company of North America FIAs.

Press Releases

  • ICMG Golf Event Raises $43,000 for Charity During Annual Industry Gathering
  • RFP #T25521
  • ICMG Announces 2026 Don Kampe Lifetime Achievement Award Recipient
  • RFP #T22521
  • Hexure Launches First Fully Digital NIGO Resubmission Workflow to Accelerate Time to Issue
More Press Releases > Add Your Press Release >

How to Write For InsuranceNewsNet

Find out how you can submit content for publishing on our website.
View Guidelines

Topics

  • Advisor News
  • Annuity Index
  • Annuity News
  • Companies
  • Earnings
  • Fiduciary
  • From the Field: Expert Insights
  • Health/Employee Benefits
  • Insurance & Financial Fraud
  • INN Magazine
  • Insiders Only
  • Life Insurance News
  • Newswires
  • Property and Casualty
  • Regulation News
  • Sponsored Articles
  • Washington Wire
  • Videos
  • ———
  • About
  • Meet our Editorial Staff
  • Advertise
  • Contact
  • Newsletters

Top Sections

  • AdvisorNews
  • Annuity News
  • Health/Employee Benefits News
  • InsuranceNewsNet Magazine
  • Life Insurance News
  • Property and Casualty News
  • Washington Wire

Our Company

  • About
  • Advertise
  • Contact
  • Meet our Editorial Staff
  • Magazine Subscription
  • Write for INN

Sign up for our FREE e-Newsletter!

Get breaking news, exclusive stories, and money- making insights straight into your inbox.

select Newsletter Options
Facebook Linkedin Twitter
© 2026 InsuranceNewsNet.com, Inc. All rights reserved.
  • Terms & Conditions
  • Privacy Policy
  • InsuranceNewsNet Magazine

Sign in with your Insider Pro Account

Not registered? Become an Insider Pro.
Insurance News | InsuranceNewsNet