“Extraction Of Information And Smart Annotation Of Relevant Information Within Complex Documents” in Patent Application Approval Process (USPTO 20190251182) - Insurance News | InsuranceNewsNet

InsuranceNewsNet — Your Industry. One Source.™

Sign in
  • Subscribe
  • About
  • Advertise
  • Contact
Home Now reading Newswires
Topics
    • Advisor News
    • Annuity Index
    • Annuity News
    • Companies
    • Earnings
    • Fiduciary
    • From the Field: Expert Insights
    • Health/Employee Benefits
    • Insurance & Financial Fraud
    • INN Magazine
    • Insiders Only
    • Life Insurance News
    • Newswires
    • Property and Casualty
    • Regulation News
    • Sponsored Articles
    • Washington Wire
    • Videos
    • ———
    • About
    • Advertise
    • Contact
    • Editorial Staff
    • Newsletters
  • Exclusives
  • NewsWires
  • Magazine
  • Newsletters
Sign in or register to be an INNsider.
  • AdvisorNews
  • Annuity News
  • Companies
  • Earnings
  • Fiduciary
  • Health/Employee Benefits
  • Insurance & Financial Fraud
  • INN Exclusives
  • INN Magazine
  • Insurtech
  • Life Insurance News
  • Newswires
  • Property and Casualty
  • Regulation News
  • Sponsored Articles
  • Video
  • Washington Wire
  • Life Insurance
  • Annuities
  • Advisor
  • Health/Benefits
  • Property & Casualty
  • Insurtech
  • About
  • Advertise
  • Contact
  • Editorial Staff

Get Social

  • Facebook
  • X
  • LinkedIn
Newswires
Newswires RSS Get our newsletter
Order Prints
September 2, 2019 Newswires
Share
Share
Tweet
Email

“Extraction Of Information And Smart Annotation Of Relevant Information Within Complex Documents” in Patent Application Approval Process (USPTO 20190251182)

Insurance Daily News

2019 SEP 02 (NewsRx) -- By a News Reporter-Staff News Editor at Insurance Daily News -- A patent application by the inventors Ray, Ritwik (Cary, NC); Angelopoulos, Marie (Cortlandt Manor, NY); Roberts, Frederick (Franklin, TN); Gagen, Christopher (Perrysburg, OH); Gabrani, Maria (Thalwil, CH), filed on February 12, 2018, was made available online on August 15, 2019, according to news reporting originating from Washington, D.C., by NewsRx correspondents.

This patent application is assigned to International Business Machines Corporation (Armonk, New York, United States).

The following quote was obtained by the news editors from the background information supplied by the inventors: “Present invention embodiments relate to extracting data from documents, and more specifically, to identifying the presence or absence of a key term or concept, which may be a unique key term or concept, in a complex document, extracting information related to the key term or concept, and/or comparing the extracted information.

“With the advent of sophisticated document generation and processing programs, it has become routine to generate complex documents comprising tables, graphs, and unstructured text. Such documents may be hundreds or even thousands of pages in length. Often, such documents are modified or merged with other documents, at least on an annual basis, making it difficult to compare information between various versions of documents.

“Additionally, in certain industries, such as the health insurance industry, documents may describe personalized or customized plan information specific to an organization. While each plan may vary from company to company, common concepts (e.g., deductible, co-pay, formulary, etc.) will be present in each document. In some cases, the same style of document, e.g., based on a common or similar template, may be provided, wherein each document is customized to the needs of a particular group. However, it is difficult to parse through such large, complex documents to find a key item or concept or an answer to a question about the content of the complex document.

“For example, Health Insurance Benefit Coverage Summary Plan Descriptions (SPDs), which describe medical, dental, vision and other health benefit coverage, are often more than several hundred pages long. SPDs are structured to comply with regulatory requirements laid down by government and other statutory bodies. Because these documents are required to comply with federal law regulations, SPDs contain many similarities, but are tailored to individual health plans for the respective organization for which a health care plan is provided, which is the underlying cause for the significant variations in the content.

“Such documents are difficult to understand by most people, and participants in the health plan often have difficulty locating information that is key to their question within such large, complex document(s), particularly when participants are in need of care for themselves or someone in their family. Additionally, it is difficult to identify specific changes in coverage from year to year as the same terminology may not be consistently applied.

“Further, it is difficult to apply business analytics (e.g., to benefit health care plan design and benchmarking) to unstructured text data with complex terminology, and it is difficult for employees to understand the terminology to determine which aspects of medical care are covered, and which are not.

“In particular, unstructured data provides particular challenges with regard to: usability, volume, variability, and quality. Regarding content format variability, documents may have information represented in multiple formats (e.g., unstructured text, diagrams, tables, figures, charts, etc.) and consumption of this type of data for decision making is challenging.

“Regarding volume, unstructured data is growing at a rate of approximately 62% per year, further complicating collection and extraction of data. Regarding variability, such documents often have a wide range of styles, formats, and codes with similar intents. Regarding quality, such documents frequently originate from different sources and have a high level of ambiguity in natural language. Accordingly, managing such documents is difficult, time consuming, and complex. Health benefits is one such type of complex document, other complex documents include but are not limited to insurance documents (e.g., home or auto), policy documents (e.g., employment, government), legal documents, etc.”

In addition to the background information obtained for this patent application, NewsRx journalists also obtained the inventors’ summary information for this patent application: “According to embodiments of the present invention, methods and systems are provided for extraction of information from complex documents comprising unstructured data to create a structured data repository. Such techniques may include using Natural Language Processing (NLP) in combination with machine learning and cognitive systems to identify relevant data.

“According to embodiments of the invention, information may be extracted from a plurality of complex documents, and the extracted information may be mapped to a semantic representation. Information may be extracted from text or non-text elements in the plurality of complex documents. Extracted information may include text extraction, symbol extraction, numerical extraction, and so forth, with such information extracted from any suitable location in the complex document, including text, tables, lists, charts, graphs, etc. Natural language processing and machine learning may be utilized to extract one or more entities from the semantic representation. Structured data comprising the extracted entities is generated from the semantic representation and corresponding attributes. In response to receiving a user query, a subset of the structured data corresponding to the query is returned, wherein the subset of structured data may be arranged to correlate entities with corresponding attributes for each complex document. Information, unless otherwise indicated, generally refers to unstructured text, which may include but is not limited to symbols, numbers, and alphanumeric characters, etc. associated with free text as well as other types of formatted data including but not limited to tables, charts, graphs, lists, etc.

“It is to be understood that the Summary is not intended to identify key or essential features of embodiments of the present disclosure, nor is it intended to be used to limit the scope of the present disclosure. Other features of the present disclosure will become easily comprehensible through the description below.”

The claims supplied by the inventors are:

“1. A method, in a cognitive data processing system comprising at least one processor and at least one memory, the at least one memory comprising instructions executed by the at least one processor to cause the at least one processor to implement an intelligent annotation and data extraction system for extracting and processing information across a plurality of complex documents to provide an answer to a user query comprising: extracting information from the plurality of complex documents; mapping the extracted information to a semantic representation; utilizing a natural language process and a machine learning process to extract one or more entities from the semantic representation; generating structured data comprising the extracted entities from the semantic representation along with corresponding attributes; and providing a subset of the structured data corresponding to the query, wherein the subset of structured data may be arranged to correlate entities with corresponding attributes for one or more complex document.

“2. The method of claim 1, wherein the extracted entities are compared to determine differences with regard to context and meaning as presented in the respective complex document.

“3. The method of claim 1, further comprising: extracting, for a given type of information, the information presented as a text element in at least one complex document and as a non-text element in at least one complex document.

“4. The method of claim 1, wherein the semantic representation includes the extracted information and a context indicating where in the complex document the extracted information is located.

“5. The method of claim 1, wherein the extracted information includes information extracted from both unformatted portions and formatted portions of the plurality of complex documents.

“6. The method of claim 5, wherein the formatted portions include a table, a list, a picture, a spreadsheet, a graph, or a chart.

“7. The method of claim 5, wherein the unformatted portions include text, numbers, or symbols.

“8. The method of claim 1, wherein the complex documents are summary plan descriptions.

“9. The method of claim 1, wherein an annotated data set is provided to the machine learning process and the machine learning process generates a machine learning model to extract entities based on the provided annotated data set.

“10. The method of claim 1, wherein an annotated data set is provided to the machine leaning process to generate and train a machine learning model, and wherein the trained machine learning model is utilized to automatically annotate a received unannotated complex document.

“11. The method of claim 1, comprising: receiving a query requesting a type of information common to each of the plurality of complex documents; and returning the requested information in a readable format allowing side-by-side comparison of the extracted information for each of the plurality of complex documents.

“12. A cognitive data processing system for intelligent annotation and data extraction across a plurality of complex documents comprising: at least one processor configured to: extract information from the plurality of complex documents; map the extracted information to a semantic representation; utilize a natural language process and a machine learning process to extract one or more entities from the semantic representation; generate structured data comprising the extracted entities from the semantic representation along with corresponding attributes; and provide a subset of the structured data corresponding to the query, wherein the subset of structured data may be arranged to correlate entities with corresponding attributes for one or more complex document.

“13. The system of claim 12, wherein the extracted entities are compared to determine differences with regard to context and meaning as presented in the respective complex document.

“14. The system of claim 12, wherein the processor is further configured to extract a type of information presented as a text element in at least one complex document and as a non-text element in at least one other complex document.

“15. The system of claim 12, wherein the semantic representation includes the extracted information and a context indicating where in the complex document the extracted information is located.

“16. The system of claim 12, wherein the extracted information includes information extracted from both unformatted portions and formatted portions of the plurality of complex documents, wherein the formatted portions include a table, a list, a picture, a spreadsheet, a graph, or a chart, and the unformatted portions include text, numbers, or symbols.

“17. The system of claim 12, wherein an annotated data set is provided to the machine learning process and the machine learning process generates a machine learning model to extract entities based on the provided annotated data set.

“18. The system of claim 12, wherein an annotated data set is provided to the machine leaning process to generate and train a machine learning model, and wherein the trained machine learning model is utilized to automatically annotate a received unannotated complex document.

“19. The system of claim 12, wherein the processor is further configured to: receive a query requesting a type of information common to each of the plurality of complex documents; and return the requested information in a readable format allowing side-by-side comparison of the extracted information for each of the plurality of complex documents.

“20. A computer program product for intelligent annotation and data extraction across a plurality of complex documents, the computer program product comprising a computer readable storage medium having computer readable program code embodied therewith, the computer readable program code executable by at least one processor to cause the at least one processor to: extract information from the plurality of complex documents; map the extracted information to a semantic representation; utilize a natural language process and a machine learning process to extract one or more entities from the semantic representation; generate structured data comprising the extracted entities from the semantic representation along with corresponding attributes; and provide a subset of the structured data corresponding to the query, wherein the subset of structured data may be arranged to correlate entities with corresponding attributes for one or more complex document.”

URL and more information on this patent application, see: Ray, Ritwik; Angelopoulos, Marie; Roberts, Frederick; Gagen, Christopher; Gabrani, Maria. Extraction Of Information And Smart Annotation Of Relevant Information Within Complex Documents. Filed February 12, 2018 and posted August 15, 2019. Patent URL: http://appft.uspto.gov/netacgi/nph-Parser?Sect1=PTO1&Sect2=HITOFF&d=PG01&p=1&u=%2Fnetahtml%2FPTO%2Fsrchnum.html&r=1&f=G&l=50&s1=%2220190251182%22.PGNR.&OS=DN/20190251182&RS=DN/20190251182

(Our reports deliver fact-based news of research and discoveries from around the world.)

Older

Australian Department of Agriculture and Water Resources: $20m Basin Research Program Increases Security

Newer

Hurricane Dorian pounds Bahamas; no change in track or forecast for South Florida

Advisor News

  • Are the holidays a good time to have a long-term care conversation?
  • Gen X unsure whether they can catch up with retirement saving
  • Bill that could expand access to annuities headed to the House
  • Private equity, crypto and the risks retirees can’t ignore
  • Will Trump accounts lead to a financial boon? Experts differ on impact
More Advisor News

Annuity News

  • Hildene Capital Management Announces Purchase Agreement to Acquire Annuity Provider SILAC
  • Removing barriers to annuity adoption in 2026
  • An Application for the Trademark “EMPOWER INVESTMENTS” Has Been Filed by Great-West Life & Annuity Insurance Company: Great-West Life & Annuity Insurance Company
  • Bill that could expand access to annuities headed to the House
  • LTC annuities and minimizing opportunity cost
More Annuity News

Health/Employee Benefits News

  • “Assessment of the Impact of Vaccine Funding by the National Health Insurance on Vaccination Coverage Among Patients Targeted by Current Vaccination Recommendations and Followed in Outpatient Consultations in Ile-de-France Region in France””: Coronavirus – COVID-19
  • Trademark Application for “UHC” Filed by Unitedhealth Group Incorporated: UnitedHealth Group Incorporated
  • Dental insurer to close Worcester office, lay off staff of 50
  • 420 with CNW — Proposal Seeks to Cover Some Hemp Products Under Medicare Plans
  • Health insurance premiums rose nearly 3x the rate of worker earnings over the past 25 years
Sponsor
More Health/Employee Benefits News

Life Insurance News

  • On the Move: Dec. 4, 2025
  • Judge approves PHL Variable plan; could reduce benefits by up to $4.1B
  • Seritage Growth Properties Makes $20 Million Loan Prepayment
  • AM Best Revises Outlooks to Negative for Kansas City Life Insurance Company; Downgrades Credit Ratings of Grange Life Insurance Company; Revises Issuer Credit Rating Outlook to Negative for Old American Insurance Company
  • AM Best Affirms Credit Ratings of Bao Minh Insurance Corporation
More Life Insurance News

- Presented By -

Top Read Stories

More Top Read Stories >

NEWS INSIDE

  • Companies
  • Earnings
  • Economic News
  • INN Magazine
  • Insurtech News
  • Newswires Feed
  • Regulation News
  • Washington Wire
  • Videos

FEATURED OFFERS

Slow Me the Money
Slow down RMDs … and RMD taxes … with a QLAC. Click to learn how.

ICMG 2026: 3 Days to Transform Your Business
Speed Networking, deal-making, and insights that spark real growth — all in Miami.

Your trusted annuity partner.
Knighthead Life provides dependable annuities that help your clients retire with confidence.

Press Releases

  • ePIC University: Empowering Advisors to Integrate Estate Planning Into Their Practice With Confidence
  • Altara Wealth Launches as $1B+ Independent Advisory Enterprise
  • A Heartfelt Letter to the Independent Advisor Community
  • 3 Mark Financial Celebrates 40 Years of Partnerships and Purpose
  • Hexure Launches AI Enabled Version of Its Platform to Power Life Insurance Sales
More Press Releases > Add Your Press Release >

How to Write For InsuranceNewsNet

Find out how you can submit content for publishing on our website.
View Guidelines

Topics

  • Advisor News
  • Annuity Index
  • Annuity News
  • Companies
  • Earnings
  • Fiduciary
  • From the Field: Expert Insights
  • Health/Employee Benefits
  • Insurance & Financial Fraud
  • INN Magazine
  • Insiders Only
  • Life Insurance News
  • Newswires
  • Property and Casualty
  • Regulation News
  • Sponsored Articles
  • Washington Wire
  • Videos
  • ———
  • About
  • Advertise
  • Contact
  • Editorial Staff
  • Newsletters

Top Sections

  • AdvisorNews
  • Annuity News
  • Health/Employee Benefits News
  • InsuranceNewsNet Magazine
  • Life Insurance News
  • Property and Casualty News
  • Washington Wire

Our Company

  • About
  • Advertise
  • Contact
  • Meet our Editorial Staff
  • Magazine Subscription
  • Write for INN

Sign up for our FREE e-Newsletter!

Get breaking news, exclusive stories, and money- making insights straight into your inbox.

select Newsletter Options
Facebook Linkedin Twitter
© 2025 InsuranceNewsNet.com, Inc. All rights reserved.
  • Terms & Conditions
  • Privacy Policy
  • InsuranceNewsNet Magazine

Sign in with your Insider Pro Account

Not registered? Become an Insider Pro.
Insurance News | InsuranceNewsNet