Patent Issued for Extraction of information and smart annotation of relevant information within complex documents (USPTO 11163837): International Business Machines Corporation - Insurance News | InsuranceNewsNet

InsuranceNewsNet — Your Industry. One Source.™

Sign in
  • Subscribe
  • About
  • Advertise
  • Contact
Home Now reading Newswires
Topics
    • Advisor News
    • Annuity Index
    • Annuity News
    • Companies
    • Earnings
    • Fiduciary
    • From the Field: Expert Insights
    • Health/Employee Benefits
    • Insurance & Financial Fraud
    • INN Magazine
    • Insiders Only
    • Life Insurance News
    • Newswires
    • Property and Casualty
    • Regulation News
    • Sponsored Articles
    • Washington Wire
    • Videos
    • ———
    • About
    • Meet our Editorial Staff
    • Advertise
    • Contact
    • Newsletters
  • Exclusives
  • NewsWires
  • Magazine
  • Newsletters
Sign in or register to be an INNsider.
  • AdvisorNews
  • Annuity News
  • Companies
  • Earnings
  • Fiduciary
  • Health/Employee Benefits
  • Insurance & Financial Fraud
  • INN Exclusives
  • INN Magazine
  • Insurtech
  • Life Insurance News
  • Newswires
  • Property and Casualty
  • Regulation News
  • Sponsored Articles
  • Video
  • Washington Wire
  • Life Insurance
  • Annuities
  • Advisor
  • Health/Benefits
  • Property & Casualty
  • Insurtech
  • About
  • Advertise
  • Contact
  • Editorial Staff

Get Social

  • Facebook
  • X
  • LinkedIn
Newswires
Newswires RSS Get our newsletter
Order Prints
November 24, 2021 Newswires
Share
Share
Post
Email

Patent Issued for Extraction of information and smart annotation of relevant information within complex documents (USPTO 11163837): International Business Machines Corporation

Insurance Daily News

2021 NOV 24 (NewsRx) -- By a News Reporter-Staff News Editor at Insurance Daily News -- According to news reporting originating from Alexandria, Virginia, by NewsRx journalists, a patent by the inventors Angelopoulos, Marie (Cortlandt Manor, NY, US), Gabrani, Maria (Thalwil, CH), Gagen, Christopher (Perrysburg, OH, US), Ray, Ritwik (Cary, NC, US), Roberts, Frederick (Franklin, TN, US), filed on June 18, 2019, was published online on November 2, 2021.

The assignee for this patent, patent number 11163837, is International Business Machines Corporation (Armonk, New York, United States).

Reporters obtained the following quote from the background information supplied by the inventors:

“1. Technical Field

“Present invention embodiments relate to extracting data from documents, and more specifically, to identifying the presence or absence of a key term or concept, which may be a unique key term or concept, in a complex document, extracting information related to the key term or concept, and/or comparing the extracted information.

“2. Discussion of the Related Art

“With the advent of sophisticated document generation and processing programs, it has become routine to generate complex documents comprising tables, graphs, and unstructured text. Such documents may be hundreds or even thousands of pages in length. Often, such documents are modified or merged with other documents, at least on an annual basis, making it difficult to compare information between various versions of documents.

“Additionally, in certain industries, such as the health insurance industry, documents may describe personalized or customized plan information specific to an organization. While each plan may vary from company to company, common concepts (e.g., deductible, co-pay, formulary, etc.) will be present in each document. In some cases, the same style of document, e.g., based on a common or similar template, may be provided, wherein each document is customized to the needs of a particular group. However, it is difficult to parse through such large, complex documents to find a key item or concept or an answer to a question about the content of the complex document.

“For example, Health Insurance Benefit Coverage Summary Plan Descriptions (SPDs), which describe medical, dental, vision and other health benefit coverage, are often more than several hundred pages long. SPDs are structured to comply with regulatory requirements laid down by government and other statutory bodies. Because these documents are required to comply with federal law regulations, SPDs contain many similarities, but are tailored to individual health plans for the respective organization for which a health care plan is provided, which is the underlying cause for the significant variations in the content.

“Such documents are difficult to understand by most people, and participants in the health plan often have difficulty locating information that is key to their question within such large, complex document(s), particularly when participants are in need of care for themselves or someone in their family. Additionally, it is difficult to identify specific changes in coverage from year to year as the same terminology may not be consistently applied.

“Further, it is difficult to apply business analytics (e.g., to benefit health care plan design and benchmarking) to unstructured text data with complex terminology, and it is difficult for employees to understand the terminology to determine which aspects of medical care are covered, and which are not.

“In particular, unstructured data provides particular challenges with regard to: usability, volume, variability, and quality. Regarding content format variability, documents may have information represented in multiple formats (e.g., unstructured text, diagrams, tables, figures, charts, etc.) and consumption of this type of data for decision making is challenging.

“Regarding volume, unstructured data is growing at a rate of approximately 62% per year, further complicating collection and extraction of data. Regarding variability, such documents often have a wide range of styles, formats, and codes with similar intents. Regarding quality, such documents frequently originate from different sources and have a high level of ambiguity in natural language. Accordingly, managing such documents is difficult, time consuming, and complex. Health benefits is one such type of complex document, other complex documents include but are not limited to insurance documents (e.g., home or auto), policy documents (e.g., employment, government), legal documents, etc.”

In addition to obtaining background information on this patent, NewsRx editors also obtained the inventors’ summary information for this patent: “According to embodiments of the present invention, methods and systems are provided for extraction of information from complex documents comprising unstructured data to create a structured data repository. Such techniques may include using Natural Language Processing (NLP) in combination with machine learning and cognitive systems to identify relevant data.

“According to embodiments of the invention, information may be extracted from a plurality of complex documents, and the extracted information may be mapped to a semantic representation. Information may be extracted from text or non-text elements in the plurality of complex documents. Extracted information may include text extraction, symbol extraction, numerical extraction, and so forth, with such information extracted from any suitable location in the complex document, including text, tables, lists, charts, graphs, etc. Natural language processing and machine learning may be utilized to extract one or more entities from the semantic representation. Structured data comprising the extracted entities is generated from the semantic representation and corresponding attributes. In response to receiving a user query, a subset of the structured data corresponding to the query is returned, wherein the subset of structured data may be arranged to correlate entities with corresponding attributes for each complex document. Information, unless otherwise indicated, generally refers to unstructured text, which may include but is not limited to symbols, numbers, and alphanumeric characters, etc. associated with free text as well as other types of formatted data including but not limited to tables, charts, graphs, lists, etc.

“It is to be understood that the Summary is not intended to identify key or essential features of embodiments of the present disclosure, nor is it intended to be used to limit the scope of the present disclosure. Other features of the present disclosure will become easily comprehensible through the description below.”

The claims supplied by the inventors are:

“1. A method, in a cognitive data processing system comprising at least one processor and at least one memory, the at least one memory comprising instructions executed by the at least one processor to cause the at least one processor to implement an intelligent annotation and data extraction system for extracting and processing information across a plurality of complex documents to provide an answer to a user query comprising: extracting information from the plurality of complex documents, wherein a text structure model, trained with a first set of labeled data, extracts information from text of the plurality of complex documents, wherein a table structure model, trained with a second set of labeled data, extracts information from tables of the plurality of complex documents, wherein the first set of labeled data includes text from one or more Summary Plan Description documents, wherein each Summary Plan Description document describes specific health care benefit coverage for plan members, and wherein the second set of labeled data includes tables from one or more Summary Plan Description documents, wherein the Summary Plan Description documents of the first and second sets of labeled data include annotations of exclusions to the health care benefit coverage, and wherein the annotations of exclusions are determined according to an extracted exclusion knowledge base; mapping the extracted information to a semantic representation; utilizing a natural language process and a machine learning process to extract one or more entities from the semantic representation, wherein the machine learning process is trained with a third set of labeled data and is trained using supervised machine learning, and wherein the one or more entities include one or more of: a plan sponsor entity, a plan entity, a conditions/exclusions entity, a prior authorizations entity, an eligibility entity, a copay entity, a coinsurance entity, and an out of pocket limits entity; generating structured data comprising the extracted entities from the semantic representation along with corresponding attributes; and providing a subset of the structured data corresponding to the query, wherein the subset of structured data may be arranged to correlate entities with corresponding attributes for one or more complex documents.

“2. The method of claim 1, wherein the extracted entities are compared to determine differences with regard to context and meaning as presented in the respective complex document.

“3. The method of claim 1, further comprising: extracting, for a given type of information, the information presented as a text element in at least one complex document and as a non-text element in at least one complex document.

“4. The method of claim 1, wherein the semantic representation includes the extracted information and a context indicating where in the complex document the extracted information is located.

“5. The method of claim 1, wherein the extracted information includes information extracted from both unformatted portions and formatted portions of the plurality of complex documents.

“6. The method of claim 5, wherein the formatted portions include a table, a list, a picture, a spreadsheet, a graph, or a chart.

“7. The method of claim 5, wherein the unformatted portions include text, numbers, or symbols.

“8. The method of claim 1, wherein the complex documents are summary plan descriptions.

“9. The method of claim 1, wherein the third set includes an annotated data set that is provided to the machine learning process and the machine learning process generates a machine learning model to extract entities based on the provided annotated data set.

“10. The method of claim 1, wherein an annotated data set is provided to the machine learning process to generate and train a machine learning model, and wherein the trained machine learning model is utilized to automatically annotate a received unannotated complex document.

“11. The method of claim 1, comprising: receiving a query requesting a type of information common to each of the plurality of complex documents; and returning the requested information in a readable format allowing side-by-side comparison of the extracted information for each of the plurality of complex documents.”

For more information, see this patent: Angelopoulos, Marie. Extraction of information and smart annotation of relevant information within complex documents. U.S. Patent Number 11163837, filed June 18, 2019, and published online on November 2, 2021. Patent URL: http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO1&Sect2=HITOFF&d=PALL&p=1&u=%2Fnetahtml%2FPTO%2Fsrchnum.htm&r=1&f=G&l=50&s1=11163837.PN.&OS=PN/11163837RS=PN/11163837

(Our reports deliver fact-based news of research and discoveries from around the world.)

Older

Researchers from University of Ljubljana Describe Findings in Service Management and Research (How Industry and Occupational Stereotypes Shape Consumers’ Trust, Value and Loyalty Judgments Concerning Service Brands): Economics – Service Management and Research

Newer

Patent Issued for Accident re-creation using augmented reality (USPTO 11164356): Allstate Insurance Company

Advisor News

  • Reynolds signs temporary tax hike
  • Gov. Kim Reynolds signs temporary tax hike to address Iowa Medicaid shortfall
  • Reynolds signs temporary tax hike to address Iowa Medicaid shortfall
  • Temporary tax hike to fill Iowa Medicaid gap heads to governor’s desk
  • Gov. Kim Reynolds signs health insurance premium tax increase into law
More Advisor News

Annuity News

  • Corebridge, Equitable merge to create potential new annuity sales king
  • LIMRA: Final retail annuity sales total $464.1 billion in 2025
  • How annuities can enhance retirement income for post-pension clients
  • We can help find a loved one’s life insurance policy
  • 2025: A record-breaking year for annuity sales via banks and BDs
More Annuity News

Health/Employee Benefits News

  • Northwestern Medicine steps up support for Crystal Lake community health clinic as insurance costs soar
  • Why health insurance shouldn’t stand between you and colon cancer screening
  • Amesbury FD receives grant for cardiac screenings
  • SOUTHERN MN REPUBLICAN VOICES: Health care, American style
  • Reynolds signs temporary tax hike
More Health/Employee Benefits News

Life Insurance News

  • Corebridge, Equitable Merger Creates $1.5tr Platfrom
  • AM Best Removes from Under Review with Positive Implications and Affirms Credit Ratings of Sompo Seguros Mexico S.A. de C.V.
  • Corebridge, Equitable merge to create potential new annuity sales king
  • Aflac adds new long-term care rider
  • AM Best Affirms Credit Ratings of Nan Shan General Insurance Co., Ltd.
More Life Insurance News

- Presented By -

Top Read Stories

More Top Read Stories >

NEWS INSIDE

  • Companies
  • Earnings
  • Economic News
  • INN Magazine
  • Insurtech News
  • Newswires Feed
  • Regulation News
  • Washington Wire
  • Videos

FEATURED OFFERS

Elevate Your Practice with Pacific Life
Taking your business to the next level is easier when you have experienced support.

Your Cap. Your Term. Locked.
Oceanview CapLock™. One locked cap. No annual re-declarations. Clear expectations from day one.

Ready to make your client presentations more engaging?
EnsightTM marketing stories, available with select Allianz Life Insurance Company of North America FIAs.

Unlock the Future of Index-Linked Solutions
Join industry leaders shaping next-gen index strategies, distribution, and innovation.

Press Releases

  • RFP #T01525
  • RFP #T01725
  • Insurate expands workers’ comp into: CA, FL, LA, NC, NJ, PA, VA
  • LifeSecure Insurance Company Announces Retirement of Brian Vestergaard, Additions to Executive Leadership
  • RFP #T02226
More Press Releases > Add Your Press Release >

How to Write For InsuranceNewsNet

Find out how you can submit content for publishing on our website.
View Guidelines

Topics

  • Advisor News
  • Annuity Index
  • Annuity News
  • Companies
  • Earnings
  • Fiduciary
  • From the Field: Expert Insights
  • Health/Employee Benefits
  • Insurance & Financial Fraud
  • INN Magazine
  • Insiders Only
  • Life Insurance News
  • Newswires
  • Property and Casualty
  • Regulation News
  • Sponsored Articles
  • Washington Wire
  • Videos
  • ———
  • About
  • Meet our Editorial Staff
  • Advertise
  • Contact
  • Newsletters

Top Sections

  • AdvisorNews
  • Annuity News
  • Health/Employee Benefits News
  • InsuranceNewsNet Magazine
  • Life Insurance News
  • Property and Casualty News
  • Washington Wire

Our Company

  • About
  • Advertise
  • Contact
  • Meet our Editorial Staff
  • Magazine Subscription
  • Write for INN

Sign up for our FREE e-Newsletter!

Get breaking news, exclusive stories, and money- making insights straight into your inbox.

select Newsletter Options
Facebook Linkedin Twitter
© 2026 InsuranceNewsNet.com, Inc. All rights reserved.
  • Terms & Conditions
  • Privacy Policy
  • InsuranceNewsNet Magazine

Sign in with your Insider Pro Account

Not registered? Become an Insider Pro.
Insurance News | InsuranceNewsNet