Patent Issued for Passive Authentication Through Voice Data Analysis (USPTO 10,867,612)
2020 DEC 24 (NewsRx) -- By a
The patent’s inventors are Pollack,
This patent was filed on
From the background information supplied by the inventors, news correspondents obtained the following quote: “Various types of network-connected smart appliances, Internet of Things (IoT) devices, mobile devices, and/or other computing devices have become available to consumers. Such devices may serve a primary function (e.g., a washing machine washing clothes), while also providing smart capabilities for sensing its state and/or the state of the local environment, collecting state data, executing logic, communicating information to other devices over networks, and so forth. Different devices may have different capabilities with regard to data input and data output. For example, a device may accept audio input (e.g., speech input) and provide audio output, but may not include a display for visually presenting data, or may include a limited display that does not support a graphical user interface (GUI). As another example, a device such as a television may include a large display but may lack a full-featured user interface for inputting data.”
Supplementing the background information on this patent, NewsRx reporters also obtained the inventors’ summary information for this patent: “Implementations of the present disclosure are generally directed to authenticating a user based at least partly on audio information. More specifically, implementations are directed to collecting voice data from a user through a conversational user interface (CUI) of a device, passively authenticating the user based on the voice data, and controlling access to sensitive information based on the authentication of the user.
“In general, innovative aspects of the subject matter described in this specification can be embodied in methods that include operations of: receiving speech data provided by a user during a speech interaction with a conversational user interface (CUI) executing on a computing device; analyzing the speech data to attempt a passive authentication of the user during the speech interaction, wherein the passive authentication is attempted based at least partly on the speech data and does not include explicitly prompting the user for a credential; receiving a request to access sensitive information associated with the user, the request submitted by the user through the CUI during the speech interaction; and in response to the request and based on a determination that the passive authentication of the user is successful, providing access to the sensitive information through the CUI.
“Implementations can optionally include one or more of the following features: the speech data is further analyzed to identify the user among a plurality of users who are registered as users of the computing device; the operations further include based on a determination that the user has not been passively authenticated when the request is received, attempting to actively authenticate the user through the CUI; attempting to actively authenticate the user includes prompting the user to provide, through the CUI, one or more of a personal identification number (PIN), a password, and a passphrase; the operations further include receiving video data that is captured by at least one camera of the computing device; the video data is analyzed with the speech data to attempt to passively authenticate the user; analyzing the video data includes one or more of a facial recognition analysis, a posture recognition analysis, a gesture recognition analysis, and a gait recognition analysis; analyzing the speech data to attempt to passively authenticate the user includes: providing the speech data as input to a model of a speech pattern of the user, the model having been previously developed based on collected speech data of the user, receiving, from the model, a confidence metric indicating a likelihood that the speech data is spoken by the user, and determining that the user is authenticated based on the confidence metric exceeding a threshold value; the analyzed speech data is audio data that is recorded by at least one microphone of the computing device; the analyzed speech data is text data that is generated by transcribing at least a portion of audio data that is recorded by at least one microphone of the computing device; and/or the request to access the sensitive information includes one or more of: a request to access financial account information describing at least one account of the user, a request to perform a financial transaction involving at least one account of the user, and a request to perform a funds transfer involving at least one account of the user.
“Other implementations of any of the above aspects include corresponding systems, apparatus, and computer programs that are configured to perform the actions of the methods, encoded on computer storage devices. The present disclosure also provides a computer-readable storage medium coupled to one or more processors and having instructions stored thereon which, when executed by the one or more processors, cause the one or more processors to perform operations in accordance with implementations of the methods provided herein. The present disclosure further provides a system for implementing the methods provided herein. The system includes one or more processors, and a computer-readable storage medium coupled to the one or more processors having instructions stored thereon which, when executed by the one or more processors, cause the one or more processors to perform operations in accordance with implementations of the methods provided herein.
“Implementations of the present disclosure provide the following technical advantages and technical improvements over previously available solutions. Implementations provide for passive authentication that is performed unobtrusively based on collected speech data of the user during a conversation between the user and a personal assistant device, without explicitly asking or requiring the user to provide credentials (e.g., identifying information). Such passive authentication through speech data analysis is easy and transparent from the user’s point of view, and provides a more favorable user experience compared to traditional authentication methods based on user-provided credentials. The passive authentication can be performed on an ongoing basis, such that the user is authenticated and access is available to sensitive information should the user request such information during a conversation with the personal assistant device. Because implementations perform passive authentication based on collected audio data, such authentication may be performed with higher confidence than traditional techniques for authenticating the user, given that the analysis may continue collecting and analyzing the user’s speech data until authentication is successful. Accordingly, implementations make more efficient use of processing power, network bandwidth, active memory, storage space, and/or other computing resources that traditional authentication systems expend to recover from, and/or retry following, multiple failed authentication attempts based on inaccurately provided credentials. The passive authentication described herein is particularly advantageous on headless PA devices that may lack a display and/or other I/O components that would enable a user to enter traditional authentication credentials.
“In some implementations, the authentication can take place on a personal assistant device and can be passed to another device for a limited period of time. For example, a user can authenticate on their in-home personal assistant device, and the authentication is passed to the user’s car for a period of time. In some instances, this passed authentication can be for a subset of data and/or functionality on the receiving device.
“In some implementations, the authentication can also be used without a personally owned device but by leveraging public access devices such as speaker systems, help lines, smart TVs, and/or other voice-enabled interactive systems. Passive authentication is particular advantage to control of Internet of Things (IoT) devices in a smart home or other type of connected ecosystem. For example, voice commands can be used to control a thermostat in a different building, room, or floor, and the commands can be validated as coming from an authenticated source. Passive authentication can also be used at a point of interaction such as an automated teller machine (ATM) and/or other type of automated teller or agent system and can perform operations for authentication as well as for emotion detection (e.g., to detect when a user is under duress), and take appropriate actions. Other applications include but are not limited to drone control, assisted technologies, banking and investments, and so forth.
“It is appreciated that aspects and features in accordance with the present disclosure can include any combination of the aspects and features described herein. That is, aspects and features in accordance with the present disclosure are not limited to the combinations of aspects and features specifically described herein, but also include any combination of the aspects and features provided.
“The details of one or more implementations of the present disclosure are set forth in the accompanying drawings and the description below. Other features and advantages of the present disclosure will be apparent from the description and drawings, and from the claims.”
The claims supplied by the inventors are:
“The invention claimed is:
“1. A computer-implemented method performed by at least one processor, the method comprising: receiving, by the at least one processor, speech data provided by a user during a speech interaction with a conversational user interface (CUI) executing on a computing device; analyzing, by the at least one processor, the speech data to attempt a passive authentication of the user during the speech interaction, wherein the passive authentication is attempted based at least partly on the speech data and does not include explicitly prompting the user for a credential; storing, by the at least one processor, session information related to the speech interaction with the CUI, the session information including an indication of whether the passive authentication of the user has been successful during the speech interaction; in response to determining that the passive authentication of the user has been successful during the speech interaction, storing, with the session information, a time value that indicates a period of time following a cessation of speech interaction between the user and the CUI after which the successful passive authentication expires; receiving, by the at least one processor, a request to access sensitive information associated with the user, the request submitted by the user through the CUI during the speech interaction; and in response to the request: a) determining, from the session information, that the user has been successfully authenticated during the speech interaction, and b) based on the determination that the user has been successfully authenticated, providing, by the at least one processor, access to the sensitive information through the CUI.
“2. The method of claim 1, wherein the speech data is further analyzed to identify the user among a plurality of users who are registered as users of the computing device.
“3. The method of claim 1, further comprising: based on a determination that the passive authentication has expired when the request is received, attempting, by the at least one processor, to actively authenticate the user through the CUI.
“4. The method of claim 3, wherein attempting to actively authenticate the user includes prompting the user to provide, through the CUI, one or more of a personal identification number (PIN), a password, and a passphrase.
“5. The method of claim 1, further comprising: receiving, by the at least one processor, video data that is captured by at least one camera of the computing device; wherein the video data is analyzed with the speech data to attempt to passively authenticate the user.
“6. The method of claim 5, wherein analyzing the video data includes one or more of a facial recognition analysis, a posture recognition analysis, a gesture recognition analysis, and a gait recognition analysis.
“7. The method of claim 1, wherein analyzing the speech data to attempt to passively authenticate the user includes: providing the speech data as input to a model of a speech pattern of the user, the model having been previously developed based on collected speech data of the user; receiving, from the model, a confidence metric indicating a likelihood that the speech data is spoken by the user; and determining that the user is authenticated based on the confidence metric exceeding a threshold value.
“8. The method of claim 1, wherein the analyzed speech data is audio data that is recorded by at least one microphone of the computing device.
“9. The method of claim 1, wherein the analyzed speech data is text data that is generated by transcribing at least a portion of audio data that is recorded by at least one microphone of the computing device.
“10. The method of claim 1, wherein the request to access the sensitive information includes one or more of: a request to access financial account information describing at least one account of the user; a request to perform a financial transaction involving at least one account of the user; and a request to perform a funds transfer involving at least one account of the user.
“11. A system, comprising: at least one processor; and a memory communicatively coupled to the at least one processor, the memory storing instructions which, when executed by the at least one processor, cause the at least one processor to perform operations comprising: receiving speech data provided by a user during a speech interaction with a conversational user interface (CUI) executing on a computing device; analyzing the speech data to attempt a passive authentication of the user during the speech interaction, wherein the passive authentication is attempted based at least partly on the speech data and does not include explicitly prompting the user for a credential; storing session information related to the speech interaction with the CUI, the session information including an indication of whether the passive authentication of the user has been successful during the speech interaction; in response to determining that the passive authentication of the user has been successful during the speech interaction, storing, with the session information, a time value that indicates a period of time following a cessation of speech interaction between the user and the CUI after which the successful passive authentication expires; receiving a request to access sensitive information associated with the user, the request submitted by the user through the CUI during the speech interaction; and in response to the request: a) determining, from the session information, that the user has been successfully authenticated during the speech interaction, and b) based on the determination that the user has been successfully authenticated, providing, by the at least one processor, access to the sensitive information through the CUI.
“12. The system of claim 11, wherein the speech data is further analyzed to identify the user among a plurality of users who are registered as users of the computing device.
“13. The system of claim 11, the operations further comprising: based on a determination that the passive authentication has expired when the request is received, attempting, by the at least one processor, to actively authenticate the user through the CUI.
“14. The system of claim 13, wherein attempting to actively authenticate the user includes prompting the user to provide, through the CUI, one or more of a personal identification number (PIN), a password, and a passphrase.
“15. The system of claim 11, the operations further comprising: receiving video data that is captured by at least one camera of the computing device; wherein the video data is analyzed with the speech data to attempt to passively authenticate the user.
“16. The system of claim 15, wherein analyzing the video data includes one or more of a facial recognition analysis, a posture recognition analysis, a gesture recognition analysis, and a gait recognition analysis.
“17. The system of claim 11, wherein analyzing the speech data to attempt to passively authenticate the user includes: providing the speech data as input to a model of a speech pattern of the user, the model having been previously developed based on collected speech data of the user; receiving, from the model, a confidence metric indicating a likelihood that the speech data is spoken by the user; and determining that the user is authenticated based on the confidence metric exceeding a threshold value.
“18. The system of claim 11, wherein the analyzed speech data is audio data that is recorded by at least one microphone of the computing device.
“19. The system of claim 11, wherein the analyzed speech data is text data that is generated by transcribing at least a portion of audio data that is recorded by at least one microphone of the computing device.
“20. One or more non-transitory computer-readable media storing instructions which, when executed by at least one processor, cause the at least one processor to perform operations comprising: receiving speech data provided by a user during a speech interaction with a conversational user interface (CUI) executing on a computing device; analyzing the speech data to attempt a passive authentication of the user during the speech interaction, wherein the passive authentication is attempted based at least partly on the speech data and does not include explicitly prompting the user for a credential; storing session information related to the speech interaction with the CUI, the session information including an indication of whether the passive authentication of the user has been successful during the speech interaction; in response to determining that the passive authentication of the user has been successful during the speech interaction, storing, with the session information, a time value that indicates a period of time following a cessation of speech interaction between the user and the CUI after which the successful passive authentication expires; receiving a request to access sensitive information associated with the user, the request submitted by the user through the CUI during the speech interaction; and in response to the request: a) determining, from the session information, that the user has been successfully authenticated during the speech interaction, and b) based on the determination that the user has been successfully authenticated, providing, by the at least one processor, access to the sensitive information through the CUI.
“21. The method of claim 3, further comprising: determining that the active authentication of the user is successful; and in response, applying the speech data from the speech interaction with the CUI to refine a model of the user’s speech.”
For the URL and additional information on this patent, see: Pollack,
(Our reports deliver fact-based news of research and discoveries from around the world.)



4-month-old boy dies in Hebron fire
Health Insurance Coverage for Locally Employed Staff of the U.S. Embassy Abidjan, C–te d'Ivoire
Advisor News
- Advisors must lead the policy risk conversation
- Gen X more anxious than baby boomers about retirement
- Taxing trend: How the OBBBA is breaking the standard deduction reliance
- 6 in 10 Americans struggle with financial decisions
- New Trump administration rule seeks to bail out private equity, credit with workers’ 401(k) savings
More Advisor NewsAnnuity News
- ‘I get confused:’ Regulators ponder increasing illustration complexities
- Three ways the Corebridge/Equitable merger could shake up the annuity market
- Corebridge, Equitable merge to create potential new annuity sales king
- LIMRA: Final retail annuity sales total $464.1 billion in 2025
- How annuities can enhance retirement income for post-pension clients
More Annuity NewsHealth/Employee Benefits News
- Garson to run for NC Senate District 23 seat
- New York lawmakers introduce bills aimed at maintaining vaccine access, updating state oversight
- DESPITE POSTPARTUM MEDICAID COVERAGE GAINS FOR BLACK WOMEN, SIGNIFICANT EQUITY GAPS PERSIST
- LEVERAGING EXISTING INFRASTRUCTURE AND PARTNERSHIPS TO IMPROVE CHILD HEALTH OUTCOMES
- Congress, end the "prior authorization" health-care scam
More Health/Employee Benefits NewsLife Insurance News
- From marathons to mountaineering: Ranking which sports and hobbies affect life insurance the most
- AMERICA'S CREDIT UNIONS HIRES VETERAN WASHINGTON ADVOCATE TO LEAD POLICY STRATEGY
- Society of Actuaries announces Clar Rosso as next CEO
- AM Best Affirms Credit Ratings of Fidelity & Guaranty Life Holdings, Inc. and Its Life/Health Subsidiaries
- Hawai'i's Top Employers Profiles 2026
More Life Insurance News