Invented by Petr KARASHCHUK, Tomas A. VEGA GALVEZ, Thomas R. Gruber, Apple Inc
The Apple Inc invention works as followsSystems and methods for operating an intelligent automated assistance in a message environment are provided.” In an example, a GUI containing a plurality previous messages sent between the user of the electronic gadget and the digital assistant is displayed on the display. The previous messages may be displayed in a conversational format. The GUI can receive user input and display it as a message. The context state of the electronic gadget corresponding to displayed user input may be stored. The process can perform an action in accordance with the user intent that is derived from user input. The GUI can display a response to the action as a second message.
Background for Intelligent Automated Assistant in a Messaging Environment
Intelligent automated assistants or digital assistants can be a useful interface between electronic devices and human users. These assistants allow users to interact using natural language, either in text or spoken form. A user could, for example, provide a spoken input with a request to a digital assistance operating on an electronic gadget. The digital assistant will interpret the user intent and translate it into tasks. The tasks are then performed by executing the services of the electronic devices, and relevant outputs responsive to user requests can be returned to them.
Usually, electronic devices have a dedicated interface to interact with the digital assistant. An electronic device, for example, can implement a voice interface to interact with the digital assistant. These dedicated user interfaces limit interaction opportunities, which may limit the adoption and use of digital assistants.
Systems and methods for operating an intelligent automated assistance in a message environment are provided.” In an example, a GUI containing a plurality previous messages sent between the user of the electronic gadget and the digital assistant is displayed on the display. The previous messages may be displayed in a conversational format. The GUI can receive user input and display it as a message. The context state of the electronic gadget corresponding to displayed user input may be stored. The process can perform an action in accordance with the user intent that is derived from user input. The GUI can display a response to the action as a second message.
In another example, a GUI containing a plurality previous messages between the user and digital assistant can displayed on an electronic device’s display. The previous messages may be displayed in a conversational format. It is possible to receive a first user input that includes a media item. The media object may be displayed in response to the first user input. The second input can include text. The text displayed in response to the second input can be displayed on the GUI as a message. This process can determine a user intention corresponding to both the first and second user input. It is possible to determine whether the user intention requires text to be extracted from the media object. After determining that the user intention requires extracting the text from the object, the text can be extracted from the object, the task can be performed in accordance with user intent using the extracted text and the response indicative of user intent being met can be displayed in the GUI as a third text message.
In a third example, a GUI containing a number of previous messages sent between an electronic device user and a remote user can be displayed in the display of a device. The previous messages may be displayed in a conversational format. The user of the electronic gadget can send a first input to the digital assistant. The first user input received can be displayed in the GUI as a message. A process can be used to perform an action in accordance with the user intent that is derived from the user input. The GUI can display a response to the action as a second message.
In the following examples, it is referred to the drawings that accompany the text. These illustrations show specific examples which can be used. Other examples and structural changes are possible without departing from their scope.
It may be desirable to integrate a digital assistant into a messaging environment. The messaging platform allows for multiple input modes (e.g. text, audio files, images, videos, etc.). Send and receive messages. The user can define the request for the digital assistant by combining inputs (e.g. image and text). This can enhance the capabilities and functionality of the digital assistant. It also allows for a more interactive experience between the user and the digital assistant. The messaging platform also provides a visual interface that allows for interaction in more environments than platforms based on voice/audio. The digital assistant can be more accessible when it is in a messaging environment. The digital assistant is particularly accessible in noisy environments, or environments where audio outputs are not desired (e.g. the library). The messaging platform can also be used as a conversational interface, where brief concise communications are sent between two or three parties in chronological order. In a message environment, a digital assistant can participate in multi-party conversations where benefits of the digital assistant are shared by all participants. The chronological format allows a user to easily review past interactions with the digital assistance and use the context history associated with previous interactions to define an even wider range of tasks.
The following description uses the terms “first” and “second”, but it is not a complete description. ?second,? etc. These terms should not limit the elements to be described. These terms are used only to differentiate one element from another. A first input can be referred to as a “second input” and a “second input”, without straying from the scope described in the examples. The first input and second input are both inputs, and in some cases can be different inputs.
The terminology used to describe the examples is only intended to describe specific examples and not be restrictive. In the claims and in the descriptions of the examples, the singular forms “a” and “an” are used. ?an,? The plural forms?an?,? The plural form is also intended, unless it’s clear from the context. The term “and/or” is also understood to include all possible combinations of one or more listed items. As used herein, the term?and/or? refers to any and all combinations of one or multiple of the listed items. The terms “includes” will also be understood to mean that they include all possible combinations. ?including,? ?comprises,? and/or ?comprising,? When used in this specification to specify the presence of specified features, integers or steps, or components. However, they do not exclude the addition or presence of other features or integers or steps, or components or groups thereof.
The term “if” can be interpreted as “when? “The term?if? When? or ?in response to determining? or ?in response to detecting,? Depending on the context, you can use?if it is detected? or ‘in response to detecting’. The phrase “if it is determined” can also be used. The phrase ‘if it is determined? The phrase “upon determination” can be taken to mean “after determining”. or ?in response to determining? Or ‘upon detection of [the condition or event]’? “Upon detecting [the condition or event]” or “in response to detecting (the condition or event)”? “Depending on the context.
1. “System and Environment
FIG. The block diagram 1 shows a system 100 according various examples. In some cases, system 100 may implement a digital assistance. The terms “digital assistant” and “virtual assistant” are used interchangeably. ?virtual assistant,? Intelligent automated assistant? “Automatic digital assistant” or “intelligent automated assistant” can refer to any information processing system that interprets natural language input in spoken and/or textual form to infer user intent, and performs actions based on the inferred user intention. Any information processing system which interprets natural language input, whether spoken or textual, to infer user intention and then performs actions in accordance with the inferred intent. To act on inferred intents, for example, the system could perform the following actions: identifying a flow of tasks and parameters that will accomplish the inferred intent; inputting the specific requirements into the flow; executing this flow by calling programs, methods, APIs or the like, and finally generating responses to the user.
A digital assistant is capable of accepting user requests at least in part as a natural-language command, request or narrative. The user can request either an answer to a question or the performance of a specific task by the digital assistance. The digital assistant can provide the informational response requested by the user, perform the task requested, or combine the two. A user could ask the digital assistant, for example, “Where am I now?” The digital assistant will respond based on the location of the user, such as “You are at the west gate in Central Park.” User can also ask for a specific task to be performed, such as ‘Please invite my girlfriend’s friends to her birthday party next Monday. The digital assistant will acknowledge the request with a ‘Yes, immediately.’ The digital assistant will then send an appropriate calendar invitation on behalf of each user to their friends in their electronic address book. The digital assistant may interact with the user continuously over a long period of time while performing a task. You can interact with a digital assistance in a variety of ways to get information or perform tasks. The digital assistant is not limited to verbal responses or taking pre-programmed actions. It can also respond in visual and audio formats, such as alerts, text, music, videos, animated graphics, etc.
As shown in FIG. In some examples, the digital assistant can implement a client-server system. Digital assistants can have a client-side component 102 (hereafter “DA client 102”) The digital assistant can include a client-side component 102 (hereafter called DA client 102?) executed on the user device 104, and a server-side component 106 (hereafter called?DA server 106) executed on server system 108. Executed on the server system 108. DA client 102 communicates with DA Server 106 via one or more networks. DA client 102 is capable of providing client-side functionality, such as input and output processing for the user and communication with DA Server 106. DA server 106 provides server-side functionality for any number DA clients 102, each of which resides on a separate user device 104.
In some cases, DA server 106 may include a client-facing I/O Interface 112, one or several processing modules 114 and data and models 116. It can also include an I/O Interface to external services 118. Client-facing I/O 112 facilitates the input and output processing of DA server 106. Data and models 116 can be used by one or more processing modules to determine user intent from natural language input. In addition, one or multiple processing modules 114 can execute tasks based on the inferred intent of the user. In some cases, DA server 106 may communicate with external services via network(s) to complete a task or acquire information. I/O interfaces to external services 118 may facilitate communications.
Communication network(s), 110, can be implemented using any known protocol. This includes wired or wireless protocols such as Ethernet, Universal Serial Bus (USB), FIREWIRE and Wide Area Networks (WAN), for example the Internet. Communication network(s), 110, can be implemented by any network protocol. This includes wired and wireless protocols such as Ethernet, Universal Serial Bus, FIREWIRE (USB), Global System for Mobile Communications, Enhanced Data GSM Environment, EDGE, code division multiple-access (CDMA), Time Division Multiple Access (TDMA), Bluetooth (Wi-Fi), WiMAX (VoIP), WiMAX (or any other suitable protocol).
The server system 108 may be implemented using one or more standalone computers or distributed networks of computers. In some examples server system can use virtual devices or services from third party service providers, such as third-party cloud providers, to provide computing resources and/or other infrastructure resources for server system 108.
In some cases, the second device 122 can be used to communicate with the DA server 106. Second user device can be identical or similar to user device. Second user device 122, for example, can be similar to the devices 200, 600, or 400 described below. 2A, 4 and 6A-B. User device 104 may be configured to communicate with second device 122 through a direct connection such as Bluetooth or NFC or BTLE or similar, or via a wireless or wired network such as a Wi-Fi local network. In some examples second user device can be configured as a proxy device between user device and DA server. DA client 102 on user device 104, for example, can be configured so that it transmits information (e.g. a user request) to DA Server 106 via second device 122. DA server 106 may process the information, and then return relevant data to user device 104 (e.g. data responsive to the request from the user) via second device 122.
In some cases, the user device can be configured to send abbreviated data requests to the second device to reduce information sent from user device. The second user device 122 may be configured to determine supplementary information to add to an abbreviated data request in order to create a complete request that is transmitted to DA server 106. This system architecture allows user device 104 with limited battery life and/or communication capabilities (e.g. a watch, or similar compact electronic devices) to access DA server 106 services by using a second user device 122 having greater battery power and/or communication capabilities (e.g. a mobile, laptop, tablet computer or the like). In FIG., only two devices are shown. It should be noted that the system 100 could include any number of devices in this proxy configuration configured to communicate with DA Server System 106.
Although FIG. The digital assistant shown in FIG. 1 can have both a client side portion (e.g. DA client 102) as well as a server side portion (e.g. DA server 106) but, in some cases, it can be implemented using a standalone app installed on the user’s device. The division of functionality between the server and client portions of the digital assistance can also vary depending on the implementation. In some cases, the DA can be a thin client that only provides input and output functions to the user, while delegating all other digital assistant functionalities to a server at the backend.Click here to view the patent on Google Patents.