Invented by David Chance Graham, Cyrus Daniel Irani, Aimee Piercy, Thomas Alsina, Apple Inc

The market for intelligent automated assistants for media exploration is rapidly growing, as more and more people are seeking efficient and personalized ways to discover and consume media content. These assistants, powered by artificial intelligence (AI) and machine learning algorithms, are revolutionizing the way we interact with media platforms and helping us navigate through the vast amount of available content. In today’s digital age, we are bombarded with an overwhelming amount of media options, from streaming services and social media platforms to news websites and podcasts. With so much content to choose from, it can be challenging to find what we are truly interested in. This is where intelligent automated assistants come into play. These assistants are designed to understand our preferences, interests, and behaviors by analyzing our past interactions with media. By leveraging AI and machine learning, they can make personalized recommendations, curate content, and even predict our future preferences. This level of personalization not only saves us time and effort but also enhances our overall media exploration experience. One of the key advantages of intelligent automated assistants is their ability to adapt and learn from user feedback. As we interact with these assistants, they gather data on our likes and dislikes, allowing them to continuously refine their recommendations. This iterative process ensures that the content suggestions become more accurate and tailored to our individual tastes over time. Furthermore, intelligent automated assistants are not limited to a single media platform. They can integrate with multiple platforms, such as streaming services, social media platforms, and news aggregators, to provide a holistic media exploration experience. This means that users can discover content from various sources without the need to switch between different apps or websites. The market for intelligent automated assistants is not only driven by consumer demand but also by the increasing availability of data and advancements in AI technology. With the proliferation of smart devices and the internet of things (IoT), there is a wealth of data that can be collected and analyzed to improve the accuracy of recommendations. Additionally, advancements in natural language processing and machine learning algorithms have made it possible for these assistants to understand and interpret user preferences more effectively. As the market continues to grow, we can expect to see further innovations in intelligent automated assistants for media exploration. For example, voice-activated assistants, such as Amazon’s Alexa and Apple’s Siri, are becoming increasingly popular, allowing users to interact with media platforms through voice commands. Virtual reality and augmented reality technologies are also being integrated into these assistants, providing users with immersive and interactive media exploration experiences. In conclusion, the market for intelligent automated assistants for media exploration is expanding rapidly, driven by the need for personalized and efficient content discovery. These assistants leverage AI and machine learning algorithms to understand user preferences and provide tailored recommendations. As technology continues to advance, we can expect to see further developments in this field, enhancing our media exploration experience and making it easier than ever to discover and consume content.

The Apple Inc invention works as follows

Systems and methods for operating an intelligent automated assistance are provided.” According to one example, the method comprises receiving, on an electronic device having one or more processors, and memory, a natural-language input indicative of a media request, wherein the natural-language input includes a search parameter. The digital assistant then provides, identified by the search parameter, a media item. While providing the first item, the method includes receiving a second speech input in natural language and determining if the second input is a user intention to refine the media request. In accordance with the determination that the user’s intent to refine the media request is reflected in the second input, the method includes identifying a second item based on both the first parameter as well as the second input and delivering the second item.

Background for Intelligent automated assistant for media exploration

Intelligent automated assistants or digital assistants can be a useful interface between electronic devices and human users. These assistants allow users to interact using natural language, either in text or spoken form. A user could, for example, provide a spoken input with a request to a digital assistance operating on an electronic gadget. The digital assistant will interpret the user intent and translate it into tasks. The tasks are then performed by executing the services of the electronic devices, and relevant outputs responsive to user requests can be returned to them.

In most cases, users use conventional graphical user interfaces, at least partially, to interact with electronic devices. However, in some cases, a digital assistance may be implemented on a device that has limited or no display capability.

Herein are described “Example Methods”. A method for example includes receiving, on an electronic device with one or more processors, and memory, a natural-language input indicating a request for a media; providing by the digital assistant a media item identified using the search parameter. Receiving a second speech input while providing the media item; determining if the second speech input is a user intention to refine the media request. In accordance with the determination that the second language speech input corresponds with a user intention of refining the media request, the method includes identifying a second item, different from the original media item, using the first parameter, and the second input. The digital assistant then provides the second item.

The method can include receiving a natural language speech input at an electronic device with one or more processors, identifying a task using the natural-language input, providing a digital assistant a speech output indicative a verbal answer associated with the task, and providing a digital assistant a speech out put indicative a verbal reply to the task.

The digital assistant determines whether the number of non-affirmative answers that correspond to the media request meets a threshold. If the digital assistant determines that the number consecutive non-affirmative answers does not meet the threshold, the method includes providing an audio output indicative a suggestion for a second item of media different from the original media item. If the number of non-affirmative answers is greater than the threshold, the method includes providing the audio output of the suggestion of another media item.

The method can include: receiving, on an electronic device with one or more processors, and memory, a speech input that indicates a request for multimedia; detecting by a digital assistant the physical presence to the device of a plurality users; obtaining, in response to detecting this physical presence, a plurality preference profiles corresponding the plurality users; providing by the assistant a merged profile based upon the plurality preference profiles; identifying by the assistant a media item using the merged profile; and providing

Herein, examples of non-transitory computer readable media are described. A non-transitory storage medium for one or more computer programs is an example. One or more of the programs comprises instructions that, when executed by an electronic device’s processor, cause it to: receive a natural-language input indicating a media request, wherein the natural-language input includes a search parameter. Provide, by way of a digital assistant a media item identified by the search parameter. Receive a second input while presenting the media item. Determine whether the second input is a user intention to refine the media request. The instructions may also cause the electronic device, if it is determined that the second natural language speech input corresponds with a user intention of refining the media request, to: identify, using the first parameter and second natural language speech input, another media item, which differs from the first item, and provide the second item by the digital assistant.

The non-transitory storage medium can store one or more programs. The one or multiple programs include instructions that, when executed by one of more processors in an electronic device, cause it to receive a spoken natural language input, identify a task using the speech input, provide a voice output indicative of verbal responses associated with the task, and play back a media item corresponding the verbal answer.

An example of a non-transitory, computer-readable storage media stores one or multiple programs.” The one or multiple programs include instructions that, when executed by an electronic device’s processors, cause it to: receive a speech output indicating a media request; provide an audio output indicating a suggestion for a first item of media in response to the speech input; and determine by the digital device whether a plurality of non-affirmative answers corresponding to the media request satisfy a threshold. If the instructions determine that the number consecutive non-affirmative answers does not meet the threshold, the electronic device can provide an audio output by the digital assistant indicating a suggestion for a second item of media different from the original media item. The instructions can also cause the electronic device, if it is determined that the number consecutive non-affirmative answers meets the threshold, to: omit providing an audio input indicative of the suggestion of a new media item, and instead provide an audio output indicating a user request.

The non-transitory storage medium can store one or more programs. The one or multiple programs include instructions that, when executed by a processor of an electronic gadget, cause it to: receive a speech request indicative of media, detect the physical presence by a digital device of a plurality users at the device, and obtain a plurality preference profiles for the users.

Herein are described “Examples of electronic devices”. The example electronic device includes one or multiple processors, a memory, and one or several programs. These programs are stored in memory and configured to run by the processors. They include instructions for: receiving a natural-language input indicative of a media request, wherein the natural-language input includes a search parameter, providing a media item by way of a digital assistant based on that search parameter, and receiving a second input natural-language input while presenting the media item. “The one or more programs also include instructions to, if it is determined that the second language speech input corresponds with a user intention of refining the media request, identify, using the first parameter and second language speech input, a different media item from the first item, and provide the second item by the digital assistant.

The digital assistant provides a speech out put indicative of the verbal answer associated with the identified task. While providing the speech out put indicative of the verbal reply, the digital assistant plays back a media item that corresponds to the verbal reaction.

The example electronic device includes one or multiple processors; memory; and one of more programs. These programs are stored in memory and configured to run by one or several processors. They include instructions for: receiving a voice input indicative of an input for media, and providing, in response, a digital output indicative of the suggestion of a media item. If the number of non-affirmative answers does not meet the threshold, the one or more programs include instructions to provide, by a digital assistant, audio outputs indicative of a different second media item from the first. If the threshold is met, the one or more programs include instructions to: refrain from providing an audio input indicative of the suggestion of a new media item, and provide, by the digital assistance, an audio out put indicative of the user’s input.

The electronic device can include one or multiple processors and a memory, and may also include one or several programs. These programs could include instructions for: receiving a voice input indicative of a media request; detecting by a digital assistance the physical presence by a plurality users at the electronic device; upon detecting this physical presence, obtaining by the assistant a plurality preference profiles that correspond to the plurality users; providing by the assistant a merged profile based on a plurality preference profiles; identifying by the

While providing the first item, means are provided for receiving a 2nd natural-language input. Means for determining if the 2nd natural-language input corresponds with a user intention of refining a media request.

The digital assistant can provide a speech output that indicates a verbal answer associated with the identified task. It can also playback a media file corresponding to this verbal reply.

The digital assistant can determine whether the number consecutive non affirmative responses corresponds to a threshold. If the threshold is not met, the digital device will provide an audio out put that indicates a suggestion for a different media item from the first.

The digital assistant can provide a merged profile of preferences based on a plurality preference profiles. It also provides an audio output that includes the identified media item.

The user can easily direct a search for media items by using natural language speech. The digital assistant lets the user refine their media request at any moment, without stopping the current playback. The digital assistant gives the user full control and flexibility over the media search. The digital assistant can also interact with the user in a way that is intuitive and natural by allowing the user to enter a conversation or steer it at any time. Flexible and intuitive control over the media search process improves the operability and efficiency of the device.

The user can quickly get desired content by using a small number of inputs. This technique allows the user to reduce the number of inputs they make because, for example, the user doesn’t have to specify the same parameters repeatedly when refining a media request. The technique allows for natural and intuitive interaction between the digital assistant, and the user. For example, by receiving recommendations that are tailored to the user and narrowing down the content to be desired, through a series decisions. Reducing the number of user inputs and providing an intuitive user interface enhance the operability of the device and make the user-device interface more efficient (e.g., by helping the user to provide proper inputs and reducing user mistakes when operating/interacting with the device) which, additionally, reduces power usage and improves battery life of the device by enabling the user to use the device more quickly and efficiently.

Providing a voice output that is indicative of a verbal answer to a request from the user while also playing back a media item related to it provides a rich, intuitive audio interface for the digital assistant. The audio output is shortened by playing back the media item. The user’s ability to make informed decisions also reduces the amount of inputs. Providing a rich and intuitive auditory interface enhances the operability of the device and makes the user-device interface more efficient (e.g., by helping the user to provide proper inputs and reducing user mistakes when operating/interacting with the device) which, additionally, reduces power usage and improves battery life of the device by enabling the user to use the device more quickly and efficiently.

Click here to view the patent on Google Patents.