Described techniques may be utilized to receive a transcription stream including transcribed text that has been transcribed from speech, and to receive a summary request for a summary to be provided on a display of a device. Extracted text may be identified from the transcribed text and in response to the summary request. The extracted text may be processed using a summarization machine learning (ML) model to obtain a summary of the extracted text, and the summary may be displayed on the display of the device. When an image is captured, an augmented summary may be generated that includes the image together with a visual indication of one or more of an emotion, an entity, or an intent associated with the image, the summary, or the extracted text.
Emoticons or other images are inserted into text messages during chat sessions without leaving the chat session by entering an input sequence onto an input area of a touchscreen on an electronic device, thereby causing an emoticon library to be presented to a user. The user selects an emoticon, and the emoticon library either closes automatically or closes after the user enters a closing input sequence. The opening and closing input sequences are, for example, any combination of swipes and taps along or on the input area. Users are also able to add content to chat sessions and generate mood messages to chat sessions.
Implementations described herein relate to causing emoji(s) that are associated with a given emotion class expressed by a spoken utterance to be visually rendered for presentation to a user at a display of a client device of the user. Processor(s) of the client device may receive audio data that captures the spoken utterance, process the audio data to generate textual data that is predicted to correspond to the spoken utterance, and cause a transcription of the textual data to be visually rendered for presentation to the user via the display. Further, the processor(s) may determine, based on processing the textual data, whether the spoken utterance expresses a given emotion class. In response to determining that the spoken utterance expresses the given emotion class, the processor(s) may cause emoji(s) that are stored in association with the given emotion class to be visually rendered for presentation to the user via the display.
Systems and methods for capturing media content in accordance with viewer expression are disclosed. In some implementations, a method is performed at a computer system having one or more processors and memory storing one or more programs for execution by the one or more processors. The method includes: (1) while a media content item is being presented to a user, capturing a momentary reaction of the user; (2) comparing the captured user reaction with one or more previously captured reactions of the user; (3) identifying the user reaction as one of a plurality of reaction types based on the comparison; (4) identifying the portion of the media content item corresponding to the momentary reaction; and (5) storing an association between the identified user reaction and the portion of the media content item.
A method includes providing, by an audio playback interface, an initial playlist comprising audio tracks. The method includes receiving a user preference associated with an initial audio track during a listening session, wherein the user preference is indicative of a listening mood of a user and comprises one or more of a user behavior or a natural language input. The method includes generating a representation of the user preference in a joint audio- text embedding space by applying a two-tower model comprising an audio embedding network and a text embedding network. A proximity of two embeddings is indicative of semantic similarity. The method includes training a machine learning model to generate an updated playlist responsive to the listening mood of the user during the listening session. The method includes applying the machine learning model to generate the updated playlist. The method includes substituting the initial playlist with the updated playlist.