Categories: FAANG

Referring to Screen Texts with Voice Assistants

Voice assistants help users make phone calls, send messages, create events, navigate, and do a lot more. However, assistants have limited capacity to understand their users’ context. In this work, we aim to take a step in this direction. Our work dives into a new experience for users to refer to phone numbers, addresses, email addresses, URLs, and dates on their phone screens. Our focus lies in reference understanding, which becomes particularly interesting when multiple similar texts are present on screen, similar to visual grounding. We collect a dataset and propose a lightweight…
AI Generated Robotic Content

Recent Posts

Anima – Sharing Some Prompts and Results

Been experimenting with Anima lately and ended up spending way too much time refining prompts.…

18 hours ago

Keychron K2 HE Concrete Edition Review: Rock-Solid Typing

Keychron's K2 HE Concrete Edition sounds like a cute gimmick, but as I discovered, there's…

19 hours ago

AI generates full battery electrolyte recipes, matching top lithium metal battery performance

Battery electrolytes aren't just one chemical, but a complex mixture of salts, solvents, and additives…

19 hours ago

Nava – A 6.3B audio-video model .

Page: https://ernie-research.github.io/NAVA/ Model: https://huggingface.co/ernie-research/NAVA Github: https://github.com/ernie-research/NAVA NAVA is a 6.3 B-parameter joint audio-video generator that…

2 days ago

Enterprise Business Software and the Mixed-Up Chameleon Problem

Editor’s Note: This blog post was written by Greg Little, Senior Counselor at Palantir, with…

2 days ago

High-Throughput Graph Abstraction at Netflix: Part I

By Oleksii Tkachuk, Kartik Sathyanarayanan, Rajiv ShringiIntroductionNetflix has a diverse range of graph use cases, each…

2 days ago