Neural Transducer Training: Reduced Memory Consumption with Sample-wise Computation

The neural transducer is an end-to-end model for automatic speech recognition (ASR). While the model is well-suited for streaming ASR, the training process remains challenging. During training, the memory requirements may quickly exceed the capacity of state-of-the-art GPUs, limiting batch size and sequence lengths. In this work, we analyze the time and space complexity of …

12AptizWQvYDbiG6jDvNocrjg

Palantir Polska: Często zadawane pytania

(Scroll down for English translation below) W przestrzeni publicznej pojawia się wiele pytań dotyczących działalności Palantira. Biorąc pod uwagę duże zainteresowanie naszą firmą, zebraliśmy pytania, które stawiane są najczęściej i tym wpisem na blogu, chcielibyśmy na nie odpowiedzieć. Czym jest Palantir? Palantir Technologies to międzynarodowa firma technologiczna założona w 2003 roku w Dolinie Krzemowej. Obecnie …

resized 3

PRESTO – A multilingual dataset for parsing realistic task-oriented dialogues

Posted by Rahul Goel and Aditya Gupta, Software Engineers, Google Assistant Virtual assistants are increasingly integrated into our daily routines. They can help with everything from setting alarms to giving map directions and can even assist people with disabilities to more easily manage their homes. As we use these assistants, we are also becoming more …

5 big things you can do at Google Data Cloud & AI Summit this week

Data is at the heart of digital transformation and organizations are looking to find new opportunities to transform customer experiences, boost revenue, and reduce costs. In a new study conducted by Harvard Business Review Analytic Services for Google Cloud, 91% percent of leaders say that democratizing access to data is imperative to business success, and …

Stable Diffusion v2-1-unCLIP model released

Information taken from the GitHub page: https://github.com/Stability-AI/stablediffusion/blob/main/doc/UNCLIP.MD HuggingFace checkpoints and diffusers integration: https://huggingface.co/stabilityai/stable-diffusion-2-1-unclip Public web-demo: https://clipdrop.co/stable-diffusion-reimagine unCLIP is the approach behind OpenAI’s DALL·E 2, trained to invert CLIP image embeddings. We finetuned SD 2.1 to accept a CLIP ViT-L/14 image embedding in addition to the text encodings. This means that the model can be used …

I’m the creator of LoRA. How can I make it better?

I wrote this paper two years ago: https://arxiv.org/abs/2106.09685 Super happy that people find it useful for diffusion models. I had text in mind when I wrote the paper, so there are probably things we can tweak to make LoRA more suited for image generation. I want to better understand how exactly LoRA is used in …