Inpainting and Outpainting with Stable Diffusion

Inpainting and outpainting have long been popular and well-studied image processing domains. Traditional approaches to these problems often relied on complex algorithms and deep learning techniques yet still gave inconsistent outputs. However, recent advancements in the form of Stable diffusion have reshaped these domains. Stable diffusion now offers enhanced efficacy in inpainting and outpainting while …

CatLIP: CLIP-level Visual Recognition Accuracy with 2.7× Faster Pre-training on Web-scale Image-Text Data

Contrastive learning has emerged as a transformative method for learning effective visual representations through the alignment of image and text embeddings. However, pairwise similarity computation in contrastive loss between image and text pairs poses computational challenges. This paper presents a novel weakly supervised pre-training of vision models on web-scale image-text data. The proposed method reframes …

AI transforms the IT support experience

We know that understanding clients’ technical issues is paramount for delivering effective support service. Enterprises demand prompt and accurate solutions to their technical issues, requiring support teams to possess deep technical knowledge and communicate action plans clearly. Product-embedded or online support tools, such as virtual assistants, can drive more informed and efficient support interactions with …

ML16091 Solution architecture 1

Deploy a Hugging Face (PyAnnote) speaker diarization model on Amazon SageMaker as an asynchronous endpoint

Speaker diarization, an essential process in audio analysis, segments an audio file based on speaker identity. This post delves into integrating Hugging Face’s PyAnnote for speaker diarization with Amazon SageMaker asynchronous endpoints. We provide a comprehensive guide on how to deploy speaker segmentation and clustering solutions using SageMaker on the AWS Cloud. You can use …

1 PyTorchXLA stack diagram.max 1000x1000 1

Announcing PyTorch/XLA 2.3: Distributed training, dev improvements, and GPUs

PyTorch’s flexibility and dynamic nature make it a popular choice for deep learning researchers and practitioners. Developed by Google, XLA is a specialized compiler designed to optimize linear algebra computations – the foundation of deep learning models. PyTorch/XLA offers the best of both worlds: the user experience and ecosystem advantages of PyTorch, with the compiler …

Why can’t robots outrun animals?

Robotics engineers have worked for decades and invested many millions of research dollars in attempts to create a robot that can walk or run as well as an animal. And yet, it remains the case that many animals are capable of feats that would be impossible for robots that exist today.

Adobe’s VideoGigaGAN uses AI to make blurry videos sharp and clear

A team of video and AI engineers at Adobe Research has developed an AI application called VideoGigaGAN, that can accept a blurry video and enhance it to make it a much shaper product. The team describes their work and results in an article posted to the arXiv preprint server. They have also posted several examples …

OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework

The reproducibility and transparency of large language models are crucial for advancing open research, ensuring the trustworthiness of results, and enabling investigations into data and model biases, as well as potential risks. To this end, we release OpenELM, a state-of-the-art open language model. OpenELM uses a layer-wise scaling strategy to efficiently allocate parameters within each …

Data privacy examples

An online retailer always gets users’ explicit consent before sharing customer data with its partners. A navigation app anonymizes activity data before analyzing it for travel trends. A school asks parents to verify their identities before giving out student information. These are just some examples of how organizations support data privacy, the principle that people …