thumbnail

Goal Representations for Instruction Following

Goal Representations for Instruction Following A longstanding goal of the field of robot learning has been to create generalist agents that can perform tasks for humans. Natural language has the potential to be an easy-to-use interface for humans to specify arbitrary tasks, but it is difficult to train robots to follow language instructions. Approaches like …

Unleashing real-time insights: Monitoring SAP BTP cloud-native applications with IBM Instana

Introducing the SAP Business Technology Platform The SAP Business Technology Platform (BTP) is a technological innovation platform designed for SAP applications to combine data and analytics, AI, application development, automation and integration into a single, cohesive ecosystem. BTP is SAP’s integration and application development platform for SAP clients, who want to extend their S/4 system. …

ML 15416 pic1 fullview

Learn how Amazon Pharmacy created their LLM-based chat-bot using Amazon SageMaker

Amazon Pharmacy is a full-service pharmacy on Amazon.com that offers transparent pricing, clinical and customer support, and free delivery right to your door. Customer care agents play a crucial role in quickly and accurately retrieving information related to pharmacy information, including prescription clarifications and transfer status, order and dispensing details, and patient profile information, in …

Vertex AI adds Mistral AI model for powerful and flexible AI solutions

One of Europe’s leading providers of artificial intelligence (AI) solutions, Mistral AI, is on a mission to design highly performant and efficient open-source (OSS) foundation models. Mistral AI is teaming up with Google Cloud to natively integrate their cutting-edge AI model within Vertex AI. This integration can accelerate AI adoption by making it easy for …

pipeline h

Rethinking the Role of PPO in RLHF

Rethinking the Role of PPO in RLHF TL;DR: In RLHF, there’s tension between the reward learning phase, which uses human preference in the form of comparisons, and the RL fine-tuning phase, which optimizes a single, non-comparative reward. What if we performed RL in a comparative way? Figure 1: This diagram illustrates the difference between reinforcement …

How to use foundation models and trusted governance to manage AI workflow risk

Artificial intelligence (AI) adoption is still in its early stages. As more businesses use AI systems and the technology continues to mature and change, improper use could expose a company to significant financial, operational, regulatory and reputational risks. Using AI for certain business tasks or without guardrails in place may also not align with an …