Categories: FAANG

Training Software Engineering Agents and Verifiers with SWE-Gym

We present SWE-Gym, the first environment for training real-world software engineering (SWE) agents. SWE-Gym contains 2,438 real-world Python task instances, each comprising a codebase with an executable runtime environment, unit tests, and a task specified in natural language. We use SWE-Gym to train language model based SWE agents, achieving up to 19% absolute gains in resolve rate on the popular SWE-Bench Verified and Lite test sets. We also experiment with inference-time scaling through verifiers trained on agent trajectories sampled from SWE-Gym. When combined with our fine-tuned SWE…
AI Generated Robotic Content

Recent Posts

This Is a Weapon of Choice (Wan2.2 Animate)

I used a workflow from here: https://github.com/IAMCCS/comfyui-iamccs-workflows/tree/main Specifically this one: https://github.com/IAMCCS/comfyui-iamccs-workflows/blob/main/C_IAMCCS_NATIVE_WANANIMATE_LONG_VIDEO_v.1.json submitted by /u/sutrik [link]…

23 hours ago

Expert-Level Feature Engineering: Advanced Techniques for High-Stakes Models

Building machine learning models in high-stakes contexts like finance, healthcare, and critical infrastructure often demands…

23 hours ago

Introducing agent-to-agent protocol support in Amazon Bedrock AgentCore Runtime

We recently announced the support for Agent-to-Agent (A2A) protocol on Amazon Bedrock AgentCore Runtime. With…

23 hours ago

BigQuery under the hood: How Google brought embeddings to analytics

Embeddings are a crucial component at the intersection of data and AI. As data structures,…

23 hours ago

Baidu just dropped an open-source multimodal AI that it claims beats GPT-5 and Gemini

Baidu Inc., China's largest search engine company, released a new artificial intelligence model on Monday…

24 hours ago

The Nike x Hyperice Hyperboot Is $200 Off

Nike’s high-end recovery sneakers are on sale—just in time for ski season.

24 hours ago