OpenAI

2297 Articles
PoE-World Outperforms Reinforcement Learning RL Baselines in Montezuma’s Revenge with Minimal Demonstration Data
OpenAI

PoE-World Outperforms Reinforcement Learning RL Baselines in Montezuma’s Revenge with Minimal Demonstration Data

The Importance of Symbolic Reasoning in World Modeling Understanding how the world works is key to creating AI agents that can adapt to...

Build an Intelligent Multi-Tool AI Agent Interface Using Streamlit for Seamless Real-Time Interaction
OpenAI

Build an Intelligent Multi-Tool AI Agent Interface Using Streamlit for Seamless Real-Time Interaction

In this tutorial, we’ll build a powerful and interactive Streamlit application that brings together the capabilities of LangChain, the Google Gemini API, and...

UC Berkeley Introduces CyberGym: A Real-World Cybersecurity Evaluation Framework to Evaluate AI Agents on Large-Scale Vulnerabilities Across Massive Codebases
OpenAI

UC Berkeley Introduces CyberGym: A Real-World Cybersecurity Evaluation Framework to Evaluate AI Agents on Large-Scale Vulnerabilities Across Massive Codebases

Cybersecurity has become a significant area of interest in artificial intelligence, driven by the increasing reliance on large software systems and the expanding...

This AI Paper from Google Introduces a Causal Framework to Interpret Subgroup Fairness in Machine Learning Evaluations More Reliably
OpenAI

This AI Paper from Google Introduces a Causal Framework to Interpret Subgroup Fairness in Machine Learning Evaluations More Reliably

Understanding Subgroup Fairness in Machine Learning ML Evaluating fairness in machine learning often involves examining how models perform across different subgroups defined by...

From Backend Automation to Frontend Collaboration: What’s New in AG-UI Latest Update for AI Agent-User Interaction
OpenAI

From Backend Automation to Frontend Collaboration: What’s New in AG-UI Latest Update for AI Agent-User Interaction

Introduction AI agents are increasingly moving from pure backend automators to visible, collaborative elements within modern applications. However, making agents genuinely interactive—capable of...

MiniMax AI Releases MiniMax-M1: A 456B Parameter Hybrid Model for Long-Context and Reinforcement Learning RL Tasks
OpenAI

MiniMax AI Releases MiniMax-M1: A 456B Parameter Hybrid Model for Long-Context and Reinforcement Learning RL Tasks

The Challenge of Long-Context Reasoning in AI Models Large reasoning models are not only designed to understand language but are also structured to...

OpenAI Releases an Open‑Sourced Version of a Customer Service Agent Demo with the Agents SDK
OpenAI

OpenAI Releases an Open‑Sourced Version of a Customer Service Agent Demo with the Agents SDK

OpenAI has open-sourced a new multi-agent customer service demo on GitHub, showcasing how to build domain-specialized AI agents using its Agents SDK. This...

ReVisual-R1: An Open-Source 7B Multimodal Large Language Model (MLLMs) that Achieves Long, Accurate and Thoughtful Reasoning
OpenAI

ReVisual-R1: An Open-Source 7B Multimodal Large Language Model (MLLMs) that Achieves Long, Accurate and Thoughtful Reasoning

The Challenge of Multimodal Reasoning Recent breakthroughs in text-based language models, such as DeepSeek-R1, have demonstrated that RL can aid in developing strong...