OpenAI

2302 Articles
CURE: A Reinforcement Learning Framework for Co-Evolving Code and Unit Test Generation in LLMs
OpenAI

CURE: A Reinforcement Learning Framework for Co-Evolving Code and Unit Test Generation in LLMs

Introduction Large Language Models (LLMs) have shown substantial improvements in reasoning and precision through reinforcement learning (RL) and test-time scaling techniques. Despite outperforming...

Develop a Multi-Tool AI Agent with Secure Python Execution using Riza and Gemini
OpenAI

Develop a Multi-Tool AI Agent with Secure Python Execution using Riza and Gemini

In this tutorial, we’ll harness Riza’s secure Python execution as the cornerstone of a powerful, tool-augmented AI agent in Google Colab. Beginning with...

How Do LLMs Really Reason? A Framework to Separate Logic from Knowledge
OpenAI

How Do LLMs Really Reason? A Framework to Separate Logic from Knowledge

Unpacking Reasoning in Modern LLMs: Why Final Answers Aren’t Enough Recent advancements in reasoning-focused LLMs like OpenAI’s o1/3 and DeepSeek-R1 have led to...

Mistral AI Releases Magistral Series: Advanced Chain-of-Thought LLMs for Enterprise and Open-Source Applications
OpenAI

Mistral AI Releases Magistral Series: Advanced Chain-of-Thought LLMs for Enterprise and Open-Source Applications

Mistral AI has officially introduced Magistral, its latest series of reasoning-optimized large language models (LLMs). This marks a significant step forward in the...

NVIDIA Researchers Introduce Dynamic Memory Sparsification (DMS) for 8× KV Cache Compression in Transformer LLMs
OpenAI

NVIDIA Researchers Introduce Dynamic Memory Sparsification (DMS) for 8× KV Cache Compression in Transformer LLMs

As the demand for reasoning-heavy tasks grows, large language models (LLMs) are increasingly expected to generate longer sequences or parallel chains of reasoning....

How Much Do Language Models Really Memorize? Meta’s New Framework Defines Model Capacity at the Bit Level
OpenAI

How Much Do Language Models Really Memorize? Meta’s New Framework Defines Model Capacity at the Bit Level

Introduction: The Challenge of Memorization in Language Models Modern language models face increasing scrutiny regarding their memorization behavior. With models such as an...

Meta Introduces LlamaRL: A Scalable PyTorch-Based Reinforcement Learning RL Framework for Efficient LLM Training at Scale
OpenAI

Meta Introduces LlamaRL: A Scalable PyTorch-Based Reinforcement Learning RL Framework for Efficient LLM Training at Scale

Reinforcement Learning’s Role in Fine-Tuning LLMs Reinforcement learning has emerged as a powerful approach to fine-tune large language models (LLMs) for more intelligent...

ether0: A 24B LLM Trained with Reinforcement Learning RL for Advanced Chemical Reasoning Tasks
OpenAI

ether0: A 24B LLM Trained with Reinforcement Learning RL for Advanced Chemical Reasoning Tasks

LLMs primarily enhance accuracy through scaling pre-training data and computing resources. However, the attention has shifted towards alternate scaling due to finite data...