OpenAI

1602 Articles
ReasonFlux: Elevating LLM Reasoning with Hierarchical Template Scaling
OpenAI

ReasonFlux: Elevating LLM Reasoning with Hierarchical Template Scaling

Large language models (LLMs) have demonstrated exceptional problem-solving abilities, yet complex reasoning tasks—such as competition-level mathematics or intricate code generation—remain challenging. These tasks...

DeepSeek AI Introduces CODEI/O: A Novel Approach that Transforms Code-based Reasoning Patterns into Natural Language Formats to Enhance LLMs’ Reasoning Capabilities
OpenAI

DeepSeek AI Introduces CODEI/O: A Novel Approach that Transforms Code-based Reasoning Patterns into Natural Language Formats to Enhance LLMs’ Reasoning Capabilities

Large Language Models (LLMs) have advanced significantly in natural language processing, yet reasoning remains a persistent challenge. While tasks such as mathematical problem-solving...

Google DeepMind Researchers Propose Matryoshka Quantization: A Technique to Enhance Deep Learning Efficiency by Optimizing Multi-Precision Models without Sacrificing Accuracy
OpenAI

Google DeepMind Researchers Propose Matryoshka Quantization: A Technique to Enhance Deep Learning Efficiency by Optimizing Multi-Precision Models without Sacrificing Accuracy

Quantization is a crucial technique in deep learning for reducing computational costs and improving model efficiency. Large-scale language models demand significant processing power,...

TransMLA: Transforming GQA-based Models Into MLA-based Models
OpenAI

TransMLA: Transforming GQA-based Models Into MLA-based Models

Large Language Models (LLMs) have gained significant importance as productivity tools, with open-source models increasingly matching the performance of their closed-source counterparts. These...

This AI Paper from UC Berkeley Introduces a Data-Efficient Approach to Long Chain-of-Thought Reasoning for Large Language Models
OpenAI

This AI Paper from UC Berkeley Introduces a Data-Efficient Approach to Long Chain-of-Thought Reasoning for Large Language Models

Large language models (LLMs)  process extensive datasets to generate coherent outputs, focusing on refining chain-of-thought (CoT) reasoning. This methodology enables models to break...

Microsoft Research Introduces Data Formulator: An AI Application that Leverages LLMs to Transform Data and Create Rich Visualizations
OpenAI

Microsoft Research Introduces Data Formulator: An AI Application that Leverages LLMs to Transform Data and Create Rich Visualizations

Most modern visualization authoring tools like Charticulator, Data Illustrator, and Lyra,  and libraries like ggplot2, and VegaLite expect tidy data, where every variable...

ByteDance Introduces UltraMem: A Novel AI Architecture for High-Performance, Resource-Efficient Language Models
OpenAI

ByteDance Introduces UltraMem: A Novel AI Architecture for High-Performance, Resource-Efficient Language Models

Large Language Models (LLMs) have revolutionized natural language processing (NLP) but face significant challenges in practical applications due to their large computational demands....

Salesforce AI Research Introduces Reward-Guided Speculative Decoding (RSD): A Novel Framework that Improves the Efficiency of Inference in Large Language Models (LLMs) Up To 4.4× Fewer FLOPs
OpenAI

Salesforce AI Research Introduces Reward-Guided Speculative Decoding (RSD): A Novel Framework that Improves the Efficiency of Inference in Large Language Models (LLMs) Up To 4.4× Fewer FLOPs

In recent years, the rapid scaling of large language models (LLMs) has led to extraordinary improvements in natural language understanding and reasoning capabilities....