OpenAI

1758 Articles
NVIDIA Open-Sources cuOpt: An AI-Powered Decision Optimization Engine-Unlocking Real-Time Optimization at an Unprecedented Scale
OpenAI

NVIDIA Open-Sources cuOpt: An AI-Powered Decision Optimization Engine-Unlocking Real-Time Optimization at an Unprecedented Scale

Every day, organizations face complex logistical challenges—from optimizing delivery routes and managing supply chains to streamlining production schedules. These tasks typically involve massive...

IBM and Hugging Face Researchers Release SmolDocling: A 256M Open-Source Vision Language Model for Complete Document OCR
OpenAI

IBM and Hugging Face Researchers Release SmolDocling: A 256M Open-Source Vision Language Model for Complete Document OCR

Converting complex documents into structured data has long posed significant challenges in the field of computer science. Traditional approaches, involving ensemble systems or...

MemQ: Enhancing Knowledge Graph Question Answering with Memory-Augmented Query Reconstruction
OpenAI

MemQ: Enhancing Knowledge Graph Question Answering with Memory-Augmented Query Reconstruction

LLMs have shown strong performance in Knowledge Graph Question Answering (KGQA) by leveraging planning and interactive strategies to query knowledge graphs. Many existing...

Building a Retrieval-Augmented Generation (RAG) System with FAISS and Open-Source LLMs
OpenAI

Building a Retrieval-Augmented Generation (RAG) System with FAISS and Open-Source LLMs

Retrieval-augmented generation (RAG) has emerged as a powerful paradigm for enhancing the capabilities of large language models (LLMs). By combining LLMs’ creative generation...

VisualWebInstruct: A Large-Scale Multimodal Reasoning Dataset for Enhancing Vision-Language Models
OpenAI

VisualWebInstruct: A Large-Scale Multimodal Reasoning Dataset for Enhancing Vision-Language Models

VLMs have shown notable progress in perception-driven tasks such as visual question answering (VQA) and document-based visual reasoning. However, their effectiveness in reasoning-intensive...

This AI Paper Introduces R1-Onevision: A Cross-Modal Formalization Model for Advancing Multimodal Reasoning and Structured Visual Interpretation
OpenAI

This AI Paper Introduces R1-Onevision: A Cross-Modal Formalization Model for Advancing Multimodal Reasoning and Structured Visual Interpretation

Multimodal reasoning is an evolving field that integrates visual and textual data to enhance machine intelligence. Traditional artificial intelligence models excel at processing...

Lowe’s Revolutionizes Retail with AI: From Personalized Shopping to Proactive Customer Assistance
OpenAI

Lowe’s Revolutionizes Retail with AI: From Personalized Shopping to Proactive Customer Assistance

Lowe’s, a leading home improvement retailer with 1,700 stores and 300,000 associates, is establishing itself as a pioneer in AI innovation. In a...

Speech-to-Speech Foundation Models Pave the Way for Seamless Multilingual Interactions
OpenAI

Speech-to-Speech Foundation Models Pave the Way for Seamless Multilingual Interactions

At NVIDIA GTC25, Gnani.ai experts unveiled groundbreaking advancements in voice AI, focusing on the development and deployment of Speech-to-Speech Foundation Models. This innovative...