Home OpenAI Integrating Neural Systems for Visual Perception: The Role of Ventral Temporal Cortex VTC and Medial Temporal Cortex MTC in Rapid and Complex Object Recognition

OpenAI

Integrating Neural Systems for Visual Perception: The Role of Ventral Temporal Cortex VTC and Medial Temporal Cortex MTC in Rapid and Complex Object Recognition

adminUpdated 10 months Ago2 Mins read165 Views

Integrating Neural Systems for Visual Perception: The Role of Ventral Temporal Cortex VTC and Medial Temporal Cortex MTC in Rapid and Complex Object Recognition

Human and primate perception occurs across multiple timescales, with some visual attributes identified in under 200ms, supported by the ventral temporal cortex (VTC). However, more complex visual inferences, such as recognizing novel objects, require additional time and multiple glances. The high-acuity fovea and frequent gaze shifts help compose object representations. While much is understood about rapid visual processing, less about integrating visual sequences is known. The medial temporal cortex (MTC), particularly the perirhinal cortex (PRC), may aid in this process, enabling visual inferences beyond VTC capabilities by integrating sequential visual inputs.

Stanford researchers evaluated the MTC’s role in object perception by comparing human visual performance to macaque VTC recordings. While humans and VTC perform similarly with brief viewing times (<200ms), human performance significantly surpasses VTC with extended viewing. MTC plays a key role in this improvement, as MTC-lesioned humans perform like VTC models. Eye-tracking experiments revealed that humans use sequential gaze patterns for complex visual inferences. These findings suggest that MTC integrates visuospatial sequences into compositional representations, enhancing object perception beyond VTC capabilities.

Researchers used a dataset of various object images presented in different orientations and settings to estimate performance based on VTC responses and compare it with human visual processing. They implemented a cross-validation strategy where trials featured two typical objects and one outlier in randomized configurations. Neural responses from the brain’s high-level visual areas were then used to train a linear classifier to detect the odd object. This process was repeated multiple times, with results averaged to determine a performance score for distinguishing each pair of objects.

For comparison, a CNN model, pre-trained for object classification, was used to evaluate VTC model performance. The images were preprocessed for the CNN, and a similar experimental setup was followed, where a classifier was trained to detect odd objects in various trials. The model’s accuracy was tested and compared to neural response-based predictions, offering insights into how closely the model’s visual processing mirrored human-like inference.

The study compares human performance in two visual regimes: time-restricted (less than 200ms) and time-unrestricted (self-paced). In time-restricted tasks, participants rely on immediate visual processing since there’s no opportunity for sequential sampling through eye movements. A 3-way visual discrimination task and a match-to-sample paradigm were used to assess this. Results showed a strong correlation between time-restricted human performance and the performance predicted by the high-level VTC of macaques. However, with unlimited viewing time, human participants significantly outperformed VTC-supported performance and computational models based on VTC. This demonstrates that humans exceed VTC capabilities when given extended viewing times, suggesting reliance on different neural mechanisms.

The study reveals complementary neural systems in visual object perception, where the VTC enables rapid visual inferences within 100ms, while the MTC supports more complex inferences through sequential saccades. Time-restricted tasks align with VTC performance, but with more time, humans surpass VTC capabilities, reflecting MTC’s integration of visuospatial sequences. The findings emphasize MTC’s role in compositional operations, extending beyond memory to perception. Models of human vision, like convolutional neural networks, approximate VTC but fail to capture MTC’s contributions, suggesting the need for biologically plausible models that integrate both systems.

Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. If you like our work, you will love our newsletter..

Don’t Forget to join our 50k+ ML SubReddit

⏩ ⏩ FREE AI WEBINAR: ‘SAM 2 for Video: How to Fine-tune On Your Data’ (Wed, Sep 25, 4:00 AM – 4:45 AM EST)

Sana Hassan, a consulting intern at Marktechpost and dual-degree student at IIT Madras, is passionate about applying technology and AI to address real-world challenges. With a keen interest in solving practical problems, he brings a fresh perspective to the intersection of AI and real-life solutions.

⏩ ⏩ FREE AI WEBINAR: ‘SAM 2 for Video: How to Fine-tune On Your Data’ (Wed, Sep 25, 4:00 AM – 4:45 AM EST)

Source link

GOT (General OCR Theory) Unveiled: A Revolutionary OCR-2.0 Model That Streamlines Text Recognition Across Multiple Formats with Unmatched Efficiency and Precision

Previous post GOT (General OCR Theory) Unveiled: A Revolutionary OCR-2.0 Model That Streamlines Text Recognition Across Multiple Formats with Unmatched Efficiency and Precision

Next post Maestro: A New AI Tool Designed to Streamline and Accelerate the Fine-Tuning Process for Multimodal AI Models

Microsoft Open-Sources GitHub Copilot Chat Extension for VS Code—Now Free for All Developers

Microsoft has officially open-sourced the GitHub Copilot Chat extension for Visual Studio...

admin3 Mins read

OpenAI

Hugging Face Releases SmolLM3: A 3B Long-Context, Multilingual Reasoning Model

Hugging Face just released SmolLM3, the latest version of its “Smol” language...

admin3 Mins read

OpenAI

A Code Implementation for Designing Intelligent Multi-Agent Workflows with the BeeAI Framework

BeeAI FrameworkIn this tutorial, we explore the power and flexibility of the...

admin10 Mins read

OpenAI

Anthropic Proposes Targeted Transparency Framework for Frontier AI Systems

As the development of large-scale AI systems accelerates, concerns about safety, oversight,...

admin3 Mins read

This Week

OMEGA: A Structured Math Benchmark to Probe the Reasoning Limits of LLMs

LongWriter-Zero: A Reinforcement Learning Framework for Ultra-Long Text Generation Without Synthetic Data

Building Advanced Multi-Agent AI Workflows by Leveraging AutoGen and Semantic Kernel

Weekly Newsletter

Integrating Neural Systems for Visual Perception: The Role of Ventral Temporal Cortex VTC and Medial Temporal Cortex MTC in Rapid and Complex Object Recognition

Leave a comment

Leave a Reply Cancel reply

Latest Posts

LongWriter-Zero: A Reinforcement Learning Framework for Ultra-Long Text Generation Without Synthetic Data

Building Advanced Multi-Agent AI Workflows by Leveraging AutoGen and Semantic Kernel

TabArena: Benchmarking Tabular Machine Learning with Reproducibility and Ensembling at Scale

DSRL: A Latent-Space Reinforcement Learning Approach to Adapt Diffusion Policies in Real-World Robotics

Microsoft Open-Sources GitHub Copilot Chat Extension for VS Code—Now Free for All Developers

Hugging Face Releases SmolLM3: A 3B Long-Context, Multilingual Reasoning Model

A Code Implementation for Designing Intelligent Multi-Agent Workflows with the BeeAI Framework

Anthropic Proposes Targeted Transparency Framework for Frontier AI Systems

Get to Know Us

keep in touch