Home OpenAI Google AI Researchers Introduce a New Whale Bioacoustics Model that can Identify Eight Distinct Species, Including Multiple Calls for Two of Those Species
OpenAI

Google AI Researchers Introduce a New Whale Bioacoustics Model that can Identify Eight Distinct Species, Including Multiple Calls for Two of Those Species

Share
Google AI Researchers Introduce a New Whale Bioacoustics Model that can Identify Eight Distinct Species, Including Multiple Calls for Two of Those Species
Share


Whale species produce a wide range of vocalizations, from very low to very high frequencies, which vary by species and location, making it difficult to develop models that automatically classify multiple whale species. By analyzing whale vocalizations, researchers can estimate population sizes, track changes over time, and help develop conservation strategies, including protected area designation and mitigation measures. Effective monitoring is essential for conservation, but the complexity of whale calls, especially from elusive species, and the vast amount of underwater audio data complicate efforts to track their populations.

Current methods for animal species identification through sound are more advanced for birds than for whales, as models like Google Perch can classify thousands of bird vocalizations. However, similar multi-species classification models for whales are more challenging to develop due to the diversity in whale vocalizations and a lack of comprehensive data for certain species. Previous efforts have focused on specific species like humpback whales, with earlier models developed by Google Research in partnership with NOAA and other organizations. These models helped classify humpback calls and identified new locations of whale activity.

To address the limitations of previous models, Google researchers developed a new whale bioacoustics model capable of classifying vocalizations from eight distinct species, including the mysterious “Biotwang” sound attributed to the Bryde’s whale. This new model expands on earlier efforts by classifying multiple species and vocalization types, designed for large-scale application on long-term passive acoustic recordings.

The proposed whale bioacoustics model processes audio data by converting it into spectrogram images for each 5-second window of sound. The front-end of the model uses mel-scaled frequency axes and log amplitude compression. It then classifies these spectrograms into one of 12 classes, corresponding to eight whale species and several specific vocalization types. To ensure accurate classifications and minimize false positives, the model was trained not just on positive examples but also on negative and background noise data. The model’s performance, as measured by metrics such as the area under the receiver operating characteristic curve (AUC), showed strong discriminative abilities, particularly for species like Minke and Bryde’s whales.

Along with the classification task, the model helped researchers discover new insights about species’ movements, including differences between central and western Pacific Bryde’s whale populations. By labeling over 200,000 hours of underwater recordings, the model also uncovered the seasonal migration patterns of some species. The model is now publicly available via Kaggle for further use in whale conservation and research efforts.

In conclusion, Google’s new whale bioacoustics model is a significant advancement in the field, addressing the challenge of multi-species classification with a model that not only recognizes eight species but also provides detailed insights into their ecology. This model is a crucial tool in marine biology research, offering scalable and accurate underwater audio data classification and furthering our understanding of whale populations, especially for elusive species like Bryde’s whales.


Check out the Paper and Blog. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. If you like our work, you will love our newsletter..

Don’t Forget to join our 50k+ ML SubReddit

⏩ ⏩ FREE AI WEBINAR: ‘SAM 2 for Video: How to Fine-tune On Your Data’ (Wed, Sep 25, 4:00 AM – 4:45 AM EST)


Pragati Jhunjhunwala is a consulting intern at MarktechPost. She is currently pursuing her B.Tech from the Indian Institute of Technology(IIT), Kharagpur. She is a tech enthusiast and has a keen interest in the scope of software and data science applications. She is always reading about the developments in different field of AI and ML.





Source link

Share

Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *

By submitting this form, you are consenting to receive marketing emails and alerts from: techaireports.com. You can revoke your consent to receive emails at any time by using the Unsubscribe link, found at the bottom of every email.

Latest Posts

Related Articles
Simular Releases Agent S2: An Open, Modular, and Scalable AI Framework for Computer Use Agents
OpenAI

Simular Releases Agent S2: An Open, Modular, and Scalable AI Framework for Computer Use Agents

In today’s digital landscape, interacting with a wide variety of software and...

Alibaba Researchers Introduce R1-Omni: An Application of Reinforcement Learning with Verifiable Reward (RLVR) to an Omni-Multimodal Large Language Model
OpenAI

Alibaba Researchers Introduce R1-Omni: An Application of Reinforcement Learning with Verifiable Reward (RLVR) to an Omni-Multimodal Large Language Model

Emotion recognition from video involves many nuanced challenges. Models that depend exclusively...

From Sparse Rewards to Precise Mastery: How DEMO3 is Revolutionizing Robotic Manipulation
OpenAI

From Sparse Rewards to Precise Mastery: How DEMO3 is Revolutionizing Robotic Manipulation

Long-horizon robotic manipulation tasks are a serious challenge for reinforcement learning, caused...

HybridNorm: A Hybrid Normalization Strategy Combining Pre-Norm and Post-Norm Strengths in Transformer Architectures
OpenAI

HybridNorm: A Hybrid Normalization Strategy Combining Pre-Norm and Post-Norm Strengths in Transformer Architectures

Transformers have revolutionized natural language processing as the foundation of large language...