Home MarkTechPost Google DeepMind at ICML 2024
MarkTechPost

Google DeepMind at ICML 2024

Share
Google DeepMind at ICML 2024
Share


Research

Published

Exploring AGI, the challenges of scaling and the future of multimodal generative AI

Next week the artificial intelligence (AI) community will come together for the 2024 International Conference on Machine Learning (ICML). Running from July 21-27 in Vienna, Austria, the conference is an international platform for showcasing the latest advances, exchanging ideas and shaping the future of AI research.

This year, teams from across Google DeepMind will present more than 80 research papers. At our booth, we’ll also showcase our multimodal on-device model, Gemini Nano, our new family of AI models for education called LearnLM and we’ll demo TacticAI, an AI assistant that can help with football tactics.

Here we introduce some of our oral, spotlight and poster presentations:

Defining the path to AGI

What is artificial general intelligence (AGI)? The phrase describes an AI system that’s at least as capable as a human at most tasks. As AI models continue to advance, defining what AGI could look like in practice will become increasingly important.

We’ll present a framework for classifying the capabilities and behaviors of AGI models. Depending on their performance, generality and autonomy, our paper categorizes systems ranging from non-AI calculators to emerging AI models and other novel technologies.

We’ll also show that open-endedness is critical to building generalized AI that goes beyond human capabilities. While many recent AI advances were driven by existing Internet-scale data, open-ended systems can generate new discoveries that extend human knowledge.

At ICML, we’ll be demoing Genie, a model that can generate a range of playable environments based on text prompts, images, photos, or sketches.

Scaling AI systems efficiently and responsibly

Developing larger, more capable AI models requires more efficient training methods, closer alignment with human preferences and better privacy safeguards.

We’ll show how using classification instead of regression techniques makes it easier to scale deep reinforcement learning systems and achieve state-of-the-art performance across different domains. Additionally, we propose a novel approach that predicts the distribution of consequences of a reinforcement learning agent’s actions, helping rapidly evaluate new scenarios.

Our researchers present an alignment-maintaining approach that reduces the need for human oversight, and a new approach to fine-tuning large language models (LLMs), based on game theory, better aligns a LLM’s output with human preferences.

We critique the approach of training models on public data and only fine-tuning with “differentially private” training, and argue this approach may not offer the privacy or utility that is often claimed it does.

VideoPoet is a large language model for zero-shot video generation.

New approaches in generative AI and multimodality

Generative AI technologies and multimodal capabilities are expanding the creative possibilities of digital media.

We’ll present VideoPoet, which uses an LLM to generate state-of-the-art video and audio from multimodal inputs including images, text, audio and other video.

And share Genie (generative interactive environments), which can generate a range of playable environments for training AI agents, based on text prompts, images, photos, or sketches.

Finally, we introduce MagicLens, a novel image retrieval system that uses text instructions to retrieve images with richer relations beyond visual similarity.

Supporting the AI community

We’re proud to sponsor ICML and foster a diverse community in AI and machine learning by supporting initiatives led by Disability in AI,
Queer in AI,
LatinX in AI and
Women in Machine Learning.

If you’re at the conference, visit the Google DeepMind and Google Research booths to meet our teams, see live demos and find out more about our research.



Source link

Share

Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *

By submitting this form, you are consenting to receive marketing emails and alerts from: techaireports.com. You can revoke your consent to receive emails at any time by using the Unsubscribe link, found at the bottom of every email.

Latest Posts

Related Articles
2.0 Flash, Flash-Lite, Pro Experimental
MarkTechPost

2.0 Flash, Flash-Lite, Pro Experimental

In December, we kicked off the agentic era by releasing an experimental...

Updating the Frontier Safety Framework
MarkTechPost

Updating the Frontier Safety Framework

Our next iteration of the FSF sets out stronger security protocols on...

FACTS Grounding: A new benchmark for evaluating the factuality of large language models
MarkTechPost

FACTS Grounding: A new benchmark for evaluating the factuality of large language models

Responsibility & Safety Published 17 December 2024 Authors FACTS team Our comprehensive...

Updates to Veo, Imagen and VideoFX, plus introducing Whisk in Google Labs
MarkTechPost

Updates to Veo, Imagen and VideoFX, plus introducing Whisk in Google Labs

While video models often “hallucinate” unwanted details — extra fingers or unexpected...