Skip to main content

European Large Open Multi-Modal Foundation Models For Robust Generalization On Arbitrary Data Streams

Project overview

ELLIOT is a four-year Horizon European project building open, trustworthy Multimodal Generalist Foundation Models (MGFMs) for real-world use.

ELLIOT unites images, text, video, audio, 3D and sensors into robust models, advancing European sovereignty in AI.

Image caption goes here

Ambition

ELLIOT is a four-year Horizon European project building open, trustworthy Multimodal Generalist Foundation Models (MGFMs) for real-world use. ELLIOT unites images, text, video, audio, 3D and sensors into robust models, advancing European sovereignty in AI.

Foundation 
models

Score well on benchmarks but struggle when the real world shifts, or when data comes as long, mixed streams (text, images, audio, sensors). It’s hard to trust results when the training data and checks aren’t clear.

ELLIOT tackles this with an open, reproducible pipeline

From gathering data to training, fine-tuning, testing and releasing, so performance and transparency go hand in hand.

Mission

ELLIOT’s mission is to advance open, trustworthy multimodal generalist foundation models that reliably generalise to real-world conditions, reason across space–time signals, and remain reproducible and auditable end-to-end.

Objectives

Multimodal data collection and generation

Preprocessing and aligning diverse data types — text, structured data, video, images, audio, 3D and sensors — for seamless integration.

Pretraining of MGFMs

Training large-scale models on heterogeneous datasets in a task-agnostic way to ensure broad adaptability.

Fine-tuning of MGFMs

Enabling learning and unlearning to adapt models to specific downstream tasks and mitigate bias.

Trustworthy AI themes

Focusing on transparency, fairness, explainability, and robustness, while promoting Green AI and energy efficiency.

Testing and evaluation of MGFMs

Assessing models in task-agnostic and task-specific scenarios across strategic European applications.

Examining Policy, Legal, Ethical and Societal aspects

Ensuring responsible, human-centric AI through legal, ethical, and societal considerations.

Action plan

An end-to-end, open workflow for robust multimodal AI.

ELLIOT follows a full-pipeline plan: from multimodal data to pre-training, fine-tuning, and trust & evaluation, supported by policy, ethics and adoption activities. 



The goal is simple: build models that hold up in the real world and make the process open and robust.

Data collection and generation

Collect and generate high-quality data to enable the training of broader, more generalist MGFMs.

Pre-training of MGFMs

Develop methods to systematically search for new pre-training strategies that make models improve more efficiently as data and compute scale up.

Fine-tuning of MGFMs

Develop and evaluate novel generalist methods for continual learning and unlearning, so models can adapt to new tasks while reducing the impact of outdated or biased data.

Testing and evaluation of MGFMs

Develop open, reproducible benchmarks that automatically test pre-training and fine-tuning methods, using multimodal judges to assess outputs and feed results back into model improvement.

Trustworthy AI aspects

Make trustworthiness part of the design by building models that treat people fairly, protect privacy and are easier to understand and adjust.

Application on downstream tasks

Validate and specialise the generalist models across multiple domains via use cases, demonstrating emergent capabilities.

Use Cases

The action plan is validated through real-world use cases across multiple domains, ensuring scientific excellence translates into societal and industrial impact.

Explore our use cases