Introduction
Galileo is an open-source, extremely multimodal basis mannequin developed to course of, analyze, and perceive various Earth remark (EO) information streams—together with optical, radar, elevation, local weather, and auxiliary maps—at scale. Galileo is developed with the assist from researchers from McGill College, NASA Harvest Ai2, Carleton College, College of British Columbia, Vector Institute, and Arizona State College. Galileo goals to offer a unified, generalist answer for essential purposes like agricultural land mapping, catastrophe response, and environmental monitoring.
In distinction to prior distant sensing fashions restricted to a single information kind or scale, Galileo flexibly fuses a number of sensing modalities and is designed to acknowledge phenomena starting from tiny objects (resembling fishing boats, measuring simply 1–2 pixels) to huge, slowly altering options like glaciers.

Key Options and Structure
Multimodal Transformer Design
Galileo relies on a Imaginative and prescient Transformer (ViT) structure, meticulously tailored to course of:
- Multispectral optical imagery (e.g., Sentinel-2)
- Artificial Aperture Radar (SAR) (e.g., Sentinel-1)
- Elevation and slope information (e.g., NASA SRTM)
- Climate/local weather information (e.g., precipitation and temperature from ERA5)
- Land cowl maps, inhabitants, night-lights, and extra
Versatile Enter Dealing with:
Galileo’s tokenization pipeline splits distant sensing inputs into spatial patches, timesteps, and logical channel teams. This enables the mannequin to course of photographs, time collection, and static tabular information in a single structure configuration.
Unified Native and World Characteristic Studying
A core innovation is Galileo’s self-supervised pretraining algorithm, which mixes:
- World losses: Encourage abstraction over large spatial or temporal contexts—excellent for figuring out “huge” or slowly altering options (glaciers, forest loss).
- Native losses: Improve sensitivity to minute particulars—essential for detecting small, fast-changing objects (boats, particles).
Native and international targets differ in:
- Prediction depth: World duties goal deep latent representations; native duties use shallow, linearly projected options.
- Masking methods: World duties use structured, correlated space-time masks (forcing predictions over giant intervals); native duties use random unstructured masks.
This dual-objective pretraining enhances multi-scale function illustration, making Galileo generalizable throughout duties and sturdy even with restricted labels.
Pretraining Dataset and Technique
To make sure each semantic and geographic range, Galileo’s pretraining dataset covers your entire globe, sampled by way of a clustering method to maximise each land cowl selection and geographic unfold. The dataset includes over 127,000 spatiotemporally aligned samples, every together with 4 classes and 9 distant sensing information sorts.
Pretraining proceeds for 500 epochs on giant compute sources. Key facets:
- Batch measurement: Efficient batch measurement of 512.
- Information augmentations: Flipping, rotation, and variable patch sizes.
- Optimization: AdamW with scheduled studying charge and weight decay sweeps.

Benchmark Outcomes
Superior Generalization
Galileo is benchmarked on 11 various datasets and 15 downstream duties, spanning picture and pixel time collection classification, in addition to segmentation. Particularly, it dominates on public datasets resembling EuroSat, BigEarthNet, So2Sat, MADOS (marine particles), Sen1Floods11 (SAR flood mapping), CropHarvest (multimodal crop classification), and plenty of others.
Efficiency Highlights of Galileo-Base (ViT-Base):
- Classification (Finetune):
- EuroSat: 97.7% (top-1 accuracy, 100% coaching information)
- Outperforms specialist fashions like CROMA (96.6%) and SatMAE (96.6%)
- Pixel Timeseries:
- CropHarvest (Kenya): 84.2% (tops Presto and AnySat)
- Breizhcrops: 73.0%
- Segmentation (mIoU):
- MADOS: 67.6%
- PASTIS: 79.4%
Mannequin Flexibility:
Throughout all benchmarks, Galileo is the highest performer total—outclassing each image-specialized and time-series specialised opponents. Notably, small mannequin variants (ViT-Nano, ViT-Tiny) additionally obtain prime or near-top outcomes, essential for resource-constrained settings.

Ablation and Enter Significance
Eradicating any particular person modality (e.g., VIIRS night time lights, ERA5, Dynamic World maps) from pretraining results in a measurable decline in efficiency—even on benchmarks indirectly utilizing that enter kind. For instance, absence of VIIRS information reduces MADOS mIoU from 67.8% to 63.5%, demonstrating the worth of full multimodality for function generalization.
Open-Supply and Actual-World Affect
- Open Weights & Code:
All code, mannequin weights, and pretraining information can be found on GitHub, fostering transparency and adoption by the worldwide EO group. - Societal Advantages:
Galileo helps mission-critical NASA Harvest actions, resembling international crop kind mapping, fast catastrophe mapping (floods, wildfires), and marine air pollution detection. The mannequin’s capacity to work with restricted labeled information makes it particularly beneficial in areas the place floor reality is scarce, supporting meals safety and local weather adaptation efforts.
Technical Abstract Desk
| Mannequin | Params | Duties Supported | Rank (Decrease=Higher) | Enter Modalities |
|---|---|---|---|---|
| Galileo-Base | 85M | Pictures, Time Collection | 1 (total) | Optical, SAR, Climate, and so on. |
| Specialist SOTA | varies | Normally 1 or 2 sorts | 3–10 | Restricted |
Galileo-Base: constantly superior efficiency and suppleness throughout all main EO benchmarks.
Conclusion
Galileo’s methodological and engineering advances—multimodal inputs, multi-scale local-global function studying, and large-scale globally various pretraining—set a brand new commonplace for generalist distant sensing AI. Its flexibility underpins sensible deployments from environmental monitoring to local weather resilience, providing dependable, high-quality maps and predictions whatever the process or geography.
With open-source entry and energetic growth, Galileo is positioned to catalyze a brand new wave of innovation in earth system science, empowering practitioners all over the place.
Take a look at the Paper, Mannequin and Technical Weblog. Be at liberty to take a look at our GitHub Web page for Tutorials, Codes and Notebooks. Additionally, be happy to observe us on Twitter and don’t overlook to affix our 100k+ ML SubReddit and Subscribe to our Publication.
Asif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is dedicated to harnessing the potential of Synthetic Intelligence for social good. His most up-to-date endeavor is the launch of an Synthetic Intelligence Media Platform, Marktechpost, which stands out for its in-depth protection of machine studying and deep studying information that’s each technically sound and simply comprehensible by a large viewers. The platform boasts of over 2 million month-to-month views, illustrating its recognition amongst audiences.

