Spanishoddata: A package for accessing and working with Spanish Open Mobility Big Data

Egor Kotov, Eugeni Vidal-Tortosa, Oliva G. Cantú-Ros, Javier Burrieza-Galán, Ricardo Herranz, Tania Gullón Muñoz-Repiso, and Robin Lovelace (2026). Spanishoddata: A package for accessing and working with Spanish Open Mobility Big Data. Environment and Planning B: Urban Analytics and City Science. https://doi.org/10.1177/23998083251415040
Authors

Egor Kotov

Eugeni Vidal-Tortosa

Oliva G. Cantú-Ros

Javier Burrieza-Galán

Ricardo Herranz

Tania Gullón Muñoz-Repiso

Robin Lovelace

Published

January 1, 2026

Doi
Abstract
We present spanishoddata, an R package that enables fast and efficient access to Spain’s open, high-resolution origin-destination human mobility datasets, derived from anonymised mobile-phone records and released by the Ministry of Transport and Sustainable Mobility. The package directly addresses challenges of data accessibility, reproducibility, and efficient processing identified in prior studies. spanishoddata automates retrieval from the official source, performs file and schema validation, and converts the data to efficient, analysis-ready formats (DuckDB and Parquet) that enable multi-month and multi-year analysis on consumer-grade hardware. The interface handles complexities associated with these datasets, enabling a wide range of people – from data science beginners to experienced practitioners with domain expertise – to start using the data with just a few lines of code. We demonstrate the utility of the package with example applications in urban transport planning, such as assessing cycling potential or understanding mobility patterns by activity type. By simplifying data access and promoting reproducible workflows, spanishoddata lowers the barrier to entry for researchers, policymakers, transport planners or anyone seeking to leverage mobility datasets.

Type: Journal Article Venue: Environment and Planning B: Urban Analytics and City Science Year: 2026

DOI URL BibTeX

Abstract

We present spanishoddata, an R package that enables fast and efficient access to Spain’s open, high-resolution origin-destination human mobility datasets, derived from anonymised mobile-phone records and released by the Ministry of Transport and Sustainable Mobility. The package directly addresses challenges of data accessibility, reproducibility, and efficient processing identified in prior studies. spanishoddata automates retrieval from the official source, performs file and schema validation, and converts the data to efficient, analysis-ready formats (DuckDB and Parquet) that enable multi-month and multi-year analysis on consumer-grade hardware. The interface handles complexities associated with these datasets, enabling a wide range of people – from data science beginners to experienced practitioners with domain expertise – to start using the data with just a few lines of code. We demonstrate the utility of the package with example applications in urban transport planning, such as assessing cycling potential or understanding mobility patterns by activity type. By simplifying data access and promoting reproducible workflows, spanishoddata lowers the barrier to entry for researchers, policymakers, transport planners or anyone seeking to leverage mobility datasets.

Citation

Egor Kotov, Eugeni Vidal-Tortosa, Oliva G. Cantú-Ros, Javier Burrieza-Galán, Ricardo Herranz, Tania Gullón Muñoz-Repiso, and Robin Lovelace (2026). Spanishoddata: A package for accessing and working with Spanish Open Mobility Big Data. Environment and Planning B: Urban Analytics and City Science. https://doi.org/10.1177/23998083251415040

BibTeX

@article{kotov_spanishoddata_2026,
    title = {spanishoddata: {A} package for accessing and working with {Spanish} {Open} {Mobility} {Big} {Data}},
    issn = {2399-8083},
    shorttitle = {spanishoddata},
    url = {https://doi.org/10.1177/23998083251415040},
    doi = {10.1177/23998083251415040},
    abstract = {We present spanishoddata, an R package that enables fast and efficient access to Spain’s open, high-resolution origin-destination human mobility datasets, derived from anonymised mobile-phone records and released by the Ministry of Transport and Sustainable Mobility. The package directly addresses challenges of data accessibility, reproducibility, and efficient processing identified in prior studies. spanishoddata automates retrieval from the official source, performs file and schema validation, and converts the data to efficient, analysis-ready formats (DuckDB and Parquet) that enable multi-month and multi-year analysis on consumer-grade hardware. The interface handles complexities associated with these datasets, enabling a wide range of people – from data science beginners to experienced practitioners with domain expertise – to start using the data with just a few lines of code. We demonstrate the utility of the package with example applications in urban transport planning, such as assessing cycling potential or understanding mobility patterns by activity type. By simplifying data access and promoting reproducible workflows, spanishoddata lowers the barrier to entry for researchers, policymakers, transport planners or anyone seeking to leverage mobility datasets.},
    language = {EN},
    urldate = {2026-03-06},
    journal = {Environment and Planning B: Urban Analytics and City Science},
    publisher = {SAGE Publications Ltd STM},
    author = {Kotov, Egor and Vidal-Tortosa, Eugeni and Cantú-Ros, Oliva G. and Burrieza-Galán, Javier and Herranz, Ricardo and Gullón Muñoz-Repiso, Tania and Lovelace, Robin},
    month = jan,
    year = {2026},
    pages = {23998083251415040},
}

Notes

Impact Statement

This output, co-authored by an international team including staff from the Spanish Department for Transport (Ministerio de Transportes y Movilidad), and the CEO of the disruptive international mobility data science startup Nomon, is setting the international agenda for open science in transport planning. It outlines the vast potential for innovation and more evidence-based decision making made possible by nation states going beyond open data by making datasets not only open access but also analysis-ready. It does this with reference to a case study of spanishoddata, the first R package to provide high-resolution, analysis-ready access to national-scale mobile phone origin-destination data. The rigour lies in its technical architecture, using new tools such as DuckDB and Parquet formats to enable the processing of large (200GB+) datasets on consumer-grade hardware for a 5-8x performance increase over traditional methods. Its significance is evidenced by its immediate uptake (2200+ downloads), lowering the barrier for researchers to conduct multi-year mobility studies. It solves the “reproducibility crisis” in MND (Mobile Network Data) research by automating ETL (Extract, Transform, Load) processes and enforcing standardized citation.

Unit of Assessment: UoA 9 (Architecture, Built Environment and Planning).
Alternative UoA: UoA 11 (Computer Science and Informatics).
PGR student co-author: True (Egor Kotov).
Potential for Double Weighting: No.
Author rating: Likely 3* for technical innovation and reproducibility.