GeoLife+: Large-Scale Simulated Trajectory Datasets Calibrated to the GeoLife Dataset


Sumamry:

GeoLife+ addresses the challenges of real-world trajectory data limitations by creating simulated datasets that mimic realistic human mobility patterns, based on the popular but sparse GeoLife dataset. By calibrating a simulation model (Pattern of Life Simulation) using GeoLife’s characteristics, GeoLife+ offers a rich dataset that retains key statistical similarities to real human behavior, without the constraints of participant privacy and limited sample sizes.

Methodology:

The study employed a genetic algorithm to fine-tune simulation parameters of the Pattern of Life Simulation, targeting similarity to GeoLife’s patterns, such as trip distance and frequency. The genetic algorithm iteratively optimized the simulation's parameters, ensuring that the synthetic data closely aligns with GeoLife’s statistical features. Datasets generated through this approach vary in size, simulating scenarios from hundreds to thousands of agents over different time spans.

Results:

GeoLife+ produced large-scale trajectory datasets with up to 100,000 users, providing a density of human mobility data unattainable through GeoLife alone. This enriched dataset preserves realistic movement patterns and social interactions, making it a valuable resource for researchers studying human mobility.

Conclusion:

GeoLife+ represents a significant advancement in trajectory dataset simulation, allowing researchers to access high-density data that maintains social realism. This resource is available for further research and applications, with open access on GitHub, supporting large-scale mobility analysis and privacy-conscious data applications in areas like urban planning and infectious disease tracking.

Full Paper:


You can access the paper at this link.