Dataset: Pedestrian tracking with group annotations

Here we provide multiple datasets containing the pedestrian position and group data. The tracking of pedestrians was done using automatic tracking systems, whereas the groups were labeled manually. The data was taken in two environments: "DIAMOR" and "ATC".

The DIAMOR environment datasets were taken in two large straight corridors connecting the Diamor shopping centre in Osaka, Japan, with the railway station. The pedestrian tracking was done using multiple laser range finders, using the method described in reference [4].

The ATC environment datasets were taken in part of the ATC (Asia and Pacific Trade Center) shopping and business center in Osaka, Japan. The tracking of pedestrians in this environment was done using 3-D range sensors. For more details see [5]. For additional data from this environment you can also check the ATC dataset page.

All annotations were done by a single coder (note: this is different from [1], where an agreement between two coders was used). Apart from group annotations, consisting of IDs of all pedestrians in the group, the socially interacting partners are also annotated (the definition of social interaction is given in [1]).


The dataset contains 8 experiment days, 2 in the DIAMOR environment and 6 in the ATC environment. For each day the data for 4 1-hour periods is provided: 10:00-11:00, 12:00-13:00, 15:00-16:00, 19:00-20:00.

Preprocessed data

The data has been resampled at a lower frequency to limit the impact of tracking noise and gait-induced oscillations.

The data is provided as zip compressed files, one file per experiment day. Each zip file contains one or more files containing the person tracking data (person_[DATASET_NAME]_[TIME].csv) and one containing the group annotations (groups_[DATASET_NAME].dat).

Person tracking files contain the data for all persons that were tracked in the environment on a given day and period of time. Each row corresponds to a single tracked person at a single instant, and it contains the following 8 fields (comma separated values):
TIME [s] (unixtime + milliseconds/1000), PEDESTRIAN_ID, POSITION_X [mm], POSITION_Y [mm], POSITION_Z (height) [mm], VELOCITY [mm/s], ANGLE_OF_MOTION (direction of velocity vector) [rad], FACING_ANGLE (body direction) [rad]

Group files contain the group annotations for the given day. Only pedestrians in groups are listed, pedestrians walking alone are not included. Each row corresponds to a single tracked pedestrian inside one group, and it contains the following fields (space separated values):
PEDESTRIAN_ID  GROUP_SIZE  PARTNER_ID_1 ... (list of ids of all other pedestrians in group)  NUMBER_OF_INTERACTING_PARTNERS  INTERACTION_PARTNER_ID_1 ... (list of all socially interacting partners)

Raw data and interaction annotations (DIAMOR only)

The raw data is provided as zip compressed files, one file per experiment day. Each zip file contains one folder with the trajectory data (sampled at around 33 Hz) and one with the interaction annotations.

In the trajectories folder, the trajectories are stored in .dat files with following format TIME [s] (unixtime + milliseconds/1000) i PEDESTRIAN_ID 0 POSITION_X [mm] POSITION_Y [mm] POSITION_Z (height) [mm] ANGLE_OF_MOTION (direction of velocity vector) [rad], FACING_ANGLE (body direction) [rad] -1

The annotations folder contains the interaction annotations. Files named ids_wrt_group_size_tan_[DAY].pkl contains the list of all groups. Each group is represented as a list of pedestrian IDs. The file gt_2p_yos_[DAY].pkl contains annotation of the intensity of interaction (ranging from 0 for no interaction to 3 for strong interaction) between groups of 2 pedestrians. Each element in the list is another list containing the IDs of the two pedestrians (index 0 and 1) and the intensity of interaction (index 4).

The repository at contains the instruction and code to process the raw data and interaction annotations.

Environment Date Times File File size
DIAMOR (preprocessed trajectories and group labels) 2010/10/06 (Wed) 11:00-15:40 133 MB
2010/10/08 (Fri) 11:20-17:00 257 MB
DIAMOR (raw trajectories and interactions labels) 2010/10/06 (Wed) 11:00-15:40 172 MB
2010/10/08 (Fri) 11:20-17:00 278 MB
ATC (preprocessed trajectories and group labels) 2013/01/09 (Wed) 10:00-11:00
19:00-20:00 171 MB
2013/02/17 (Sun) 366 MB
2013/03/24 (Sun) 303 MB
2013/04/24 (Wed) 130 MB
2013/05/05 (Sun) 515 MB
2013/05/08 (Wed) 168 MB


The datasets are free to use for research purposes only.
In case you use the datasets in your work please be sure to cite the reference below: [1], [2] (for the DIAMOR datasets) and [3] (for ATC datasets).


Papers containing more details about the dataset:
[1] F. Zanlungo, T. Ikeda, and T. Kanda, "Potential for the dynamics of pedestrians in a socially interacting group," Physical Review E, Vol. 89, No. 1, 012811, 2014
[2] A. Gregorj, Z. Yucel, F. Zanlungo, T. Kanda, "Ecological Data Reveal Imbalances in Collision Avoidance Due to Groups' Social Interaction", arXiv:2406.06084, 2024
[3] F. Zanlungo, D. Brscic, and T. Kanda, "Pedestrian group spatial size scaling under growing density conditions," Physical review E, Vol. 91, No. 6, 062810, 2015

Additional references:
Tracking using laser range finders:
[4] D. Glas, T. Miyashita, H. Ishiguro and N. Hagita, "Laser-based tracking of human position and orientation using parametric shape modeling," Advanced robotics, Vol. 23, No. 4, pp. 405-428, 2009
Tracking using 3-D range finders and ATC environment:
[5] D. Brscic, T. Kanda, T. Ikeda, T. Miyashita, "Person position and body direction tracking in large public spaces using 3-D range sensors," IEEE Transactions on Human-Machine Systems, Vol. 43, No. 6, pp. 522-534, 2013

Inquiries and feedback

For any questions concerning the datasets please contact: