Synthetic AIS Dataset of Vessel Proximity Events
The development of solutions and models for the analysis, early detection and mitigation of vessel collision events is a significant step towards ensuring future maritime safety. In this context, a synthetic vessel proximity event dataset is created using real vessel AIS messages. The synthetic dataset of trajectories with reconstructed timestamps is generated so that a pair of trajectories reach simultaneously their intersection point, simulating an unintended proximity event (collision close call). The dataset aims to provide a basis for the development of methods for the detection and mitigation of maritime collisions and proximity events, as well as the study and training of vessel crews in simulator environments.

The dataset consists of 4658 samples/AIS messages of 213 unique vessels from the Aegean Sea with simulated vessel proximity events
The Automatic Identification System (AIS) allows vessels to share identification, characteristics, and location data through self-reporting. This information is periodically broadcast and can be received by other vessels with AIS transceivers, as well as ground or satellite sensors. Since the International Maritime Organisation (IMO) mandated AIS for vessels above 300 gross tonnage, extensive datasets have emerged, becoming a valuable resource for maritime intelligence.
Maritime collisions occur when two vessels collide or when a vessel collides with a floating or stationary object, such as an iceberg. Maritime collisions hold significant importance in the realm of marine accidents for several reasons:
- Injuries and fatalities of vessel crew members and passengers.
- Environmental effects, especially in cases involving large tanker ships and oil spills.
- Direct and indirect economic losses on local communities near the accident area.
- Adverse financial consequences for ship owners, insurance companies and cargo owners including vessel loss and penalties.
- As sea routes become more congested and vessel speeds increase, the likelihood of significant accidents during a ship's operational life rises. The increasing congestion on sea lanes elevates the probability of accidents and especially collisions between vessels.
The development of solutions and models for the analysis, early detection and mitigation of vessel collision events is a significant step towards ensuring future maritime safety. In this context, a synthetic vessel proximity event dataset is created using real vessel AIS messages. The synthetic dataset of trajectories with reconstructed timestamps is generated so that a pair of trajectories reach simultaneously their intersection point, simulating an unintended proximity event (collision close call). The dataset aims to provide a basis for the development of methods for the detection and mitigation of maritime collisions and proximity events, as well as the study and training of vessel crews in simulator environments.
The dataset consists of 4658 samples/AIS messages of 213 unique vessels from the Aegean Sea. The steps that were followed to create the collision dataset are:
Given 2 vessels X (vessel_id1) and Y (vessel_id2) with their current known location (LATITUDE [lat], LONGITUDE [lon]):
- Check if the trajectories of vessels X and Y are spatially intersecting.
- If the trajectories of vessels X and Y are intersecting, then align temporally the timestamp of vessel Y at the intersect point according to X’s timestamp at the intersect point. The temporal alignment is performed so the spatial intersection (nearest proximity point) occurs at the same time for both vessels.
- Also for each vessel pair the timestamp of the proximity event is different from a proximity event that occurs later so that different vessel trajectory pairs do not overlap temporarily.
- Two csv files are provided. vessel_positions.csv includes the AIS positions vessel_id, t, lon, lat, heading, course, speed of all vessels. Simulated_vessel_proximity_events.csv includes the id, position and timestamp of each identified proximity event along with the vessel_id number of the associated vessels. The final sum of unintended proximity events in the dataset is 237. Examples of unintended vessel proximity events are visualized in the respective png and gif files.
The research leading to these results has received funding from the European Union's Horizon Europe Programme under the CREXDATA Project, grant agreement n° 101092749.