Detecting traffic problems (IJS,KUL,UPM)

Application domain: Traffic control
Source: Technical University Madrid
Dataset size: nearly 6000 observations >1MB
Data format: Prolog
Systems Used: Claudien, Tilde
References: (Dzeroski et al. 1998)

The data

A simulator is used (AIMSUN) which simulates the car traffic around Barcelona, Spain. This is a detailed simulator where the behavior of each individual car can be observed. Incidents and congestions can be simulated and readings of speed, saturation and occupation sensors recorded. Relational background knowledge describes the structure of the road network and the relative position of different road sections. A section is a cross-section of the road: sensors are attached to most, but not all sections. 128 different traffic situations were simulated, including traffic problems, and the sensor-readings were logged, resulting in 5952 sensor-readings.

Experiment with Claudien

In an initial experiment, only 11 simulations were made. The system Claudien was used to find rules describing the type of problem (incident versus congestion by lack of capacity) and the critical section, i.e., the section constraining traffic flow the most. Three rules were found that cover all 9 incident examples. The first rule says there is an incident at critical section X, which is the previous section of off-ramp link Y (enlace de salida) with next section O and ramp section R, if the speed (velocidad) at X is not high (alta), the speed at O is high and the saturation on R is low (baja).

  accidentat(X) :- 
    seccion(X), seccion_anterior(Y,X), seccion_posterior(Y,O), 
    enlace_de_salida(Y), velocidad(X,VX), not VX = alta, 
    velocidad(O,VO), VO = alta, 
    seccion_en_rampa(Y,R), saturacion(R,SR), SR = baja.
The rules found describe conditions already known to the domain experts and encouraged us to conduct further experiments.

Experiments with Tilde

Additional simulations were then performed to generate the dataset described above. Experiments were then conducted using the full dataset. Tilde was used to build a decision tree distinguishing between accident critical sections, congestion critical sections and noncritical sections. Running a 6-fold cross validation on the complete dataset resulted in 80% correctly classified congestions and only 39 of the 5824 non critical sections were misclassified. But only 61% of the incidents were correctly classified. If we repeat the experiment with only 90 randomly selected non critical sections (so that the class distribution is less skewed), we get better overall results: 92% accuracy on congestions, 88% on incidents and 83% on non critical sections.


  1. S. Dzeroski, N. Jacobs, M. Molina, and C. Moure. ILP experiments in detecting traffic problems. In Proc. Tenth European Conference on Machine Learning, 61-66. Springer, Berlin, 1998.

back to index