Examining the Use of Scott’s Formula and Link Expiration Time Metric for Vehicular Clustering

By Fady Esmat Fathel Samann | CMES


Implementing machine learning algorithms in the non-conducive environment of the vehicular network requires some adaptations due to the high computational complexity of these algorithms. K-clustering algorithms are simplistic, with fast performance and relative accuracy. However, their implementation depends on the initial selection of clusters number (K), the initial clusters’ centers, and the clustering metric. This paper investigated using Scott’s histogram formula to estimate the K number and the Link Expiration Time (LET) as a clustering metric. Realistic traffic flows were considered for three maps, namely Highway, Traffic Light junction, and Roundabout junction, to study the effect of road layout on estimating the K number. A fast version of the PAM algorithm was used for clustering with a modification to reduce time complexity. The Affinity propagation algorithm sets the baseline for the estimated K number, and the Medoid Silhouette method is used to quantify the clustering. OMNET++, Veins, and SUMO were used to simulate the traffic, while the related algorithms were implemented in Python. The Scott’s formula estimation of the K number only matched the baseline when the road layout was simple. Moreover, the clustering algorithm required one iteration on average to converge when used with LET.