Font Size:
An efficient multicore CPU implementation of the DatabionicSwarm
Last modified: 2024-06-08
Abstract
We present an efficiency improved framework for an algorithm exploiting
swarm intelligence for self-organized clustering. The algorithm is able to cluster numeric data in three computational steps. First, a projection on two dimensions is achieved by defining each datapoint from the dataset as an agent randomly distributed on a polar grid, on which they self-organize themselves iteratively based on scent emission while their moving radius is cooled, finally resulting in local neighborhoods of similar datapoints. The second step computes the Delaunay triangulation of the projected points and weights the graph edges with the distances from the original high dimensional dataspace and computes the shortest paths with the Dijkstra algorithm. The third step applies hierarchical clustering using the shortest paths in the weighted Delaunay graph. The user can decide the number of clusters based on the resulting dendrogram, but also with a landscape visualization technique of the projection visualizing high-dimensional structures on the generalized U-matrix concept. A higher efficiency is achieved with a parallelized vectorization and minimization of number of operations resulting in the full usage of the CPU. The proposed framework
is showed to accelerate the performance of a previously implemented sequential algorithm by a factor over 20.
swarm intelligence for self-organized clustering. The algorithm is able to cluster numeric data in three computational steps. First, a projection on two dimensions is achieved by defining each datapoint from the dataset as an agent randomly distributed on a polar grid, on which they self-organize themselves iteratively based on scent emission while their moving radius is cooled, finally resulting in local neighborhoods of similar datapoints. The second step computes the Delaunay triangulation of the projected points and weights the graph edges with the distances from the original high dimensional dataspace and computes the shortest paths with the Dijkstra algorithm. The third step applies hierarchical clustering using the shortest paths in the weighted Delaunay graph. The user can decide the number of clusters based on the resulting dendrogram, but also with a landscape visualization technique of the projection visualizing high-dimensional structures on the generalized U-matrix concept. A higher efficiency is achieved with a parallelized vectorization and minimization of number of operations resulting in the full usage of the CPU. The proposed framework
is showed to accelerate the performance of a previously implemented sequential algorithm by a factor over 20.
Keywords
Self-organization, Emergence, Unsupervised Learning, Clustering, Swarm Intelligence