Clustering for Infrastructure Observability

Imagine walking through a large botanical garden filled with thousands of plants.
Some grow close together in thick patches, others are scattered in smaller clusters, and a few stand alone in remote corners.

If you were asked to group them, you’d probably do it by how similar they look e.g., color, leaf shape, height, or the type of soil they prefer.
You’d quickly notice that some groups are dense and obvious, while others are sparse and loosely related.
A few plants wouldn’t fit anywhere at all.

That’s essentially what clustering does, it identifies natural groupings within a large, unlabeled space.
It doesn’t need prior knowledge or fixed categories. Instead, it observes how things naturally relate to each other and organises them accordingly.

Some groups are strong and well-defined, others are weaker or short-lived, and some points don’t belong anywhere — they’re outliers.
The goal is simple: find structure inside apparent randomness.

What Is Clustering

Clustering is the process of automatically grouping similar data points.

For engineers, here is what it means:

  • Anomaly detection: grouping normal vs abnormal signals
  • Workload segmentation: clustering GPUs, jobs, or nodes by behaviour
  • Log and metric deduplication: grouping repeating fault patterns
  • Embedding analysis: grouping semantically similar vectors

In short, clustering helps convert noisy telemetry into structured, actionable insight.

How Clustering Works

At its core, every clustering method follows  steps:

  1. Measure similarity: Define how close two data points are using a distance metric (e.g., Euclidean, cosine).
  2. Group related points: Combine points that are close together into clusters.
  3. Evaluate stability: Check how cohesive and distinct each cluster is, and merge or split as needed.It’s about the structure that helps you reason about your data.

Two Broad Types of Clustering

There are many algorithms, but most fall into two broad categories:

1. Flat Clustering

Flat algorithms create a single partition of the data — every point belongs to exactly one cluster.
They’re simple and efficient but can struggle when cluster densities vary or when data has complex shapes.
Examples: k-means, k-medoids.

2. Hierarchical Clustering

Hierarchical algorithms build a tree (dendrogram) of clusters, capturing how groups form, merge, or split at different similarity levels.
This approach helps reveal structure at multiple scales — from fine-grained subgroups to broader categories.
Examples: BIRCH, HDBSCAN.

Evaluating Clustering Quality

Because clustering is unsupervised, there’s no single “correct” answer.
Common ways to assess cluster quality include:

  • Silhouette score:  measures how well points fit within their cluster vs others.
  • Davies–Bouldin index: compares internal cohesion to external separation.
  • Stability checks:  ensure clusters persist across different samples.
  • Visualization: project high-dimensional data into 2D (UMAP or t-SNE) to inspect structure and overlap.

How Asama AI uses Clustering

At Asama, we constantly observe a diverse range of devices to ensure optimal performance and proactively identify potential issues. The sheer volume and complexity of our infrastructure necessitate sophisticated tools for anomaly detection. This is where  Clustering proves invaluable.

We use various techniques for clustering based on various operational metrics, specs, and various other parameters. By grouping similar devices and their activities, we can effectively identify outliers that deviate significantly from their respective clusters. These deviations often indicate anomalies that could signal performance degradation, security breaches, or other operational issues.

The primary goal of this clustering and anomaly-detection process is to enable logical, efficient remediation. Once an anomaly is detected, our teams can quickly pinpoint the affected devices, understand the nature of the deviation, and take targeted actions to resolve the issue before it escalates.

References

  1. https://www.youtube.com/watch?v=7xHsRkOdVwo
  2. https://www.youtube.com/watch?v=4AW_5nYQkuc
  3. https://hdbscan.readthedocs.io/en/latest/how_hdbscan_works.html

Summary

Clustering is the foundation of unsupervised learning, enabling us to discover natural order within complexity.
It helps systems, researchers, and engineers uncover structure where none was labelled before.