Additional PCI Device Metrics in Prometheus Node Exporter

In continuation of our previous post, we are bringing additional features to the pcidevice collector in node_exporter, through PR #3425.

This enhancement builds on three key themes:

  • Extended PCI Metrics (powered by PR #748)
  • Translating Numeric IDs into Human-Friendly Names
  • Improved Stability via Nil-Pointer Checks

What’s New

1. Extended PCI Metrics

The pcidevice collector is now enriched with several new fields:

  • NUMA node – identifies which NUMA node the device is attached to.
  • SR-IOV details – reports the number of virtual functions, total VFs, etc.
  • Driver autoprobe flag – tracks whether driver probing is enabled.
  • Power state & D3Cold – exposes device power state and low-power capability.

Previously, collecting this data required a custom textfile collector. With this update, these attributes become first-class citizens in Node Exporter.

2. ID → Name Conversion

A highly ergonomic addition: numeric PCI IDs (vendor, device, class) can now be optionally mapped to human-readable names.

Example:

  • Before → {vendor_id=0x8086}
  • After → {vendor_id=0x8086, vendor_name="Intel Corporation"}

This mapping relies on the system’s pci.ids file (or a user-specified alternative). It’s disabled by default to minimize overhead, but for dashboards, debug logs, and alerts, the readability boost is huge.

3. Nil-Pointer Checks

Alongside the new features, PR #3425 adds nil-pointer checks across optional fields in the sysfs.PciDevice struct.

Why this is important:

  • Not all sysfs entries exist consistently across kernels, drivers, or hardware types.
  • Without these checks, collector could panic when trying to read missing fields.
  • With them, the collector gracefully skips unavailable data, keeping metrics flowing reliably.

Why This Matters (Especially for Infra Teams)

Better Observability & Context

These enhancements unlock deeper insights into machine hardware, particularly in high-performance, virtualized, or containerized environments:

  • Detect and diagnose NUMA locality issues that impact performance.
  • Monitor power states and wake-up events for PCI devices.
  • Validate and observe SR-IOV setups, critical for NICs and accelerators.
  • Build cleaner dashboards with vendor and device names instead of cryptic hex codes.
  • Rely on robust collectors that won’t break due to missing sysfs entries.

Opt-In by Design

The optional nature of ID-to-name conversion is deliberate: users can enable richer context where needed, without forcing additional dependencies on minimal setups.

Conclusion

At Asama.ai, we believe in strengthening the open-source ecosystem we rely on. Contributing improvements like these ensures broader community benefit and reduces the need to carry private forks.

And yes, we’re always looking for developers who care about infrastructure, observability, and open source.

Cheers,

Jain

jj@asama.ai

Leave a comment