Module: metrics
File: src/spatial_graph_algorithms/metrics/
Status: Stable.
Purpose
Three responsibilities, kept in separate files:
-
Graph structure properties (
graph_properties.py) — degree, density, transitivity. These work on anySpatialGraph, with or without positions. -
Graph report (
report.py) —GraphReportandgraph_report(). Pre-reconstruction characterisation: topology, spatial geometry, and false-edge statistics in one object. Works on anySpatialGraph; spatial section is populated only whenpositionsis set. -
Reconstruction quality (
__init__.py, delegates toreconstruct/quality.py) — CPD, KNN preservation, distortion. These require bothpositionsandreconstructed_positionsto be set.
The unified entry point evaluate() combines responsibilities 1 and 3 and optionally
writes a CSV row. quality_table() wraps evaluate() for side-by-side comparison of
multiple reconstructions. graph_report() is the entry point for responsibility 2.
What evaluate() Returns
{
# graph structure (always present)
"n_nodes": int,
"n_edges": int,
"min_degree": float,
"max_degree": float,
"mean_degree": float,
"degree_std": float,
"density": float,
"transitivity": float, # global clustering coefficient
"largest_component_fraction": float,
# reconstruction quality (None when positions unavailable)
"cpd": float | None,
"knn": float | None,
"distortion": float | None, # [0, 1]; None unless compute_distortion=True
}
Metric Interpretation
| Metric | Range | Excellent | Acceptable | Poor |
|---|---|---|---|---|
| CPD | [0, 1] | > 0.95 | > 0.85 | < 0.7 |
| KNN | [0, 1] | > 0.70 | > 0.50 | < 0.30 |
| Distortion | [0, 1] | < 0.05 | < 0.20 | > 0.40 |
All three are rotation-, reflection-, and translation-invariant — they measure structural fidelity, not absolute coordinate agreement.
CPD is the most interpretable: 1.0 means all inter-node distances were perfectly preserved. It is sensitive to global layout errors.
KNN preservation is more sensitive to local neighbourhood recovery — it catches reconstruction methods that get the global shape right but scramble local clusters.
Distortion penalises scale errors that CPD (a correlation) does not. The reconstruction is scale-aligned to the ground truth before scoring, so the result is always in [0, 1].
quality_table()
quality_table(reconstructions, *, k_neighbors=15) is the recommended way to compare
multiple reconstructions in one call:
from spatial_graph_algorithms.metrics import quality_table
qt = quality_table({"MDS": sg_mds, "STRND": sg_strnd})
# Returns a DataFrame indexed by method with columns CPD, KNN, Distortion.
It always computes distortion (unlike evaluate(), which requires compute_distortion=True).
GraphReport
graph_report(sg) returns a GraphReport object — the recommended entry point for
understanding any SpatialGraph before reconstruction.
from spatial_graph_algorithms.metrics import graph_report
r = graph_report(sg)
r # styled HTML table in Jupyter
r.n_connected_components # topology, always available
r.edge_length_stats # spatial, None when positions absent
r.diameter # on-demand, O(n²), cached after first access
r.plot_degree_distribution() # returns matplotlib Figure
r.to_dict() # flat dict for CSV / pandas pipelines
Always-computed (topology): n_nodes, n_edges, density, mean/min/max_degree,
degree_std, n_connected_components, largest_component_fraction, transitivity,
average_clustering_coefficient, assortativity.
Computed when positions is set (spatial): edge_length_stats (mean/median/std/min/max),
spatial_extent (bounding box + area/volume), local_spatial_density, false_edge_fraction.
On-demand (lazy, cached): diameter, average_path_length, betweenness_centrality_stats.
graph_report and evaluate are complementary, not redundant:
graph_report() |
evaluate() |
|
|---|---|---|
| When to use | Before reconstruction | After reconstruction |
| Inputs needed | Any SpatialGraph |
reconstructed_positions must be set |
| Spatial metrics | Edge lengths, bounding box | CPD, KNN, distortion |
Design Decisions
Why is evaluate() in metrics/__init__.py rather than a standalone function?
It is the public face of the module. Users import from spatial_graph_algorithms.metrics import evaluate
and should not need to know about the internal split between graph_properties.py and
reconstruct/quality.py.
Why does evaluate() import networkx lazily (import networkx as nx inside the function)?
transitivity and largest_component_fraction require constructing a NetworkX graph.
This is potentially slow for large graphs. Lazy import keeps the cost visible at the call
site rather than at module import time. (In practice, sg.graph is cached so repeated
calls do not recompute it.)
How to Add a New Graph Property
- Add a function to
metrics/graph_properties.py: - Add it to the
resultsdict inmetrics/__init__.py::evaluate(). - Export it from
metrics/__init__.py::__all__.
How to Add a New Reconstruction Metric
See docs/modules/reconstruct.md — quality metrics live in
reconstruct/quality.py and are imported into metrics/__init__.py::evaluate().