Skip to content

spatial_graph_algorithms.metrics

Evaluate reconstruction quality and graph structural properties.

Metric interpretation guide

Metric Range Excellent Good Poor
CPD [0, 1] > 0.95 > 0.85 < 0.7
KNN [0, 1] > 0.7 > 0.5 < 0.3
Distortion* [0, 1] < 0.05 < 0.20 > 0.40

* Distortion is not computed by default — pass compute_distortion=True to enable it. Scale-aligns the reconstruction to ground-truth before scoring; 0.0 = perfect.

All metrics are rotation-, reflection-, and translation-invariant.

Usage

from spatial_graph_algorithms.metrics import evaluate, quality_table

m = evaluate(sg_reconstructed, k_neighbors=10)
print(m["cpd"], m["knn"])

# Include distortion (opt-in, O(n²)):
m = evaluate(sg_reconstructed, compute_distortion=True)

# Optional: save to CSV
evaluate(sg_reconstructed, output_csv="quality.csv")

# Compare multiple reconstructions side-by-side (Jupyter-friendly DataFrame):
qt = quality_table({"MDS": sg_mds, "STRND": sg_strnd})

API Reference

spatial_graph_algorithms.metrics.evaluate(sn, *, k_neighbors=15, compute_distortion=False, output_csv=None)

Compute graph structure properties and reconstruction quality metrics.

Always computes structural properties from the adjacency matrix. When both :attr:~spatial_graph_algorithms.network.SpatialGraph.positions and :attr:~spatial_graph_algorithms.network.SpatialGraph.reconstructed_positions are set, also computes reconstruction quality metrics.

Parameters:

Name Type Description Default
sn SpatialGraph

Input graph. Should have reconstructed_positions set (via :func:spatial_graph_algorithms.reconstruct.reconstruct) for quality metrics to be populated.

required
k_neighbors int

Number of neighbours used in the KNN metric. Default 10.

15
compute_distortion bool

If True, also compute the median relative pairwise-distance distortion. Disabled by default because it is O(n²) and rarely needed. Default False.

False
output_csv str or Path

If given, the result dictionary is written as a one-row CSV to this path. Parent directories are created if needed.

None

Returns:

Type Description
dict

Flat dictionary with the following keys:

Graph structure

  • n_nodes — number of nodes.
  • n_edges — number of undirected edges.
  • min_degree, max_degree, mean_degree, degree_std
  • density — edge density in [0, 1].
  • transitivity — global clustering coefficient.
  • largest_component_fraction — fraction of nodes in the LCC.

Reconstruction quality (None if positions unavailable)

  • cpd — R² of pairwise distances [0, 1].
  • knn — Mean k-NN overlap [0, 1].
  • distortion — Normalised pairwise-distance distortion [0, 1] (None unless compute_distortion=True).

Examples:

>>> from spatial_graph_algorithms.simulate import generate
>>> from spatial_graph_algorithms.reconstruct import reconstruct
>>> from spatial_graph_algorithms.metrics import evaluate
>>> sn = generate(n=300, seed=42)
>>> sn_rec = reconstruct(sn, method="mds", seed=42)
>>> m = evaluate(sn_rec)
>>> m["cpd"] > 0.7
True
Source code in src/spatial_graph_algorithms/metrics/__init__.py
def evaluate(
    sn: SpatialGraph,
    *,
    k_neighbors: int = 15,
    compute_distortion: bool = False,
    output_csv: str | Path | None = None,
) -> dict:
    """Compute graph structure properties and reconstruction quality metrics.

    Always computes structural properties from the adjacency matrix.  When
    both :attr:`~spatial_graph_algorithms.network.SpatialGraph.positions` and
    :attr:`~spatial_graph_algorithms.network.SpatialGraph.reconstructed_positions` are
    set, also computes reconstruction quality metrics.

    Parameters
    ----------
    sn : SpatialGraph
        Input graph.  Should have *reconstructed_positions* set (via
        :func:`spatial_graph_algorithms.reconstruct.reconstruct`) for quality metrics to
        be populated.
    k_neighbors : int
        Number of neighbours used in the KNN metric.  Default 10.
    compute_distortion : bool
        If ``True``, also compute the median relative pairwise-distance
        distortion.  Disabled by default because it is O(n²) and rarely
        needed.  Default ``False``.
    output_csv : str or Path, optional
        If given, the result dictionary is written as a one-row CSV to this
        path.  Parent directories are created if needed.

    Returns
    -------
    dict
        Flat dictionary with the following keys:

        **Graph structure**

        - ``n_nodes`` — number of nodes.
        - ``n_edges`` — number of undirected edges.
        - ``min_degree``, ``max_degree``, ``mean_degree``, ``degree_std``
        - ``density`` — edge density in ``[0, 1]``.
        - ``transitivity`` — global clustering coefficient.
        - ``largest_component_fraction`` — fraction of nodes in the LCC.

        **Reconstruction quality** (``None`` if positions unavailable)

        - ``cpd`` — R² of pairwise distances ``[0, 1]``.
        - ``knn`` — Mean k-NN overlap ``[0, 1]``.
        - ``distortion`` — Normalised pairwise-distance distortion ``[0, 1]``
          (``None`` unless ``compute_distortion=True``).

    Examples
    --------
    >>> from spatial_graph_algorithms.simulate import generate
    >>> from spatial_graph_algorithms.reconstruct import reconstruct
    >>> from spatial_graph_algorithms.metrics import evaluate
    >>> sn = generate(n=300, seed=42)
    >>> sn_rec = reconstruct(sn, method="mds", seed=42)
    >>> m = evaluate(sn_rec)
    >>> m["cpd"] > 0.7
    True
    """
    import networkx as nx

    results = graph_summary(sn)

    seq = degree_sequence(sn)
    results["degree_std"] = float(np.std(seq)) if len(seq) else 0.0

    G = sn.graph
    results["transitivity"] = float(nx.transitivity(G))
    n_lcc = max((len(c) for c in nx.connected_components(G)), default=0)
    results["largest_component_fraction"] = (
        n_lcc / results["n_nodes"] if results["n_nodes"] > 0 else 0.0
    )

    if sn.positions is not None and sn.reconstructed_positions is not None:
        pos   = sn.positions
        recon = sn.reconstructed_positions
        # Restrict to rows where both arrays are fully valid (no NaN).
        # For real bipartite data, bead rows have NaN positions; this ensures
        # metrics are computed only on the positioned (cell) nodes.
        valid = ~(np.isnan(pos).any(axis=1) | np.isnan(recon).any(axis=1))
        if valid.any():
            results["cpd"] = cpd(pos[valid], recon[valid])
            results["knn"] = knn(pos[valid], recon[valid], k=k_neighbors)
            results["distortion"] = (
                distortion(pos[valid], recon[valid]) if compute_distortion else None
            )
        else:
            results["cpd"] = None
            results["knn"] = None
            results["distortion"] = None
    else:
        results["cpd"] = None
        results["knn"] = None
        results["distortion"] = None

    if output_csv is not None:
        out = Path(output_csv)
        out.parent.mkdir(parents=True, exist_ok=True)
        pd.DataFrame([results]).to_csv(out, index=False)

    return results

spatial_graph_algorithms.metrics.evaluate_denoising(sg, *, scores=None, method=None)

Evaluate denoising quality against ground-truth is_false labels.

Compares the is_removed column (set by :class:~spatial_graph_algorithms.denoise.EdgeFilterer) against the is_false column (set by :func:~spatial_graph_algorithms.simulate.generate) to compute binary classification metrics.

Parameters:

Name Type Description Default
sg SpatialGraph

Graph after denoising. Must have edge_metadata with both an is_false column (ground truth) and an is_removed column (set by the filterer).

required
scores dict

Continuous edge scores from :class:~spatial_graph_algorithms.denoise.EdgeScorer. When provided, AUC-ROC and AUC-PR are also computed from the raw scores rather than from the binary is_removed predictions.

None
method str

Scoring method that produced scores. Required when scores is provided. Score polarity is looked up in :data:spatial_graph_algorithms.denoise.SCORE_POLARITY.

None

Returns:

Type Description
dict or None

None when ground truth is unavailable (is_false or is_removed columns missing). Otherwise a flat dict with:

  • precision — fraction of removed edges that are truly false.
  • recall — fraction of false edges that were removed.
  • f1 — harmonic mean of precision and recall.
  • n_removed — count of edges removed.
  • n_true_false_edges — count of false edges in the graph.
  • confusion_matrix — 2×2 array [[TN, FP], [FN, TP]].
  • auc_roc — area under ROC curve (None when scores absent).
  • auc_pr — area under precision-recall curve (None when scores absent).

Raises:

Type Description
ValueError

If scores is provided without method, or method is unknown.

Examples:

>>> from spatial_graph_algorithms.simulate import generate
>>> from spatial_graph_algorithms.denoise import denoise
>>> from spatial_graph_algorithms.metrics import evaluate_denoising
>>> sg = generate(n=200, false_edges_fraction=0.10, seed=42)
>>> sg_clean = denoise(sg, method="jaccard", fraction_to_remove=0.10)
>>> report = evaluate_denoising(sg_clean)
>>> report is not None and "f1" in report
True
Source code in src/spatial_graph_algorithms/metrics/denoise_quality.py
def evaluate_denoising(
    sg: SpatialGraph,
    *,
    scores: dict[tuple[int, int], float] | None = None,
    method: str | None = None,
) -> dict | None:
    """Evaluate denoising quality against ground-truth ``is_false`` labels.

    Compares the ``is_removed`` column (set by
    :class:`~spatial_graph_algorithms.denoise.EdgeFilterer`) against the
    ``is_false`` column (set by
    :func:`~spatial_graph_algorithms.simulate.generate`) to compute binary
    classification metrics.

    Parameters
    ----------
    sg : SpatialGraph
        Graph after denoising.  Must have ``edge_metadata`` with both an
        ``is_false`` column (ground truth) and an ``is_removed`` column
        (set by the filterer).
    scores : dict, optional
        Continuous edge scores from
        :class:`~spatial_graph_algorithms.denoise.EdgeScorer`.  When
        provided, AUC-ROC and AUC-PR are also computed from the raw scores
        rather than from the binary ``is_removed`` predictions.
    method : str, optional
        Scoring method that produced *scores*.  Required when *scores* is
        provided.  Score polarity is looked up in
        :data:`spatial_graph_algorithms.denoise.SCORE_POLARITY`.

    Returns
    -------
    dict or None
        ``None`` when ground truth is unavailable (``is_false`` or
        ``is_removed`` columns missing).  Otherwise a flat dict with:

        - ``precision`` — fraction of removed edges that are truly false.
        - ``recall`` — fraction of false edges that were removed.
        - ``f1`` — harmonic mean of precision and recall.
        - ``n_removed`` — count of edges removed.
        - ``n_true_false_edges`` — count of false edges in the graph.
        - ``confusion_matrix`` — 2×2 array ``[[TN, FP], [FN, TP]]``.
        - ``auc_roc`` — area under ROC curve (``None`` when *scores* absent).
        - ``auc_pr`` — area under precision-recall curve (``None`` when
          *scores* absent).

    Raises
    ------
    ValueError
        If *scores* is provided without *method*, or *method* is unknown.

    Examples
    --------
    >>> from spatial_graph_algorithms.simulate import generate
    >>> from spatial_graph_algorithms.denoise import denoise
    >>> from spatial_graph_algorithms.metrics import evaluate_denoising
    >>> sg = generate(n=200, false_edges_fraction=0.10, seed=42)
    >>> sg_clean = denoise(sg, method="jaccard", fraction_to_remove=0.10)
    >>> report = evaluate_denoising(sg_clean)
    >>> report is not None and "f1" in report
    True
    """
    if sg.edge_metadata is None:
        return None
    cols = set(sg.edge_metadata.columns)
    if "is_false" not in cols or "is_removed" not in cols:
        return None

    from sklearn.metrics import (
        confusion_matrix,
        f1_score,
        precision_score,
        recall_score,
    )

    y_true = sg.edge_metadata["is_false"].astype(int).to_numpy()
    y_pred = sg.edge_metadata["is_removed"].astype(int).to_numpy()

    result: dict = {
        "precision": float(precision_score(y_true, y_pred, zero_division=0)),
        "recall": float(recall_score(y_true, y_pred, zero_division=0)),
        "f1": float(f1_score(y_true, y_pred, zero_division=0)),
        "n_removed": int(y_pred.sum()),
        "n_true_false_edges": int(y_true.sum()),
        "confusion_matrix": confusion_matrix(y_true, y_pred),
        "auc_roc": None,
        "auc_pr": None,
    }

    if scores is not None and len(scores) > 0:
        if method is None:
            raise ValueError("method is required when scores are provided.")
        from sklearn.metrics import average_precision_score, roc_auc_score

        n64 = np.int64(sg.adjacency_matrix.shape[0])
        sources = sg.edge_metadata["source"].values.astype(np.int64)
        targets = sg.edge_metadata["target"].values.astype(np.int64)
        lo = np.minimum(sources, targets)
        hi = np.maximum(sources, targets)
        encoded_meta = lo * n64 + hi

        score_by_code: dict[np.int64, float] = {
            np.int64(u) * n64 + np.int64(v): s
            for (u, v), s in scores.items()
        }
        y_score_raw = np.array(
            [score_by_code.get(int(e), float("nan")) for e in encoded_meta]
        )
        # Flip so that higher always means "more suspicious" for ROC/PR.
        y_score = y_score_raw if _score_direction(method) else -y_score_raw
        valid = ~np.isnan(y_score)
        if valid.sum() > 1 and y_true[valid].sum() > 0:
            result["auc_roc"] = float(roc_auc_score(y_true[valid], y_score[valid]))
            result["auc_pr"] = float(
                average_precision_score(y_true[valid], y_score[valid])
            )

    return result

spatial_graph_algorithms.metrics.graph_report(sg)

Return a GraphReport characterizing the given SpatialGraph.

Parameters:

Name Type Description Default
sg SpatialGraph

Input graph. May or may not have ground-truth positions.

required

Returns:

Type Description
GraphReport

Report with topology properties always populated and spatial properties populated when sg.positions is not None.

Examples:

>>> from spatial_graph_algorithms.simulate import generate
>>> from spatial_graph_algorithms.metrics import graph_report
>>> sg = generate(n=200, seed=0)
>>> r = graph_report(sg)
>>> r.n_nodes
200
Source code in src/spatial_graph_algorithms/metrics/report.py
def graph_report(sg: SpatialGraph) -> GraphReport:
    """Return a GraphReport characterizing the given SpatialGraph.

    Parameters
    ----------
    sg : SpatialGraph
        Input graph. May or may not have ground-truth positions.

    Returns
    -------
    GraphReport
        Report with topology properties always populated and spatial
        properties populated when ``sg.positions`` is not ``None``.

    Examples
    --------
    >>> from spatial_graph_algorithms.simulate import generate
    >>> from spatial_graph_algorithms.metrics import graph_report
    >>> sg = generate(n=200, seed=0)
    >>> r = graph_report(sg)
    >>> r.n_nodes
    200
    """
    return GraphReport(sg)

spatial_graph_algorithms.metrics.GraphReport

Structural and spatial characterization of a SpatialGraph.

Instantiation computes all fast properties immediately. Expensive metrics (diameter, average path length, betweenness) are lazy and cached on first access via :func:functools.cached_property.

Parameters:

Name Type Description Default
sg SpatialGraph

Input graph. May or may not have ground-truth positions.

required

Examples:

>>> from spatial_graph_algorithms.simulate import generate
>>> from spatial_graph_algorithms.metrics import graph_report
>>> sg = generate(n=200, seed=0)
>>> r = graph_report(sg)
>>> r.n_nodes
200
>>> r.edge_length_stats["mean"] > 0
True
Source code in src/spatial_graph_algorithms/metrics/report.py
class GraphReport:
    """Structural and spatial characterization of a SpatialGraph.

    Instantiation computes all fast properties immediately. Expensive metrics
    (diameter, average path length, betweenness) are lazy and cached on first
    access via :func:`functools.cached_property`.

    Parameters
    ----------
    sg : SpatialGraph
        Input graph. May or may not have ground-truth positions.

    Examples
    --------
    >>> from spatial_graph_algorithms.simulate import generate
    >>> from spatial_graph_algorithms.metrics import graph_report
    >>> sg = generate(n=200, seed=0)
    >>> r = graph_report(sg)
    >>> r.n_nodes
    200
    >>> r.edge_length_stats["mean"] > 0
    True
    """

    def __init__(self, sg: SpatialGraph) -> None:
        import networkx as nx

        self._sg = sg
        G = sg.graph
        seq = degree_sequence(sg)
        n = int(sg.adjacency_matrix.shape[0])
        m = int(sg.adjacency_matrix.nnz // 2)

        # --- Topology (always computed) ---
        self.n_nodes: int = n
        self.n_edges: int = m
        max_edges = (n * (n - 1)) / 2 if n > 1 else 1
        self.density: float = float(m / max_edges)
        self.mean_degree: float = float(np.mean(seq)) if len(seq) else 0.0
        self.min_degree: float = float(np.min(seq)) if len(seq) else 0.0
        self.max_degree: float = float(np.max(seq)) if len(seq) else 0.0
        self.degree_std: float = float(np.std(seq)) if len(seq) else 0.0

        components = list(nx.connected_components(G))
        self.n_connected_components: int = len(components)
        lcc_size = max((len(c) for c in components), default=0)
        self.largest_component_fraction: float = lcc_size / n if n > 0 else 0.0
        self.transitivity: float = float(nx.transitivity(G))
        self.average_clustering_coefficient: float = float(nx.average_clustering(G))
        try:
            self.assortativity: float = float(nx.degree_assortativity_coefficient(G))
        except (ZeroDivisionError, nx.NetworkXError):
            self.assortativity = float("nan")

        # --- Spatial / ground truth (only when positions are set) ---
        if sg.positions is not None:
            self.edge_length_stats: dict | None = _edge_length_stats(sg)
            self.spatial_extent: dict | None = _spatial_extent(sg.positions)
            _area = self.spatial_extent.get("area") or self.spatial_extent.get("volume")
            self.local_spatial_density: float | None = (
                float(n / _area) if _area and _area > 0 else None
            )
            self.false_edge_fraction: float | None = _false_edge_fraction(sg)
        else:
            self.edge_length_stats = None
            self.spatial_extent = None
            self.local_spatial_density = None
            self.false_edge_fraction = None

    # ------------------------------------------------------------------
    # On-demand (lazy, cached after first access)
    # ------------------------------------------------------------------

    @functools.cached_property
    def diameter(self) -> float:
        """Diameter of the largest connected component.

        O(n²) — may be slow for large graphs.
        """
        import networkx as nx

        G = self._sg.graph
        lcc = G.subgraph(max(nx.connected_components(G), key=len)).copy()
        return float(nx.diameter(lcc))

    @functools.cached_property
    def average_path_length(self) -> float:
        """Mean shortest-path length in the largest connected component.

        O(n²) — may be slow for large graphs.
        """
        import networkx as nx

        G = self._sg.graph
        lcc = G.subgraph(max(nx.connected_components(G), key=len)).copy()
        return float(nx.average_shortest_path_length(lcc))

    @functools.cached_property
    def betweenness_centrality_stats(self) -> dict:
        """Min / max / mean betweenness centrality over all nodes.

        O(n · m) — may be slow for large graphs.

        Returns
        -------
        dict
            Keys: ``min``, ``max``, ``mean``.
        """
        import networkx as nx

        bc = np.array(list(nx.betweenness_centrality(self._sg.graph).values()))
        return {
            "min": float(np.min(bc)),
            "max": float(np.max(bc)),
            "mean": float(np.mean(bc)),
        }

    # ------------------------------------------------------------------
    # Output / display
    # ------------------------------------------------------------------

    def to_dict(self) -> dict:
        """Return a flat dictionary of all default properties.

        Returns
        -------
        dict
            Topology keys are always present. Spatial keys are ``None`` when
            the graph has no ground-truth positions.
        """
        d: dict = {
            "n_nodes": self.n_nodes,
            "n_edges": self.n_edges,
            "density": self.density,
            "mean_degree": self.mean_degree,
            "min_degree": self.min_degree,
            "max_degree": self.max_degree,
            "degree_std": self.degree_std,
            "n_connected_components": self.n_connected_components,
            "largest_component_fraction": self.largest_component_fraction,
            "transitivity": self.transitivity,
            "average_clustering_coefficient": self.average_clustering_coefficient,
            "assortativity": self.assortativity,
        }
        if self.edge_length_stats is not None:
            for k, v in self.edge_length_stats.items():
                d[f"edge_length_{k}"] = v
        else:
            for k in ("mean", "median", "std", "min", "max"):
                d[f"edge_length_{k}"] = None
        if self.spatial_extent is not None:
            d.update(self.spatial_extent)
        d["local_spatial_density"] = self.local_spatial_density
        d["false_edge_fraction"] = self.false_edge_fraction
        return d

    def __repr__(self) -> str:
        has_spatial = self.edge_length_stats is not None
        return (
            f"GraphReport(n={self.n_nodes}, m={self.n_edges}, "
            f"components={self.n_connected_components}, "
            f"spatial={'yes' if has_spatial else 'no'})"
        )

    def _repr_html_(self) -> str:
        def _row(label: str, value: object) -> str:
            return (
                f"<tr>"
                f"<td style='padding:2px 10px;'>{label}</td>"
                f"<td style='padding:2px 10px;'><b>{value}</b></td>"
                f"</tr>"
            )

        def _section(title: str) -> str:
            return (
                f"<tr><td colspan='2' style='"
                f"padding:4px 10px;background:#f0f0f0;font-weight:bold;'>"
                f"{title}</td></tr>"
            )

        rows = [
            "<table style='border-collapse:collapse;font-family:monospace;font-size:13px;'>",
            _section("Topology"),
            _row("Nodes", self.n_nodes),
            _row("Edges", self.n_edges),
            _row("Density", f"{self.density:.4f}"),
            _row("Mean degree", f"{self.mean_degree:.2f}"),
            _row("Min / Max degree", f"{int(self.min_degree)} / {int(self.max_degree)}"),
            _row("Degree std", f"{self.degree_std:.2f}"),
            _row("Connected components", self.n_connected_components),
            _row("Largest component fraction", f"{self.largest_component_fraction:.3f}"),
            _row("Transitivity", f"{self.transitivity:.4f}"),
            _row("Avg clustering coeff", f"{self.average_clustering_coefficient:.4f}"),
            _row("Assortativity", f"{self.assortativity:.4f}"),
        ]

        if self.edge_length_stats is not None:
            el = self.edge_length_stats
            rows += [
                _section("Spatial (ground truth)"),
                _row("Edge length mean ± std", f"{el['mean']:.4f} ± {el['std']:.4f}"),
                _row("Edge length min / max", f"{el['min']:.4f} / {el['max']:.4f}"),
                _row("Edge length median", f"{el['median']:.4f}"),
            ]
            if self.spatial_extent is not None:
                area = self.spatial_extent.get("area") or self.spatial_extent.get("volume")
                label = "Area" if "area" in self.spatial_extent else "Volume"
                rows.append(_row(label, f"{area:.4f}" if area is not None else "n/a"))
            if self.local_spatial_density is not None:
                rows.append(_row("Local spatial density", f"{self.local_spatial_density:.4f}"))
            if self.false_edge_fraction is not None:
                rows.append(_row("False edge fraction", f"{self.false_edge_fraction:.4f}"))

        rows.append("</table>")
        return "".join(rows)

    # ------------------------------------------------------------------
    # Plots
    # ------------------------------------------------------------------

    def plot_degree_distribution(self, *, log_scale: bool = False, **kwargs) -> plt.Figure:
        """Return a bar chart of the degree distribution.

        Parameters
        ----------
        log_scale : bool
            If ``True``, set the y-axis to log scale. Default is ``False``.
        **kwargs
            Passed through to :func:`matplotlib.axes.Axes.bar`.

        Returns
        -------
        matplotlib.figure.Figure
        """
        import matplotlib.pyplot as plt

        dist = degree_distribution(self._sg)
        degrees = sorted(dist["counts"].keys())
        counts = [dist["counts"][d] for d in degrees]

        fig, ax = plt.subplots()
        ax.bar(degrees, counts, color="#4C72B0", **kwargs)
        ax.set_xlabel("Degree")
        ax.set_ylabel("Count")
        ax.set_title("Degree Distribution")
        if log_scale:
            ax.set_yscale("log")
        fig.tight_layout()
        return fig

    def plot_edge_length_distribution(self, *, log_scale: bool = False, **kwargs) -> plt.Figure:
        """Return a histogram of edge lengths.

        Requires ground-truth positions.

        Parameters
        ----------
        log_scale : bool
            If ``True``, set the y-axis to log scale. Default is ``False``.
        **kwargs
            Passed through to :func:`matplotlib.axes.Axes.hist`.

        Returns
        -------
        matplotlib.figure.Figure

        Raises
        ------
        ValueError
            If the graph has no ground-truth positions.
        """
        if self._sg.positions is None:
            raise ValueError(
                "plot_edge_length_distribution requires ground-truth positions "
                "(sg.positions is None)."
            )
        import matplotlib.pyplot as plt

        lengths = _edge_lengths(self._sg)
        fig, ax = plt.subplots()
        ax.hist(lengths, bins=30, color="#55A868", **kwargs)
        ax.set_xlabel("Edge length")
        ax.set_ylabel("Count")
        ax.set_title("Edge Length Distribution")
        if log_scale:
            ax.set_yscale("log")
        fig.tight_layout()
        return fig

Attributes

diameter cached property

Diameter of the largest connected component.

O(n²) — may be slow for large graphs.

average_path_length cached property

Mean shortest-path length in the largest connected component.

O(n²) — may be slow for large graphs.

betweenness_centrality_stats cached property

Min / max / mean betweenness centrality over all nodes.

O(n · m) — may be slow for large graphs.

Returns:

Type Description
dict

Keys: min, max, mean.

Functions

to_dict()

Return a flat dictionary of all default properties.

Returns:

Type Description
dict

Topology keys are always present. Spatial keys are None when the graph has no ground-truth positions.

Source code in src/spatial_graph_algorithms/metrics/report.py
def to_dict(self) -> dict:
    """Return a flat dictionary of all default properties.

    Returns
    -------
    dict
        Topology keys are always present. Spatial keys are ``None`` when
        the graph has no ground-truth positions.
    """
    d: dict = {
        "n_nodes": self.n_nodes,
        "n_edges": self.n_edges,
        "density": self.density,
        "mean_degree": self.mean_degree,
        "min_degree": self.min_degree,
        "max_degree": self.max_degree,
        "degree_std": self.degree_std,
        "n_connected_components": self.n_connected_components,
        "largest_component_fraction": self.largest_component_fraction,
        "transitivity": self.transitivity,
        "average_clustering_coefficient": self.average_clustering_coefficient,
        "assortativity": self.assortativity,
    }
    if self.edge_length_stats is not None:
        for k, v in self.edge_length_stats.items():
            d[f"edge_length_{k}"] = v
    else:
        for k in ("mean", "median", "std", "min", "max"):
            d[f"edge_length_{k}"] = None
    if self.spatial_extent is not None:
        d.update(self.spatial_extent)
    d["local_spatial_density"] = self.local_spatial_density
    d["false_edge_fraction"] = self.false_edge_fraction
    return d

plot_degree_distribution(*, log_scale=False, **kwargs)

Return a bar chart of the degree distribution.

Parameters:

Name Type Description Default
log_scale bool

If True, set the y-axis to log scale. Default is False.

False
**kwargs

Passed through to :func:matplotlib.axes.Axes.bar.

{}

Returns:

Type Description
Figure
Source code in src/spatial_graph_algorithms/metrics/report.py
def plot_degree_distribution(self, *, log_scale: bool = False, **kwargs) -> plt.Figure:
    """Return a bar chart of the degree distribution.

    Parameters
    ----------
    log_scale : bool
        If ``True``, set the y-axis to log scale. Default is ``False``.
    **kwargs
        Passed through to :func:`matplotlib.axes.Axes.bar`.

    Returns
    -------
    matplotlib.figure.Figure
    """
    import matplotlib.pyplot as plt

    dist = degree_distribution(self._sg)
    degrees = sorted(dist["counts"].keys())
    counts = [dist["counts"][d] for d in degrees]

    fig, ax = plt.subplots()
    ax.bar(degrees, counts, color="#4C72B0", **kwargs)
    ax.set_xlabel("Degree")
    ax.set_ylabel("Count")
    ax.set_title("Degree Distribution")
    if log_scale:
        ax.set_yscale("log")
    fig.tight_layout()
    return fig

plot_edge_length_distribution(*, log_scale=False, **kwargs)

Return a histogram of edge lengths.

Requires ground-truth positions.

Parameters:

Name Type Description Default
log_scale bool

If True, set the y-axis to log scale. Default is False.

False
**kwargs

Passed through to :func:matplotlib.axes.Axes.hist.

{}

Returns:

Type Description
Figure

Raises:

Type Description
ValueError

If the graph has no ground-truth positions.

Source code in src/spatial_graph_algorithms/metrics/report.py
def plot_edge_length_distribution(self, *, log_scale: bool = False, **kwargs) -> plt.Figure:
    """Return a histogram of edge lengths.

    Requires ground-truth positions.

    Parameters
    ----------
    log_scale : bool
        If ``True``, set the y-axis to log scale. Default is ``False``.
    **kwargs
        Passed through to :func:`matplotlib.axes.Axes.hist`.

    Returns
    -------
    matplotlib.figure.Figure

    Raises
    ------
    ValueError
        If the graph has no ground-truth positions.
    """
    if self._sg.positions is None:
        raise ValueError(
            "plot_edge_length_distribution requires ground-truth positions "
            "(sg.positions is None)."
        )
    import matplotlib.pyplot as plt

    lengths = _edge_lengths(self._sg)
    fig, ax = plt.subplots()
    ax.hist(lengths, bins=30, color="#55A868", **kwargs)
    ax.set_xlabel("Edge length")
    ax.set_ylabel("Count")
    ax.set_title("Edge Length Distribution")
    if log_scale:
        ax.set_yscale("log")
    fig.tight_layout()
    return fig

spatial_graph_algorithms.metrics.graph_summary(sn)

Return a high-level structural summary of the graph.

Parameters:

Name Type Description Default
sn SpatialGraph

Input graph.

required

Returns:

Type Description
dict

Dictionary with keys:

  • n_nodes — number of nodes.
  • n_edges — number of undirected edges.
  • min_degree — minimum node degree.
  • max_degree — maximum node degree.
  • mean_degree — mean node degree.
  • density — edge density in [0, 1] (2 * n_edges / (n * (n-1))).

Examples:

>>> from spatial_graph_algorithms.simulate import generate
>>> from spatial_graph_algorithms.metrics import graph_summary
>>> sn = generate(n=200, seed=0)
>>> s = graph_summary(sn)
>>> s["n_nodes"]
200
Source code in src/spatial_graph_algorithms/metrics/graph_properties.py
def graph_summary(sn: SpatialGraph) -> dict[str, int | float]:
    """Return a high-level structural summary of the graph.

    Parameters
    ----------
    sn : SpatialGraph
        Input graph.

    Returns
    -------
    dict
        Dictionary with keys:

        - ``n_nodes`` — number of nodes.
        - ``n_edges`` — number of undirected edges.
        - ``min_degree`` — minimum node degree.
        - ``max_degree`` — maximum node degree.
        - ``mean_degree`` — mean node degree.
        - ``density`` — edge density in ``[0, 1]``
          (``2 * n_edges / (n * (n-1))``).

    Examples
    --------
    >>> from spatial_graph_algorithms.simulate import generate
    >>> from spatial_graph_algorithms.metrics import graph_summary
    >>> sn = generate(n=200, seed=0)
    >>> s = graph_summary(sn)
    >>> s["n_nodes"]
    200
    """
    seq = degree_sequence(sn)
    n = sn.adjacency_matrix.shape[0]
    edge_count = int(sn.adjacency_matrix.nnz // 2)
    max_edges = (n * (n - 1)) / 2 if n > 1 else 1
    return {
        "n_nodes": int(n),
        "n_edges": edge_count,
        "min_degree": float(np.min(seq)) if len(seq) else 0.0,
        "max_degree": float(np.max(seq)) if len(seq) else 0.0,
        "mean_degree": float(np.mean(seq)) if len(seq) else 0.0,
        "density": float(edge_count / max_edges) if max_edges else 0.0,
    }

spatial_graph_algorithms.metrics.degree_sequence(sn)

Return the per-node degree sequence.

Parameters:

Name Type Description Default
sn SpatialGraph

Input graph.

required

Returns:

Type Description
ndarray

Float array of shape (n_nodes,) where entry i is the degree of node i.

Source code in src/spatial_graph_algorithms/metrics/graph_properties.py
def degree_sequence(sn: SpatialGraph) -> np.ndarray:
    """Return the per-node degree sequence.

    Parameters
    ----------
    sn : SpatialGraph
        Input graph.

    Returns
    -------
    np.ndarray
        Float array of shape ``(n_nodes,)`` where entry *i* is the degree of
        node *i*.
    """
    deg = np.asarray(sn.adjacency_matrix.sum(axis=1)).ravel()
    return deg.astype(float)

spatial_graph_algorithms.metrics.quality_table(reconstructions, *, k_neighbors=15)

Build a quality-metrics table for one or more reconstructions.

Calls :func:evaluate on each entry in reconstructions and collects CPD, KNN, and Distortion into a tidy DataFrame indexed by method label.

Parameters:

Name Type Description Default
reconstructions dict[str, SpatialGraph]

Mapping of method label → reconstructed :class:~spatial_graph_algorithms.network.SpatialGraph.

required
k_neighbors int

Passed to :func:evaluate for the KNN metric. Default 15.

15

Returns:

Type Description
DataFrame

DataFrame with columns CPD, KNN, Distortion and the method label as the index.

Examples:

>>> from spatial_graph_algorithms.simulate import generate
>>> from spatial_graph_algorithms.reconstruct import reconstruct
>>> from spatial_graph_algorithms.metrics import quality_table
>>> sg = generate(n=300, seed=42)
>>> qt = quality_table({"MDS": reconstruct(sg, method="mds", seed=42)})
>>> qt.columns.tolist()
['CPD', 'KNN', 'Distortion']
Source code in src/spatial_graph_algorithms/metrics/__init__.py
def quality_table(
    reconstructions: dict[str, SpatialGraph],
    *,
    k_neighbors: int = 15,
) -> pd.DataFrame:
    """Build a quality-metrics table for one or more reconstructions.

    Calls :func:`evaluate` on each entry in *reconstructions* and collects
    CPD, KNN, and Distortion into a tidy DataFrame indexed by method label.

    Parameters
    ----------
    reconstructions : dict[str, SpatialGraph]
        Mapping of method label → reconstructed
        :class:`~spatial_graph_algorithms.network.SpatialGraph`.
    k_neighbors : int
        Passed to :func:`evaluate` for the KNN metric.  Default 15.

    Returns
    -------
    pd.DataFrame
        DataFrame with columns ``CPD``, ``KNN``, ``Distortion`` and the
        method label as the index.

    Examples
    --------
    >>> from spatial_graph_algorithms.simulate import generate
    >>> from spatial_graph_algorithms.reconstruct import reconstruct
    >>> from spatial_graph_algorithms.metrics import quality_table
    >>> sg = generate(n=300, seed=42)
    >>> qt = quality_table({"MDS": reconstruct(sg, method="mds", seed=42)})
    >>> qt.columns.tolist()
    ['CPD', 'KNN', 'Distortion']
    """
    rows = []
    for label, sg_rec in reconstructions.items():
        m = evaluate(sg_rec, k_neighbors=k_neighbors, compute_distortion=True)
        rows.append({
            "Method": label, "CPD": m["cpd"], "KNN": m["knn"], "Distortion": m["distortion"],
        })
    return pd.DataFrame(rows).set_index("Method")

spatial_graph_algorithms.metrics.degree_distribution(sn)

Compute the degree distribution as counts and proportions.

Parameters:

Name Type Description Default
sn SpatialGraph

Input graph.

required

Returns:

Type Description
dict

A dictionary with two sub-dicts:

  • "counts"{degree: node_count} mapping.
  • "proportions"{degree: fraction} mapping. Proportions sum to 1.0.

Examples:

>>> from spatial_graph_algorithms.simulate import generate
>>> from spatial_graph_algorithms.metrics import degree_distribution
>>> sn = generate(n=100, seed=0)
>>> dist = degree_distribution(sn)
>>> sum(dist["proportions"].values())
1.0
Source code in src/spatial_graph_algorithms/metrics/graph_properties.py
def degree_distribution(sn: SpatialGraph) -> dict:
    """Compute the degree distribution as counts and proportions.

    Parameters
    ----------
    sn : SpatialGraph
        Input graph.

    Returns
    -------
    dict
        A dictionary with two sub-dicts:

        - ``"counts"`` — ``{degree: node_count}`` mapping.
        - ``"proportions"`` — ``{degree: fraction}`` mapping.  Proportions
          sum to 1.0.

    Examples
    --------
    >>> from spatial_graph_algorithms.simulate import generate
    >>> from spatial_graph_algorithms.metrics import degree_distribution
    >>> sn = generate(n=100, seed=0)
    >>> dist = degree_distribution(sn)
    >>> sum(dist["proportions"].values())
    1.0
    """
    seq = degree_sequence(sn).astype(int)
    if len(seq) == 0:
        return {"counts": {}, "proportions": {}}
    unique, counts = np.unique(seq, return_counts=True)
    total = counts.sum()
    cdict = {int(k): int(v) for k, v in zip(unique, counts)}
    pdict = {int(k): float(v / total) for k, v in zip(unique, counts)}
    return {"counts": cdict, "proportions": pdict}