Module: verify

File: src/spatial_graph_algorithms/verify/ Status: Stable.

Purpose

run_report() is a convenience wrapper that runs the full pipeline in one call and writes all output artefacts to a timestamped directory.

It is not a new algorithm — it calls simulate, reconstruct, metrics, and plot in sequence. It exists so users can get a complete output folder without writing boilerplate pipeline code.

What `run_report()` Does

generate(simulation_kwargs)
    ↓
render_simulation_visualization_bundle()  → verify_network.png, verify_edge_length_histogram.png
    ↓
degree_distribution()                     → degree_distribution.csv
    ↓
graph_summary() + shortest_path_stats() + false_edge_stats()
    ↓
run_parameters.csv
    ↓
For each method in reconstruct_methods:
    reconstruct(method)
        ↓
    evaluate()                            → reconstruction_quality.csv
        ↓
    plot_comparison()                     → comparison_{method}.png
    ↓
report.csv

Output Files

File	Always written	Contents
`report.csv`	Yes	Graph properties, SP stats, false-edge stats
`degree_distribution.csv`	Yes	Per-degree counts and proportions
`run_parameters.csv`	Yes	`simulation_kwargs` as a CSV row
`reconstruction_quality.csv`	When methods given	CPD, KNN, distortion per method
`{prefix}_network.png`	Yes	2-D graph (or 3-D if `dim=3`)
`{prefix}_edge_length_histogram.png`	Yes	Edge-length histogram
`comparison_{method}.png`	When methods given	Original vs reconstructed

VerifyConfig

@dataclass
class VerifyConfig:
    output_root: Path = Path(".planning/artifacts/verification_runs")
    run_id: str | None = None   # auto-timestamps when None
    prefix: str = "verify"      # filename prefix for plots
    reconstruct_methods: list[str] = field(default_factory=list)

run_id becomes the subdirectory name inside output_root. If None, a UTC timestamp (YYYYMMDD_HHMMSS) is used automatically.

Design Decisions

Why a separate config object instead of keyword arguments? run_report already has many parameters through simulation_kwargs. A config dataclass groups the pipeline-level settings cleanly and is easier to extend without signature breakage.

Why does verify own the shortest-path and false-edge stats instead of metrics? These stats are cheap to compute on the original (unfitted) graph and are specific to the reporting use case. They don't belong in metrics which focuses on reconstruction quality.

How to Extend the Report

To add a new statistic to report.csv: 1. Compute it in run_report.py and add it to the report_row dict before writing. 2. Document the new column in docs/modules/verify.md and docs/api/verify.md.

To add a new output file: 1. Generate and write it inside run_report() after the existing outputs. 2. Add it to the "Output Files" table above.

Tests

tests/test_verify_report.py      — CSV exists, correct columns, methods present
tests/test_examples_verify.py    — end-to-end with mds + strnd