Skip to content

Module: verify

File: src/spatial_graph_algorithms/verify/ Status: Stable.


Purpose

run_report() is a convenience wrapper that runs the full pipeline in one call and writes all output artefacts to a timestamped directory.

It is not a new algorithm — it calls simulate, reconstruct, metrics, and plot in sequence. It exists so users can get a complete output folder without writing boilerplate pipeline code.


What run_report() Does

generate(simulation_kwargs)
render_simulation_visualization_bundle()  → verify_network.png, verify_edge_length_histogram.png
degree_distribution()                     → degree_distribution.csv
graph_summary() + shortest_path_stats() + false_edge_stats()
run_parameters.csv
For each method in reconstruct_methods:
    reconstruct(method)
    evaluate()                            → reconstruction_quality.csv
    plot_comparison()                     → comparison_{method}.png
report.csv

Output Files

File Always written Contents
report.csv Yes Graph properties, SP stats, false-edge stats
degree_distribution.csv Yes Per-degree counts and proportions
run_parameters.csv Yes simulation_kwargs as a CSV row
reconstruction_quality.csv When methods given CPD, KNN, distortion per method
{prefix}_network.png Yes 2-D graph (or 3-D if dim=3)
{prefix}_edge_length_histogram.png Yes Edge-length histogram
comparison_{method}.png When methods given Original vs reconstructed

VerifyConfig

@dataclass
class VerifyConfig:
    output_root: Path = Path(".planning/artifacts/verification_runs")
    run_id: str | None = None   # auto-timestamps when None
    prefix: str = "verify"      # filename prefix for plots
    reconstruct_methods: list[str] = field(default_factory=list)

run_id becomes the subdirectory name inside output_root. If None, a UTC timestamp (YYYYMMDD_HHMMSS) is used automatically.


Design Decisions

Why a separate config object instead of keyword arguments? run_report already has many parameters through simulation_kwargs. A config dataclass groups the pipeline-level settings cleanly and is easier to extend without signature breakage.

Why does verify own the shortest-path and false-edge stats instead of metrics? These stats are cheap to compute on the original (unfitted) graph and are specific to the reporting use case. They don't belong in metrics which focuses on reconstruction quality.


How to Extend the Report

To add a new statistic to report.csv: 1. Compute it in run_report.py and add it to the report_row dict before writing. 2. Document the new column in docs/modules/verify.md and docs/api/verify.md.

To add a new output file: 1. Generate and write it inside run_report() after the existing outputs. 2. Add it to the "Output Files" table above.


Tests

tests/test_verify_report.py      — CSV exists, correct columns, methods present
tests/test_examples_verify.py    — end-to-end with mds + strnd