spatial_graph_algorithms.reconstruct

Recover 2-D or 3-D node coordinates from graph topology alone.

Available methods

Method	Algorithm	When to use
`mds`	All-pairs shortest-path distances → sklearn metric MDS	Fast baseline; no extra dependencies
`landmark_mds`	Landmark shortest-path distances → MDS + triangulation	Faster approximation when all-pairs MDS is too expensive
`strnd`	Node2Vec (pecanpy PreComp) → UMAP	State-of-the-art quality; included in base install

How to choose

Use MDS for quick checks or small graphs (< 500 nodes).
Use landmark MDS when you want a seeded MDS-style approximation and can choose an explicit landmark count.
Use STRND for production-quality reconstructions. It consistently achieves CPD > 0.95 on well-connected circle/square graphs.

Quality evaluation

After reconstruction, use spatial_graph_algorithms.metrics.evaluate() to score the result. See the metrics reference for the interpretation guide.

API Reference

`spatial_graph_algorithms.reconstruct.reconstruct(sn, method='mds', dim=2, seed=None, **kwargs)`

Reconstruct node coordinates from graph topology.

Returns a copy of sn with :attr:~spatial_graph_algorithms.network.SpatialGraph.reconstructed_positions set. The original sn is never modified.

Parameters:

Name	Type	Description	Default
`sn`	`SpatialGraph`	Input graph. Only the adjacency matrix is used; positions are not required (they are used only for quality evaluation afterwards).	required
`method`	`str`	Reconstruction algorithm. One of: `"mds"` — Metric MDS on shortest-path distances. Fast; no extra dependencies. `"strnd"` — Node2Vec (pecanpy) embeddings → UMAP. Higher quality; included in the base install. `"landmark_mds"` — Landmark shortest-path MDS. Requires `n_landmarks`. `"gse"` — Graph Spectral Embedding. Bipartite graphs only; requires `pip install "spatial_graph_algorithms[gse]"`.	`'mds'`
`dim`	`int`	Target number of output dimensions (2 or 3).	`2`
`seed`	`int`	Random seed for reproducibility.	`None`

Returns:

Type	Description
`SpatialGraph`	Deep copy of sn with reconstructed_positions of shape `(n_nodes, dim)`.

Raises:

Type	Description
`ValueError`	If method is not one of the supported values.

Examples:

>>> from spatial_graph_algorithms.simulate import generate
>>> from spatial_graph_algorithms.reconstruct import reconstruct
>>> sn = generate(n=300, seed=42)
>>> sn_rec = reconstruct(sn, method="mds", dim=2, seed=42)
>>> sn_rec.reconstructed_positions.shape
(300, 2)

Source code in src/spatial_graph_algorithms/reconstruct/__init__.py

def reconstruct(
    sn: SpatialGraph,
    method: str = "mds",
    dim: int = 2,
    seed: int | None = None,
    **kwargs
) -> SpatialGraph:
    """Reconstruct node coordinates from graph topology.

    Returns a copy of *sn* with
    :attr:`~spatial_graph_algorithms.network.SpatialGraph.reconstructed_positions`
    set. The original *sn* is never modified.

    Parameters
    ----------
    sn : SpatialGraph
        Input graph.  Only the adjacency matrix is used; positions are not
        required (they are used only for quality evaluation afterwards).
    method : str
        Reconstruction algorithm.  One of:

        - ``"mds"`` — Metric MDS on shortest-path distances.  Fast; no extra
          dependencies.
        - ``"strnd"`` — Node2Vec (pecanpy) embeddings → UMAP.  Higher quality;
          included in the base install.
        - ``"landmark_mds"`` — Landmark shortest-path MDS. Requires
          ``n_landmarks``.
        - ``"gse"`` — Graph Spectral Embedding.  Bipartite graphs only;
          requires ``pip install "spatial_graph_algorithms[gse]"``.

    dim : int
        Target number of output dimensions (2 or 3).
    seed : int, optional
        Random seed for reproducibility.

    Returns
    -------
    SpatialGraph
        Deep copy of *sn* with *reconstructed_positions* of shape
        ``(n_nodes, dim)``.

    Raises
    ------
    ValueError
        If *method* is not one of the supported values.

    Examples
    --------
    >>> from spatial_graph_algorithms.simulate import generate
    >>> from spatial_graph_algorithms.reconstruct import reconstruct
    >>> sn = generate(n=300, seed=42)
    >>> sn_rec = reconstruct(sn, method="mds", dim=2, seed=42)
    >>> sn_rec.reconstructed_positions.shape
    (300, 2)
    """
    method = method.lower()
    if method == "mds":
        coords = run_mds(sn.adjacency_matrix, dim=dim, random_state=seed)
    elif method == "strnd":
        from .strnd import run_strnd

        coords = run_strnd(sn.adjacency_matrix, dim=dim, random_state=seed)
    elif method == "landmark_mds":
        from .landmark_mds import run_landmark_mds

        coords = run_landmark_mds(sn.adjacency_matrix, dim=dim, random_state=seed, **kwargs)
    elif method == "gse":
        from .gse import run_gse

        coords = run_gse(sn.adjacency_matrix, dim=dim, **kwargs)
    else:
        raise ValueError(f"Unsupported reconstruction method '{method}'")

    sn2 = sn.copy()
    sn2.reconstructed_positions = coords
    return sn2

`spatial_graph_algorithms.reconstruct.mds.run_mds(adjacency, dim=2, random_state=None)`

Reconstruct node coordinates using metric MDS on shortest-path distances.

Computes all-pairs shortest-path distances from adjacency and passes the resulting dissimilarity matrix to sklearn's MDS with dissimilarity='precomputed'. Disconnected pairs receive twice the maximum finite distance.

Parameters:

Name	Type	Description	Default
`adjacency`	`scipy.sparse matrix`	Symmetric, unweighted adjacency matrix of shape `(n, n)`.	required
`dim`	`int`	Number of output dimensions. Default is 2.	`2`
`random_state`	`int`	Seed for sklearn's MDS initialisation.	`None`

Returns:

Type	Description
`ndarray`	Float array of shape `(n, dim)` with the embedded coordinates.

Examples:

>>> from spatial_graph_algorithms.simulate import generate
>>> from spatial_graph_algorithms.reconstruct.mds import run_mds
>>> sn = generate(n=100, seed=0)
>>> coords = run_mds(sn.adjacency_matrix, dim=2, random_state=0)
>>> coords.shape
(100, 2)

Source code in src/spatial_graph_algorithms/reconstruct/mds.py

def run_mds(adjacency, dim: int = 2, random_state: int | None = None) -> np.ndarray:
    """Reconstruct node coordinates using metric MDS on shortest-path distances.

    Computes all-pairs shortest-path distances from *adjacency* and passes the
    resulting dissimilarity matrix to sklearn's ``MDS`` with
    ``dissimilarity='precomputed'``.  Disconnected pairs receive twice the
    maximum finite distance.

    Parameters
    ----------
    adjacency : scipy.sparse matrix
        Symmetric, unweighted adjacency matrix of shape ``(n, n)``.
    dim : int
        Number of output dimensions.  Default is 2.
    random_state : int, optional
        Seed for sklearn's MDS initialisation.

    Returns
    -------
    np.ndarray
        Float array of shape ``(n, dim)`` with the embedded coordinates.

    Examples
    --------
    >>> from spatial_graph_algorithms.simulate import generate
    >>> from spatial_graph_algorithms.reconstruct.mds import run_mds
    >>> sn = generate(n=100, seed=0)
    >>> coords = run_mds(sn.adjacency_matrix, dim=2, random_state=0)
    >>> coords.shape
    (100, 2)
    """
    D = shortest_path(adjacency, directed=False, unweighted=True)
    max_finite = np.nanmax(D[np.isfinite(D)]) if np.isfinite(D).any() else 1.0
    D[~np.isfinite(D)] = max_finite * 2
    mds = MDS(n_components=dim, dissimilarity="precomputed", random_state=random_state)
    return mds.fit_transform(D)

`spatial_graph_algorithms.reconstruct.landmark_mds.run_landmark_mds(adjacency, dim=2, random_state=None, *, n_landmarks=None, weighted=False)`

Reconstruct coordinates using landmark MDS.

Selects n_landmarks random nodes, computes shortest-path distances from those landmarks to every node, embeds the landmarks with metric MDS, and triangulates every node from its distances to the landmark set.

Parameters:

Name	Type	Description	Default
`adjacency`	`scipy.sparse matrix`	Symmetric adjacency matrix of shape `(n, n)`.	required
`dim`	`int`	Number of output dimensions. Default is 2.	`2`
`random_state`	`int`	Seed for landmark selection and sklearn's MDS initialisation.	`None`
`n_landmarks`	`int`	Number of landmark nodes. Must satisfy `dim < n_landmarks <= n`.	`None`
`weighted`	`bool`	If `True`, use adjacency values as shortest-path edge weights. If `False`, treat all nonzero edges as unit weight. Default is `False`.	`False`

Returns:

Type	Description
`ndarray`	Float array of shape `(n, dim)` with the reconstructed coordinates.

Raises:

Type	Description
`ValueError`	If `n_landmarks` is missing or outside the valid range, if `dim` is invalid, or if no finite off-diagonal shortest-path distances exist.

Examples:

>>> from spatial_graph_algorithms.simulate import generate
>>> from spatial_graph_algorithms.reconstruct.landmark_mds import run_landmark_mds
>>> sg = generate(n=100, seed=0)
>>> coords = run_landmark_mds(sg.adjacency_matrix, dim=2, random_state=0, n_landmarks=16)
>>> coords.shape
(100, 2)

Source code in src/spatial_graph_algorithms/reconstruct/landmark_mds.py

def run_landmark_mds(
    adjacency: scipy.sparse.spmatrix,
    dim: int = 2,
    random_state: int | None = None,
    *,
    n_landmarks: int | None = None,
    weighted: bool = False,
) -> np.ndarray:
    """Reconstruct coordinates using landmark MDS.

    Selects ``n_landmarks`` random nodes, computes shortest-path distances from
    those landmarks to every node, embeds the landmarks with metric MDS, and
    triangulates every node from its distances to the landmark set.

    Parameters
    ----------
    adjacency : scipy.sparse matrix
        Symmetric adjacency matrix of shape ``(n, n)``.
    dim : int
        Number of output dimensions. Default is 2.
    random_state : int, optional
        Seed for landmark selection and sklearn's MDS initialisation.
    n_landmarks : int
        Number of landmark nodes. Must satisfy ``dim < n_landmarks <= n``.
    weighted : bool
        If ``True``, use adjacency values as shortest-path edge weights. If
        ``False``, treat all nonzero edges as unit weight. Default is ``False``.

    Returns
    -------
    np.ndarray
        Float array of shape ``(n, dim)`` with the reconstructed coordinates.

    Raises
    ------
    ValueError
        If ``n_landmarks`` is missing or outside the valid range, if ``dim`` is
        invalid, or if no finite off-diagonal shortest-path distances exist.

    Examples
    --------
    >>> from spatial_graph_algorithms.simulate import generate
    >>> from spatial_graph_algorithms.reconstruct.landmark_mds import run_landmark_mds
    >>> sg = generate(n=100, seed=0)
    >>> coords = run_landmark_mds(sg.adjacency_matrix, dim=2, random_state=0, n_landmarks=16)
    >>> coords.shape
    (100, 2)
    """
    n_nodes = adjacency.shape[0]
    n_landmarks = _validate_parameters(n_nodes, dim, n_landmarks)

    rng = np.random.default_rng(random_state)
    landmarks = rng.choice(n_nodes, n_landmarks, replace=False)

    distances_landmarks_to_all = shortest_path(
        adjacency,
        directed=False,
        unweighted=not weighted,
        indices=landmarks,
    )
    distances_landmarks_to_all = _replace_infinite_distances(distances_landmarks_to_all)
    all_distances_to_landmarks = distances_landmarks_to_all.T

    landmark_distance_matrix = all_distances_to_landmarks[landmarks]
    landmark_distance_matrix = np.maximum(landmark_distance_matrix, landmark_distance_matrix.T)
    np.fill_diagonal(landmark_distance_matrix, 0.0)

    mds = MDS(
        n_components=dim,
        metric=True,
        dissimilarity="precomputed",
        random_state=random_state,
    )
    landmark_positions = np.asarray(mds.fit_transform(landmark_distance_matrix))

    landmark_squared = landmark_distance_matrix**2
    all_squared = all_distances_to_landmarks**2
    mean_column = landmark_squared.mean(axis=0)
    landmark_pinv = np.linalg.pinv(landmark_positions)
    return (0.5 * landmark_pinv.dot((mean_column - all_squared).T)).T

`spatial_graph_algorithms.reconstruct.strnd.run_strnd(adjacency, dim=2, random_state=None, embedding_dim=64, p=1.0, q=1.0, workers=4, umap_n_neighbors=15, umap_min_dist=1.0, verbose=False)`

Reconstruct coordinates using STRND: Node2Vec embeddings → UMAP.

Exact port of the NSC compute_pecanpy_embeddings_from_df + UMAP pipeline. Node2Vec biased random walks (pecanpy PreComp) produce a 64-D embedding which UMAP reduces to dim spatial coordinates.

Parameters:

Name	Type	Description	Default
`adjacency`	`scipy.sparse matrix`	Symmetric, unweighted adjacency matrix.	required
`dim`	`int`	Number of output dimensions. Default is 2.	`2`
`random_state`	`int`	Seed for UMAP reproducibility.	`None`
`embedding_dim`	`int`	Dimensionality of the Node2Vec embedding before UMAP reduction. Default is 64 (matches NSC defaults).	`64`
`p`	`float`	Node2Vec return parameter. `p=1` (default) gives unbiased walks.	`1.0`
`q`	`float`	Node2Vec in-out parameter. `q=1` (default) gives unbiased walks.	`1.0`
`workers`	`int`	Number of worker threads for pecanpy. Default is 4.	`4`
`umap_n_neighbors`	`int`	UMAP `n_neighbors` parameter. Default is 15.	`15`
`umap_min_dist`	`float`	UMAP `min_dist` parameter. Default is 1.0.	`1.0`
`verbose`	`bool`	If `True`, pecanpy prints timing information.	`False`

Returns:

Type	Description
`ndarray`	Float array of shape `(n, dim)` with the embedded coordinates.

Examples:

>>> from spatial_graph_algorithms.simulate import generate
>>> from spatial_graph_algorithms.reconstruct.strnd import run_strnd
>>> sn = generate(n=200, seed=0)
>>> coords = run_strnd(sn.adjacency_matrix, dim=2, random_state=0)
>>> coords.shape
(200, 2)

Source code in src/spatial_graph_algorithms/reconstruct/strnd.py

def run_strnd(
    adjacency: scipy.sparse.spmatrix,
    dim: int = 2,
    random_state: int | None = None,
    embedding_dim: int = 64,
    p: float = 1.0,
    q: float = 1.0,
    workers: int = 4,
    umap_n_neighbors: int = 15,
    umap_min_dist: float = 1.0,
    verbose: bool = False,
) -> np.ndarray:
    """Reconstruct coordinates using STRND: Node2Vec embeddings → UMAP.

    Exact port of the NSC ``compute_pecanpy_embeddings_from_df`` + UMAP pipeline.
    Node2Vec biased random walks (pecanpy PreComp) produce a 64-D embedding which
    UMAP reduces to ``dim`` spatial coordinates.

    Parameters
    ----------
    adjacency : scipy.sparse matrix
        Symmetric, unweighted adjacency matrix.
    dim : int
        Number of output dimensions.  Default is 2.
    random_state : int, optional
        Seed for UMAP reproducibility.
    embedding_dim : int
        Dimensionality of the Node2Vec embedding before UMAP reduction.
        Default is 64 (matches NSC defaults).
    p : float
        Node2Vec return parameter.  ``p=1`` (default) gives unbiased walks.
    q : float
        Node2Vec in-out parameter.  ``q=1`` (default) gives unbiased walks.
    workers : int
        Number of worker threads for pecanpy.  Default is 4.
    umap_n_neighbors : int
        UMAP ``n_neighbors`` parameter.  Default is 15.
    umap_min_dist : float
        UMAP ``min_dist`` parameter.  Default is 1.0.
    verbose : bool
        If ``True``, pecanpy prints timing information.

    Returns
    -------
    np.ndarray
        Float array of shape ``(n, dim)`` with the embedded coordinates.

    Examples
    --------
    >>> from spatial_graph_algorithms.simulate import generate
    >>> from spatial_graph_algorithms.reconstruct.strnd import run_strnd
    >>> sn = generate(n=200, seed=0)
    >>> coords = run_strnd(sn.adjacency_matrix, dim=2, random_state=0)
    >>> coords.shape
    (200, 2)
    """
    edge_df = _adjacency_to_edge_df(adjacency)
    embeddings = _pecanpy_embeddings(
        edge_df,
        embedding_dim=embedding_dim,
        p=p,
        q=q,
        workers=workers,
        verbose=verbose,
    )
    reducer = umap.UMAP(
        n_components=dim,
        n_neighbors=umap_n_neighbors,
        min_dist=umap_min_dist,
        random_state=random_state,
    )
    return reducer.fit_transform(embeddings)

`spatial_graph_algorithms.reconstruct.quality`

Reconstruction quality metrics comparing original and reconstructed positions.

All three metrics are rotation-, reflection-, and translation-invariant.

Functions

`cpd(true_positions, recon_positions)`

Compute the Correlation of Pairwise Distances (CPD).

R² of the pairwise distance matrices of the original and reconstructed coordinates. CPD = 1.0 means all inter-node distances are perfectly preserved.

Parameters:

Name	Type	Description	Default
`true_positions`	`ndarray`	Ground-truth node coordinates, shape `(n, dim)`.	required
`recon_positions`	`ndarray`	Reconstructed node coordinates, shape `(n, dim)`.	required

Returns:

Type	Description
`float`	R² in `[0, 1]`. Values > 0.9 indicate excellent reconstruction; < 0.7 is considered poor.

Examples:

>>> import numpy as np
>>> from spatial_graph_algorithms.reconstruct.quality import cpd
>>> pts = np.random.default_rng(0).random((50, 2))
>>> cpd(pts, pts + 0.001) > 0.99
True

Source code in src/spatial_graph_algorithms/reconstruct/quality.py

def cpd(true_positions: np.ndarray, recon_positions: np.ndarray) -> float:
    """Compute the Correlation of Pairwise Distances (CPD).

    R² of the pairwise distance matrices of the original and reconstructed
    coordinates.  CPD = 1.0 means all inter-node distances are perfectly
    preserved.

    Parameters
    ----------
    true_positions : np.ndarray
        Ground-truth node coordinates, shape ``(n, dim)``.
    recon_positions : np.ndarray
        Reconstructed node coordinates, shape ``(n, dim)``.

    Returns
    -------
    float
        R² in ``[0, 1]``.  Values > 0.9 indicate excellent reconstruction;
        < 0.7 is considered poor.

    Examples
    --------
    >>> import numpy as np
    >>> from spatial_graph_algorithms.reconstruct.quality import cpd
    >>> pts = np.random.default_rng(0).random((50, 2))
    >>> cpd(pts, pts + 0.001) > 0.99
    True
    """
    a = pdist(true_positions)
    b = pdist(recon_positions)
    if a.std() < 1e-12 or b.std() < 1e-12:
        return 0.0
    r = float(np.corrcoef(a, b)[0, 1])
    return r * r

`knn(true_positions, recon_positions, k=10)`

Compute k-nearest-neighbour overlap.

For each node, the fraction of its k true nearest neighbours that also appear among its k reconstructed nearest neighbours, averaged over all nodes.

Parameters:

Name	Type	Description	Default
`true_positions`	`ndarray`	Ground-truth coordinates, shape `(n, dim)`.	required
`recon_positions`	`ndarray`	Reconstructed coordinates, shape `(n, dim)`.	required
`k`	`int`	Number of neighbours to compare. Clamped to `[1, n-1]`.	`10`

Returns:

Type	Description
`float`	Mean neighbourhood overlap in `[0, 1]`. 1.0 = all neighbourhoods perfectly recovered.

Examples:

>>> import numpy as np
>>> from spatial_graph_algorithms.reconstruct.quality import knn
>>> pts = np.random.default_rng(0).random((50, 2))
>>> knn(pts, pts)
1.0

Source code in src/spatial_graph_algorithms/reconstruct/quality.py

def knn(
    true_positions: np.ndarray,
    recon_positions: np.ndarray,
    k: int = 10,
) -> float:
    """Compute k-nearest-neighbour overlap.

    For each node, the fraction of its *k* true nearest neighbours that also
    appear among its *k* reconstructed nearest neighbours, averaged over all
    nodes.

    Parameters
    ----------
    true_positions : np.ndarray
        Ground-truth coordinates, shape ``(n, dim)``.
    recon_positions : np.ndarray
        Reconstructed coordinates, shape ``(n, dim)``.
    k : int
        Number of neighbours to compare.  Clamped to ``[1, n-1]``.

    Returns
    -------
    float
        Mean neighbourhood overlap in ``[0, 1]``.  1.0 = all neighbourhoods
        perfectly recovered.

    Examples
    --------
    >>> import numpy as np
    >>> from spatial_graph_algorithms.reconstruct.quality import knn
    >>> pts = np.random.default_rng(0).random((50, 2))
    >>> knn(pts, pts)
    1.0
    """
    k = max(1, min(k, len(true_positions) - 1))
    nn_t = NearestNeighbors(n_neighbors=k + 1).fit(true_positions)
    nn_r = NearestNeighbors(n_neighbors=k + 1).fit(recon_positions)
    idx_t = nn_t.kneighbors(return_distance=False)[:, 1:]
    idx_r = nn_r.kneighbors(return_distance=False)[:, 1:]
    overlaps = []
    for a, b in zip(idx_t, idx_r):
        overlaps.append(len(set(a).intersection(set(b))) / k)
    return float(np.mean(overlaps))

`distortion(true_positions, recon_positions)`

Compute normalised pairwise-distance distortion.

Scale-aligns the reconstructed distances to the ground-truth mean scale, then returns median(|d_true − d_recon_scaled|) / max(d_true).

Parameters:

Name	Type	Description	Default
`true_positions`	`ndarray`	Ground-truth coordinates, shape `(n, dim)`.	required
`recon_positions`	`ndarray`	Reconstructed coordinates, shape `(n, dim)`.	required

Returns:

Type	Description
`float`	Normalised distortion in `[0, 1]`. 0.0 = perfect; 1.0 = residuals as large as the widest pair in the ground truth.

Examples:

>>> import numpy as np
>>> from spatial_graph_algorithms.reconstruct.quality import distortion
>>> pts = np.random.default_rng(0).random((50, 2))
>>> distortion(pts, pts)
0.0

Source code in src/spatial_graph_algorithms/reconstruct/quality.py

def distortion(true_positions: np.ndarray, recon_positions: np.ndarray) -> float:
    """Compute normalised pairwise-distance distortion.

    Scale-aligns the reconstructed distances to the ground-truth mean scale,
    then returns ``median(|d_true − d_recon_scaled|) / max(d_true)``.

    Parameters
    ----------
    true_positions : np.ndarray
        Ground-truth coordinates, shape ``(n, dim)``.
    recon_positions : np.ndarray
        Reconstructed coordinates, shape ``(n, dim)``.

    Returns
    -------
    float
        Normalised distortion in ``[0, 1]``.  0.0 = perfect; 1.0 = residuals
        as large as the widest pair in the ground truth.

    Examples
    --------
    >>> import numpy as np
    >>> from spatial_graph_algorithms.reconstruct.quality import distortion
    >>> pts = np.random.default_rng(0).random((50, 2))
    >>> distortion(pts, pts)
    0.0
    """
    dt = pdist(true_positions)
    dr = pdist(recon_positions)

    max_true = dt.max()
    if max_true < 1e-12:
        return 0.0

    mean_recon = dr.mean()
    if mean_recon < 1e-12:
        return 1.0

    dr_scaled = dr * (dt.mean() / mean_recon)
    return float(min(np.median(np.abs(dt - dr_scaled)) / max_true, 1.0))