Skip to content

spatial_graph_algorithms.simulate

Generate synthetic spatial graphs for benchmarking and testing.

Supported connectivity modes

Mode Description Key parameters
delaunay_corrected Delaunay triangulation — recommended default
delaunay Standard Delaunay (may include long edges)
knn k nearest neighbours k
epsilon / epsilon-ball All nodes within radius ε epsilon
lattice Regular grid
distance_decay Gaussian-decay connection probability quantile_scale, power_law_exp
knn_bipartite k-NN between two disjoint partitions k, bipartite_ratio
epsilon_bipartite Epsilon-ball between two partitions epsilon, bipartite_ratio

False edges

Set false_edges_fraction=0.05 to inject 5% random long-range edges, simulating the sequencing noise present in real Slide-tags / Pixelgen data. Each injected edge is labelled is_false=True in sg.edge_metadata.

API Reference

spatial_graph_algorithms.simulate.generate(*, n=1000, dim=2, shape='circle', mode='delaunay_corrected', seed=None, scale=1.0, image_path=None, k=8, epsilon=0.15, bipartite_ratio=2, quantile_scale=0.05, power_law_exp=4.0, false_edges_number=None, false_edges_fraction=None)

Generate a simulated spatial graph and return it as a SpatialGraph.

Points are sampled uniformly inside the chosen shape, then connected according to mode. False (shortcut) edges can be injected to simulate the sequencing noise present in real spatial omics data.

Parameters:

Name Type Description Default
n int

Number of nodes to generate. Must be > 1.

1000
dim int

Spatial dimension. Must be 2 or 3.

2
shape str

Point-cloud geometry. One of: "circle", "square", "sphere", "cube", "image" / "image_2d", or any name returned by :func:list_shapes (e.g. "star", "ring", "triangle"). Named shapes are 2-D only.

'circle'
mode str

Edge-construction rule. Supported modes:

  • "delaunay_corrected" — Delaunay triangulation (recommended default)
  • "delaunay" — standard Delaunay
  • "knn"k nearest neighbours
  • "epsilon" / "epsilon-ball" — epsilon-ball connectivity
  • "lattice" — regular grid
  • "distance_decay" — Gaussian-decay connection probability
  • "knn_bipartite"k-NN on two disjoint node sets
  • "epsilon_bipartite" — epsilon-ball on two disjoint node sets
'delaunay_corrected'
seed int

Random seed for reproducible results.

None
scale float

Characteristic length scale of the point cloud. Default is 1.0.

1.0
image_path str

Path to a binary image for shape="image". Black pixels are used as valid sampling regions.

None
k int

Number of neighbours for knn and knn_bipartite modes.

8
epsilon float

Neighbourhood radius for epsilon and epsilon_bipartite modes.

0.15
bipartite_ratio int

Ratio of the two node sets for bipartite modes (e.g. 2 → equal halves).

2
quantile_scale float

Distance quantile used as the length-scale for distance_decay mode.

0.05
power_law_exp float

Exponent controlling the steepness of the distance-decay kernel.

4.0
false_edges_number int

Exact number of false (random long-range) edges to inject. Mutually exclusive with false_edges_fraction.

None
false_edges_fraction float

Fraction of existing edges to replace with random false edges. Mutually exclusive with false_edges_number.

None

Returns:

Type Description
SpatialGraph

A new graph with positions, adjacency_matrix, node_metadata, and edge_metadata populated. edge_metadata["is_false"] is a boolean column marking injected edges.

Raises:

Type Description
ValueError

If n ≤ 1, dim not in {2, 3}, mode is unsupported, k ≤ 0, epsilon ≤ 0, or both false_edges_number and false_edges_fraction are provided.

Examples:

Minimal usage:

>>> from spatial_graph_algorithms.simulate import generate
>>> sn = generate(n=500, seed=42)

With false-edge noise:

>>> sn = generate(n=500, mode="knn", k=6, false_edges_fraction=0.05, seed=42)
>>> sn.edge_metadata["is_false"].sum()
# number of injected false edges
Source code in src/spatial_graph_algorithms/simulate/__init__.py
def generate(
    *,
    n: int = 1000,
    dim: int = 2,
    shape: str = "circle",
    mode: str = "delaunay_corrected",
    seed: int | None = None,
    scale: float = 1.0,
    image_path: str | None = None,
    k: int = 8,
    epsilon: float = 0.15,
    bipartite_ratio: int = 2,
    quantile_scale: float = 0.05,
    power_law_exp: float = 4.0,
    false_edges_number: int | None = None,
    false_edges_fraction: float | None = None,
) -> SpatialGraph:
    """Generate a simulated spatial graph and return it as a SpatialGraph.

    Points are sampled uniformly inside the chosen *shape*, then connected
    according to *mode*.  False (shortcut) edges can be injected to simulate
    the sequencing noise present in real spatial omics data.

    Parameters
    ----------
    n : int
        Number of nodes to generate.  Must be > 1.
    dim : int
        Spatial dimension.  Must be 2 or 3.
    shape : str
        Point-cloud geometry.  One of: ``"circle"``, ``"square"``,
        ``"sphere"``, ``"cube"``, ``"image"`` / ``"image_2d"``, or any
        name returned by :func:`list_shapes` (e.g. ``"star"``, ``"ring"``,
        ``"triangle"``).  Named shapes are 2-D only.
    mode : str
        Edge-construction rule.  Supported modes:

        - ``"delaunay_corrected"`` — Delaunay triangulation (recommended default)
        - ``"delaunay"`` — standard Delaunay
        - ``"knn"`` — *k* nearest neighbours
        - ``"epsilon"`` / ``"epsilon-ball"`` — epsilon-ball connectivity
        - ``"lattice"`` — regular grid
        - ``"distance_decay"`` — Gaussian-decay connection probability
        - ``"knn_bipartite"`` — *k*-NN on two disjoint node sets
        - ``"epsilon_bipartite"`` — epsilon-ball on two disjoint node sets

    seed : int, optional
        Random seed for reproducible results.
    scale : float
        Characteristic length scale of the point cloud.  Default is 1.0.
    image_path : str, optional
        Path to a binary image for ``shape="image"``.  Black pixels are used
        as valid sampling regions.
    k : int
        Number of neighbours for ``knn`` and ``knn_bipartite`` modes.
    epsilon : float
        Neighbourhood radius for ``epsilon`` and ``epsilon_bipartite`` modes.
    bipartite_ratio : int
        Ratio of the two node sets for bipartite modes (e.g. 2 → equal halves).
    quantile_scale : float
        Distance quantile used as the length-scale for ``distance_decay`` mode.
    power_law_exp : float
        Exponent controlling the steepness of the distance-decay kernel.
    false_edges_number : int, optional
        Exact number of false (random long-range) edges to inject.  Mutually
        exclusive with *false_edges_fraction*.
    false_edges_fraction : float, optional
        Fraction of existing edges to replace with random false edges.
        Mutually exclusive with *false_edges_number*.

    Returns
    -------
    SpatialGraph
        A new graph with *positions*, *adjacency_matrix*, *node_metadata*, and
        *edge_metadata* populated.  ``edge_metadata["is_false"]`` is a boolean
        column marking injected edges.

    Raises
    ------
    ValueError
        If *n* ≤ 1, *dim* not in {2, 3}, *mode* is unsupported, *k* ≤ 0,
        *epsilon* ≤ 0, or both *false_edges_number* and *false_edges_fraction*
        are provided.

    Examples
    --------
    Minimal usage:

    >>> from spatial_graph_algorithms.simulate import generate
    >>> sn = generate(n=500, seed=42)

    With false-edge noise:

    >>> sn = generate(n=500, mode="knn", k=6, false_edges_fraction=0.05, seed=42)
    >>> sn.edge_metadata["is_false"].sum()
    # number of injected false edges
    """
    _validate_inputs(
        n=n,
        dim=dim,
        shape=shape,
        mode=mode,
        k=k,
        epsilon=epsilon,
        false_edges_number=false_edges_number,
        false_edges_fraction=false_edges_fraction,
    )

    rng = np.random.default_rng(seed)

    positions = generate_points(
        n=n,
        dim=dim,
        shape=shape,
        rng=rng,
        scale=scale,
        image_path=image_path,
    )

    edges = build_edges(
        positions=positions,
        mode=mode,
        k=k,
        epsilon=epsilon,
        bipartite_ratio=bipartite_ratio,
        quantile_scale=quantile_scale,
        power_law_exp=power_law_exp,
        rng=rng,
        dim=dim,
    )

    is_bipartite = "bipartite" in mode
    all_edges, false_edges, requested_false = inject_false_edges(
        edges,
        n_nodes=len(positions),
        is_bipartite=is_bipartite,
        bipartite_ratio=bipartite_ratio,
        false_edges_number=false_edges_number,
        false_edges_fraction=false_edges_fraction,
        rng=rng,
    )

    adjacency = _edges_to_adjacency(len(positions), all_edges)

    edge_df = pd.DataFrame(sorted(all_edges), columns=["source", "target"])
    edge_df["edge_id"] = np.arange(len(edge_df), dtype=int)
    edge_df["is_false"] = edge_df.apply(
        lambda r: (int(r["source"]), int(r["target"])) in false_edges,
        axis=1,
    )

    node_df = pd.DataFrame({"node_id": np.arange(len(positions), dtype=int)})
    node_df.attrs["simulation"] = {
        "seed": seed,
        "shape": shape,
        "mode": mode,
        "n": int(n),
        "dim": int(dim),
        "k": int(k),
        "epsilon": float(epsilon),
        "requested_false_edges": int(requested_false),
        "is_bipartite": bool(is_bipartite),
    }

    return SpatialGraph(
        adjacency_matrix=adjacency,
        positions=positions,
        node_metadata=node_df,
        edge_metadata=edge_df,
        keep_lcc=False,
    )