Synthetic Network Generators
A gallery of generators walked stage by stage on the same small input. A community detector cannot be stress-tested without synthetics whose ground truth is known: each generator here takes a reference network and its community structure, and produces fresh draws keyed to that ground truth.
Given a network \(G\) (undirected, unweighted, simple) with a community structure \(\mathcal{C}\) on its nodes, we want to sample a family of networks that are statistically similar to \(G\) and \(\mathcal{C}\) without being identical. \(\mathcal{C}\) can be ground-truth or produced by a detection algorithm; the generator does not care which. What "statistically similar" means differs from one generator to the next: each freezes a different summary of \(G\) and \(\mathcal{C}\) and randomises the rest.
Scoring a community-detection algorithm requires networks whose answer can be checked. An empirical network gives only one labelling, which may itself be wrong, and a single network is a sample of one. Synthetics fix that: each generator emits a network and its ground-truth clustering together, across as many seeds, sizes, and noise levels as the test demands.
Each page below walks one generator from input to output on the same 20-node example, keeps one interactive widget per stage, and closes with a plain note on what holds and what drifts.
Two labels recur across the pages: a node is either inside a non-trivial community or standing alone, and generators treat the two cases differently.
Every generator runs the same three-stage pipeline: profile the input to extract the parameters the generator needs, generate a fresh graph from those parameters, then post-process the output. Post-processing covers up to three steps. An optional match-degree rewire aligns the generated degree sequence to the input (see degree matchers). A simplify pass collapses parallel edges and drops self-loops. A singleton drop removes size-1 clusters and relabels the lone node as an outlier.
- 18 clustered
- 2 outliers
- 27 intra-cluster
- 8 inter-cluster
- 4 clustered-outlier
- 1 outlier-outlier
Block-model generators
Central objects: the cluster assignment \(\mathcal{C}\), the block-to-block edge-count matrix, and the per-node degree sequence (every variant here uses the degree-corrected SBM). Variants add further structural constraints (per-cluster edge connectivity) on top.
Mixing-parameter generators
Central object: a single scalar setting the fraction of edges that cross cluster boundaries. Variants differ in what drives the degrees and cluster sizes (resampled from a power law, or taken as given).
Geometric generators
Nodes placed in a latent geometric space; edges drawn from proximity. The only family here where clustering coefficient is a design goal rather than a side effect.