nPSO
Nonuniform Popularity-Similarity Optimisation
Embeds nodes on a hyperbolic disk and connects pairs that lie close in the geometry; a temperature knob trades clustering coefficient against randomness.
Most generators in this project are driven by degree and block statistics, not geometry. They place edges from counts, so triangles happen only by coincidence.
nPSO is the one exception. It gets triangles by embedding nodes in a hyperbolic disk and connecting nodes that are geometrically close. A temperature knob \(T \in (0, 1)\) controls how strictly "close" is enforced: small \(T\) means only very close pairs connect (high clustering), large \(T\) means random-ish (low clustering). Matching an empirical clustering coefficient means searching for the right \(T\).
nPSO has no outlier category. Every node lives in a cluster.
By default, every node that is not part of a size-2+ community becomes its own 1-node cluster, and the generator treats those size-1 clusters like any other cluster. The graph below colours the 20-node fixture by the resulting 5 clusters (3 real + 2 singletons).
Fit every scalar the generator needs.
Profile reads the reference network + clustering and distils them into a small, explicit contract. The default model is nPSO2, so alongside the scalars profile also emits a weight vector \(\rho_k = \text{size}_k / N\), one entry per mixture component (including the singleton outliers introduced above).
| N | · | Number of nodes. |
|---|---|---|
| m | · | Half of the mean degree, rounded. Controls how many existing nodes each arrival attaches to. |
| γ | · | Power-law exponent of the degree distribution, fit on the input degree sequence and floored at \(\gamma \ge 2\). |
| C | · | Number of mixture components. Each input outlier becomes its own 1-node cluster, bumping C by the outlier count. |
| CG | · | Global clustering coefficient of the input. The secant search tries to hit this value. |
The paper defines three angular-distribution variants. nPSO1 uses uniform weights \(\rho_k = 1/C\); it is the paper's primary benchmark. nPSO2 accepts caller-supplied \(\rho_k\); the paper introduces it as the way to generate "communities of different sizes". nPSO3 mixes Gaussians with Gammas to build asymmetric lobes. Our inputs have clusters of very different sizes, so we fit the mixing proportions to the cluster sizes and pass them through: this is why profile emits the \(\rho_k\) vector.
Embed every node on the Poincaré disk.
Each node gets two polar coordinates. The radial position encodes popularity (hubs near the centre, low-degree nodes at the rim); the angular position encodes similarity (nearby angles = similar).
Nodes are indexed in descending order by degree, so the biggest hub is \(i = 1\) and the lowest-degree node is \(i = N\). The rank appears as each node's label in the disk below.
With \(\beta = 1/(\gamma - 1)\), following the PSO growth process (Papadopoulos et al. 2012, extended for communities in Muscoloni & Cannistraci 2018), node \(i\)'s radial coordinate at simulation time \(t \ge i\) is
\[ r_i(t) \;=\; 2\beta \, \ln(i) \;+\; 2(1-\beta) \, \ln(t). \]Each new arrival pushes every existing node outward (popularity fading), so a node born early eventually drifts out as later nodes arrive. Step the disk below through \(t = 1, 2, \ldots, N\) to watch \(r_i(t)\) evolve: arrivals \(i > t\) are not yet drawn, and arrived nodes drift outward as \(t\) grows. At \(t = N\) you see the final-state layout \(r_i(N)\).
Default model is nPSO2: the angular coordinate is sampled from a Gaussian mixture with \(C\) equidistant means \(\mu_k = 2\pi k / C\), common width \(\sigma = 2\pi / (6C)\), and profile-supplied weights \(\rho_k = \text{size}_k / N\):
\[ \theta_i \;\sim\; \sum_{k=1}^{C} \rho_k \, \mathcal{N}(\mu_k, \sigma^2) \pmod{2\pi}. \]The original PSO model (Papadopoulos et al. 2012) draws \(\theta_i \sim \text{Uniform}(0, 2\pi)\): angles spread evenly around the disk, so the output graph has only a radial popularity ladder and no community structure. Muscoloni & Cannistraci 2018 replace the uniform angular law with the Gaussian mixture above: angles concentrate around \(C\) component means \(\mu_k\), which is what produces visible clusters on the disk. The leading "n" stands for nonuniform, naming exactly that change to the angular distribution; the radial law is unchanged from PSO.
Each sample first picks a component \(k\) with probability \(\rho_k\), then draws \(\theta_i\) from \(\mathcal{N}(\mu_k, \sigma^2)\). Input cluster labels do not survive this stage. Every node is re-assigned to the nearest component mean:
\[ \mathcal{C}'(i) \;=\; \arg\min_k d_\mathrm{ang}(\theta_i, \mu_k), \]where \(d_\mathrm{ang}\) is the shortest circular distance between two angles, i.e. the minimum of \(|\theta_i - \mu_k|\) and \(2\pi - |\theta_i - \mu_k|\).
Nodes drawn from low-weight components can land closer to a high-weight component's mean and get re-assigned; that is by design, and it is why singleton outliers (with \(\rho = 1/N\)) can end up absorbed by nearby large clusters rather than staying isolated.
Every component gets a full angular sector regardless of weight, so that part is fixed; what changes is how many nodes each sector attracts. Treating singletons as \(1/C\)-weighted components (nPSO1) would have each outlier claim a similar share of the nodes as a real cluster, producing outlier "communities" with several nodes each. nPSO2 with \(\rho_k = \text{size}_k / N\) gives singleton outliers correspondingly tiny mixture weight, so they attract almost no nodes, which is closer to the reality that they are outliers.
Argmin reassignment can leave a cluster with only one node, or with zero nodes. A post-process drops both: a one-node cluster gets relabelled as an outlier, and a zero-node cluster is removed entirely. The final clustering only lists clusters of size 2 or more.
Close on the disk, likely connected.
Zoom in on one arrival. Node \(i\) has just landed on the disk at polar coordinates \((r_i, \theta_i)\) and needs to decide, for each earlier arrival \(j \in \{1, \ldots, i-1\}\), how likely the pair \((i, j)\) is to be connected. The generator does this in three sequential computations.
Step 1. Compute the disk's connection radius at this arrival, \(R_i(T)\). This scalar is a closed form in \(i\), \(m\), \(T\), and \(\beta = 1/(\gamma - 1)\); it does not depend on \(j\), so it is computed once per arrival:
\[ R_t(T) \;=\; 2 \ln t \;-\; 2 \ln\!\left[\; \frac{2T \bigl(1 - e^{-(1 - \beta) \ln t}\bigr)}{\sin(\pi T)\, m\, (1 - \beta)} \;\right] \qquad (\beta \neq 1). \]The \(\beta = 1\) case is the L'Hopital limit of the same expression, not a "drop the \((1-\beta)\) factor" shortcut: at \(\beta = 1\) the general formula goes \(0/0\), so a plain substitution doesn't work; the limit yields
\[ R_t(T) \;=\; 2 \ln t \;-\; 2 \ln\!\left[\; \frac{2T \ln t}{\sin(\pi T)\, m} \;\right] \qquad (\beta = 1). \]Because \(t\) is the current arrival index, the disk grows one notch per arrival, and every new vertex sees a fresh radius. The closed form is undefined at \(t = 1\); we do not need to compute it for the first node, and the next section explains why.
Step 2. For each candidate \(j\) already on the disk, compute the hyperbolic distance between \(i\) and \(j\) using the radii at the current arrival time \(t = i\):
\[ h_{ij}(t) \;=\; \mathrm{arccosh}\bigl( \cosh r_i(t) \cosh r_j(t) \;-\; \sinh r_i(t) \sinh r_j(t) \cos d_\mathrm{ang}(\theta_i, \theta_j) \bigr). \]Close in radius AND close in angle gives a small \(h_{ij}\); a big radial gap or a diametric angular gap blows it up. One distance per candidate.
Step 3. Plug the pair \((h_{ij}, R_i(T))\) into the Fermi-Dirac sigmoid to get the connection probability:
\[ p(i, j) \;=\; \frac{1}{1 + \exp\!\bigl((h_{ij} - R_i(T)) / (2T)\bigr)}. \]Pairs closer than \(R_i(T)\) get \(p \to 1\); pairs further than \(R_i(T)\) get \(p \to 0\); pairs sitting right on the radius get \(p = 1/2\). \(T\) controls how sharp the transition is: small \(T\) means hard step at the radius, large \(T\) means gradual slope.
The plot below shows \(p\) as a function of \(h_{ij}\) with the radius pinned at the dashed vertical. The \(T\) slider scales the sigmoid: a hard step at \(T \to 0\), a shallow slope at \(T \to 1\).
Same picture on the disk: pick any node as source to colour the rest of the disk by \(p(i, \cdot)\). The hyperbolic ball boundary becomes a soft gradient that hugs the dashed cutoff line at low \(T\) and diffuses across the disk at high \(T\). The widget below merges the step 1 ball, step 2 geodesic, and step 3 sigmoid into one click-to-pick view; toggle show heat to drop the cell shading and see the bare ball/geodesic.
The \(p(i, j)\) values shown above are the weights the next section uses to pick which \(m\) predecessors node \(i\) actually connects to.
Drag \(T\). Watch which edges get picked.
Edges are placed via the paper's implementation 3. Process nodes in arrival order \(i = 1, 2, \ldots, N\). For each new arrival \(i\) look at the set of earlier arrivals \(V_i = \{1, \ldots, i-1\}\) and draw a target set \(S_i \subseteq V_i\) by weighted sampling without replacement:
- If \(|V_i| \leq m\) (the first \(m+1\) arrivals), connect \(i\) to every earlier node, creating \(m(m+1)/2\) edges.
- Otherwise pull out \(m\) distinct targets from \(V_i\) one at a time: the probability of picking \(j\) on each pull is proportional to \(p(i, j)\). Remove \(j\) from the pool, repeat until \(|S_i| = m\). That spends exactly \(m\) edges per arrival for the remaining \(N - m - 1\) arrivals.
The totals add up to \(m(m+1)/2 + (N - m - 1) \cdot m\), independent of \(T\). The temperature knob reshuffles which pairs end up in each \(S_i\) through the weights \(p(i, j)\), but never the size of any \(S_i\).
The first \(m+1\) arrivals form a complete graph \(K_{m+1}\), which is \(m\)-edge-connected. Every later arrival attaches to exactly \(m\) existing nodes; the same inductive argument that underpins EC-SBM v1's \(K_{k+1}\) proof (Theorem 1 in Vu-Le et al. 2025) carries over, so the entire output graph ends up with edge connectivity \(\geq m\). This is a property of the graph, not of each individual cluster: nPSO places edges geometrically on the disk, not per-cluster, so a cluster's induced subgraph can still have a thin cut.
Why does \(T\) move \(cc\) when edge counts are fixed? Triangles need three pairwise-close nodes to all be wired up.
At low \(T\) the weights \(p(i, j)\) collapse onto the hyperbolic ball: each arrival's \(m\) picks concentrate on its few nearest predecessors, so when two of \(i\)'s targets are themselves close to each other they get picked together and the triangle closes.
At high \(T\), \(p(i, j)\) flattens toward uniform on \(V_i\). Picks scatter across the disk, the "two of my targets are also close to each other" coincidence stops happening, and triangle density (and so \(cc\)) drops. Same edge budget, different geometry of which edges, different clustering.
Bracket, then secant, with a margin guard.
\(cc(T)\) trends downward in \(T\) but is noisy and not strictly monotone. The code treats this as 1D root-finding. Start with \([T_{\min}, T_{\max}] = [0, 1]\) and bisect. Residual signs drive the bracket update backwards from the textbook: \(cc > \mathrm{target}\) means \(T\) is too small, so \(T_{\min}\) moves right; \(cc < \mathrm{target}\) means \(T\) is too large, so \(T_{\max}\) moves left.
Once both bracket ends have opposite-signed residuals, secant fires:
\[ T_{\mathrm{sec}} \;=\; T_{\min} \;-\; \frac{f_{\min}}{f_{\max} - f_{\min}} \cdot (T_{\max} - T_{\min}). \]Margin guard: if the secant iterate falls within \(5\%\) of the bracket width from either endpoint (\(T_{\mathrm{sec}} \leq T_{\min} + 0.05(T_{\max} - T_{\min})\) or \(T_{\mathrm{sec}} \geq T_{\max} - 0.05(T_{\max} - T_{\min})\)), the code falls back to the bracket midpoint for that iter. Without the guard, secant stalls hugging the endpoint when \(cc(T)\) flattens near a bracket edge.
Stop conditions: residual \(< 0.005\), step \(< 0.0001\), or \(100\) iters. The wrapper keeps the lowest-residual iterate even when no stop condition fired.
With --search-samples-per-T S, each iter draws \(S\) realisations at the same \(T\) with distinct seeds; the residual driver is the sample mean. Noise on the mean shrinks like \(1/\sqrt{S}\), at \(S\times\) MATLAB cost per iter. The output edge list is the realisation whose \(cc\) lands closest to the mean.
The framework assumes \(cc(T)\) is monotone so residual signs carry information. The ghost band below shows the mean curve drifting downward but not strictly so, and the \(\pm 1\sigma\) / \(\pm 2\sigma\) bands show the single-shot noise on a real search iterate. The cobalt inner band tracks the mean's \(\pm \sigma/\sqrt{S}\) envelope at the current \(S\); that is the residual the search actually consumes. Equal-signed residuals can cluster on the wrong side, the secant never fires, and bisection walks into a local minimum.
nPSO's achievable \(cc\) ceiling depends on \((N, m, \gamma, C)\). If the target line sits above the top of the \(\pm 2\sigma\) band at every \(T\), even driving \(T\) toward \(0\) cannot push \(cc(T)\) up to it. When that happens the search never crosses the target, residuals stay same-signed across the whole bracket, and bisection walks \(T \to 0\) until the step collapses.
On this \(20\)-node synthetic the target sits at the top edge of the ghost band: barely reachable. The DNC case study below is the extreme: target \(0.548\), ghost band top barely above \(0.1\).
cc(T) mean (\(64\) draws per \(T\)) with \(\pm 1\sigma\) and \(\pm 2\sigma\) shaded bands; mint dashed line is the target. Cobalt dashed envelope (only when \(S>1\)) is the mean's \(\pm \sigma/\sqrt{S}\), the residual the search actually consumes. Azure dots = mean cc per iter (current iter haloed, best-so-far ringed). Bottom: bracket heatmap per iter. Blue = cc below target, red = above. Red ring = sampled \(T\); mint dashed vertical = where mean cc crosses the target inside the bracket.What you get on the shipped example.
Default run on dnc + sbm-flat-best+cc at --seed 1, nPSO2, every input outlier as its own singleton cluster:
| stat | input | nPSO output | note |
|---|---|---|---|
| # nodes | 906 | 906 | Exact: the disk is built node by node up to \(N\). |
| # clusters | 87 | 168 | Argmin reassignment redistributes nodes across 442 mixture components (87 real + 355 singleton outliers); singleton outliers can catch stray nodes and graduate, real clusters can fragment. |
| # edges | 10,429 | 10,794 | Within \(3.5\%\). Fixed at \(m(m+1)/2 + m(N - m - 1)\), where \(m = \lfloor \text{mean deg} / 2 \rceil = 12\); independent of \(T\) and search outcome. |
| global cc | 0.548 | 0.106 | Did not converge. The search bisects down to \(T = 0.00391\) with \(cc = 0.106\), residual \(0.443\); MATLAB's datasample blows up below that, so the walk hits a numerical wall before the secant can rule out a lower-\(T\) ceiling. |
| char. time | 40.41 | 8.15 | Early-arrival hubs sit near the disk centre and connect to most later nodes, so they shortcut walks across angular sectors. |
| pseudo-diameter | 8 | 3 | Same hub effect: rank-1 connects to almost every later node, so most pairs are within two hops through the centre. |
nPSO with \((N = 906, m = 12, \gamma = 2.0, C = 442)\) peaks at \(cc \approx 0.106\) across the sampled \(T\) range: the hyperbolic disk with these parameters cannot generate clustering as high as \(0.548\). The search bisects down the bracket from \(T = 0.5\) to \(T = 0.00391\) over 9 iters, picks up \(cc\) only from \(0.099\) to \(0.106\), then hits MATLAB's datasample failure at \(T \approx 0.001\); the secant never fires because residuals stay same-signed.