ERGM: Creating Large Fully-Connected Network objects in Statnet
TL;DR
Use the following approach:
network <- network(matrix(1, n, n), directed=F)
(At least with Statnet 1.7 / R64 2.13)
In more depth
Statnet seems optimized for sparsely connected graphs. This is not too surprising, since many of the "real" graphs I deal with have a density around d = .0002 or thereabouts, and even some fairly large small world graphs, like IMDB only have a density of ~.18. However, there's one special case where I need to have a fully connected graph: the input to an edgecov()
term in an ERGM model. This graph has to have all possible edges, not just the observed edges, and so, density = 1.0.
One of the challenges is how to create and initialize these very large networks. The step to create them would often take a very long time, and it wasn't clear I was using the best approach. There are at least two method that seemed plausible: use matrix
to initialize an adjacency matrix, or use network.intialize
and then add all of the edges in afterwards. It was not clear up front which one would be faster. So, I did a quick experiment: I ran each method 50 times on various sized graphs, and compared the results.
# Method 1: network(matrix())
startTime = proc.time()
for (i in 1:50) n <- network(matrix(1, 200, 200), directed=F)
proc.time() - startTime
# Method 2: network.intialize() and then assignment
startTime = proc.time()
for (i in 1:50) {
n <- network.initialize(200, directed=F)
n[,] <- 1
}
proc.time() - startTime
The results were pretty clear:
Method | 200x200 | 500x500 | |
#1 | 12.32 | 361.91 | |
#2 | 42.42 | 8+ hours |
I used proc.time()
for the timing. There are suggestions that this is not super-accurate, but the difference is so stark, I think even 1s resolution is more than enough. Also: I've discovered that 32-bit R is a really bad environment for working with even "medium-sized" graphs (500 nodes or so), much less "large" graphs. The extra address space afforded by the 64bit version of R avoids a lot of out-of-memory conditions.