Contents:
Script demonstrating the use of the estrangement library to detect and visualize temporal communities.
Function to run simulations, based on a specified dataset, and created tiled plots of the temporal communities.
Parameters can be specified at the command line, when calling this script. Alternatively, a config file specifed at the command line can be used to set the parameter. At the very minimum, a path to the data set must be specified.
Each experiment requires a name, which is used to create a folder to store the results of the simulation. If the results already exist in the folder specified by the experiment name, plots are created using these existing results and the simulation is not run on subsequent calls to EstrangementDemo.py. To run the simulation again, delete the experiment folder before running this script, or use a different experiment name.
Examples
>>> # To see all configuarable parameters use the -h option
>>> EstrangementDemo.py -h
>>> # Configurable parameters can be specified at the command line
>>> EstrangementDemo.py --dataset_dir ./data --display_on True --exp_name my_experiment
>>> # A config file can be used, but it must be preceeded by an '@'
>>> # Three config files are provided as examples, check that that path to the dataset is valid.
>>> EstrangementDemo.py @senate.conf
>>> EstrangementDemo.py @markovian.conf
>>> EstrangementDemo.py @realitymining.conf
This module implements the Estrangement Confinement Algorithm (ECA) and various functions necessary to read the input snapshots, process information and return the results and/or print them to file.
Constructs and returns the Zgraph, as defined in Eq 3 of the paper.
Parameters : | g0, g1: networkx.Graph :
g0_label_dict: dictionary {node:community} :
|
---|---|
Returns : | Z: graph :
|
Examples
>>> g0 = nx.Graph()
>>> g0.add_edges_from([(1,2,{'weight':2}),(1,3,{'weight':1}),(2,4,{'weight':1})])
>>> g1 = nx.Graph()
>>> g1.add_edges_from([(1,2,{'weight':2}),(1,3,{'weight':1}),(3,4,{'weight':1})])
>>> labels = {1:'b',2:'b',3:'b',4:'b',5:'b',6:'b'}
>>> print(make_Zgraph(g0,g1,labels)
Function to read datasets from files in datadir.
Each file represents a graph for a particular timestamp. The name of the files is expected to be <timestamp>.ncol, and each line in the file represents one edge in the graph e.g. line:’ 1 2 5 ‘ indicates there is an edge between nodes ‘1’ and ‘2’ with weight ‘5’
Parameters : | datadir: string :
tolerance: float,optional :
minrepeats: integer :
|
---|---|
Returns : | t: list :
g1: networkx.Graph :
initial_label_dictionary: dictionary { node: community} :
|
Function which returns the partitioning of the input graph, into communities, which maximizes the value of the quality function Q.
In this case, the quality function is modularity. See, [Blondel08] for more details. Note that estrangement is not taken into cosideration in this calculation. This is only used for computing a partition for the intial snapshot.
Parameters : | g1: networkx.Graph :
minrepeats: integer, optional :
tolerance: float, optional :
|
---|---|
Returns : | dictPartition: dictionary {node:community} :
|
Examples
>>> g0 = nx.Graph
>>> g0.add_edges_from([(1,2,{'weight':2}),(1,3,{'weight':1}),(2,4,{'weight':1})])
>>> print(maxQ(g0,minrepeats=10))
Function to make repeated calls to the Label Propagation Algorithm (LPA) and store the values of Q, E and F, as well as the corresponding partitioning for later use. F is the Lagrangian, referred to in the paper as L.
Parameters : | g1: networkx graph :
delta: float :
tolerance: float :
Zgraph: networkx graph :
repeats: integer :
|
---|---|
Returns : | dictPartition: List of dictionaries representing community labeling :
|
Examples
>>> g0 = nx.Graph()
>>> g0.add_edges_from([(1,2,{'weight':1}),(2,3,{'weight':1}),(1,3,{'weight':1}),(3,4,{'weight':1}),(4,5,{'weight':1}),(4,6,{'weight':1}),(5,7,{'weight':1}),(6,7,{'weight':1}),(8,9,{'weight':1}),(8,10,{'weight':1}),(9,10,{'weight':1})])
>>> dictPartition,dictQ,DictE,DictF = estrangement.repeated_runs(g0, delta=0.05, tolerance=0.01, lambduh=1,g0,repeats = 3)
The Estrangement Reduction Algorithm, as decribed in: [KS12]. This function detects temporal communities and returns the results.
Parameters : | dataset_dir: string :
results_filename: string :
tolerance: float, optional :
convergence_tolerance: float,optional :
delta: float :
minrepeats: integer,optional :
increpeats: integer,optional :
maxfun: integer, optional :
write_stats, optional :
|
---|---|
Returns : | matched_labels : dictionary {time: {node:label}}
|
References
[KS12] | (1, 2) Vikas Kawadia and Sameet Sreenivasan, ``Sequential detection of temporal communities by estrangement confinement ,’’ in Nature Scientific Reports, Nov 2012, http://www.nature.com/srep/2012/121109/srep00794/full/srep00794.html |
Examples
>>> ECA(dataset_dir='tests/sample_data',delta=0.001)
This module implements community detection using an agglomerative approach as decribed in [Blondel08]. It has been modified for estrangement confinement. Tje original code is available here: http://perso.crans.org/aynaud/communities/
Function which return the partition of the nodes at the given level.
A dendogram is a tree and each level is a partition of the graph nodes. Level 0 is the first partition, which contains the smallest communities, and the best partition is at height [len(dendogram) - 1]. The higher the level in the dendogram, the bigger the communities in that level. This function merely processes an existing dendogram, which is created by estrangement.agglomerate.generate_dendogram.
Parameters : | dendogram : list of dict
level : int
|
---|---|
Returns : | partition : dictionary
|
Raises : | KeyError :
|
See also
Examples
>>> G=nx.erdos_renyi_graph(100, 0.01)
>>> dendo = generate_dendogram(G)
>>> for level in range(len(dendo) - 1) :
>>> print "partition at level", level, "is", partition_at_level(dendo, level)
Function to compute the modularity of a partitioning of a graph, as defined in [Newman04].
Parameters : | partition : dictionary {node:community label}
graph : networkx.Graph
|
---|---|
Returns : | modularity : float
|
Raises : | KeyError: :
ValueError: :
TypeError: :
|
References
[Newman04] | (1, 2, 3) M.E.J. Newman and M. Girvan, “Finding and evaluating community structure in networks.” Physical Review E 69:26113 (2004). |
Examples
>>> G=nx.erdos_renyi_graph(100, 0.01)
>>> part = best_partition(G)
>>> modularity(part, G)
Function to compute the partition of the graph nodes which maximises the modularity (or tries to) using the Louvain heuristices.
This is the partition of highest modularity, i.e. the highest partition of the dendogram generated by the Louvain algorithm, see: [Blondel08]
Parameters : | graph : networkx.Graph
tolerance: float :
lambduh: float :
Zgraph : networkx.Graph
partition: dictionary {node : community label}
q : multiprocessing.Queue
|
---|---|
Returns : | partition : dictionnary {node
|
See also
References
[Blondel08] | (1, 2, 3, 4, 5, 6) Blondel, V.D. et al. Fast unfolding of communities in large networks. J. Stat. Mech 10008, 1-12(2008). |
Examples
>>> #Basic usage
>>> G=nx.erdos_renyi_graph(100, 0.01)
>>> part = best_partition(G)
>>> #other example to display a graph with its community :
>>> #better with karate_graph() as defined in networkx examples
>>> #erdos renyi don't have true community structure
>>> G = nx.erdos_renyi_graph(30, 0.05)
>>> #first compute the best partition
>>> partition = community.best_partition(G)
>>> #drawing
>>> size = float(len(set(partition.values())))
>>> pos = nx.spring_layout(G)
>>> count = 0.
>>> for com in set(partition.values()) :
>>> count = count + 1.
>>> list_nodes = [nodes for nodes in partition.keys()
>>> if partition[nodes] == com]
>>> nx.draw_networkx_nodes(G, pos, list_nodes, node_size = 20,
node_color = str(count / size))
>>> nx.draw_networkx_edges(G,pos, alpha=0.5)
>>> plt.show()
Function to find communities in a graph and return the associated dendogram.
A dendogram is a tree and each level is a partition of the graph nodes. Level 0 is the first partition, which contains the smallest communities, and the best is len(dendogram) - 1. The higher the level, the bigger the size of the communities. At each step, nodes (which are the communities from the previous step) are merged into aggregate communities such that modularity is maximized, subject to the estrangement constraint. For more details on the basic agglomerative procedure, see: [Blondel08].
Parameters : | graph : networkx.Graph
delta : float
tolerance: float :
lambduh: float :
Zgraph : networkx.Graph
part_init : dictiionary, optional {node:community}
|
---|---|
Returns : | dendogram : list of dictionaries
|
Raises : | TypeError: :
|
See also
Notes
Uses Louvain algorithm: [Blondel08]
Examples
>>> G=nx.erdos_renyi_graph(100, 0.01)
>>> dendo = generate_dendogram(G)
>>> for level in range(len(dendo) - 1) :
>>> print "partition at level", level, "is", partition_at_level(dendo, level)
Function to produce a graph where the nodes belonging to the same community are aggregated into a single node.
In the induced graph, there is a link of weight w between communities if the sum of the weights of the links between their elements is w. Communities are merged until there is no significant improvement on the objective function.
Parameters : | partition : dictionary {node:community}
graph : networkx.Graph
Zgraph : networkx.Graph
|
---|---|
Returns : | g : networkx.Graph
|
Examples
>>> n = 5
>>> g = nx.complete_graph(2*n)
>>> part = dict([])
>>> for node in g.nodes() :
>>> part[node] = node % 2
>>> ind = induced_graph(part, g)
>>> goal = nx.Graph()
>>> goal.add_weighted_edges_from([(0,1,n*n),(0,0,n*(n-1)/2), (1, 1, n*(n-1)/2)])
>>> nx.is_isomorphic(int, goal)
True
This module implements the Label Propagation Algorithm (LPAm) as decribed in: [Barber09].
Function to run the Label Propagation Algorithm and return the labelling upon convergence.
The Label Propagation algorithm initially assigns each node a unique label and sequentially allows nodes to change their label so as to maximize a specified objective function, which in this case is modularity.
Parameters : | G : networkx.Graph
tolerance: float, optional :
Z: networkx.Graph, optional :
lambduh: :
initial_label_dict : dictionary {node_identifier:label}, optional
|
---|---|
Returns : | label_dict : dictionary {node_identifier:label}
|
Raises : | NetworkXError :
|
See also
agglomerate.generate_dendogram
References
[Barber09] | (1, 2) M.J. Barber and J.W. Clark, “Detecting Network Communities by propagating labels under constraints.” Physical Review E 80:026129 |
[Raghavan07] | U.N. Raghavan, R. Albert and S. Kumara, “Near linear time algorithm to detect community structure in large-scale networks.” Physical Review E 76:036106 |
Examples
>>> labeling = lpa.lpa(current_graph, tolerance, lambduh, Z=current_Zgraph)
>>> new_labeling = lpa.lpa(current_graph, tolerance, lambduh, labelling, Z=current_Zgraph)
>>> list(lpa.lpa(current_graph, tolerance, lambduh, Z=current_Zgraph))
[(0,1),(1,1),(2,2),(3,1)]
This module implements functions used to produce evolution charts and other plots based on the output of estrangement confinement estrangement.estrangement.ECA .
Function to scan for simulation folders in the current working directory and read the values of delta used in ECA.
The estrangement.estrangement.ECA function creates a folder specific to the value of delta used in each simulation (e.g. task_delta_0.01). Within each of these folders, a config file (simulation.conf) specifies the value of delta used in the simulation. This function reads the value of delta from each such config file and returns them in a list.
Alternatively, delta can be specified in the function calls, bypassing this function. See EstrangementDemo for more details.
Returns : | deltas : list
|
---|
Examples
>>> deltas = GetDeltas()
>>> print(deltas)
Given dictionaries, dictX with key=label, val = iterable of X values, and dictY with key=label, val = iterable of Y values, this function plots lines for each the labels specified on the same axes.
Parameters : | dictX : dictionary {label:[list of values]}
dictY : dictionary {lable:[list of values]}
deltas : list of floats
linewidth : float, optional
markersize : float, optional
label_fontsize : integer, optional
xfigsize : float, optional
yfigsize : float, optional
fontsize : integer, optional
fname : string, optional
listLinesstyles : list of strings, optional
xlabel : string, optional
ylabel : string, optional
title : string, optional
xscale : string, optional
yscale : string, optional
dictErr : Dictionary {label:[list of values]}
display_on : boolean
|
---|
Examples
>>> dictX = {'Estrangement:delta_0.05': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10], 'Estrangement:delta_0.025': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10], 'Estrangement:delta_0.01': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10], 'Estrangement:delta_1.0': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]}
>>> dictY = {'Estrangement:delta_0.05': [0.0, 0.020202020202020204, 0.04040404040404041, 0.031578947368421054, 0.010309278350515464, 0.010101010101010102, 0.020202020202020204, 0.030612244897959183, 0.030303030303030304, 0.0103092783505154], 'Estrangement:delta_0.025': [0.0, 0.020202020202020204, 0.0, 0.021052631578947368, 0.0, 0.020202020202020204, 0.010101010101010102, 0.02040816326530612, 0.010101010101010102, 0.0], 'Estrangement:delta_0.01': [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], 'Estrangement:delta_1.0': [0.061224489795918366, 0.020202020202020204, 0.04040404040404041, 0.05263157894736842, 0.0, 0.020202020202020204, 0.06060606060606061, 0.02040816326530612, 0.04040404040404041, 0.041237113402061855]}
>>> plot_by_param(dictX, dictY,deltas=[],fname='estrangement.svg', listLinestyles=['bs--', 'ro-',], xlabel="Time", ylabel='Estrangement', title="Estrangement")
Plots a graph with the attributes specified in listNames.
This function relies on the file output of estrangement.estrangement.ECA. The value of the parameter write_stats should be set to True when calling estrangement.estrangement.ECA.
Parameters : | listNames : list of strings
image_extension : string, optional
|
---|---|
Returns : | Nothing : The image is displayed on the screen and/or written to file. |
Notes
The function reads the relevant input data from file and formats it. The actual plotting is done in estrangement.plots.plot_by_param.
Examples
>>> deltas = [0.01,0.025,0.05,1.0]
>>> for d in deltas:
... estrangement.ECA(dataset_dir='../data',delta=d,increpeats=opt.increpeats,minrepeats=opt.minrepeats)
>>> plot_function(['Q', 'F',])
>>> plot_function(['Estrangement'])
Function to plot avg(Q*-E) versus delta to get insights into the best delta for the given dataset.
This module merely processes the data, the plotting is done by estrangement.plots.plot_by_param. It requires the results from estrangement.estrangement.ECA to be outputted to file.
Parameters : | image_extension : string
deltas : list of floats, optional
|
---|---|
Returns : | Nothing : The plot is displayed on the screen and/or written to file. |
Notes
To produce the necessary stat files, set ‘write_stats=True’ when calling estrangement.estrangement.ECA.
Examples
>>> deltas = [0.01,0.025,0.05,1.0]
>>> for d in deltas:
... estrangement.ECA(dataset_dir='../data',delta=d,increpeats=opt.increpeats,minrepeats=opt.minrepeats)
>>> ChooseingDelta()
Function to perform the necessary preprocessing before making the tiled plots, of the temporal communities, for all simulation parameters.
Parameters : | matched_labels : dictionary {delta:{time: {node:label}}}
deltas : list of floats, optional
nodes of interest : list of integers, optional
delta_to_use_for_node_ordering : float, optional
label_sorting_keyfunc : string, optional
|
---|---|
Returns : | node_index_dict : dictionary {nodename:index_value}
t_index_dict : dictionary {timestamp
label_index_dict : dictionary {label
labels_of_interest_dict : dictionary {delta
|
Examples
>>> deltas = [0.01,0.025,0.05,1.0]
>>> for d in deltas:
... estrangement.ECA(dataset_dir='../data',delta=d,increpeats=opt.increpeats,minrepeats=opt.minrepeats)
>>> node_index_dict, t_index_dict, label_index_dict, labels_of_interest_dict = preprocess_temporal_communities(
nodes_of_interest=nodes_of_interest)
>>> print(node_index_dict)
>>> print(t_index_dict)
Module to create tiled plots, illustrating the temporal communities, for different values of paramters.
Parameters : | matched_labels : dictionary {delta
nodes of interest : list of integers, optional
deltas : list of floats, optional
tiled_figsize : string, optional
manual_colormap : dictionary, optional
label_cmap : string, optional
frameon : boolean, optional
xlabel : string, optional
ylabel : string, optional
label_fontsize : integer, optional
show_title : boolean, optional
fontsize : integer, optional
colorbar : boolean, optional
show_y_ticklabels : boolean, False
nodelabel_func : String, optional
xtick_separation : integer, optional
snapshotlabel_func : string, optional
wspace : float, optional
bottom : float, optional
image_extension : string, optional
dpi : integer, optional
display_on : boolean, optional
|
---|---|
Returns : | Nothing, the plot is written to file and/or displayed on the screen depending on specified parameters. : |
Examples
>>> deltas = [0.01,0.025,0.05,1.0]
>>> for d in deltas:
... estrangement.ECA(dataset_dir='../data',delta=d,increpeats=opt.increpeats,minrepeats=opt.minrepeats)
>>> plot_temporal_communities()
Function to plot F with lambduhs for various snapshots.
Parameters : | linewidth : float, optional
image_extension : string, optional
|
---|---|
Returns : | Nothing. The plot is written to file. : |
Examples
>>> deltas = [0.01,0.025,0.05,1.0]
>>> for d in deltas:
... estrangement.ECA(dataset_dir='../data',delta=d,increpeats=opt.increpeats,minrepeats=opt.minrepeats)
>>> plot_with_lambdas()