Skip to content

Commit 399c6fa

Browse files
committed
fix: minor fixes in some notebooks
1 parent d82f30d commit 399c6fa

File tree

6 files changed

+80
-51
lines changed

6 files changed

+80
-51
lines changed

doc/source/community_detection_guide/notebooks/community_detection_algorithms.ipynb

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
"# __Community detection algorithms__\n",
99
"## __Optimization based methods: modularity maximization__\n",
1010
"Modularity maximization methods are a prominent class of algorithms in community detection that aim to discover partitions of a network by optimizing a specific quality function called modularity.\n",
11-
"<div style=\"background-color: #e6ffe6; padding: 20px; border-radius: 5px;\">\n",
11+
"<div style=\"background-color: #e6ffe6; padding: 0px; border-radius: 5px;\">\n",
1212
" \n",
1313
"**NOTE:** You can find a more detailed explanation of **modularity** [here](./modularity.ipynb).\n",
1414
"\n",
@@ -92,7 +92,7 @@
9292
"\n",
9393
"### __community_fluid_communities__\n",
9494
"\n",
95-
"When is community_fluid_communities applied?\n",
95+
"#### When is community_fluid_communities applied?\n",
9696
"\n",
9797
"The Fluid Communities algorithm is typically applied when:\n",
9898
"\n",
@@ -107,7 +107,7 @@
107107
"\n",
108108
"### __community_edge_betweenness (The Girvan-Newman Algorithm)__\n",
109109
"\n",
110-
"When is community_edge_betweenness applied?\n",
110+
"#### When is community_edge_betweenness applied?\n",
111111
"\n",
112112
"The Edge Betweenness (Girvan-Newman) algorithm is typically applied when:\n",
113113
"\n",
@@ -122,7 +122,7 @@
122122
"\n",
123123
"### __community_label_propagation__\n",
124124
"\n",
125-
"When is community_label_propagation applied?\n",
125+
"#### When is community_label_propagation applied?\n",
126126
"\n",
127127
"The Label Propagation Algorithm (LPA) is typically applied when:\n",
128128
"\n",

doc/source/community_detection_guide/notebooks/initial_workflow.ipynb

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -552,11 +552,11 @@
552552
"\n",
553553
"* **Note on local seeds:** Many `igraph` algorithms, such as the Leiden algorithm, rely on random processes. To ensure a specific part of your analysis is reproducible without affecting the rest of your notebook, you can use a custom utility function, such as `local_random()`, imported from [here](./functions.ipynb).\n",
554554
"\n",
555-
" For example:\n",
556-
" ```python\n",
557-
" with local_random(seed=123):\n",
558-
" g.community_leiden()\n",
559-
" ```\n",
555+
"For example:\n",
556+
"```python\n",
557+
"with local_random(seed=123):\n",
558+
" g.community_leiden()\n",
559+
"```\n",
560560
"\n",
561561
"</div>"
562562
]

doc/source/community_detection_guide/notebooks/membership_vector.ipynb

Lines changed: 1 addition & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -7,15 +7,6 @@
77
"source": [
88
"# Membership vector\n",
99
"\n",
10-
"### What is a __membership vector__?\n",
11-
"A membership vector is a list or array that assigns a cluster or group identifier to each data point or object.\n",
12-
"\n",
13-
"* __Structure:__ It's a one-dimensional sequence where the length is equal to the number of data points.\n",
14-
"\n",
15-
"* __Indexing:__ Each index in the vector corresponds to a specific data point. For example, the value at index i is the cluster ID for the i-th data point.\n",
16-
"\n",
17-
"* __Values:__ The values in the vector are the cluster IDs. These are typically non-negative integers (e.g., 0, 1, 2, 3, ...).\n",
18-
"\n",
1910
"### Karate club network: A case study in `igraph`\n",
2011
"Let's apply the concept of a membership vector to the famous Zachary's Karate Club network. \n",
2112
"\n",
@@ -87,7 +78,7 @@
8778
"id": "0160c803-fa2a-4dd4-802c-8ce94df9051e",
8879
"metadata": {},
8980
"source": [
90-
"### Why it's useful\n",
81+
"### Why is it useful?\n",
9182
"\n",
9283
"The membership vector is the most direct and compact representation of a clustering. It serves as the basis for almost all subsequent analyses and visualizations:\n",
9384
"\n",

doc/source/community_detection_guide/notebooks/modularity.ipynb

Lines changed: 63 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,8 @@
55
"id": "94f8164e-537f-41a1-bfc4-ed15c7b00cf8",
66
"metadata": {},
77
"source": [
8-
"# Modularity formula\n",
8+
"# Modularity\n",
9+
"## Modularity formula\n",
910
"\n",
1011
"Modularity is a quantitative metric used to evaluate the strength of a network's division into modules (or communities). It measures how well the network is partitioned by comparing the density of edges within communities to the expected density of such edges in a randomized network that preserves the original degree distribution. The formula for modularity is given below.\n",
1112
"\n",
@@ -36,7 +37,7 @@
3637
},
3738
{
3839
"cell_type": "code",
39-
"execution_count": 2,
40+
"execution_count": 1,
4041
"id": "a4766cc0-3157-493f-a072-9fd87ed92519",
4142
"metadata": {},
4243
"outputs": [
@@ -100,7 +101,7 @@
100101
},
101102
{
102103
"cell_type": "code",
103-
"execution_count": 4,
104+
"execution_count": 2,
104105
"id": "eccd7d47-1606-4fc8-ac0c-31ad49604f0b",
105106
"metadata": {},
106107
"outputs": [
@@ -127,7 +128,7 @@
127128
"metadata": {},
128129
"source": [
129130
"### Modularity calculation\n",
130-
"### Computation for \"good\" partitioning ($P_{good}$)\n",
131+
"### Computation for \"good\" partitioning ($P_\\text{good}$)\n",
131132
"\n",
132133
"This partition correctly identifies the two cliques.\n",
133134
"\n",
@@ -154,9 +155,38 @@
154155
"\n",
155156
"**Final modularity** ($Q_{good}$):\n",
156157
"$Q_{good} = \\frac{1}{2m} \\times (\\text{Total Sum}) = \\frac{1}{14} \\times 5 = \\frac{5}{14} \\approx \\mathbf{0.357}$\n",
158+
"\n"
159+
]
160+
},
161+
{
162+
"cell_type": "code",
163+
"execution_count": 3,
164+
"id": "b6de50ca-146b-4323-9828-80d883d2d8e9",
165+
"metadata": {},
166+
"outputs": [
167+
{
168+
"data": {
169+
"text/plain": [
170+
"0.3571428571428571"
171+
]
172+
},
173+
"execution_count": 3,
174+
"metadata": {},
175+
"output_type": "execute_result"
176+
}
177+
],
178+
"source": [
179+
"membership_good = [0, 0, 0, 1, 1, 1]\n",
180+
"g.modularity(membership_good)"
181+
]
182+
},
183+
{
184+
"cell_type": "markdown",
185+
"id": "8bdbc1dc-5c26-4b33-b5f4-072fee61fc7f",
186+
"metadata": {},
187+
"source": [
157188
"\n",
158-
"\n",
159-
"### 2. Computation for \"bad\" partitioning ($P_{bad}$)\n",
189+
"### Computation for \"bad\" partitioning ($P_\\text{bad}$)\n",
160190
"\n",
161191
"This partition incorrectly splits a clique and merges nodes from both communities.\n",
162192
"\n",
@@ -186,14 +216,36 @@
186216
"$Q_{bad} = \\frac{1}{2m} \\times (\\text{Total Sum}) = \\frac{1}{14} \\times (-3) = -\\frac{3}{14} \\approx \\mathbf{-0.214}$"
187217
]
188218
},
219+
{
220+
"cell_type": "code",
221+
"execution_count": 4,
222+
"id": "894ddb6b-25d5-4060-be09-5758f2d3db45",
223+
"metadata": {},
224+
"outputs": [
225+
{
226+
"data": {
227+
"text/plain": [
228+
"-0.2142857142857143"
229+
]
230+
},
231+
"execution_count": 4,
232+
"metadata": {},
233+
"output_type": "execute_result"
234+
}
235+
],
236+
"source": [
237+
"membership_bad = [0, 0, 1, 0, 1, 1]\n",
238+
"g.modularity(membership_bad)"
239+
]
240+
},
189241
{
190242
"cell_type": "markdown",
191243
"id": "f3981bc5-f6bf-4b41-a168-cae9c17ec764",
192244
"metadata": {},
193245
"source": [
194246
"*Note:* Based on our previous analysis, the \"good\" partitioning yields a significantly higher modularity score. It is important to note, however, that a high modularity score is not always a definitive indicator of a better community partitioning, as was previously demonstrated with the Grid Graph [here](test_significance_of_community.ipynb).\n",
195247
"\n",
196-
"# Directed Modularity\n",
248+
"## Directed modularity\n",
197249
"\n",
198250
"While the classic modularity formula works for undirected networks, a different approach is needed for **directed networks**, where edges have a specific direction (e.g., from node *i* to node *j*). In this context, the direction of an edge is crucial and should not be ignored.\n",
199251
"\n",
@@ -213,7 +265,7 @@
213265
"* **Flipping a single edge:** Reversing a single edge (e.g., from A → B to B → A) will change the modularity score. This is because the out-degree of A and the in-degree of B would change, altering the null model's calculation and, consequently, the overall score.\n",
214266
"* **Flipping all edges:** If you reverse the direction of **every single edge** in the network, the modularity score will **remain the same**. This is due to a symmetry property of the formula. The set of in-degrees becomes the new set of out-degrees, and vice versa. When the formula is applied to this completely reversed network, the total modularity score is unchanged. This is a fascinating property of directed modularity.\n",
215267
"\n",
216-
"# From directed to undirected formula\n",
268+
"## From directed to undirected formula\n",
217269
"Start with the directed formula:\n",
218270
"\n",
219271
"$$Q = \\frac{1}{m} \\sum_{i,j} \\left[ A_{ij} - \\gamma \\frac{k_i^\\text{out} k_j^\\text{in}}{m} \\right] \\delta(c_i, c_j)$$\n",
@@ -232,7 +284,7 @@
232284
"\n",
233285
"\n",
234286
"\n",
235-
"# Why the resolution parameter is important\n",
287+
"## Why the resolution parameter is important\n",
236288
"\n",
237289
"The resolution parameter addresses a fundamental limitation of the original modularity measure, known as the **\"resolution limit\"**. This is the tendency of the original formula (where $\\gamma=1$) to fail at detecting small communities, especially in large graphs. It often merges smaller, distinct communities into a single larger one to maximize the modularity score.\n",
238290
"\n",
@@ -242,7 +294,8 @@
242294
"* $\\gamma < 1$: Decreasing the resolution parameter reduces the penalty. This allows the algorithm to find **more and smaller communities**, as it becomes easier for closely-knit groups to be identified as their own communities.\n",
243295
"\n",
244296
"In essence, the resolution parameter provides a flexible way to explore the community structure of a network at different scales, moving beyond the limitations of a single, fixed-scale partition.\n",
245-
"# Density-based modularity for undirected graphs\n",
297+
"\n",
298+
"## Density-based modularity for undirected graphs\n",
246299
"\n",
247300
"While modularity is a powerful metric, it suffers from a well-known flaw called the **resolution limit**. This problem causes the modularity-maximizing algorithm to fail to detect small, tightly-knit communities, especially in large networks. Instead of finding these small groups, it often merges them into a single larger one to maximize the modularity score.\n",
248301
"\n",
@@ -270,14 +323,6 @@
270323
"\n",
271324
"In this formulation, the null model assumes **uniform edge probability**, so communities are favored if their **internal density** is higher than the global density.\n"
272325
]
273-
},
274-
{
275-
"cell_type": "code",
276-
"execution_count": null,
277-
"id": "1b9c3fa0-05b7-461f-acf8-8348f344a2d9",
278-
"metadata": {},
279-
"outputs": [],
280-
"source": []
281326
}
282327
],
283328
"metadata": {

doc/source/community_detection_guide/notebooks/resolution.ipynb

Lines changed: 6 additions & 5 deletions
Large diffs are not rendered by default.

doc/source/community_detection_guide/notebooks/test_significance_of_community.ipynb

Lines changed: 1 addition & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -270,7 +270,7 @@
270270
"id": "94db88ee-12f7-4e68-a2ab-d76136a6d983",
271271
"metadata": {},
272272
"source": [
273-
"## Testing Significance of Community Structure on a Grid Graph"
273+
"## Testing significance of community structure on a grid graph"
274274
]
275275
},
276276
{
@@ -427,14 +427,6 @@
427427
"\n",
428428
"plot_nmi_histogram(er_graph, pairwise_nmi_values, title)"
429429
]
430-
},
431-
{
432-
"cell_type": "code",
433-
"execution_count": null,
434-
"id": "54dc19c6-8e76-491d-bc20-b76b08b128bd",
435-
"metadata": {},
436-
"outputs": [],
437-
"source": []
438430
}
439431
],
440432
"metadata": {

0 commit comments

Comments
 (0)