This project explores the hyperlink structure of government websites using social network analysis (SNA). Each government website is represented as a node, and hyperlinks between sites are modeled as directed edges. The resulting network graph is used to analyze the interconnectivity and structure of the digital public sector.
- 🌐 Web Graph Construction: Builds a directed graph from hyperlinks among government websites.
- 🔍 Breadth-First Traversal: Extracts data using BFS to capture relevant and reachable nodes.
- 📊 Social Network Metrics: Applies centrality and connectivity measures using
igraph
. - 🖼️ Visualization: Includes a graphical representation of the resulting network.
- 📁 Reproducible Workflow: Code and data included for replicating the analysis in R.
Make sure you have the following installed:
- R (version 3.6 or newer)
- R packages:
igraph
readr
ggplot2
(optional, for further visualization)
-
Clone this repository:
git clone https://github.com/casualcomputer/social-network-analysis.git cd social-network-analysis
-
Open the
website_network.R
file in RStudio or your preferred R IDE. -
Install required packages (if not already installed):
install.packages(c("igraph", "readr", "ggplot2"))
-
Run the script:
source("website_network.R")
File | Description |
---|---|
website_network.R |
Core script to construct the graph and analyze the network. |
hyperlinkTraversal.csv |
Input data containing hyperlink relationships between sites. |
social_network_normal.PNG |
Visualization of the resulting network. |
README.md |
This file. |
-
How do people typically navigate CRA websites to find information?
Identify common user paths through the CRA web ecosystem using hyperlink traversal analysis. This helps improve access to frequently asked information. -
Which CRA websites are the most important or influential?
Determine key content hubs using network centrality measures to prioritize updates and resource focus. -
Is the CRA’s website network well-connected or fragmented?
Analyze the overall structure to identify isolated groups of pages and enhance internal linking where needed. -
What does the web structure reveal about how CRA sites are organized?
Use metrics like link distribution and cluster size to assess and optimize the navigation experience.
This project is licensed under the GNU GPL-3.0 License.
Developed by @casualcomputer. Contributions and suggestions welcome!