Skip to content

Implement CAIDA ITDK #167

@m-appel

Description

@m-appel

Discussed in #146

Originally posted by m-appel October 25, 2024

Description

The Macroscopic Internet Topology Data Kit (ITDK) contains data about connectivity and routing gathered from a large cross-section of the global Internet.

A process too complex to describe here in detail results in a router-level topology for both IPv4 and IPv6. The core components are nodes, interfaces, and links.

  • Node: Represents a physical router.
    • Has one or more interfaces
    • Is mapped to exactly one AS
    • Can be mapped to a geographical location consisting of continent, country, region, city, and GPS coordinates
  • Interface: Represents a router interface.
    • Belongs to exactly one router
    • Unsure: Belongs to exactly one link? Not sure if there are dangling interfaces, need to check data.
    • Can have an IP address
    • Can have a DNS name
  • Link: Represents a layer 2 connection between two or more routers.
    • Consists of routers and, if available, the interface IPs

This dataset is only available every few months.

Geolocation data is not available for IPv6.

Modeling

reference_org: CAIDA
reference_name: caida.itdk_v[4|6]

New nodes

  1. Router: Represents the node that connects to IPs, router links, AS, geolocation, etc.
  2. RouterLink: Represents the layer 2 connection between Router nodes. This needs to be a node and not a relationship since there can be more than two routers on one link.

Relationships

(:Router)<-[:ASSIGNED]-(:IP) // Router-interface mapping
(:Router)-[:MANAGED_BY]->(:AS)
(:Router)-[:COUNTRY]->(:Country)
(:Router)-[:PART_OF]->(:RouterLink)

(:IP)-[:PART_OF]->(:RouterLink)
(:IP)<-[:RESOLVES_TO]-(:HostName)

(:Router)-[:LOCATED_IN->(:City) // Cities not yet modeled

One caveat with this modelling is that the interface IP (if present) is separated from the Router in the RouterLink:

(r:Router)-[:PART_OF]->(:RouterLink)<-[:PART_OF]-(:IP)-[:ASSIGNED]->(r)

Maybe an optional helper property could be useful

(:Router)-[:PART_OF {'ip': 'x.x.x.x'}]->(:RouterLink)

Open Questions

  • There is an additional data file available that identifies if IPs were seen as a transit or destination hop (or both) in traceroute. I don't think this is particularly relevant for us?
  • This dataset is only available every few months and I expect it to be very large. Maybe we should implement a caching mechanism that just imports pre-processed node/relationship files?

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions