Fork me on GitHub

Clusters in the Expanse: Understanding and Unbiasing IPv6 Hitlists

On this website we present additional information about our IMC paper Clusters in the Expanse: Understanding and Unbiasing IPv6 Hitlists and provide access to our IPv6 Hitlist Service.

IPv6 Hitlist Service

We provide an IPv6 Hitlist Service where we publish responsive IPv6 addresses, aliased prefixes, and non-aliased prefixes to interested researchers. The IPv6 Hitlist Service consists of an openly accessible one and a registration-first service.

Openly Accessible Service

You can use the weekly generated list of responsive IPv6 addresses, aliased prefixes, and non-aliased prefixes without registration: The responsive addresses include addresses from non-aliased prefixes only. Please see the notes about aliased prefixes below to make use of them.

Registration-First Service

We provide additional data which can be used to conduct in-depth research on IPv6 networks and addresses. This includes: To get free access to this registration-first service, you can send a quick registration email. We use the gathered data for statistical purposes and might very occasionally send a survey or other requests for feedback.

Referencing the Hitlist Service

If you are using data from the IPv6 Hitlist Service in your publication, please refer to it with the following reference:
@inproceedings{gasser2018clusters,
   title = {Clusters in the Expanse: Understanding and Unbiasing IPv6 Hitlists},
   author = {Gasser, Oliver and Scheitle, Quirin and Foremski, Pawel and Lone, Qasim and Korczynski, Maciej and Strowes, Stephen D. and Hendriks, Luuk and Carle, Georg},
   booktitle = {Proceedings of the 2018 Internet Measurement Conference},
   year = {2018},
   location = {Boston, MA, USA},
   numpages = {15},
   doi = {10.1145/3278532.3278564},
   publisher = {ACM},
   address = {New York, NY, USA},
}

The same reference applies to the open and registered service. [bib]

Software and Tools

During our IPv6 hitlist analysis we developed software to analyze and understand IPv6 hitlist. We publish the following software and tools for use by the scientific community:

ZMapv6

We extend the original ZMap to add IPv6 capabilities. ZMapv6 supports the following new IPv6-specific probe modules: ZMapv6 can read IPv6 target addresses from a file or from standard input.
Source: github.com/tumi8/zmap

zesplot

zesplot is a tool to visualize IPv6 networks. It uses the concept of squarified treemaps and plots IPv6 networks in a space-filling way. Note that unlike a Hilbert curve visualizing IPv4 address space, zesplot does not plot the entire IPv6 address space.
Source: github.com/zesplot/zesplot

Entropy Clustering

Entropy Clustering is a software to find and visualize clusters in IPv6 addressing schemes.
Source: github.com/pforemski/entropy-clustering

Entropy/IP

Entropy/IP is a software to find patterns in IPv6 addresses and generate addresses based on these patterns. Entropy/IP was presented during the 2016 Internet Measurement Conference. For more information see the Entropy/IP website. The Entropy/IP software is published by Akamai.
Source: github.com/akamai/entropy-ip

New Entropy/IP Generator

This generator uses the output of Entropy/IP to generate IPv6 addresses which follow the specified model.
Source: github.com/pforemski/eip-generator

Longest prefix matching for aliased prefixes

To make use of our published lists of aliased and non-aliased prefixes for your custom IPv6 address list, you need to perform longest prefix matching. This ensures that your addresses are matched to the longest aliased or non-aliased prefix. For your convenience we publish a simple Python script and a Go tool for this purpose.
Python source: aliases-lpm.py
Go source: aliases-lpm.go

Paper

Abstract. Network measurements are an important tool in understanding the Internet. Due to the expanse of the IPv6 address space, exhaustive scans as in IPv4 are not possible for IPv6. In recent years, several studies have proposed the use of target lists of IPv6 addresses, called IPv6 hitlists.
In this paper, we show that addresses in IPv6 hitlists are heavily clustered. We present novel techniques that allow IPv6 hitlists to be pushed from quantity to quality. We perform a longitudinal active measurement study over 6 months, targeting more than 50 M addresses. We develop a rigorous method to detect aliased prefixes, which identifies 1.5 % of our prefixes as aliased, pertaining to about half of our target addresses. Using entropy clustering, we group the entire hitlist into just 6 distinct addressing schemes. Furthermore, we perform client measurements by leveraging crowdsourcing.
To encourage reproducibility in network measurement research and to serve as a starting point for future IPv6 studies, we publish source code, analysis tools, and data.
Paper. Read the final version of our paper at arXiv.org: [abstract] and [PDF].
Authors. Oliver Gasser, Quirin Scheitle, Paweł Foremski, Qasim Lone, Maciej Korczyński, Stephen D. Strowes, Luuk Hendriks, Georg Carle.

Additional Plots

We provide additional plots for in-depth analysis accompanying the evaluations in our paper.

Interactive zesplot plots

Entropy clustering plots

Responsiveness over time

Reproducibility

We publish data and scripts to reproduce our analysis at the TUM university library to guarantee long-term availability.
Dataset DOI: 10.14459/2018mp1452739

Contact

gasser [AT] net.in.tum.de

Partners

TUM logo
IITiS logo
TU Delft logo
University Grenoble Alpes logo
RIPE NCC logo
UTwente logo