Scanning I2P’s NetDB

2017/11/15

Through most of 2015, I wrote and operated a program named i2spy. The purpose of i2spy was to perform a census of the I2P network. From the way that I2P is ran, there is no centralized way to see even basic statistics about the network. Although the I2P network is an anonimity network, there is still value in having some basic diagnostic information of the whole network.

In order to see more of the I2P network, I ran nodes on VPSs in, New York, the Netherlands, and Singapore. Two other nodes were ran by friends in unknown locations. Each of these nodes reported in to a centralized node with various data points (which you’ll see below).

The code for this project can be found on GitHub

What I Collected

Minimizing Data

I’d like to start off with, as of writing this, it is explicitly not in I2P’s threat model to hide that you are using I2P. So from an ethical point of view, this research project was probably within what an I2P user could expect.

While i2spy was running, I collected the following hourly from roughly five nodes distributed around the world:

Note, that I specifically did not collect the following:

I could have collected more, but I decided less is better. A concern of mine was that I could be subponead for the data. So I tried to collect just enough that I could not cause harm to any I2P users.

Things I Wish I Collected

Given I make another implementaion (likely), these are other attributes I would collect.

Results

User Count

Based on the peak number of detected router infos over several days, it appears there are about 50k routers. Please note that one user can run multiple routers. I did not do a super serious mathematical analysis on this number. I figured an approximate number was good enough.

Also note that the I2P rekeying happened at this time. This number tries to takes in to account the rekey.

Fast Pushing of Updates

Luckily, during the observation period two updates happened. I was able to observe I2P’s update process live! Since the update system is based on bittorrent, every I2P users helps every other I2P user upgrade.

Within two weeks, approximately 80% of I2P nodes had upgraded to the newest version! In my opinion, that’s a pretty good turn around. MUCH better than certain mobile operating systems and some web browsers.

Russians!

About half the network is in Russia. This has an added benefit that neither country likes to cooperate, so traffic analysis is, in theory, more difficult.

Most Activity around 17:00 UTC

Matches country stats (20:00 in Moscow).

NetDB ReKey Working

In I2P 0.9.22 the default signing algorigthm for router infos was switched from DSA_SHA1 to EdDSA_SHA256_Ed25519. To make sure that stable network conditions persisted, routers at reboot would randomly (with some percent probability) change their router info’s signing key. After two or so releases, the I2P team changed the signing key to only be EdDSA_SHA256_Ed25519.

Using this data, str4d was able to prove that the network rekey worked! I was cited indirectly in a presentation in Real World Cryptography 2016!

Can an I2P User Opt Out?

Short answer no, long anwser yes.

With enough work I2P could make this kind of analysis more difficult. I don’t know exactly how, but I am sure it could happen 🙂.

Follow Ups

Overall this was fun to research, and I was able to learn a lot about I2P and data analysis. In the future I hope to make another implemention that is much more modular to make up for a lot of poorly thought out design decisions I made early.

If you’re a researcher and would like the data, feel free to contact me.