torrentMonitor: Tracking torrent user downloads

Python and elasticsearch for torrent tracking

A few weeks ago I read a news about how new Disney’s Mulan has become one of the most pirated films in all history. The previous article presented some stats about user downloads, and I always thought about doing something similar. So let’s see how we can create our tracker system.

Image for post
Photo by Alina Grubnyak on Unsplash

BitTorrent protocol

Nowadays direct downloads seem to be a thing of the past and the most popular option for download content is the BitTorrent network. BitTorrent has been evolving since its initial release in 2001 and their DHT usage made almost impossible to forbid access to any type of content inside this network. In order to download a file, we just need to query a specific hash to the BitTorrent network. We will need that someone is sharing this file inside the network but BitTorrent avoid the use of a classical client/server architecture using a P2P network. In order to remove a file from the network, we will need to be able to stop the connection of all the clients that are sharing the file. This is not possible and even if we manage to did that, new people that downloaded the file could be able to start sharing the file again in the network.

Image for post
Image by Wikipedia

On the other hand, BitTorrent is not a private network, the “privacy issues” associated with BitTorrent protocol are the ones related to the connection of each peer. The moment you start to download a file you are making public your IP address and some metadata. This is happening because the moment you starts download a file you also start sharing it with the BitTorrent network, you become a seeder. In a client/server scenario only the server knows your address but being this a collaborative network, you will share content too with other users, peers. Moreover, it not necessary to have the full file downloaded to start sharing because BitTorrent split the file into pieces, so the moment we start to download a file, we start sharing it too.

What information can we collect?

I already mentioned the possibility of obtaining the IPs of each user downloading a file from the network, but there are also client metadata. Some BitTorrents clients public their name an version and that information can give us some interesting insights. For example, What clients and client versions are being used for downloading a certain file?

Image for post
Peers tab in Deluge client

Monitor torrents with libtorrent

TL;DR, download torrentMonitor and use it

Ok, so How we can monitor the downloads of a torrent? In my opinion, the easiest way is to use open-source libraries that implement the BitTorrent protocol or just modify an open-source client. In my case, I chose to use libtorrent, a C++ implementation of the BitTorrent protocol, and their Python bindings.

To use libtorrent the first thing that we need to do is to install it. The easiest way to do this is to use apt in Ubuntu/Debian systems or brew in Mac OS:

sudo apt install python3-libtorrentbrew install libtorrent-rasterbar

After completing this step we will have libtorrent and their python bindings ready on our system. The usage is pretty simple and you have very useful implementation examples in their GitHub repository.

import libtorrent as lt
import time
import sys
#Start a Torrent session
ses = lt.session({'listen_interfaces': '0.0.0.0:6881'})
#Add a torrent as argument
info = lt.torrent_info(sys.argv[1])
h = ses.add_torrent({'ti': info, 'save_path': '.'})
s = h.status()
print('starting', s.name)

Then you can start to download any file published in the BitTorrent network and collect all the peers, people sharing and downloading, using get_peerb_info() method.

while (not s.is_seeding):
s = h.status()
# Get peers
peers = h.get_peer_info()
print('\r%.2f%% complete (down: %.1f kB/s up: %.1f kB/s peers: %d) %s' % (
s.progress * 100, s.download_rate / 1000, s.upload_rate / 1000,
s.num_peers, s.state), end=' ')
alerts = ses.pop_alerts()
for a in alerts:
if a.category() & lt.alert.category_t.error_notification:
print(a)
sys.stdout.flush() time.sleep(1)

This code is a really simple modification the simple_client.py example and when the file is downloaded it will stop of getting peers. After performing some changes and programming further functionalities I created torrentMonitor, whit a monitor scenario in mind.

Modifications

torrentMonitor has several modifications that allow us to perform torrent monitoring in an easy way. One of the first thing that I tweak was to set a download_rate limit. I am not looking to download any files so I do not need to run at the client at full speed. After some tests, 10000 bytes per second is enough to downloading peers info and metadata, slowest rate limitation did not download client information.

'download_rate_limit': 10000,

This is important because each peer obtained using get_peerb_info() has an IP, port and client. Like mentioned before not all the BitTorrent clients send their name and version.

Geolocation

Image for post

Another capability that I programmed in torrentMonitor is IP geolocation. torrentMonitor can use MaxMind database for IP geolocation. To use this functionally, you will need to download the MaxMind Geolite Country database.

This can be done in two steps: set up an account in MaxMind and then download Geolite Country database.

Saving data

Image for post

All information collected by torrentMonitor can be saved as a CSV file using the -o option but in my opinion, a better option is to save the data in elasticsearch.

python3 torrentMonitor.py -t AB4DEB4C2B2BE9EBCEB74955B3727BA45060C34B.torrent -o peers

Elasticsearch will allow us to search, analyze, and visualize the data in real-time, and it is Open Source so everyone can set up their own installation. torrentMonitor it already incorporates an elasticsearch saving functionality, with the -ek option. By default, torrentMonitor will try to communicate with an elasticsearch cluster under localhost:9200. In the case that you are running your cluster somewhere else or with authentication you will need to modify the following line in bold.

def save_elasticsearch_es(self, index, data):
es = Elasticsearch(hosts="localhost:9200")

Visualization

The involving of elasticsearch give us the option of using Kibana for visualizing and study our results. In the past tools like Pentaho were useful for visualization and analysis, but right now Kibana offers further and better possibilities.

Using Kibana it is really fast and easy create some visualizations. Like most used clients or a map representing the number of downloads per country. I am working in some analysis of torrent downloads, like Mulan or the Mandalorian but this information will be published in other posts.

Image for post
Top BitTorrent clients
Image for post
Choropleth map representing downloads by country

Running torrentMonitor

To run torrentMonitor some magnet links or torrent files are needed. torrentMonitor can read torrent files or magnet links using the -t option.

python3 torrentMonitor.py -t AB4DEB4C2B2BE9EBCEB74955B3727BA45060C34B.torrent -g -ek

The previous example is tracking a torrent file, geolocating all the peers and saving all the results into an elasticsearch cluster.

Conclusions and further work

The tool is already published on Github so feel free of download it and play with it!!! On the other hand, some clarifications, I am not doing this paying by any company. I am just a cybersecurity researcher looking at how some people are using BitTorrent network. This monitoring can be done by anybody and the release of this tool just evidentiate the necessity of using some kind of privacy measurements, like VPNs.

Related to further work, it would be great to involve a VPN detection service and try to discover how many people are using VPN services or other privacy solutions. I did not find any free service for VPN detection, without a lot of limitations. If you are looking to do something like that, my recommendation is getipintel. You will need to pay for using it because only the first 500 queries per day are free. But after testing several options, I think, this one is the most reliable and affordable option.

Finally, if you would like to check some research results about the BitTorrent network just follow me ;) because I am planning to release some post with in-depth analysis of popular torrent donwloads.

Bachelor of Computer Science and MSc on Cyber Security. Currently working as a cybersecurity researcher at the University of Alcalá.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store