Research: A novel user friendly system for monitoring darknet marketplaces
Darknet markets have been emerging during the past few years as the ideal online platforms for trading various forms of illicit goods and services including drugs, weapons, counterfeit documents, stolen private data, hacking tools, and others. As such, it is pivotal for law enforcement agencies all over the world to develop effective means for monitoring darknet marketplaces and tracing individuals engaging in illegal activities on these dark web based platforms.
A recently published research study introduced the design of a novel system for collecting information about various goods and services offered for sale on darknet marketplaces. The proposed system enables users to search the harvested data and alerts them whenever changes take place on the monitored marketplaces. Throughout this article, we will overview this system and how it can benefit cyber crime investigation units.
Design of the system:
The goal of this research is to develop a system for monitoring darknet marketplaces, or cryptomarkets, for the cybercrime division of the SKPV police, Czech Republic. The system obtains data from cryptomarkets and organizes it into structured data that can be easily searched. The system crawls the same page repeatedly in order to track changes such as new product listings, price changes, etc. Users of the system will be notified, e.g. via email, whenever any changes are identified. The system maps darknet marketplaces and harvests the following information:
– Overview of product listings
– Prices of different products, shipping information, and other parameters that vary from one product listing to another
– Vendor information including number of completed sales, buyers’ feedback, etc.
Tor proxy is used (SOCKS5 proxy) so that that the system’s login module and the crawler can access darknet marketplaces. The login module will create and store a cookie that will enable the crawler to collect information from the marketplace, emulating a user on the platform. As all cryptomarkets use CAPTCHA, the system relies on a CAPTCHA rescue service to overcome this problem. These services rely on human power to solve the CAPTCHA and send the solution automatically via an API to the login module.
The crawler’s task is to browse through HTML pages, search them for predefined data, and find links in them to other content pages. The crawler is programmed to find product listings from different categories, e.g. cannabis, opioids, synthetic cannabinoids, etc.
For each marketplace, the crawler has to be individually configured before its launch in order to create the session cookie. The crawler was tested on Dream Market, which included more than 4,000 pages. The crawler works via a cookie database session created via the login module and obtains data that is saved to scraper HTML documents.
After the crawler saves data to scraper HTML documents, the scraper module will extract information from them about individual products. The data extracted by the scraper will be compared with data previously saved onto the database, and if a new item is detected, it will be added to the database. However, if the data involves a product item that had been previously added to the database, the scraper will only add any modifications that might have been identified by the crawler. The scraper will also include a timestamp for any database update.
Collected data is stored on datasheets via NoSQL databases. The main reasons for choosing NoSQL databases is the ease of implementation, the flexible database model, and the feasibility of notifying other system components of data modifications.
The database will be used to store the following data:
1- Cookies created via the login module to crawl various marketplaces
2- Description of products obtained via the crawler and organized via the scraper
3- The timestamps of scraped data by the scraper
To make it easy for users to access data, an index of recorded data is included in the database. Database candidates include RethinkDB or MongoDB, while for the database index Apache Solr or Elasticsearch can be used.
The notification system:
The notification system is composed of two components:
1- The notification queue which will receive changes to data stored onto the database
2- The notifier which will process the notification queue and inform the user about changes made. The user can be notified via web applications’ push notifications, or via sending notifications to external services such as Twitter or Slack.
The web application:
The web application is the main method via which the user can browse the data harvested by the system. The web application is composed of two main parts: a backend that handles the application’s logic as well as data acquisition and a frontend that enables the user to manage the application. The frontend is the application’s presentation layer for the data requested by the user. The frontend is displayed via a web browser and has three modules: a search engine module, a visualization module, and a notification module.
Testing the system:
A prototype of the system was tested on Dream Market. The test crawl run of the system was restricted to one of the largest categories on this marketplace – digital goods. At the time of the experiment, there were 49,263 items from this category. The system created one user account via the login module, which successfully logged into the market to crawl its content. The system’s run lasted for around 45 minutes and managed to collect data of 46,657 product listings. As such, the system’s data acquisition rate reached approximately 62,209 items per hour.
During the data acquisition run, the system was blocked by DDoS protection once every 30 requests. For every identified product, the system obtained product description, price, vendor, shipping details, and feedback of buyers who bought the same product in the past. The system’s test run showed an accuracy of 94.71%.
This user friendly darknet marketplace monitoring system is one of the first systems that can be extremely helpful not only to law enforcement agents, but also to researchers interested in studying cryptomarkets. However, the system can be easily detected and blocked by administrators of these marketplaces, so it is recommended to look for means to mask the system via for example rotating different user accounts, or regularly changing Tor nodes. Another area that would benefit from development is finding means for exposing the information collected via the system and rendering it available for law enforcement agencies and researchers.