Research: Three novel user node selection algorithms for I2P
As internet applications continue to develop at a rapid rate, anonymous technologies play a pivotal role in promoting personal privacy. I2P is one of the most popular anonymous online protocols that provides internet users with high anonymity levels via its advanced communication and encryption algorithms. Nevertheless, I2P does not focus on users’ preferences, which renders it difficult to meet the specific demands of individual users.
A recently published research paper proposes three innovative user-oriented node selection schemes that can boost anonymity and lower communication delay across the I2P network. BNSA, or basic node selection algorithm, is introduced to group routing nodes in a unique manner that can provide node candidates with high performance. Two more algorithms, known as the communication delay oriented node selection algorithm (CDNSA) and the geographic diversity oriented node selection algorithm (GDNSA), are proposed on the basis of the BNSA. Throughout this article, we will overview these three novel I2P user-oriented node selection algorithms.
Basic node selection algorithms (BNSA):
The tunnel node selection algorithm designed for the I2P network operates in source route mode, i.e. the sender is responsible for choosing tunnel nodes and then creating a tunnel. Apart from other anonymous communication networks, routing nodes across the I2P network do not trust performance information provided by other routing nodes, because ‘lazy’ nodes may claim lower performance for free-riding, or malicious ones can occasionally claim higher performance to boost the probability of joining a target tunnel intentionally. As such, the routing nodes across the I2P network actively measure the performance of other routing nodes, such as bandwidth, workload, tunnel creation success rate, and reachability. In the meantime, node evaluation continuously updates the status in real time, including the time a routing node spends to respond to the query, the number of tunnel failures that a routing node takes part in, and the last communication time of a routing node.
In the study we are reviewing, the researchers refer to a tunnel as the rerouting path, and tunnel nodes are chosen according to node evaluation. Nonetheless, the existing node evaluation mainly relies on two factors: speed and capacity. Consequently, in order to illustrate the comprehensive performance of routing nodes, the researchers introduce several new evaluation factors for BNSA. These factors include capacity, bandwidth, online time, and reachability/delay.
BNSA divides the I2P routing nodes into three groups, namely, Reachable Group (RG), High-Reliable Group (HRG), and High-Performance Group (HPG). HPG represents a subset of HRG, and HRG represents a subset of RG. In BNSA’s design, all of the routing nodes are set into different groups in each polling based on the aforementioned evaluation factors, which include capacity, bandwidth, and online time. Specifically, developers of the algorithm first initiate these three groups and then test their reachability. All of the reachable nodes are then added into the RG. Thereafter, the reliability of each node within the RG group is scored on the basis of its capacity and online time. The nodes with scores higher than the average score, are inserted into the HRG. Within the HRG, the performance score of every node is calculated according to its bandwidth, and the nodes whose score is higher than the average score are inserted into the HPG. Finally, HPG will represent the basis of the GDNSA and CDNSA algorithms.
Geographic Diversity Oriented Node Selection Algorithms (GDNSA):
Malicious I2P nodes, controlled by attackers, not only could refuse to serve, but can also phish user information via conspiring with other malicious ones. If a tunnel hosts many malicious nodes, it is hard to guarantee communication anonymity. In order to protect the I2P network against such attacks, some approaches act to increase the length of a tunnel in order to reduce the proportion of malicious nodes within the tunnel. However, a long tunnel results in a significant rise in communication delay, because the process of encryption and decryption of data packets requires high overhead. Other approaches act to increase the size of the anonymous network to reduce the proportion of malicious nodes. However, this requires a large number of trusted nodes, which is hard to deploy.
Because there are more routing nodes in developed countries, such as the US and the UK, it is likely that multiple nodes within the tunnel are located within the same geographical location using a traditional node selection algorithm. The attackers can easily implement their high-performance malicious nodes without a high cost to crack the anonymity of communications. As such, authors of the paper propose the GDNSA, which can increase attack cost via selection of routing nodes in different regions to create a tunnel. GDNSA represents an optimized node selection algorithm based on BNSA.
Figure (1): GDNSA’s algorithm (algorithm 2)
As shown in figure (1), the input is the HPG, and tunnel length is represented by L. The output is routing node queue TN, containing L − 1 routing nodes that take part in the tunnel. GDNSA first initializes the TN to an empty queue and then divides the routing nodes in the HPG into n groups on the basis of different regions. Thereafter, it randomly arranges the n groups to produce the regions queue AQ=. Following that, it performs L − 1 routing node selection. In the kth selection round, the first element Ai within the AQ is selected. If Ai is not empty, a routing node Nk is selected randomly from Ai to append to the TN tail. Thereafter, Nk is removed from Ai and Ai is moved to the tail of AQ. In the meantime, AQ =, TN=, where 1 ≤ k ≤ L − 1, i = k%n. In the end, the queue of L −1 routing nodes TN= can be generated.
Communication Delay Oriented Node Selection Algorithm (CDNSA):
Occasionally, users expect anonymous systems to yield lower communication delay. For instance, in instant messaging or online browsing, if the communication delay is relatively high, it will significantly impact user experience. The traditional node selection algorithm increases transmission efficiency via selection of routing nodes with higher bandwidth, yet the increase in bandwidth is not necessarily associated with reduction in communication delay. For instance, the routing nodes within tunnel A are associated with better bandwidth, but the delay of each hop within tunnel A is relatively higher. The bandwidth of the routing nodes within tunnel B is lower than that of those within tunnel A, but the delay of each hop within tunnel B is relatively lower. If the communication data is small, it is clear that tunnel B has a lower communication delay than tunnel A. Consequently, authors of the paper propose the CDNSA, which produces lower communication delay via reduction of the delay of each hop within the tunnel.
Figure (2): CDNSA’s algorithm (algorithm 3)
CDNSA’s algorithm is illustrated in figure (2). The input is the HPG and tunnel length is represented by L. The output is the routing node queue TN formed by L − 1 routing nodes. In the beginning, TN is initialized to an empty queue and the lowest delay L − 1 nodes are selected in the HPG to form LDS0. Thereafter, a routing node N1 is randomly selected and added into TN. At this time, TN =. Thereafter, N1 also uses L−1 nodes associated with the lowest delay in its HPG to form LDS1, and randomly adds an unused routing node N2 from the LDS1 to TN. In the meantime, TN=. CDNSA iterates continuously until TN= is generated.
The paper proposes three user oriented node selection algorithms. Theoretical analysis and experimental testing prove the effectiveness of these novel node selection algorithms, which can boost system anonymity and lower communication delay across the I2P network. In the future, developers of these algorithms plan to investigate misbehavior of I2P nodes that might be compromised. Differentiating compromised nodes from normal nodes can boost anonymity and provide better node candidates.