CRITICAL - Blockpool Technical Update - 2019-01-14
The Blockpool main chain will undergo a hard fork at block height 2,500,000. All BPL delegates and relay nodes will need to be updated to version At this height, the network will no longer communicate with nodes below version 0.5.0

Situation

The BPL chain is currently in a state where it can’t be synced from zero. This is because there are blocks from round 9827 (height 1975042) onwards that are forged out of place and cause “expected generator” errors. The chain is currently propagating forward since all nodes are using the same snapshot that has those out of place blocks, so everyone is in agreement despite the blocks being inherently in the wrong slots.

Cause

The cause of the divergence can be attributed to two independent events. The first problem started occurring in round 9810, when the vote balances of delegates in the current snapshot started to drift from their true values. These discrepancies were on the order of one or two BPLtoshis for those delegates that were affected. Initially this did not cause a problem, but as the drift accumulated, it eventually caused two delegates to swap ranks as we entered round 9827, and that changed the forging order for that round. When syncing from zero, the drift is not present, and balances remain accurate, leading to disagreement for that round. The drift persisted for another several rounds, until it closed back to zero in round 9831. This issue was responsible for three blocks being forged out of turn,

The second event occurred during round 9823, when a block was propagated that contained several unvote transactions. Two of the unvotes in that block were properly handled, but the other three were not fully processed, and thus continued to be counted for subsequent rounds. Of the three unvotes, only one affected an active delegate. By not having the vote removed, this delegate maintained a higher rank in this bad snapshot database, as opposed to if the vote had been properly removed. This issue does not immediately cause an error on a cleanly synced chain only due to coincidence. At the time of this event, many delegates, including the one affected by the unvote, were not forging. Because of this, the delegate did not have a chance to forge out of turn. However, once the network recovered from the incident at the time, the delegate and others began to forge based on the order determined by not having the unvote transaction processed. This causes continuous “expected generator” errors when using a clean synced chain, since every round order is improperly calculated, and because those delegates are forging, someone always forges out of turn multiple times per round.

Solution

Correcting these issues to allow the chain to sync from zero, and then persist in the proper state, involves allowing the three blocks from round 9827 to be forged out of place, and then to continue to count the votes from round 9823 that should’ve been removed, until a round that everyone has had ample time to update.

The solution for the three blocks forged during round 9827 is fairly straightforward, the update will allow blocks by the specific delegate in that particular round, to be forged in place of another delegate as the code would have normally expected. No other delegates during no other rounds are given this exclusion.

The solution for the unvote transaction issue is a bit more involved. Upon beginning to sync the chain from zero, everything will proceed normally until round 9823 has been downloaded and processed. At the completion of the round, prior to calculating the forging order for round 9824, the three votes will be added back into the database, to generate the same forging order as the one in the corrupted snapshot. These votes will continue to persist in the clean database, to allow the chain to reach current height. These votes will continue to influence the order of forging on the clean-synced node to match that of nodes using the bad snapshot. This will continue until block height 2499838 (end of round 12437, around January 22nd) at which point the votes will be removed from the database, and will therefore restore it to it’s proper state. All block forging order from that point onwards will be calculated from the proper vote balances. This interim period will allow everyone ample time to update their nodes to version 0.4.3 which contains this code.

Once the chain reaches height 2499838, nodes running version 0.4.3 will drop the votes, and will ignore blocks sent from nodes running versions older than 0.4.3.

Fixed Rewards

During the same round, specifically at height 2500000, BPL will activate fixed rewards. This means that delegates will be forging an additional 5 BPL in every block, in addition to the normal rewards. This will be provided by update v0.5.0 which will be available shortly after v0.4.3. This update will be a hard fork, so older nodes will not be allowed to communicate with those running v0.5.0.

Technical Summary

First issue: When comparing the total votes allocation between two nodes, one restored from the bad snapshot and one synced from zero, the balances begin to drift in round 9810 by ~30 BPLtoshis. This is spread out among several delegates, with the affected delegates having a vote total of 1 or 2 BPLtoshis higher on the bad snapshot than on the true state of the zero-synced node. This persists for about 20 rounds, with the discrepancy varying round by round, but remaining on the order of 10s of BPLtoshis. By round 9831, the discrepancy closes back to zero. In round 9827, this discrepancy shifted the rank of delegate BPL_MC_116 up by one and swapped it with BPL_MC_132. This altered the round forging order and 116 forged in 132’s place, causing an EG error. The mitigation for this is for round 9827 ONLY, have the slot validation allow 116 to forge in 132’s place. There were 3 blocks forged by 116 that were on the wrong slot, but because of network issues at the time, they all occurred in round 9827, and 132 was not affected as it was not forging at that time. The code for this mitigation specifically checks for round 9827, forger public key for 116, expected public key for 132, and only when those three conditions are met, will allow the out of place block to be validated.

Second issue: in round 9823, five unvote transactions were broadcasted. Two of those unvotes we're fully and properly processed. The remaining three unvotes we're only processed partially; there were entries added to the votes table referring to the unvote tx, but the relevant votes were not remove from the table holding all active votes. This table is used to calculate delegate vote balances, so having those votes still present altered the delegate ranks. This caused one of the delegates to have a higher rank that it should have had (only one of those three unvotes was for an active delegate), and resulted in a different forging order. Once again, this lead to persistent EG errors as this altered the order for every round onward. There was no immediate EG error since that delegate was down at the time. The mitigation for this is to basically have a node replicate the state of the DB during this window. A node syncing from zero will properly process all five of those unvote txs, which would lead to an EG. So the solution is to insert those three votes that remained in the bad snapshot back into the table at the end of the round where they were broadcasted. That will cause the round order to match that of the blocks being forged. Those votes will remain until round 12437, at which point they will be removed from the table and the DB will reflect the true state of the votes. From that point onward, nodes that have been running from the bad snapshot will be the ones getting the EG error instead.

Keep Reading.