Last month, we have identified **a minor issue** in our implementation of the BLS signature on G1 used in the newly launched fastnet
network. This is an RFC compliance issue which led us to plan the launch of a new compliant quicknet
network and to spin down fastnet
.
While not a security issue for drand, the problem is affecting our “hash to curve” function, used to map round numbers to a point on the elliptic curve that get signed using **threshold BLS** by the drand network. The issue relates to the upcoming Hash To Curve RFC 9380, which mandates specific “Domain Separation Tags“ (DST) for certain curves. In the BLS curve case, where we have two different groups (G1 and G2) that we can map to, the RFC recommends using different DST for both groups (it’s the point of having a DST!). Sadly, our implementation of BLS signatures was initially written to perform signatures on G2 and not on G1… (Stay tuned for an upcoming blog post about the choice of the group for BLS signatures!) The usage of a global variable in our codebase for the DST meant that both our G1 and G2 implementations were sharing the same DST. Since we don’t have “official test vectors” for BLS signatures on G1 and G2, this went unnoticed until **someone tried** to verify our signatures done on G1 with a C++ implementation and reported the issue in our Slack.
We launched our fastnet
Testnet in February to try and identify any potential issues, while also allowing people to start building their applications on top of our new design featuring unchained randomness and timelock capabilities. Despite this, the issue went unnoticed in all signature verification implementations that we tried with our drand beacons (including our typescript **drand-client** codebase and two different third party Rust implementations of drand verification!). Amusingly, within weeks of this issue being identified, a second team, building their own timelock scheme on top of our new fastnet
network, also identified the issue and reported it to us.
For us, the main takeaways here are:
This issue means that all beacons emitted for the new fastnet
network that we launched on March 1st are featuring signatures that are non-compliant with the hash to curve spec, and so is the case for beacons from our testnet.
While not a security issue for our usage, using the wrong domain separator tag to map points on G1 is non-compliant with the RFC and therefore not great for future compatibility and adoption of our new network. We have **already implemented a new, RFC-compliant scheme** for drand, affectionately named bls-unchained-g1-rfc9380
.
During investigations, we identified 4 main ways to solve this issue:
quicknet
network and keep the existing fastnet
network running, causing a 90% increase of the load of our existing nodes, forcing us to increase our tech debt and maintenance burden but not disrupting any existing users. (The verifiability of drand beacons allows our users to re-use them or redistribute them without us knowing about it. This further means that we do not have visibility into our user-base, and therefore, getting in contact with them to notify them of a “fastnet
shutdown” is not an option).fastnet
network (quicknet
) using a compliant implementation of the signature function and shutdown the existing non-compliant one immediately. This would inevitably cause our current fastnet
users to be completely stranded and having to switch quickly to our new quicknet
network, causing serious disruption for our users.quicknet
network using a compliant implementation of the signature function and sunset the existing one over multiple months before shutting it down entirely. This would allow our users to ensure 100% uptime of their services, while gradually migrating to the new network.Given the above, The League of Entropy has voted and elected to choose the last option, using its governance process, and therefore will:
fastnet
network, reduce the number of Mainnet nodes running it (from 21 with a threshold of 11 to a committee of 14 with a threshold of 8), stop onboarding new nodes to it, and most importantly stop it entirely as soon as possible.quicknet
network with the same settings, e.g. 3 seconds frequency, 2 seconds catch-up period, **unchained randomness** (thus compatible with timelock schemes), except it would be using our newly released bls-unchained-g1-rfc9380
scheme which is RFC-compliant.testnet-g
network, shutting it down before Mainnet, in order to allow us to effectively test the shutdown scenario.