fastnet to be sunset, long live quicknet

Last month, we have identified **a minor issue** in our implementation of the BLS signature on G1 used in the newly launched fastnet network. This is an RFC compliance issue which led us to plan the launch of a new compliant quicknet network and to spin down fastnet.

The Issue

While not a security issue for drand, the problem is affecting our “hash to curve” function, used to map round numbers to a point on the elliptic curve that get signed using **threshold BLS** by the drand network. The issue relates to the upcoming Hash To Curve RFC 9380, which mandates specific “Domain Separation Tags“ (DST) for certain curves. In the BLS curve case, where we have two different groups (G1 and G2) that we can map to, the RFC recommends using different DST for both groups (it’s the point of having a DST!). Sadly, our implementation of BLS signatures was initially written to perform signatures on G2 and not on G1… (Stay tuned for an upcoming blog post about the choice of the group for BLS signatures!) The usage of a global variable in our codebase for the DST meant that both our G1 and G2 implementations were sharing the same DST. Since we don’t have “official test vectors” for BLS signatures on G1 and G2, this went unnoticed until **someone tried** to verify our signatures done on G1 with a C++ implementation and reported the issue in our Slack.

We launched our fastnet Testnet in February to try and identify any potential issues, while also allowing people to start building their applications on top of our new design featuring unchained randomness and timelock capabilities. Despite this, the issue went unnoticed in all signature verification implementations that we tried with our drand beacons (including our typescript **drand-client** codebase and two different third party Rust implementations of drand verification!). Amusingly, within weeks of this issue being identified, a second team, building their own timelock scheme on top of our new fastnet network, also identified the issue and reported it to us.

For us, the main takeaways here are:

Make sure to have as many test vectors as possible, generated using multiple, different implementations.
Generate more noise and community outreach around new network launches, including new testnets.
Expect early adopters to take at least 2-3 months to start testing their implementations building on top of your new features, and therefore plan your testnet and mainnet launches accordingly.

Next Steps

This issue means that all beacons emitted for the new fastnet network that we launched on March 1st are featuring signatures that are non-compliant with the hash to curve spec, and so is the case for beacons from our testnet.

While not a security issue for our usage, using the wrong domain separator tag to map points on G1 is non-compliant with the RFC and therefore not great for future compatibility and adoption of our new network. We have **already implemented a new, RFC-compliant scheme** for drand, affectionately named bls-unchained-g1-rfc9380.

During investigations, we identified 4 main ways to solve this issue:

Do nothing and document the non-compliance of the signatures on the drand website, however this would have caused significant friction for future adoption, reducing the usability and the developer experience of drand.
Create a new quicknet network and keep the existing fastnet network running, causing a 90% increase of the load of our existing nodes, forcing us to increase our tech debt and maintenance burden but not disrupting any existing users. (The verifiability of drand beacons allows our users to re-use them or redistribute them without us knowing about it. This further means that we do not have visibility into our user-base, and therefore, getting in contact with them to notify them of a “fastnet shutdown” is not an option).
Create a new fastnet network (quicknet) using a compliant implementation of the signature function and shutdown the existing non-compliant one immediately. This would inevitably cause our current fastnet users to be completely stranded and having to switch quickly to our new quicknet network, causing serious disruption for our users.
Create a new quicknet network using a compliant implementation of the signature function and sunset the existing one over multiple months before shutting it down entirely. This would allow our users to ensure 100% uptime of their services, while gradually migrating to the new network.

Given the above, The League of Entropy has voted and elected to choose the last option, using its governance process, and therefore will:

Sunset the current fastnet network, reduce the number of Mainnet nodes running it (from 21 with a threshold of 11 to a committee of 14 with a threshold of 8), stop onboarding new nodes to it, and most importantly stop it entirely as soon as possible.
launch a new quicknet network with the same settings, e.g. 3 seconds frequency, 2 seconds catch-up period, **unchained randomness** (thus compatible with timelock schemes), except it would be using our newly released bls-unchained-g1-rfc9380 scheme which is RFC-compliant.
do the same on our Testnet testnet-g network, shutting it down before Mainnet, in order to allow us to effectively test the shutdown scenario.