Covenant AI’s PULSE: Making Decentralized RL for LLMs as Fast as Centralized Training

Covenant AI just released PULSE (Patch Updates via Lossless Sparse Encoding), a technique that slashes bandwidth for weight synchronization in decentralized reinforcement learning (RL) by 100×+, while staying completely lossless (bit-identical reconstruction, SHA-256 verified on every sync).

Media credit: Erfan Miahi

The biggest bottleneck in decentralized RL: training can happen on fast interconnects, but inference nodes are spread globally over normal internet. Syncing full model weights becomes a bandwidth nightmare. A 7B checkpoint is ~14 GB, and repeatedly syncing that reduces performance.

PULSE fixes this by exploiting a powerful (and honestly surprising) property: RL weight updates are extremely sparse—around 99% sparse per step.

What makes this solid is how they measured it, which is by bitwise sparsity step-by-step, not just comparing initial vs final checkpoints. They tested across model families, sizes, and even async delays up to 32 steps. The result showed that sparsity consistently stays above 98% in realistic training conditions.

The reason is inferred in the update. Gradients are dense, but after Adam optimization + BF16 precision at RL learning rates around 1–3×10⁻⁶, most updates are so small they round to zero. So you get sparsity “for free” with standard RL hyperparameters.

Learning rate matters a lot: lower LR → smaller deltas → more values fall below BF16 precision → higher sparsity.

From there, they built the full compression pipeline:

Bitwise checkpoint diff: extract changed indices + values
Delta encoding for indices
Type downscaling (uint8/16 for small deltas)
zstd compression

Average compression lands at ~79×, often reaching 100×+.

And this isn’t theoretical. PULSE is already deployed live on Grail AI — Covenant’s decentralized, incentivized RL network.

In real deployment:

Patch sizes stabilize around ~108 MB (~130× smaller than 14 GB full syncs)
Bandwidth drops from 20 Gbit/s → 0.2 Gbit/s
GPU utilization stays around 90%
Every sync is SHA-256 verified, meaning zero information loss and no drift

That bandwidth reduction is the real unlock. It takes what used to require datacenter-grade networking and makes it viable over normal consumer connections. Decentralized RL starts matching centralized performance.

Learn more about the PULSE update:

Paper: https://arxiv.org/pdf/2602.03839

Code: https://github.com/one-covenant/grail

Introducing SuperWorld: The Race to Build the Programmable Layer of Earth

Seven IMMEDIATE Catalysts for $TAO

Hello world!

xTAO Extends Bittensor’s Economic Gravity

Church of Rao Calls for Broader Community Input as Bittensor Redesigns On-Chain Governance

How Babelbit is Engineering Interpreter-Level Reasoning Into AI

The Truth About Almanac: Why the Founder Dumped 3,000 TAO (SN41)

How high can TAO go from here?

Excerpt from Subnet Summer AMA with Flamewire (Subnet 97)

Explore How GroundLayer is Engineering the Capital Markets Layer for Bittensor

Organizing the Spoken Web with ReadyAI (Bittensor Subnet 33)

Market Cap with Yuma Group’s Greg Schvey

Covenant AI’s PULSE: Making Decentralized RL for LLMs as Fast as Centralized Training

Leave a Reply Cancel reply

Leave a Reply Cancel reply

Related News