All demos
Tech Demo

Replicate by Tree: PlumtreeKV

A software-only way to see a peer-to-peer overlay replicate shared state: a multi-writer key-value store with no server. Set a key in your browser and watch it spread to every peer along a broadcast tree; because each key is last-writer-wins, all nodes converge to the identical registry no matter what order the updates arrive in — and every node reads its own local copy, with no coordinator. It's a hands-on look at the same membership and dissemination building blocks our platform runs on — and a starting point for your own distributed-systems prototypes in Java.

Java 17 Babel HyParView Plumtree Multi-writer Open Source
Feature-complete; first release coming. The store works end-to-end today — membership, tree-based dissemination, convergence, snapshot-on-join, self-healing after node failure, and a live web UI. For now, build it from source (below); a downloadable release and browsable API docs land with the first tagged version. Star the repo to follow along.

Who it's for

This demo is for students, researchers, and engineers who want to understand — and start building — peer-to-peer and decentralized systems, without first hand-rolling sockets, membership, failure detection, and message dissemination. You write ordinary Java; Babel gives you protocols, events, and timers as clean composition units, and this store wires real overlay protocols together end-to-end so you can read, run, and modify them.

It's a companion to our Babel P2P chat and gossip canvas: where the canvas floods each update to a random fanout of neighbours, PlumtreeKV steps up to a broadcast tree — it sends one copy per link in the common case, yet keeps gossip's resilience through a lazy back-channel that repairs the tree when links or nodes fail. It's the efficient end of the same family.

Read it, run it, fork it, and build your own replicated application on top. If that's you, we'd love to hear what you make.

What it is

A replicated key-value store — a small, shared configuration registry — built from real Babel overlay protocols. There is no central server: every participant runs the same program, the nodes form a peer mesh, and every node may write any key. Each write is disseminated across the overlay, and every node applies writes last-writer-wins per key, so any two nodes that have seen the same writes hold the identical value everywhere — convergence is purely a matter of the dissemination reaching everyone, which is exactly what makes the registry a clear, watchable demonstration (and a clean correctness oracle: each node periodically reports a 64-bit digest of its live entries).

What it shows — and why it matters

Behind the key-value table, the demo makes a few properties of decentralized systems tangible — the reasons this kind of overlay is worth building on:

  • No server, no single point of failure. Every node is equal and writable; there is nothing in the middle to provision, scale, or bring the whole thing down.
  • It scales because nobody needs to know everyone. HyParView gives each node only a small partial view of the system — a handful of neighbours — yet those views stitch together into one connected overlay. The state and connections each node maintains stay small as the system grows.
  • Dissemination is efficient and robust. Plumtree carries each update along an embedded spanning tree — one copy per link, none of gossip's redundant fanout — while a lazy IHAVE back-channel announces what each node has along the remaining links. That lazy channel is what recovers a dropped message and re-grows the tree, so you get gossip-grade reliability at a fraction of the messages.
  • It heals itself — no separate repair protocol. Plumtree's lazy push is its recovery: a node missing an announced update grafts the announcer to fetch it. Stop a node and the survivors recover what they missed and reach full convergence on their own, with no anti-entropy layer behind it.
  • Many writers, a tree each. The default MultiPlumtree builds a separate, latency-optimal tree per writer, so a workload where every node writes doesn't fight over one shared tree. Each row in the UI is colour-coded by the writer that owns it, so the per-writer trees are visible at a glance.
  • Everyone ends up agreeing. Last-writer-wins per key makes the registry converge to the same state on every node regardless of the order updates arrive in — eventual consistency you can literally watch settle.

These are the same properties ParadigmShift's platform relies on to build robust, self-managed, decentralized systems — here distilled into something you can open in a browser and poke at.

What it looks like

Each node serves a small web UI: the shared registry as a live table, a box to set or delete a key, and a side panel listing this node's current overlay neighbours. Write a key on one node and it appears on all of them within a dissemination round; click any row to load its key and value back into the editor for a quick edit or overwrite. On startup each node prints exactly how it bound, which dissemination protocol it's running, and how it will find peers:

  PlumtreeKV — replicated key-value store
  network     : en0  →  192.168.1.20   (auto-detected sole interface en0)
  membership  : 192.168.1.20:6000  (HyParView, TCP)
  dissemination: MultiPlumtree (per-writer trees), shared HyParView channel
  web UI      : http://localhost:8000/
  discovery   : multicast 233.138.122.123
  bootstrap   : auto-discovery — probe the LAN and connect to whoever answers

Convergence is observable, not just asserted: every node periodically reports a digest of its registry, and once dissemination settles the digests match across the mesh — the same signal our validation harness checks automatically, including that the survivors still converge after a node is killed.

How it works

Three cooperating layers — the real overlay stack our platform builds on, with the store on top:

  • Membership — HyParView. A partial-view membership protocol: each node keeps a small active view (its overlay neighbours) and a larger passive view of fallback peers, using random walks to propagate joins and periodic shuffles to refresh the passive view. The result is a connected, self-healing overlay that survives churn without anyone holding the full membership. HyParView was introduced by Leitão et al. (DSN 2007 — see references).
  • Dissemination — Plumtree. A push-lazy-push multicast tree: each update is eager-pushed down an embedded spanning tree to neighbours, and lazy-pushed as a compact IHAVE announcement along the rest of the overlay. A node that receives a duplicate eager copy prunes that tree link; a node that sees only an IHAVE for something it's missing grafts the announcer — turning a lazy link back into a tree link. The tree pays gossip's redundancy only once, at the edges, and the lazy channel keeps it reliable and self-repairing. Plumtree was introduced by Leitão et al. (SRDS 2007 — see references).
  • Key-value application. Turns each set/delete into one operation, broadcasts it, and applies every delivered operation to a last-writer-wins registry (a delete is a tombstone an older set can't resurrect). A joining node fetches the current registry from a neighbour so it sees existing state at once. The web UI is served by the JDK's built-in HTTP server — no extra dependencies.

Two trees, one binary. A single switch (plumtreekv.protocol) chooses the dissemination protocol: MultiPlumtree (the default) builds one tree per writer, ideal when many nodes write; Plumtree builds a single shared tree optimised for the first broadcaster, cheaper in state and signalling when one node dominates writes. Both are the same protocol family, sharing their wire messages.

The layers never call each other directly — they cooperate through the reusable babel-protocols-common abstractions (broadcast request/delivery, neighbour up/down, the shared-channel announcement), the very same building blocks our production protocols use.

Where this technology fits

A replicated key-value registry is a friendly stand-in for a broad class of systems that need many parties to share slowly-changing state without a server in the middle — and tree-based dissemination is exactly the tool when updates are frequent enough that gossip's redundancy would be wasteful. The same Plumtree design turns up in production, for example:

  • Cluster metadata and configuration stores. Plumtree was created for precisely this: Riak's cluster_metadata / riak_core_broadcast and Leapsight's plum_db replicate ring state, bucket types, and config across every node by exactly this push-lazy-push tree.
  • Edge and IoT fleets. Spreading configuration, feature flags, or device state across thousands of gateways and sensors with no central broker to provision or overload — the decentralized, self-managing edge ParadigmShift's platform targets. The c12s edge platform replicates its control-plane state this way.
  • CRDT and delta-state dissemination. Conflict-free replicated data types need their updates delivered to every replica reliably and cheaply; a broadcast tree is a natural carrier, and convergence (here, last-writer-wins) does the rest.
  • Service discovery and membership metadata. Propagating "who is here and what they offer" across a large, churning fleet — the kind of view that has to reach everyone quickly but changes far less often than telemetry does.
  • Decentralized and blockchain networks. Block, transaction, and mempool propagation increasingly lean on structured gossip / broadcast-tree overlays to reach every participant robustly while keeping per-node bandwidth bounded as the network grows.

What these share is the trade the demo makes visible: give up the central coordinator, and in return get a system that scales with its members, tolerates failure and churn, and — by building a tree instead of flooding — spreads each update at close to the theoretical minimum cost.

Run it

You need Java 17 and Maven. Build the runnable JAR once:

mvn package

Then start a node. It serves a small web UI and opens your browser at it automatically (it also prints the address — by default http://localhost:8000/; pass plumtreekv.ui.open=false to suppress the auto-open):

java -jar target/babel-plumtree-demo.jar

That's the whole demo on one node — set a key and you're writing to the shared registry. Start more nodes to watch them converge; how they find each other is just below. To run the single-shared-tree variant instead of the per-writer default, add plumtreekv.protocol=single.

Running two (or more) nodes

A node joins by connecting to one node already in the overlay, and that first introduction happens one of two ways: automatically, by multicast — nodes on the same local network find each other with no addresses typed — or explicitly, by handing a node the address of an existing peer, which works anywhere two nodes can open a connection. Once a node is in, HyParView's gossip takes over and it discovers the rest on its own. Pick the option below that fits your setup.

How it works. Each node joins a well-known IP multicast group and periodically announces itself and listens on it; when one node hears another, the overlay introduces them — so nodes on the same network find each other with no addresses typed. Multicast isn't routed across the internet, so this works only within one local network, and a VPN, firewall, multi-NIC machine, or (recent macOS) the “Local Network” privacy permission can block it. It's opt-in: name the discovery protocol on the command line.

One node per machine on the same LAN — run this on each:

java -jar target/babel-plumtree-demo.jar \
  babel.discovery=pt.unl.fct.di.novasys.babel.core.protocols.discovery.MulticastDiscoveryProtocol

To put two multicast nodes on one machine, also give each a distinct port and discovery socket:

DISC=babel.discovery=pt.unl.fct.di.novasys.babel.core.protocols.discovery.MulticastDiscoveryProtocol
java -jar target/babel-plumtree-demo.jar $DISC
java -jar target/babel-plumtree-demo.jar $DISC babel.port=6010 babel.discovery.unicast.port=1027

How it works. You tell a joining node where an existing one is with HyParView.contact=<host>:<port>; it dials that peer to get into the overlay, and HyParView's gossip does the rest — the contact is just the door in, not a coordinator or relay. The first node (with nobody to contact) is marked HyParView.contact=none. This uses no multicast at all — nothing extra to configure — so it works on one machine, across networks, anywhere two nodes can open a TCP connection.

Two nodes on one machine, pinned to loopback:

# first node — the one others contact
java -jar target/babel-plumtree-demo.jar babel.address=127.0.0.1 babel.port=6000 HyParView.contact=none
# second node — dials the first to join
java -jar target/babel-plumtree-demo.jar babel.address=127.0.0.1 babel.port=6010 HyParView.contact=127.0.0.1:6000

Across machines, give the second node the first node's reachable address instead of 127.0.0.1.

Either way, to run several nodes on one machine give each a distinct babel.port spaced by at least 10; the dissemination protocol shares HyParView's channel (no second port to manage), and the web-UI port follows automatically (babel.port + 2000). If a node auto-selects the wrong network interface, set babel.interface=<nic> (e.g. en0, eth0) or babel.address=<ip> explicitly — that always wins. For a robust tree, keep HyParView's active view at ≥ 4 neighbours (the bundled config uses 5).

Source & credits

PlumtreeKV is a ParadigmShift tech demo, built on ParadigmShift's own implementations of the Babel framework and of the two protocols beneath the key-value store — HyParView membership and Plumtree tree-based broadcast (here in its MultiPlumtree, per-writer-tree variant). These protocols are established distributed-systems research: HyParView was introduced by Leitão et al. at DSN 2007, and Plumtree — the push-lazy-push “Epidemic Broadcast Trees” — by Leitão et al. at SRDS 2007; both are cited below. Our implementations, evolved, maintained, and distributed by ParadigmShift, descend from the original Babel, HyParView, and Plumtree implementations developed at NOVA FCT by the Computer Systems Group of NOVA LINCS, in the context of the TaRDIS European project.

The protocols underpinning this demo are described in:

  • J. Leitão, J. Pereira, and L. Rodrigues, “Epidemic Broadcast Trees,” in Proc. 26th IEEE Int'l Symp. on Reliable Distributed Systems (SRDS'07), Beijing, China, Oct. 2007, pp. 301–310, doi: 10.1109/SRDS.2007.27. [PDF]
  • J. Leitão, J. Pereira, and L. Rodrigues, “HyParView: A Membership Protocol for Reliable Gossip-Based Broadcast,” in Proc. 37th Annual IEEE/IFIP Int'l Conf. on Dependable Systems and Networks (DSN'07), Edinburgh, UK, Jun. 2007, pp. 419–429, doi: 10.1109/DSN.2007.56. [PDF]
  • J. Leitão, Gossip-Based Broadcast Protocols, MSc thesis, Faculdade de Ciências da Universidade de Lisboa, 2007.
  • P. Fouto, P. Á. Costa, N. Preguiça, and J. Leitão, “Babel: A Framework for Developing Performant and Dependable Distributed Protocols,” in Proc. 41st Int'l Symp. on Reliable Distributed Systems (SRDS), Vienna, Austria, Sep. 2022, pp. 146–155, doi: 10.1109/SRDS55811.2022.00022. [PDF]

The MultiPlumtree (per-writer-tree) variant additionally adopts design elements — per-root trees and the lazy re-advertisement / acknowledgement model — from Riak's riak_core_broadcast, originally implemented by Jordan West (source). No code from that project is included; the plumtree library carries the full acknowledgement.

Build with us

This demo is open source — read it, run it, fork it, and build your own replicated application on top of Babel. We'd love to see what you make.