Move Less, Move Faster: Speeding Up Citus Cluster Scaling | POSETTE: An Event for Postgres 2026

Estimated read time 2 min read

Post Content

​ Explore new approaches to scaling Citus clusters more efficiently. Muhammad Usama (Microsoft) explains improvements in his talk “Move Less, Move Faster: Speeding Up Citus Cluster Scaling” at POSETTE: An Event for Postgres 2026. Abstract: Scaling a distributed Postgres cluster often isn’t limited by “adding a VM”, it’s limited by how long it takes to rebalance data safely. In this talk, I’ll give a minimal mental model of how Citus distributes data (shards, placements, and coordinator/worker roles), then explain why cluster scaling can feel painfully slow: data movement is expensive, and concurrency is constrained by safety and resource limits.

We’ll then look at two concrete steps toward faster elastic scaling:
* Shard rebalancing improvements that increase parallelism and reduce bottlenecks.
* Snapshot-based node addition, where a new worker starts as a clone of an existing one, dramatically reducing how much data needs to be copied during rebalancing.

Attendees will come away with a clearer way to reason about scaling time, plus actionable guidance for running scale-out/scale-in events safely.

Muhammad Usama is a major contributor and committer to the Pgpool-II open-source project, specializing in performance and high availability. He has developed key features, including the HA component for Pgpool, and is currently working on global connection pooling. With 20 years of experience, he has been actively involved with PostgreSQL since 2006. Now a Principal Software Engineer at Microsoft, Muhammad is working on distributed database systems and continues to contribute to the PostgreSQL community.

► Video chapters:
⏩ 00:00 – Music & introduction
⏩ 00:22 – “Move Less, Move faster” overview
⏩ 00:46 – Real-world slow rebalance example
⏩ 02:33 – Citus basics and architecture
⏩ 04:15 – How rebalancing works in practice
⏩ 05:44 – Why rebalancer performance was slow
⏩ 08:21 – Parallelising reference table copies
⏩ 10:45 – Fixing locking for parallel shard moves
⏩ 14:17 – Orchestrating a parallel rebalancer
▶️ 22:15 – Snapshot-based scaling: move less
▶️ 24:59 – Snapshot-based node addition
▶️ 27:30 – Faster rebalance by deleting instead of moving
▶️ 29:31 – Choosing the right scaling strategy

📕 Everything you need to know about POSETTE: An Event for Postgres can be found at: https://posetteconf.com
✅ Learn more: watch more POSETTE talks: https://aka.ms/posette-playlist

📌 Let’s connect:
LinkedIn: https://www.linkedin.com/company/posetteconf/
X – @PosetteConf, https://x.com/PosetteConf
Mastodon – @posetteconf, https://mastodon.social/@posetteconf
Bluesky – @posetteconf.com, https://aka.ms/posette-on-bluesky

#PosetteConf #PostgreSQL #Citus   Read More Microsoft Developer 

You May Also Like

More From Author

Move Less, Move Faster: Speeding Up Citus Cluster Scaling | POSETTE: An Event for Postgres 2026

Estimated read time 2 min read

Post Content

​ Explore new approaches to scaling Citus clusters more efficiently. Muhammad Usama (Microsoft) explains improvements in his talk “Move Less, Move Faster: Speeding Up Citus Cluster Scaling” at POSETTE: An Event for Postgres 2026. Abstract: Scaling a distributed Postgres cluster often isn’t limited by “adding a VM”, it’s limited by how long it takes to rebalance data safely. In this talk, I’ll give a minimal mental model of how Citus distributes data (shards, placements, and coordinator/worker roles), then explain why cluster scaling can feel painfully slow: data movement is expensive, and concurrency is constrained by safety and resource limits.

We’ll then look at two concrete steps toward faster elastic scaling:
* Shard rebalancing improvements that increase parallelism and reduce bottlenecks.
* Snapshot-based node addition, where a new worker starts as a clone of an existing one, dramatically reducing how much data needs to be copied during rebalancing.

Attendees will come away with a clearer way to reason about scaling time, plus actionable guidance for running scale-out/scale-in events safely.

Muhammad Usama is a major contributor and committer to the Pgpool-II open-source project, specializing in performance and high availability. He has developed key features, including the HA component for Pgpool, and is currently working on global connection pooling. With 20 years of experience, he has been actively involved with PostgreSQL since 2006. Now a Principal Software Engineer at Microsoft, Muhammad is working on distributed database systems and continues to contribute to the PostgreSQL community.

► Video chapters:
⏩ 00:00 – Music & introduction
⏩ 00:22 – “Move Less, Move faster” overview
⏩ 00:46 – Real-world slow rebalance example
⏩ 02:33 – Citus basics and architecture
⏩ 04:15 – How rebalancing works in practice
⏩ 05:44 – Why rebalancer performance was slow
⏩ 08:21 – Parallelising reference table copies
⏩ 10:45 – Fixing locking for parallel shard moves
⏩ 14:17 – Orchestrating a parallel rebalancer
▶️ 22:15 – Snapshot-based scaling: move less
▶️ 24:59 – Snapshot-based node addition
▶️ 27:30 – Faster rebalance by deleting instead of moving
▶️ 29:31 – Choosing the right scaling strategy

📕 Everything you need to know about POSETTE: An Event for Postgres can be found at: https://posetteconf.com
✅ Learn more: watch more POSETTE talks: https://aka.ms/posette-playlist

📌 Let’s connect:
LinkedIn: https://www.linkedin.com/company/posetteconf/
X – @PosetteConf, https://x.com/PosetteConf
Mastodon – @posetteconf, https://mastodon.social/@posetteconf
Bluesky – @posetteconf.com, https://aka.ms/posette-on-bluesky

#PosetteConf #PostgreSQL #Citus   Read More Microsoft Developer 

You May Also Like

More From Author