Sriram Subramanian · Nov 9, 2025 · 12:27 AM UTC

Sriram Subramanian

PT retweeted

@sriramsubram

Sharing some anecdotes about hedging requests in distributed systems When we built the blob storage at LinkedIn, every blob was broken into chunks and streamed to different storage nodes. Each chunk had multiple replicas that were written with a majority quorum. The metadata chunk had the location of all the chunks. A request for a blob would typically issue read request for the chunk and stream it to the client. At scale, we found that the tail latency (99.9th percentile) was 3-5x slower. This typically happens for various reasons - network blips, momentary spike in traffic for a specific node or deployments. One of the quick fix was to send multiple requests for the same chunk to different replicas. This is called hedging requests. The benefit of this approach is that you get the chunk faster from one of the replicas before the other. This dramatically reduced the tail latency and brought it almost on par to the 99th percentile. The downside was that we immediately doubled the request load across all nodes. To alleviate this problem, we sent the second request 10-20 ms later after checking if the first request had not started responding yet. This approach only increased the load by 1.2x while keeping the tail latency comparable to the previous approach. The downside to hedging is that the operations have to be idempotent. Immutable blobs were a good candidate for this. We also had to empirically figure out the right interval to send the second request.

Steve Huynh · Nov 7, 2025 · 2:02 PM UTC

PT retweeted

Steve Huynh

@ALEngineered

Nov 7

The highest-performing developers I worked with at Amazon asked better questions than everyone else. After 18 years in tech, here's what I learned: while average engineers jump to solutions, exceptional ones pause to ask the right questions first. The 6 questions that separated high performers: "What problem are we actually solving?" "What happens when this fails?" "How will we know if this is working?" "What's the simplest solution that could work?" "Who else has solved this?" "What are we NOT going to do?" The career-changing insight: The quality of your questions determines the quality of your solutions. These thinking patterns apply beyond engineering to any complex problem-solving role. More systematic approaches to advancing your tech career: go.alifeengineered.com/?utm_…

If You’re a Senior Engineer, I’ll Get You Promoted in the Next 12 Months - or work with you for...

go.alifeengineered.com

202

Aliaksandr Valialkin · Nov 8, 2025 · 8:12 AM UTC

PT retweeted

Aliaksandr Valialkin

@valyala

17h

Both VictoriaMetrics and VictoriaLogs are optimized for persistent storage with low IOPS (aka HDD): - Data ingestion path. Both systems buffer recently ingested data in memory before writing it to files. Both systems rely on the OS page cache for minimising the number of write operations to persistent storage when writing the buffered data to files - regular writes to files do not induce physical writes to disk - the operating system stores the written data in memory (page cache) and writes it to disk in the background. Both VictoriaMetrics and VictoriaLogs call rsync() before closing just created immutable data files in order to guarantee that the data is properly stored to disk. This prevents data loss and data corruption on unclean shutdown such as accidental power off in the middle of creating a data file. The frequency of fsync() calls is usually quite low, even for high data ingestion rate (e.g. a few tens of fsync() calls per second for millions of ingested samples per second / hundreds of thousands of ingested logs per second). This allows using HDD disks with 100-200 iops for high data ingestion workloads at VictoriaMetrics and VictoriaLogs. - Querying path. Usually the most frequent queries are executed over the recently ingested data - the majority of alerting and recording rules usually rely on the data ingested during the last few hours. There are very high chances that this data is located in the OS page cache, since the operating system automatically and transparently stores all the recently written and read files' data there (of course if there is enough RAM for storing such a data aka working set). In this case the majority of queries don't induce any read operations from disk - they are satisfied by the OS page cache. Some rare and heavy queries over the historical data may need disk read operations for reading the historical data from disk. Both VictoriaMetrics and VictoriaLogs have data storage format optimized for minimising random disk read operations when reading the data for heavy queries over historical data. This data storage format is optimized for sequential reads of large chunks of data from disk, and these chunks of data are stored in a compressed form with high compression ratio. This allows minimising both random disk read operations and disk read bandwidth needed for executing heavy queries over historical data, which is missing in the OS page cache, so regular HDD disks are enough for the majority of use cases. When to use SSD and NVMe at VictoriaMetrics and VictoriaLogs? There are two cases: - When the working set doesn't fit the OS page cache. Then the operating system needs to read the frequently requested data from disk, so it needs disk with high iops. But I'd recommend increasing the available RAM instead of switching from HDD to SSD in this case - it will give better performance and capacity for both queries and data ingestion. - When heavy queries need higher performance and the bottleneck is in disk read bandwidth or in disk read iops.

Ben Dicken · Nov 6, 2025 · 2:53 PM UTC

PT retweeted

Ben Dicken

@BenjDicken

Nov 6

Much of the the internet runs on Elastic Block Storage. It's the default (and in most cases, required) storage layer for every EC2 instance running in AWS. This article by @molson was a delightful read on its history and engineering challenges.

289

Distributed Bytes · Nov 6, 2025 · 12:00 PM UTC

PT retweeted

Distributed Bytes @DistribSystems

Nov 6

TLA+ Modeling of AWS outage DNS race condition muratbuffalo.blogspot.com/20… AWS’s N. Virginia region suffered a DynamoDB outage triggered by a DNS automation defect.This post focuses narrowly on the race condition at the core of the bug, which is best understood through TLA+ modeling

Igor De Souza · Nov 6, 2025 · 9:09 AM UTC

PT retweeted

Igor De Souza @Igfasouza

Nov 6

The Stream Processing Showdown: Kafka Streams vs. Flink vs. Spark #apachekafka #apacheflink #apachespark medium.com/@balaji.rajan.ts/…

The Stream Processing Showdown: Kafka Streams vs. Flink vs. Spark

The Stream Processing Showdown: Kafka Streams vs. Flink vs. Spark Read the full story for free …

medium.com

James Cowling · Nov 5, 2025 · 8:33 PM UTC

PT retweeted

James Cowling

@jamesacowling

Nov 5

Once a queue in a distributed system or a buffer in a network gets large enough it takes so long to reach the front that the client has already timed out and retried. You'll spend all your time processing useless work. Good systems have small queues. If your queue has to get long then switching to LIFO under load at least ensures you're making progress on requests users still care about.

Sam

@SamNewby_

Nov 5

Replying to @jamesacowling

What’s the logic for LIFO instead of FIFO?

900

James Cowling · Nov 5, 2025 · 6:54 PM UTC

PT retweeted

James Cowling

@jamesacowling

Nov 5

Counter-intuitive advice for designing very large scale live-site services: 1. Don't have retries inside your system, only at the edges. 2. Don't have queues inside your system, only at the edges. 3. When shit really goes down process requests in LIFO order.

582

Kelly Sommers · Nov 5, 2025 · 5:46 AM UTC

PT retweeted

Kelly Sommers @kellabyte

Nov 5

TIL about the slotted counter pattern. Makes total sense! Continuously reminded that parallel performance is about partitioning.

Kaivalya Apte - The Geek Narrator

@thegeeknarrator

Nov 5

While researching for the podcast with @samlambert I came across his blog on "Slotted counter pattern", which is very interesting. I implemented a very similar pattern to solve high contention single row updates for analytics use case. Give it a read! a simple and effective pattern to avoid hammering a single row and spreading the updates across N rows, reducing the potential for contention. ps: I am looking forward to the discussion and releasing the episode as soon as I can. 😎 planetscale.com/blog/the-slo… Also subscribe here to stay updated: piped.video/c/TheGeekNarrato…

176

Bilgin Ibryam · Nov 5, 2025 · 3:11 PM UTC

PT retweeted

Bilgin Ibryam

@bibryam

Nov 5

How to implement the Outbox pattern in Go and Postgres by @pliutau packagemain.tech/p/how-to-im…

259

ScyllaDB · Nov 4, 2025 · 10:15 PM UTC

PT retweeted

ScyllaDB

@ScyllaDB

Nov 4

How did Bluesky move off AWS and continue to thrive despite crossing 25 million users and only employing 15 software engineers? Learn how a distributed social network was built and the challenges encountered along the way. ow.ly/gWps50Waa3m #ScyllaDB

boris tane · Nov 5, 2025 · 1:52 PM UTC

PT retweeted

boris tane

@boristane

Nov 5

the question I get the most at every conference or meetup: "what even are durable objects?" I spent the past 2 years building almost exclusively with durable objects, and today I'm compiling my notes and favourite patterns into a single blog post boristane.com/blog/what-are-…

411

ℏεsam · Nov 4, 2025 · 11:55 PM UTC

PT retweeted

ℏεsam

@Hesamation

Nov 4

she actually summarized everything you must know from the “AI Engineering” book in 76 minutes. if you don’t got the time to read the book, you need to watch this. foundational models, evaluation, prompt engineering, RAG, memory, fine-tuning and many more. great starting point.

434

4,646

Arpit Bhayani · Nov 3, 2025 · 3:39 AM UTC

PT retweeted

Arpit Bhayani

@arpit_bhayani

Nov 3

give it a read - linkedin.com/blog/engineerin…

Revolutionizing Real-Time Streaming Processing: 4 Trillion Events Daily at LinkedIn

linkedin.com

Distributed Bytes · Nov 3, 2025 · 12:01 PM UTC

PT retweeted

Distributed Bytes @DistribSystems

Nov 3

TernFS — an exabyte scale, multi-region distributed filesystem xtxmarkets.com/tech/2025-ter… This post motivates TernFS, explains its high-level architecture, and then explores some key implementation details.

Gunnar Morling 🌍 · Nov 3, 2025 · 10:22 AM UTC

PT retweeted

Gunnar Morling 🌍 @gunnarmorling

Nov 3

📝 Blogged: "'You Don't Need Kafka, Just Use Postgres' Considered Harmful" In which I'm arguing that both Postgres and Kafka are great tools for their respective purposes. But don't create your custom implementation of one on top of the other. 👉 morling.dev/blog/you-dont-ne…

155

Igor De Souza · Nov 2, 2025 · 1:00 PM UTC

PT retweeted

Igor De Souza @Igfasouza

Nov 2

Flink's 95% problem #apacheflink Flink might sound like the holy grail of streaming data processing, but for 95% of us, it's just a complex headache we don't need. tinybird.co/blog/flink-is-95…

Flink's 95% problem

Flink might sound like the holy grail of streaming data processing, but for 95% of us, it's just a complex headache we don't need.

tinybird.co

Archie Sengupta · Nov 2, 2025 · 4:04 PM UTC

PT retweeted

Archie Sengupta

@archiexzzz

Nov 2

built my own vector db from scratch with - linear scan, kd_tree, hsnw, ivf indexes just to understand things from first principles. all the way from: > recursive BST insertion with d cycling split > hyperplan perpendicular splitting to axis at depth%d > bound and branch pruning + greedy DFS > optimizing tree pruning by ~50% on avg > skip list prob distribution > selecting M diverse neighbors to prevent long-range edge clustering > bidirectional edge creation > greedy descent on upper layers for fast O(log n) coarse search > exhaustive beam search at layer 0 + bfs pruning > heuristic picks of diverse M from candidate set

125

2,647

Abhishek Singh · Nov 1, 2025 · 7:25 PM UTC

PT retweeted

Abhishek Singh

@0xlelouch_

Nov 1

Adding a few more that go deeper into real-world scaling and distributed systems side of databases: Consensus protocols (Raft, Paxos, Zab) Sharding strategies (range, hash, consistent hashing) Quorum reads/writes Conflict resolution (CRDTs, vector clocks) Snapshot isolation & MVCC Compaction & tombstones (esp. in LSM-based systems) Query optimizers & cost-based planning Columnar storage formats (Parquet, ORC) OLTP vs OLAP design differences Data lake vs data warehouse architecture Hot key mitigation & load balancing Schema evolution in distributed systems Data locality & co-location strategies Write amplification and read amplification ZK / Etcd / Consul for coordination Storage engines (InnoDB, RocksDB, WiredTiger)

Ashutosh Maheshwari

@asmah2107

Nov 1

Database stuff I’d study if I wanted to understand scaling deeply: Bookmark this. B+ Trees LSM Trees Write-Ahead Logging Two-Phase Commit Three-Phase Commit Read Replicas Leader-Follower Replication Partitioning Query Caching Secondary Indexes Vector Indexes (FAISS, HNSW) Distributed Joins Materialized Views Event Sourcing Change Data Capture

457

Cloudflare · Oct 31, 2025 · 1:05 PM UTC

PT retweeted

Cloudflare

@Cloudflare

Oct 31

Learn how a common pattern of using Go's HTTP/2 client can lead to unintended errors and the solution to avoiding them. cfl.re/4qzqfhd

Go and enhance your calm- demolishing an HTTP:2 interop problem

HTTP/2 implementations often respond to suspected attacks by closing the connection with an ENHANCE_YOUR_CALM error code. Learn how a common pattern of using Go's HTTP/2 client can lead to unintended...

blog.cloudflare.com

112