Blog

Technical write-ups on backend engineering, data systems, and performance work.

Designing a Convergent Event Deduplication Pipeline

April 21, 2026 · 6 min read

How to merge multi-source event records with graph-based deduplication, aggressive search-space reduction, and stable IDs across runs.
Achieving ~100x Compression on Scraped Pricing Data in ScyllaDB

April 20, 2026 · 8 min read

How we designed a storage layer for competitor hotel prices that compresses ~30 TB of raw scraped data into ~300 GB in ScyllaDB, with ~20x logical-to-physical compression in production.
From Custom Python per Customer to a Parameterized Pricing Engine

April 19, 2026 · 5 min read

How we moved from per-customer pipelines (startup-speed trade-offs) to one parameterized engine, migrated gradually with no downtime, and shipped previews before saving changes.