Rewriting a Core Pricing Engine: Lessons from Replacing Legacy Infrastructure

Smartpricing is an AI-powered dynamic pricing platform for hotels. The core engine - the system that turns models, rules, and market data into the actual prices that get pushed to a hotel’s PMS - had been in production for years and had accumulated the usual debts: a handful of competing abstractions, a few layers of caches written in response to specific incidents, and behaviour that nobody was comfortable changing because too many clients depended on subtle output quirks. I spent the better part of a year rewriting it, nearly single-handedly, and these are the notes I wish I’d had at the start.

What the engine does

On paper the engine has a simple job: for each room type at each property, for every future check-in date within a horizon, produce a recommended price. In practice that involves pulling occupancy and booking pace from the PMS integration, evaluating a stack of models and deterministic rules configured per-property, applying revenue- manager overrides, and writing the result to storage in a shape that both the PMS push path and the UI can consume. It runs on a schedule but also on demand, for thousands of properties, under latency budgets that matter because a revenue manager is often sitting in front of a dashboard waiting for a recalculation.

Why rewrite instead of refactor

I started assuming I’d refactor. The first month was a careful reading of the existing code paths, sketching the dependency graph, and trying to carve out one module at a time. What I found was that the hot path crossed the same three abstractions in different orders depending on configuration, that the caches were load-bearing for correctness in one edge case, and that “the engine” was actually two engines that had been glued together during a previous migration and shared state in ways that weren’t documented anywhere except in the incidents that had happened because of them.

The tipping point was an experiment: I wrote a small prototype of the new architecture and fed it a frozen snapshot of production inputs from a dozen representative properties. The prototype, which I’d written in a week, produced the same prices as production for all of them, and did so in a fraction of the time. That ended the refactor conversation. The old code wasn’t going to be gently pulled apart; it was going to be replaced.

The migration strategy

Running old and new in parallel is table stakes for this kind of rewrite. What’s harder is deciding what “agreement” means. The old engine was not always correct - there were bugs we knew about and likely others we didn’t - so the goal wasn’t to match it bit-for-bit but to match it where it was right and to diverge from it, in a documented and defensible direction, where it was wrong.

Concretely: every scheduled run produced two sets of prices, old and new, and a diff job computed per-property statistics on the differences. Any property whose diff fell outside a quiet band was flagged for review. Divergences were either accepted (with a note explaining why the new behaviour was better), patched in the new engine (when the old one happened to be right), or added to the known-divergent list. Only after a property spent a full week in the quiet band did it get cut over to the new engine as the source of truth.

The cutover itself was gradual. We rolled out by cohort, starting with internal test properties, then a small set of friendly customers, then the long tail. At no point was there a big-bang migration day, and at no point did a customer experience the change as an incident.

Architecture decisions

The rewrite is organized around an explicit pricing pipeline: inputs are fetched once per run and passed through a sequence of pure transformations, each of which can be inspected and tested in isolation. The pipeline is configured per-property from a single declarative object, not by flags scattered through the code. Caches live at the edges, not in the middle, so nothing in the pipeline has to know that caching exists.

The second big decision was to separate the scheduled path and the on-demand path only at the boundaries. They run the same pipeline over the same inputs; the only difference is how the inputs get hydrated. That removed an entire class of “it works on schedule but not on demand” bugs the old system was prone to.

What I’d do differently

I under-invested in observability in the first few months. The pipeline is cleaner than the old system, which is exactly why it’s harder to debug when something goes wrong: there are fewer log lines and fewer places where a confused developer used to dump state. Next time I’d instrument the pipeline stages from day one, with structured events per stage and easy replay from a captured input snapshot.

I also wish I’d set the bar for cutover higher sooner. “Spend a week in the quiet band” was a reasonable criterion, but it didn’t catch some seasonal divergences that only showed up when we hit a new kind of demand pattern. A version of this that required a full pricing cycle across a seasonal transition would have caught those earlier, at the cost of a slower rollout.