Aggregate Latency Distribution

Bid Response Latency

Median · p95 · p99 · industry comparison

Why bid response latency is the single most under-discussed lever in DOOH

A 300 ms bid window sounds generous compared with web display's 100 ms norm, but DOOH ad calls cascade through several systems before bytes start flowing to the player. The screen's ad client opens an HTTP request to our ad-decision service; we construct an OpenRTB bid request and fan it out to every connected SSP that matches the inventory profile; each SSP routes the opportunity to its downstream DSPs and waits for responses; each DSP has its own bid-decisioning pipeline (audience-match, frequency-cap, brand-safety, bid-shading) before it can respond. Every 50 ms of latency at any stage shifts spend toward slower decisioners and away from the buyers with the best identity match.

The asymmetry is brutal: the buyers willing to pay the most are often the most latency-sensitive, because their bid-decisioning pipelines are the most sophisticated. A DSP doing rigorous audience-match against a TV-acquired-cohort signal in real time will time out before a DSP applying a static seat-list rule. The slow-side-of-the-house wins by default when latency is uncontrolled, which shifts CPM downward for the publisher.

The right framing is that latency budget IS supply-path optimization. Every millisecond shaved off bid response time is a millisecond of buyer-decision budget restored to the buyers most likely to bid high.

What we measure, and how

We measure bid response latency from the moment our ad-decision service dispatches the OpenRTB request to the moment we have a complete response back (bid or no_bid), exclusive of network egress to the screen. The clock starts when the request leaves our server and stops when the bytes are fully received and schema-validated. We do not include the time to construct the request (we pre-build them) or the time to ship the winning creative to the screen (that's a separate measurement).

Latency is bucketed at 25 ms granularity into a histogram, with separate histograms for bid-received and no-bid outcomes. We aggregate across the full multi-SSP fanout — never per-partner — and report median, p95, and p99 percentiles. Aggregating across the SSP fanout is the correct view because that's what the screen actually experiences: the latency that matters is the slowest of the parallel calls (since we wait for all responses up to the auction timeout), not any individual partner's number.

The auction timeout itself is configurable per inventory profile. We default to 300 ms for first-look auctions (where the buyer pool is fast and well-tuned) and 450 ms for second-look auctions (where slower decisioners have a chance to participate). Beyond the timeout, any still-in-flight bid is dropped and we proceed with whatever bids have arrived.

Aggregate latency distribution

Across the full network during the observation window, the aggregate distribution of bid response latency shows median well below the IAB Tech Lab's 300 ms recommendation, with p95 inside the 450 ms second-look auction timeout. The distribution is right-skewed (a small fraction of responses are slower than typical, dragging the long tail toward the timeout boundary), which is the expected shape for any system that fans out to multiple downstream services with independent latency profiles.

The p99 number is the most operationally important — it's the threshold at which we decide whether to extend the auction timeout, switch a venue category to a different SSP routing, or invest in a faster fallback. Our p99 sits inside 650 ms across the network, which we treat as the upper bound for an acceptable long-tail. Anything sustained above that triggers a routing investigation.

Geographic distribution matters: US screens see lower median latency than international screens, primarily because the SSP datacenters cluster on the US East Coast and the round-trip to APAC screens adds a structural floor of ~120 ms. We compensate by extending the auction timeout for non-US inventory profiles, which protects buyer participation without sacrificing fill.

Industry comparison

The IAB Tech Lab's OpenRTB specification recommends a 300 ms bid response target as the floor for real-time programmatic auctions. AdExchanger's 2025 supply-path optimization analysis cited a 250–400 ms band as typical for CTV and DOOH, with the higher end driven by international DSP routing. Our aggregate profile sits comfortably inside the lower half of that band on median and inside the upper half on p95 — a normal and well-behaved distribution for a multi-SSP supply chain serving a global screen fleet.

There are two ways a publisher can game these numbers, both of which we deliberately avoid. The first is to artificially lower auction timeouts, which gives a better median latency but kills participation from slower DSPs — fill drops, CPM drops with it. The second is to report latency only for bid_received outcomes (excluding no_bids), which can shave 30–50 ms off the median because no_bids are systematically faster than bids. We report the full mixed distribution because that's what the auction actually sees.

Latency is one of the levers we tune continuously. Each SSP integration goes through a latency-validation cycle on onboarding, and we run a continuous aggregate monitor that flags any sustained drift in p95 / p99 above baseline. Drift is investigated against the SSP's side first, then the network path, then our own service if neither of those is the cause.

Reading the histogram: what the long tail tells us

A bid-response latency histogram for any sufficiently broad multi-SSP fanout has a characteristic shape. The body of the distribution clusters tightly around the median — DSPs with fast decisioning return responses in a predictable, bounded window. The tail to the right is the interesting part: it tells you which class of buyer is bidding at the edge of the timeout, and how much price the publisher leaves on the table when those buyers are squeezed out.

If the histogram's tail is short and steep, the timeout is roughly correct for the buyer pool — no one is being systematically excluded. If the tail is long and gradual, raising the timeout would extend bid participation to slower decisioners (potentially at a cost to median screen ad-load latency). The trade-off is the classic publisher tension between fill quality and player latency. We treat the timeout as a tunable per inventory profile, with different values for high-CPM premium venues versus high-volume long-tail screens.

Beyond median / p95 / p99, the metric we watch most carefully is the response-rate-by-bucket curve: the fraction of bid requests that receive at least one bid within each 25 ms latency bucket. The shape of that curve answers a different question than the latency distribution alone — it shows how the effective fill rate ramps as more buyer time is granted. The optimal timeout is where the response-rate curve plateaus; pushing past that plateau yields nothing but a slower ad load.