Edge Inference

Running ML model inference on the end device (the screen) rather than in the cloud — sub-100ms latency, no raw data egress, GDPR-clean.

Edge inference is the practice of running ML model inference on the end device — in Trillboards' case, the Android tablet powering the screen — rather than uploading frames to a cloud service. The trade-offs are well-rehearsed by 2026: edge inference adds sub-100ms latency, eliminates raw-data egress (faces never leave the device), and survives network drops; cloud inference offers larger model capacity and easier rollout.

The Trillboards sensing SDK is edge-first for both privacy and economics. Privacy: GDPR Article 9 treats biometric data as a special category requiring explicit consent for cloud processing; edge inference sidesteps the consent burden because the data never moves. Economics: a fleet of N screens running edge inference costs roughly $0 in compute beyond the device's existing power draw, while N screens streaming 30 fps to a cloud vision API would cost orders of magnitude more.

Implementation: the SDK ships pre-compiled TensorFlow Lite models for the detection and classification pipelines, plus a thin orchestrator that runs them in the right order with the right backpressure. On modern Snapdragon-class SoCs, the full pipeline runs at 5-10 fps on a single thread; on budget devices it runs at 1-2 fps. Both rates are sufficient for the second-scale attention measurement DOOH needs.

Cloud inference still has a role for the higher-context signals — cohort composition and the buyer-grade narrative fields are emitted by a Gemini call to the API. The edge owns what its sensors saw; the cloud fills the semantic gaps.

Authoritative reference

TensorFlow Lite — On-Device Inferencetensorflow.org

Edge Inference

Authoritative reference

See also

Building against Trillboards?