Albertson Co.’s Aditya R. Singh talks the online order batching problem grocery fulfilment has been getting wrong

Comment

15 May

By Aditya R. Singh, Principal Product Manager, Picking Systems, Albertsons Companies

The operations research literature on order batching is large and technically mature. Since the foundational work of Gademann, van den Berg, and van der Hoff in 2001, the problem has been studied extensively - optimal algorithms for fixed-aisle warehouses, heuristics for zoned layouts, travel-distance minimisers, makespan schedulers. The field knows a great deal about how to batch orders efficiently.

What it knows considerably less about is how to batch orders on time.

The distinction matters because efficiency and timeliness are not the same objective. In classical OBP formulations, the goal is to minimise total travel distance or total completion time across a batch wave. These are appropriate objectives for a distribution centre. They are the wrong objectives for a store running BOPIS grocery - an environment where each order carries a hard customer deadline, orders arrive continuously in real time, and the primary operational metric is promise-window adherence, not aggregate throughput.

This article presents Rolling Pick Lists (RPL), a slack-aware online batching and release policy developed to address this gap.

The gap in the literature is not obscure. Most OBP work assumes one of two settings: either all orders are known before batching begins (the offline assumption), or the optimisation objective is efficiency oriented. Store-based BOPIS violates both.

Orders arrive dynamically throughout an eight-hour operational window. Each order i has an arrival time r_i, a customer-specified promise window W_i, and a hard deadline D_i = r_i + W_i. Promise windows in grocery e-commerce typically span a wide range — in my simulation, 10% of orders carried windows of 30 to 45 minutes, 50% carried 60 to 240 minutes, and 40% carried 240 to 360 minutes. The system must make batching decisions in real-time without knowing what orders will arrive in the next hour.

Pickers work in parallel - typically four to six in a store-based setting - and are non-preemptive: once a picker starts a picklist, they finish it before receiving another assignment. Each picklist is subject to hard constraints: a cart capacity of six totes, temperature zone compatibility, and a practical maximum batch size of three orders.

The objective is to minimise late-order rate - the proportion of orders completing after their deadline - subject to maintaining acceptable picker productivity. This is an online, multi-picker, SLA constrained joint batching and release problem with capacity and temperature constraints. No prior OBP work I am aware of simultaneously addresses all five dimensions.

Classical approaches fail in this setting for a specific reason: they treat all orders as interchangeable candidates for batching. Fixed time-window batching accumulates orders over a set interval regardless of individual deadline urgency. A 35-minute-promise order and a four-hour-promise order that arrive in the same window receive identical treatment. The 35-minute order absorbs the delay overhead of the batch - queue wait, combined travel, multi-order pick time - and has no remaining buffer by the time the picker starts.

The mechanism that resolves this is slack. The calculation is a single expression, but it's worth working through it carefully - the logic it encodes is the entire policy. For order i at decision time t:

S_i(t) = [D_i − t] − [q̂(t) + p̂_i(t)]

where D_i − t is the time remaining to deadline, q̂(t) is estimated queue delay, and p̂_i(t) is estimated processing time for order i. S_i(t) is the time buffer remaining after accounting for system load. If this quantity is negative, the order cannot safely absorb any additional delay. Batching it is a decision to make it late.

To make this concrete: an order arriving at 2:05 with a 3:00 PM deadline has 55 minutes remaining. If queue delay is 10 minutes and single-order pick time is 50 minutes, S_i(t) = 55 − (10 + 50) = −5 minutes. Negative slack - the order is already at risk. It should never have been a batch candidate.

This is structurally analogous to least-slack-time (LST) scheduling, but applied at the batching eligibility decision rather than the sequencing decision. In LST, slack determines the order in which jobs are processed. Here, slack determines whether an order is eligible for batching at all - a prior and more fundamental question.

Albertson Co.’s Aditya R. Singh talks the online order batching problem grocery fulfilment has been getting wrong

RPL is a control policy, not a one-time optimiser. At each decision epoch - triggered every 60 seconds or on picker availability, whichever comes first - the policy computes S_i(t) for all open orders, partitions them into forced singles (S_i(t) < θ) and batch candidates (S_i(t) ≥ θ), releases forced singles in earliest-deadline-first order, forms batches from candidates greedily subject to capacity and temperature constraints, and assigns to idle pickers.

The threshold θ is the single operational parameter: the minimum slack an order must carry to qualify for batching. At θ = 8 minutes, an order needs at least eight minutes of buffer beyond its estimated completion time before it enters the batch queue.

On complexity: the offline OBP is NP-hard. RPL generates a polynomial-time greedy heuristic by using slack to structurally decompose the problem. Forced singles require no batch optimisation. Batch candidates form a reduced instance over which greedy construction runs in O(n² × C) - in practice, this means the decision runs in milliseconds on a standard server, with no solver and no infrastructure overhead.

The structural property worth noting is that at high demand - when the batch optimization problem is largest - the forced singles partition is also largest. The hardest instances are the ones the policy most aggressively removes from the batch optimiser. This makes RPL tractable in real time without solver infrastructure.

I validated RPL using discrete-event simulation in Python: an 8-hour horizon, five pickers, Poisson arrivals calibrated to 80, 130, and 180 orders per day, validated with 50 replications per scenario, Welch's t-tests, and Bonferroni correction on all pairwise comparisons.

At high demand, RPL reduced the late-order rate from 6.4% to 3.1% - a 51% reduction (p < 0.001, d = −1.01). Productivity came in at 117 versus 129 items per picker-hour under time-window batching, a statistically significant difference (p < 0.001, d = 0.87) representing the cost of releasing tight-deadline orders as singles rather than batching them.

The more significant finding is at medium demand. RPL matched single order picking's late-order rate - under 1% - while delivering productivity statistically equivalent to time-window batching: 118 versus 111 items per picker-hour (p = 0.25, d = 0.23). This is a Pareto improvement: RPL occupies a position on the SLA-productivity frontier that neither competing policy achieves. The mechanism is structural - at medium demand, most orders have sufficient slack to be batched safely. The forced singles partition is small, and batch candidates capture essentially the same efficiency gains as unrestricted batching.

Threshold sensitivity at medium demand follows a consistent pattern: each two-minute increase in θ from 4 to 12 minutes reduces the late-order rate by 0.1 to 0.3 percentage points at a cost of three to five items per picker-hour, with the marginal benefit flattening above θ = ten minutes.

RPL requires two real-time estimates: queue delay q̂(t) and processing time p̂_i(t). Queue delay is a rolling computation - pending picklists divided by active pickers, multiplied by average picklist duration - maintained as a continuously updated state variable. Processing time is a sum of travel time (estimated from zone count) and pick time (estimated from item count at a calibrated fixed rate).

These are deliberately simple proxies. Underestimating either quantity understates the true forced-singles partition, which errs toward batching orders that should have been released individually. The error is recoverable and immediately observable in monitoring. The policy fails in the direction of slightly elevated late order rate - not silent failure. It fails safely.

For deployment, three integration patterns are viable. The first and recommended is RPL as a policy layer inside the fulfillment orchestrator - the orchestrator already holds order state, picker state, and the queue, so RPL becomes a decision function called at each epoch before picklist assignment, requiring no new APIs and no changes to OMS or WMS.

The second embeds RPL inside the WMS batching module, enabling tighter integration with picker-state data but creating a vendor dependency. The third is a sidecar service that observes shared state via read APIs and submits picklist requests — the most loosely coupled option, appropriate when the orchestrator cannot be modified directly.

In all three patterns, the key read surfaces are order metadata from the OMS (arrival time, promise window, item count, temperature zones) and picker state from the WMS or picker app (status, current assignment, estimated completion). The write surface is the picklist assignment queue. Analytics should capture actual versus estimated completion times to allow calibration of the processing time proxy over time.

The most operationally significant extension is preparatory department integration. The current model treats all items as immediately available. In practice, deli, bakery, and hot-food items require preparation before picking - a separate service stage with its own queue. Incorporating prep queue delay into p̂_i(t) would extend the slack calculation to the full completion timeline and surface cases where tight deadline orders are already at risk before a picker is assigned.

The second meaningful extension is adaptive threshold selection. A contextual bandit or Q-learning formulation could update θ in real-time based on queue depth, time of day, and recent late order rate - allowing the policy to tighten automatically during demand spikes, precisely when a static threshold is most likely to underperform.

The online, SLA driven order batching problem is underexplored relative to its practical importance. The offline OBP literature is technically rich. The gap between those formulations and the real-time, deadline first environment of store-based grocery fulfillment is substantial. RPL addresses that gap with a tractable policy grounded in a specific insight: slack determines batching eligibility before sequencing is even a question.

The simulation results support the policy's practical value. The deployment architecture maps to production stacks. The open question - whether tighter theoretical bounds on the late order rate under RPL are achievable, and whether the adaptive threshold problem admits a clean regret characterisation - is worth pursuing. The operational case for deploying the base policy is already established.

About the author: Aditya R. Singh is Principal Product Manager for Picking Systems at Albertsons Co. He leads product development for in-store e-commerce fulfillment systems. Connect with him on LinkedIn.

The views expressed in this article are the author's own and represent independent research. They do not represent the positions or systems of Albertsons Companies.

Featured