How it works¶
celery-batches makes no changes to how tasks are created or sent to the broker, but operates on the workers to process multiple tasks at once. Exactly how tasks are processed depends on the configuration, but the below assumes usage of the default “prefork” configuration of a celery worker (the explanation doesn’t change significantly if the gevent, eventlet, or threads worker pools are used, but the math is different).
As background, Celery workers have a “main” process which fetches tasks from the
broker. By default it fetches the “worker_prefetch_multiplier x worker_concurrency”
number of tasks (if available). For example, if the prefetch multiplier is 100 and the
concurrency is 4, it attempts to fetch up to 400 items from the broker’s queue.
Once in memory the worker deserializes the messages and runs whatever their
Strategy is – for a normal celery
Task this passes the tasks to the workers in the
processing pool one at a time. (This is the default() strategy.)
The Batches task provides a different strategy which instructs
the “main” celery worker process to queue tasks in memory until either
the flush_interval or flush_every
is reached and passes that list of tasks to the worker in the processing pool
together.