✨ add timeout option to batched event handler by rparcus · Pull Request #641 · commanded/commanded

rparcus · 2025-11-15T16:54:54Z

Adds a batch_timeout option to event handlers.

CHANGELOG.md

TylerPachal · 2026-02-01T23:58:14Z

lib/commanded/event/handler.ex

+          if batch_timeout_provided? do
+            raise ArgumentError,
+                  inspect(module) <>
+                    " :batch_timeout requires :batch_size. Remove the timeout or configure batching."
+          end


I think this check makes more sense with the checks up above like "both :concurrency and :batch_size are specified..." at the beginning of this function.

A bonus of moving the checks is that I think you would no longer need the logic surrounding __no_batch_timeout__.

TylerPachal · 2026-02-02T00:10:08Z

lib/commanded/event/handler.ex

+      # No batch_size configured - process immediately
+      is_nil(batch_size) ->
+        handle_batch(events, %{flush_reason: :immediate}, state)


Is this case actually possible? If we are in :batch callback mode doesn't there have to be a batch_size configured upstream from here? Enforced by this code: https://github.com/commanded/commanded/blob/master/lib/commanded/event/handler.ex#L782-L784

If this case is not possible, and there are actually only two possible cases here, I think this code would be a little easier to follow if you moved things up into the handle_info above and had all three code paths directly in the case statement:

try do events = Upcast.upcast_event_stream(events, additional_metadata: %{application: application}) state = case {callback, state.batch_timeout) do {:event, _} -> # Non-batched: process immediately Enum.reduce(events, state, &handle_event/2) # Batched with no timeout configured {:batch, nil} -> handle_batch(events, state) # Batched with timeout {:batch, batch_timeout} -> buffer_and_maybe_flush(events, state) end end # ... defp buffer_and_maybe_flush(events, %Handler{} = state) do %Handler{ batch_buffer: buffer, batch_size: batch_size, batch_timeout: batch_timeout } = state current_buffer = buffer || [] new_buffer = current_buffer ++ events state = %Handler{state | batch_buffer: new_buffer} # Start timer if this is first event in batch state = maybe_start_batch_timer(state) # Check if we should flush based on size if length(new_buffer) >= batch_size do state |> cancel_batch_timer() |> flush_batch_buffer(:size) else state end end

I think this makes more sense because the buffer_and_maybe_flush function now is clearly only on the codepath with the timer. It was a little confusing about why all :batch paths would go through the buffer + flush mechanism.

TylerPachal · 2026-02-02T00:21:04Z

lib/commanded/event/handler.ex

+    } = state
+
+    # Upcast events before buffering
+    events = Upcast.upcast_event_stream(events, additional_metadata: %{application: application})


If this is being called here, and in the :event codepath of handle_info({:events, events}, state) here then I think its simpler for it to reside only in the handle_info({:events, events}, state) function (at the top of the try block).

- Move "batch_timeout requires batch_size" validation earlier - Remove __no_batch_timeout__ sentinel pattern - Restructure handle_info with clear three-way code paths - Move upcasting to single location before branching - Simplify buffer_and_maybe_flush (remove dead code paths) - Fix unused default argument warning in handle_batch/3

drteeth

I really like the idea of this change. Thanks for putting the time in. There are a few things I'm unclear about still, but we'll get there.

drteeth · 2026-02-02T15:10:09Z

lib/commanded/event/handler.ex

+    {batch_timeout, config} = Keyword.pop(config, :batch_timeout, :infinity)
+
+    # Validate batch_size
+    unless is_nil(batch_size) or (is_integer(batch_size) and batch_size > 0) do


unless was deprecated in Elixir 1.18, can you refactor this to use if please?

drteeth · 2026-02-02T15:10:13Z

lib/commanded/event/handler.ex

+    end
+
+    # Validate batch_timeout
+    unless batch_timeout == :infinity or (is_integer(batch_timeout) and batch_timeout > 0) do


unless was deprecated in Elixir 1.18, can you refactor this to use if please?

drteeth · 2026-02-02T15:16:40Z

lib/commanded/event/handler.ex

+
+    # Clear buffer and timer BEFORE processing to prevent race condition
+    # If timer fires during batch processing, it will see empty buffer
+    state = %Handler{state | batch_buffer: [], batch_timer_ref: nil}


Setting batch_timer_ref to nil here can't prevent a race condition can it?. You are cancelling the timer, draining any flush messages that made it through, and setting the timer ref to nil in cancel_batch_timer/1 before this call.

lib/commanded/event/handler.ex

drteeth · 2026-02-02T15:20:51Z

lib/commanded/event/handler.ex

+          {:batch, _timeout} ->
+            # Batched with timeout: buffer events and flush on size or timeout
+            buffer_and_maybe_flush(events, state)
+        end


drteeth · 2026-02-02T18:30:49Z

lib/commanded/event/handler.ex

+
+  defp drain_flush_batch_timeout_message do
+    receive do
+      :flush_batch_timeout -> :ok


What happens if the next message isn't a flush messge? I think we'll crash here no? And if you do manage to change the pattern match, then we've taken the message out of the mailbox and we have to process it it, whatever it is?

Am I wrong?

The after 0 clause below makes this a "non-blocking selective receive", meaning that it only takes :flush_batch_timeout from the mailbox if one is already there, otherwise returns immediately.

It never blocks and never consumes other messages. This pattern is used for draining a specific message after cancelling a timer (Process.cancel_timer/1 returns false when the timer already fired, meaning the message may be in the mailbox).

TylerPachal · 2026-02-27T01:04:51Z

@rparcus Could you explain a little bit what behavior you are trying to add/change with this PR?

From what I can see, if I set batch_size: 100 and dispatch a command, my event handler sees that event immediately (i.e. it does not wait for another 99 events to "fill" the batch).

Are you seeing different behavior than that?

batch_size controls the EventStore subscription's in-flight buffer, not handler-level accumulation. batch_timeout adds opt-in time-based buffering at the handler level for use cases like batching projection writes during steady-state operation. Made-with: Cursor

rparcus · 2026-02-27T10:01:50Z

@rparcus Could you explain a little bit what behavior you are trying to add/change with this PR?

From what I can see, if I set batch_size: 100 and dispatch a command, my event handler sees that event immediately (i.e. it does not wait for another 99 events to "fill" the batch).

Are you seeing different behavior than that?

You're right, batch_size alone does not cause accumulation/holdup of events. I tested this again, against the real PostgreSQL EventStore: with batch_size: 100, dispatching a single command results in handle_batch/1 being called immediately with that one event. batch_size sets the subscription's buffer_size (in-flight window) only, so real batches only form during catch-up replay or under back-pressure. Found the old issue describing this actually, handle_batch in most cases is nothing more than a normal handler wrapped in a list as it "does its thing" only during catchup.

With this change, the batch_timeout option adds something that doesn't exist in commanded today: time-based buffering at the handler level. When configured, events are accumulated in the handler process and flushed when either batch_size events have collected OR batch_timeout milliseconds have elapsed. This enables use cases like batching projection writes during steady-state live operation, not just during catch-up. I have updated the module docs to make this distinction more evident.

rparcus commented Nov 17, 2025

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

TylerPachal reviewed Feb 1, 2026

View reviewed changes

TylerPachal reviewed Feb 2, 2026

View reviewed changes

rparcus added 4 commits February 2, 2026 09:29

✨ add timeout option to batched event handler

f4adb72

Update CHANGELOG.md

3115ee0

✨ Improve timeout config. Cancel old timer.

d19d226

rparcus force-pushed the rp/add_batch_timeout branch from 1cadc59 to 66ab766 Compare February 2, 2026 08:36

drteeth reviewed Feb 2, 2026

View reviewed changes

Conversation

rparcus commented Nov 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

TylerPachal Feb 1, 2026

Choose a reason for hiding this comment

Uh oh!

TylerPachal Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TylerPachal Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

drteeth left a comment

Choose a reason for hiding this comment

Uh oh!

drteeth Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

drteeth Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

drteeth Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

drteeth Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

drteeth Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

rparcus Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

TylerPachal commented Feb 27, 2026

Uh oh!

rparcus commented Feb 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

rparcus commented Nov 15, 2025 •

edited

Loading

TylerPachal Feb 2, 2026 •

edited

Loading

TylerPachal Feb 2, 2026 •

edited

Loading

drteeth Feb 2, 2026 •

edited

Loading