RaceTrackerCatchUpHook
final class RaceTrackerCatchUpHook implements CatchUpHookInterface (View source)
| internal |
We had some race conditions in projections We saw some non-deterministic, random errors when running the tests - unluckily only on Linux, not on OSX: On OSX, forking a new subprocess in {SubprocessProjectionCatchUpTrigger} is WAY slower than in Linux; and thus the race conditions which appears if two projector instances of the same class run concurrently won't happen (or are way less likely).
Our Goal: Detect if/when a Projection runs twice at the same time.
The system must ENSURE that a given projection NEVER runs concurrently; so this is the case we need to detect.
This means, the following is the behavior we want to have:
Process A acquireLock( |[ ) processEvent() releaseLock( ] )
Process B acquireLock( | [ ) processEvent() releaseLock( ] )
(i.e. Process B will wait inside acquireLock() until the lock is released (i.e. Process A finished), and then continue.
A WRONG and UNDEFINED behavior looks as follows:
Process A acquireLock( |[ ) applyEvent() releaseLock( ] )
Process B acquireLock( | [ ) applyEvent() releaseLock( ] )
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
During this time period, two
processes run the projection
concurrently.
Legend for the flow diagrams above
--> time | [ ]
^ ^ ^
try to acquire lock acquired lock released
the lock
Implementation Idea: Race Detector with Redis
We implement a custom CatchUpHook (this class {\Neos\ContentRepository\BehavioralTests\ProjectionRaceConditionTester\RaceTrackerCatchUpHook}) which is notified during the projection run.
When {\Neos\ContentRepository\BehavioralTests\ProjectionRaceConditionTester\onBeforeEvent} is called, we know that we are inside applyEvent() in the diagram above, thus we know the lock HAS been acquired. When {\Neos\ContentRepository\BehavioralTests\ProjectionRaceConditionTester\onAfterCatchUp}is called, we know the lock will be released directly afterwards.
We track these timings across processes in a single Redis Stream. Because Redis is single-threaded, we can be sure that we observe the correct, total order of interleavings across multiple processes inside the single trace.
Race Detector Algorithm
We sequentially go through the stream, we continuously track for which PIDs a transaction is currently open.
When a transaction is open for more than one PID, we know that we found a race.
This algorithm is implemented in {[\Neos\ContentRepository\BehavioralTests\ProjectionRaceConditionTester\Dto\TraceEntries::findProjectionConcurrencyViolations()}.
](../../../../Neos/ContentRepository/BehavioralTests/ProjectionRaceConditionTester/Dto/TraceEntries.html) Duplicate Processing Algorithm
At the same time, an Event should never be processed multiple times by the same Projector. We additionally detect this by remembering the sequence numbers of seen events; and if we have already seen the sequence number already, we know this is an error. This is implemented in {\Neos\ContentRepository\BehavioralTests\ProjectionRaceConditionTester\Dto\TraceEntries::findDoubleProcessingOfEvents()}.
Properties
| protected mixed[] | $configuration |
Methods
This hook is called at the beginning of a catch-up run, after the database lock is acquired, but before any projection was called.
This hook is called for every event during the catchup process, before the projection is updated but in the same transaction.
This hook is called for every event during the catchup process, after the projection is updated but in the same transaction,
This hook is called for each batch of processed events during the catchup process, after the projection and their new position is updated and the transaction is commited.
This hook is called at the END of a catch-up run, after the projection and their new position is updated and the transaction is commited.
Details
void
onBeforeCatchUp(SubscriptionStatus $subscriptionStatus)
This hook is called at the beginning of a catch-up run, after the database lock is acquired, but before any projection was called.
Note that any errors thrown will be collected and the current catchup batch will be finished as normal. The collect errors will be returned and rethrown by the content repository.
void
onBeforeEvent(EventInterface $eventInstance, EventEnvelope $eventEnvelope)
This hook is called for every event during the catchup process, before the projection is updated but in the same transaction.
Note that any errors thrown will be collected and the current catchup batch will be finished as normal. The collect errors will be returned and rethrown by the content repository.
void
onAfterEvent(EventInterface $eventInstance, EventEnvelope $eventEnvelope)
This hook is called for every event during the catchup process, after the projection is updated but in the same transaction,
Note that any errors thrown will be collected and the current catchup batch will be finished as normal. The collect errors will be returned and rethrown by the content repository.
void
onAfterBatchCompleted()
This hook is called for each batch of processed events during the catchup process, after the projection and their new position is updated and the transaction is commited.
The database lock is directly acquired again after it is released if the batching needs to continue. It can happen that this method is called even without having seen events in the meantime.
Note that any errors thrown will be collected but no further batch is started. The collect errors will be returned and rethrown by the content repository.
void
onAfterCatchUp()
This hook is called at the END of a catch-up run, after the projection and their new position is updated and the transaction is commited.
Note that any errors thrown will be collected and the catchup will finish as normal. The collect errors will be returned and rethrown by the content repository.