Event-Driven Architecture
Event-driven architecture lets services communicate asynchronously by publishing facts about state changes. It improves decoupling and fan-out, but introduces eventual consistency and debugging complexity.
Reading time
11 min
Core Idea
A service emits events such as OrderPlaced or PaymentSucceeded, and downstream systems react independently. The producer has no knowledge of who consumes its events or what they do with them, which is what makes the architecture loosely coupled by design.
Benefits
- Loose coupling: producers and consumers evolve independently without shared deployment cycles
- Better fan-out scalability: a single event can trigger many consumers in parallel without the producer doing extra work
- Easier independent evolution: adding a new consumer requires no changes to the producer
- Natural fit for analytics and notifications: event streams double as an audit log and a data pipeline input
- Improved resilience: a slow or failed consumer does not block the producer or other consumers
Challenges
- Duplicate delivery: at-least-once delivery means consumers must handle the same event arriving more than once
- Event ordering: events from the same producer may arrive out of order across partitions or network paths
- Versioning: changing an event schema can break consumers silently if no compatibility contract is enforced
- Eventual consistency: downstream state reflects the world as it was when the event was emitted, not necessarily right now
- More complex tracing: a single user action may fan out to dozens of async handlers, making end-to-end tracing harder
Event Types
- Domain events: something that happened in the business domain, such as
OrderShippedorUserRegistered - Integration events: domain events published beyond a service boundary for other systems to consume
- Commands: a request for something to happen, directed at a specific handler rather than broadcast
- Queries: a request for current state, typically synchronous and outside the event-driven flow
Event Schema Design
Events should be self-contained and carry enough context for consumers to act without querying back. Include the event type, version, timestamp, source service, and the entity ID at minimum. Avoid embedding mutable objects that could change meaning after the event is emitted.
Schema Evolution
Use a schema registry to enforce compatibility rules between producers and consumers. Backward-compatible changes such as adding optional fields are safe. Removing fields or changing types is a breaking change that requires a versioned event type and a migration period where both versions are published in parallel.
Idempotent Consumers
Because duplicate delivery is normal, every consumer must be idempotent. Track processed event IDs in a database, use upsert semantics instead of insert, and design state transitions that are safe to apply more than once. An idempotency key derived from the event ID is the simplest approach.
Event Sourcing
Event sourcing takes the pattern further by storing every state change as an immutable event rather than overwriting a row. The current state of an entity is derived by replaying its event log. This gives a complete audit trail and enables temporal queries but adds complexity to reads and projections.
Choreography vs Orchestration
In choreography each service reacts to events and emits its own events in response, with no central coordinator. In orchestration a central saga or workflow engine directs each step explicitly. Choreography scales better and is more loosely coupled but harder to trace. Orchestration is easier to reason about but introduces a coordination bottleneck.
Distributed Tracing
A single user action in an event-driven system can fan out to many async handlers across services. Propagate a correlation ID or trace context through every event so that all downstream processing can be linked back to the originating request in your observability platform.
When Not to Use It
Event-driven architecture adds operational complexity that is not always justified. For simple CRUD applications, synchronous REST or RPC is easier to reason about, debug, and test. Reach for events when you genuinely need fan-out, decoupling across team boundaries, or an audit log of state changes.
Interview Tip
Explicitly mention eventual consistency and idempotent consumers. Also distinguish between choreography and orchestration to show you understand the trade-offs in coordinating multi-step workflows across services.