HLDintermediate

Event-Driven Architecture

Event-driven architecture lets services communicate asynchronously by publishing facts about state changes. It improves decoupling and fan-out, but introduces eventual consistency and debugging complexity.

Reading time

11 min

event-drivenasynchronouskafkaarchitecture

Core Idea

A service emits events such as OrderPlaced or PaymentSucceeded, and downstream systems react independently. The producer has no knowledge of who consumes its events or what they do with them, which is what makes the architecture loosely coupled by design.

Benefits

Loose coupling: producers and consumers evolve independently without shared deployment cycles
Better fan-out scalability: a single event can trigger many consumers in parallel without the producer doing extra work
Easier independent evolution: adding a new consumer requires no changes to the producer
Natural fit for analytics and notifications: event streams double as an audit log and a data pipeline input
Improved resilience: a slow or failed consumer does not block the producer or other consumers

Challenges

Duplicate delivery: at-least-once delivery means consumers must handle the same event arriving more than once
Event ordering: events from the same producer may arrive out of order across partitions or network paths
Versioning: changing an event schema can break consumers silently if no compatibility contract is enforced
Eventual consistency: downstream state reflects the world as it was when the event was emitted, not necessarily right now
More complex tracing: a single user action may fan out to dozens of async handlers, making end-to-end tracing harder

Event Types

Domain events: something that happened in the business domain, such as OrderShipped or UserRegistered
Integration events: domain events published beyond a service boundary for other systems to consume
Commands: a request for something to happen, directed at a specific handler rather than broadcast
Queries: a request for current state, typically synchronous and outside the event-driven flow

Event Schema Design

Events should be self-contained and carry enough context for consumers to act without querying back. Include the event type, version, timestamp, source service, and the entity ID at minimum. Avoid embedding mutable objects that could change meaning after the event is emitted.

Schema Evolution

Use a schema registry to enforce compatibility rules between producers and consumers. Backward-compatible changes such as adding optional fields are safe. Removing fields or changing types is a breaking change that requires a versioned event type and a migration period where both versions are published in parallel.

Idempotent Consumers

Because duplicate delivery is normal, every consumer must be idempotent. Track processed event IDs in a database, use upsert semantics instead of insert, and design state transitions that are safe to apply more than once. An idempotency key derived from the event ID is the simplest approach.

Event Sourcing

Event sourcing takes the pattern further by storing every state change as an immutable event rather than overwriting a row. The current state of an entity is derived by replaying its event log. This gives a complete audit trail and enables temporal queries but adds complexity to reads and projections.

Choreography vs Orchestration

In choreography each service reacts to events and emits its own events in response, with no central coordinator. In orchestration a central saga or workflow engine directs each step explicitly. Choreography scales better and is more loosely coupled but harder to trace. Orchestration is easier to reason about but introduces a coordination bottleneck.

Distributed Tracing

A single user action in an event-driven system can fan out to many async handlers across services. Propagate a correlation ID or trace context through every event so that all downstream processing can be linked back to the originating request in your observability platform.

When Not to Use It

Event-driven architecture adds operational complexity that is not always justified. For simple CRUD applications, synchronous REST or RPC is easier to reason about, debug, and test. Reach for events when you genuinely need fan-out, decoupling across team boundaries, or an audit log of state changes.

Interview Tip

Explicitly mention eventual consistency and idempotent consumers. Also distinguish between choreography and orchestration to show you understand the trade-offs in coordinating multi-step workflows across services.

← Back to all topics Practice questions →