Centralized logging is not enough — we needed user event trails across services.Tools: - Kafka with Avro schema - Unique trace IDs - Schema registry for evolutionThis helped correlate issues, even weeks after they occurred.