Why an idempotency key isn’t an idempotency guarantee
It started with a one-line message from a finance team on a Tuesday afternoon: A handful of customers had been charged twice that day, and one was disputing a duplicate charge with their bank. I went straight to the monitoring, expecting to find something broken. Instead, everything looked healthy: By the system’s own records, every order had been paid exactly once. It took the team a month of digging through production incidents to close the gap between a dashboard that said, “all good,” and a customer billed twice. I’ve since seen this kind of failure across multiple payment systems, some handling hundreds of thousands of transactions a day. What follows is a composite and doesn’t describe any single system or organization. The numbers, timings and identifying details have been changed to keep anything proprietary out. The retry that charged twice A customer clicked Pay; the order service called the payment service, which called the external provider. The provider charged the card fo