Something happened this month with Romania’s ING Bank. I’m sure you’re probably aware of it. They managed to execute a several (well, maybe more than just a several) transactions more than once. Well, shit happens, I guess. They have eventually fixed it. At least they say so. I choose believe them.
This unfortunate happening triggered a memory of my first time working in a mission-critical environment where certain operations were supposed to be executed exactly, absolutely, only once. It was for a german company. back in 2013. I am not allowed to mention or make any refference to them or the project, so let’s anonymously call them Weltschmerz Inc. It went something like this (oversimplified diagram):
I don’t claim that ING’s systems can be oversimplified to this level, but for the sake of the argument, and the protection I assumed for the so-called Weltschmerz Inc. let’s go with the banking example.
Trusted actor is me, when using a payment instrument that allows me to innitiate a transaction. (can be me using my card, or me being authenticated in any of their systems)
Trusted application is the innitial endpoint where I place my input describing my transaction (can be a POS, can be an e-banking application, anything)
The Mission-Critical Operation is the magic. Somehow, the application (be it POS, e-banking, whatsoever) knows how to construct such a dangerous operation.
Trick is, that whoever handles the execution of this operation must do it exactly, absolutely, only once. If the trusted application has a bug /attack/misfortune and generates two consecutive identical operations, one of them will never get executed. If I make a dubious mistake and somehow am allowed to quickly press twice a button, or if the e-banking / POS undergoes an attack, the second operation will be invalid. If anyone tries to pull a replay attack, it will still not work.
How to tackle this? Well, there are alot of solutions for this problem. Most of them gravitate around cryptography and efficient searching, here’s the approach we took back then:
Digitally signing the operation: necesarry in order to obtain a trusted fingerprint of the operation. the perfect unique identifier of the operation.
I understand, it is not easy to accomodate a digital signature ecosystem inside your infrastructure, there’s a lot of trust, PKI + certificates, guns, doors, locks, bieurocracy and shit to handle. It is expensive, but that’s life, no other way around it unfortunately.
Storing and partitioning: this signed version is stored wherever. However its signed hash must be partitioned based on variable that derrive from the business itself. If we are to consider banking, and if we speculate, we could come up to: time of the operation, identified recipient, innitiator, requested value, actual value, soo many more possibilities…. This partition is needed because, well, theory and practice tells us that “unicity has no value unless confined” If you are a very young developer, keep that in mind, it will cut you some slack later in your life.
Storing this hash uniquely inside a partition is easy now, it is ultimately just a carefull comparrison of the hashes inside a partition and the new operation which is a candidate for execution.
Hint: be carefull in including time in your partition. Time should not only be a part inside the signed operation, but also a separate, synchronised, independent, clock. I’m sure you already know this.
If you do this partitioning and time handling by the book, no replay attack will ever work.
Execution: Goes in all partitions that have something inside of them, gets the operations, does the magic. Magic does not include deleting the operation hash in the partition afterwards. It includes some other magic maker. I choosed my words carefully here :). #ACID.
There’s a lot more to it:
- signed hashes should be considered highly sensitive secrets, tough an encryption mechanism must be employed. Key management in this case is an issue. That’s why you will probably need an HSM or some sort of simmilar vault for the keys, and key derivates
- choose your algorithms carefully. If you have no real expertise in cryptography, please call someone that does. Never assume anything here unless you really know how to validate your assumptions
- maintaining such an infrastructure comes with a cost. It’s not such a deal breaker, but it is to be considered.
Again, I am not claiming that ING Romania did anything less than the best in order to ensure the singular execution, this article is not related directly to them. It is just a kind reminder, that it is possible to design such a mission-critical environment, for singular execution of certain operations.
As for my experience, it was not in banking, but rather a more open environment. #Marine, #Navigation.
Cheers to us all.