|
|
|
Dirty DataI was briefing the senior architect of an extremely complex DoD program a while back. Tens of millions of dollars slated for just this part of the system and far more for certain other connected elements. He is an altogether excellent guy and has been in IT for about twenty years. So I say, as I outline how a couple subsystems should communicate, "It would be best, of course, if the two subsystems had a transactional connection, so that data integrity will be protected." He nodded and we discussed other options also, but I realized after a bit that we weren't quite communicating. I suggested that maybe we were using the term transactional differently and asked his definition. He offered that transactional programs were short in duration -- that was his concept. Now listen -- this is awful. Every IT professional or manager should have the concept of transactionality clear in his mind, and especially a senior architect. I'm sure all readers of WITDW know all about this, but just in case just one or two don't, transactionality means that when some hunk of software starts executing and updates some data:
If you think it through, some of these stipulations sound impossible -- if the server loses power in the middle of updating your customer file to, say, lower the base interest rate and double the late charge fee, obviously half of the customers are updated and half aren't. The data is a mess. You can't roll the clock back and undo what's been done. Except you can. If you have magical widgets like write-ahead logs, image copies, two-phase commit, journals, and transaction-isolation tables. Which is exactly what database, transaction management, and application server products from IBM, BEA, Microsoft, and Oracle have inside them. Since about 1970. Maybe It's Yoko's FaultSo technology has existed since before the Beatles broke up that enables the construction of systems that should never lose data and never have inconsistent data. Any of your corporate systems corrupted or lost data since Abbey Road? Maybe your IT guys forgot to use this technology?This technology was, at first, limited to data on a single system (think a mainframe at a bank here) and subject to other technical limitations (TANSTAAFL), but within its applicable problem domains, it was a terrific improvement over either (a) trying to invent solutions to this problem, or, (b) having corrupt data. Now back to my architect friend from above. What's this got to do with an interface between two subsystems? About fifteen years ago, distributed transactionality began to be implemented in the vendor products. With this technology, clever architects could design enterprise solutions that connected applications on different servers (and, eventually, using different vendors' technology) such that a business transaction could execute across multiple machines and still have the ACID features I defined above. I will once again say that this technology is nearly magical -- a 'new employee' record could be created on, say, the Payroll system, the 401K system, the employee club system, and the PAC-dunning system (built on mainframes, Unix, Intel, and Linux, perhaps) from one data entry screen and when the response is displayed 'Employee added,' every system is current. The transaction managers on all four systems interoperate, no matter where they are on the network, to guarantee that either all or none of the systems add the record. Is this technology always needed? Nah -- there are four or five ways of interfacing systems that are appropriate for different circumstances. The whole buzz-field of Enterprise Application Integration (EAI) was invented a few years ago by software vendors to give a name to various means of (usually) loose, risky application integration. But transactionality is the most rigorous way of connecting systems and when you need it, you need it. Big Problem or Detail?As always, the question is, how does this affect your enterprise? Is it a technical detail that only software and database engineers should worry about, or is it something senior technologists and management also need to think about? Well, it's the latter, unfortunately:
I'll close today's sermon with this comment: I believe the
failure to take data integrity seriously enough in many enterprises has a couple
roots: (a) it can be a subtle problem and frankly, a lot of IT decision makers
don't want to tackle it. Better career path to make some pretty slides about a
new application that is under development. I say that's wrong. 75,000 accounts in error is not OK. $20M is real money. That's what I think. | ||
|
|
Copyright © 2005 Why IT Doesn't WorkLast modified: 11/28/2005No Project Managers were harmed during construction of this site. |