I like software - reading, tinkering, designing, coding. I have been doing so for 20 years or so and I would not mind continuing this for foreseeable future. Fortunately for me, this is my profession as well and I have managed to get paid for this for some 14 years now. Although I do not have any strong bias for any business domain, I have been working with some pretty big names in the finance domain and you might get a hint of that from my entries. Partha is a DZone MVB and is not an employee of DZone and has posted 20 posts at DZone. You can read more from them at their website. View Full User Profile

The Benefits of Cynicism in Coding

10.10.2012
| 4254 views |
  • submit to reddit
We had no reasons to be anxious about this component. It has been running for about an year now. It used to handle around 1000 messages per day and email out a automated report twice every day. The solution was based on robust integration tools and technologies i.e. TIBCO EMS for delivering messages and Spring Integration for reading and handling them. Everything was predictable, boring and nice.

And one morning everything changed. This component froze with a null pointer exception. Nothing more, nothing less. There were no logs. They never are when you need them. Nothing had changed in the code or in the mode of delivery. There were no obvious miscreants. Business had found out the break - as one of the automated reports had failed - and were demanding an estimated time of fix. It was a picture perfect start for the firefighters of the product team - and they poured out their first cup of coffee.

So, the team swung into action. Half a day later - after multiple calls with business (not very pleasant, any one of them, mind you) - it was suggested that it might - just might be - that a couple of messages in the 1000 or so, did not have a required field - which by the way was guaranteed to be there by the business processes. So we took these two messages off and switched on the component. Lo and behold, crashed again. This time because there were much more messages than it could handle (remember messages kept coming in while the team was troubleshooting the problem). I will not bore you with the multitude of calls that followed, and how a fix was arrived and delivered. It suffices to say that too many man hours were spent on this for my comfort. And this lead me to write down my thoughts on this.

I am all for communications, meetings, workshops, creation of all sorts of requirements and design documents. I see the value in all of them. I really do - although it has been accused many a times that I don't. But, at the end of the day, there is no substitute for a minimal amount of street smartness. A healthy amount of cynicism goes a long way in designing a resilient system. In this particular case, a couple of things had gone wrong.

1. We trusted the data quality of the feed coming in from a different system. And we should not have. No. This is not going to be written down in any book discussing integration patterns. It is just something that a seasoned developer would not do, but a new one - although as sharp as a tac - would slip up on. Folks had trusted the requirement document that guaranteed that certain fields would be populated. But, the fact is, when the fields were not populated, it was not Ok for our component to go down. A seasoned developer would have consulted the requirements document and developed to it - but would not have trusted the requirement document. He would have been cynical.

2. We trusted the data volume of the feed. And we should not have. Again, this was something written down in the document and the code hence was technically correct. But, if only the developer would have said, "Hang on, if you are saying 1000 is the tops that you expect, fine, I will pull only 1000 at one go. If there are more, I will pull a second batch. And more batches if I need. But never more than 1000." we would have been fine. We should not have pulled all data from message queue - assuming it will be less than 1000, because it was written down in the document. A seasoned developer would have been cynical of the document.

The component is fixed and everything is back in business. It is no biggie. This was not the first time something like this happened and I am willing to wager that it will not be the last. The point that I am trying to make is that the business of software production is not - and perhaps will never be - like the production line of a hardware commodity. It is most unlikely to enjoy the stability, predictability, and repeatability of the production line of - say a car. So, the proliferation of processes, documents, meetings will not going to be as successful in this business.

Processes are fine. Documents are fine. Productivity measuring tools and code quality matrices are great. Workshops are great. Peer reviews are a must. But they are quite unlikely to be a substitute for a person who loves coding, takes pride in it, and goes that extra mile to ensure that his code does not fail. These people will always be in short supply and in great demand. As an industry, sooner or later we will have to find a way to create, foster and retain these individuals.

That's it for today. Happy coding.
Published at DZone with permission of Partha Bhattacharjee, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Comments

Lund Wolfe replied on Sat, 2012/10/13 - 3:45am

Nice article.  The code is only as good as the developers.  Defensive coding that clutters the code is no substitute for a robust design.  Requirements steer the API and design but shouldn't dictate low level design or be interpreted as an implementation.  Best practices will dictate how to implement functionality and make it robust. 

Data quantity and quality is always highly suspect, and even Subject Matter Experts won't really know the real world state of the incoming data.  The data should be validated and cleaned/formatted to make it consistent and safe for the system.  If you need to make an assumption about the data, at least make it stick out like a sore thumb (in the code and to the app operators) and filter it out or halt the system on a failed precondition.  Add robustness around the data and any interface to external systems, which are the risk/failure points that can bring your system down, or make it look like a defect in your system.  You pay for quality development now or with costly production surprises later.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.