2010-10-04

Time problems

Last week, I encountered two interesting problems.

Background: The product that I am building has a newly-developed licensing feature. The feature uses hardware information of the user's machine a la Windows activation. In addition, licenses are annual, so the program must commit suicide at the end of one year.

Problem 1: After the first successful run of the product, all attempts at running it again were failing with the error message `Current time older than that of the most recent run.' The time of last run is updated in the license file (which is an encrypted file) at the beginning of each run. After several attempts at debugging, I could find nothing. I switched to other work, and returned a few hours later. To my surprise, the program ran successfully, but just once! It then returned to the now-familiar error message.

I noticed that 4 hours had elapsed between the two successful runs. I suddenly remembered that I was storing the time of last run as UTC, and was signed in into my collaborator's computer at Toronto (UTC-4:00). Upon searching the code for all instances of storing/reading time, I noticed that conversions of local time to and from UTC were inconsistent. I made them consistent, and Problem 1 disappeared.

Problem 2: No sooner had I finished resolving the above than I needed to run the program - in batch - about 6000 times. I wrote a shell script to do that. I tested it for 10 runs. Surprise! The error message was back, only even more confusingly. Jobs 1 and 2 ran successfully. Job 3 failed with the above error. Job 4 ran successfully. Job 5 failed. Jobs 6 and 7 ran successfully. Job 8 failed. Jobs 9 and 10 ran successfully. The actual jobs that ran successfully changed but slightly each time I re-ran the shell script. It really left me scratching my head.

I then noticed that there was a pattern. The jobs before the failed ones completed really quickly. I inserted a `sleep 1' after each job in the shell script. Now, every job completed successfully!

I then checked the license verifier code. One of the important checks was that current time should be greater than that of the last run. However, resolution of time was in seconds. Thus, if a job completed in less than a second, its immediately next job would fail. So, I modified the condition to current time should not be less than that of the last run. The shell script ran happily ever after!

No comments: