2012-01-07

Braces, scopes and compilers

I ran into an interesting problem with my legacy chemistry application (not the new one). We have over 5,700 tests for the application. Each test includes a molecule that characterises a family of products and another of reactants. Outside those tests, we use several standard – as in drugs – molecules to test the capabilities of the program.

For some time now, due an error in the test scripts driver, the tests have not been actually getting run. The driver has falsely been reporting success for each test. Two days ago, I noticed that, and corrected the driver script. I then ran the corrected script. 5,743 of the 5,744 tests passed.

Test m5212, the failing molecule, was an interesting case. After some debugging, I localised the error to a file SRC/map.cpp, function initialmaps(). Here is the code segment in question.

4533:  if (ignoretop)
4534:      for (j = 1; j <= pmol.numat; ++j)
4535:        if ((uniqueinmol(pmol,j) || (pmol.at[j].top == j))
4536:          &&
4537:          (1 != pmol.at[j].unsat))
4538:              for (g = 1; g <= mol1.numat; ++g)
4539:                  if ((uniqueinmol(mol1,g) || mol1.at[g].top)
4540:                  &&
4541:                  ((pmol.at[j].attribute == mol1.at[g].attribute) ||
4542:                  ((pmol.at[j].atnum == mol1.at[g].atnum) &&
4543:                  pmol.at[j].heteroaromatic &&
4544:                  mol1.at[g].heteroaromatic) ||
4545:                  tautomeric(pmol,mol1,j,g)))
4546:                      if (fit(pmol,tempmap,j,g))
4547:                          {
4548:                          tempmap.addapair(pmol,j,g);
4549:                          nmap.AddToTail(tempmap);
4550:                          check_nummap_limit();
4551:                          tempmap.clear();
4552:                          }
4553:  else
4554:      for (j = 1; j <= pmol.numat; ++j)
4555:          if ((uniqueinmol(pmol,j) ||
4556:          (pmol.at[j].top == j) ||
4557:          ((6 != pmol.at[j].atnum) && !pmol.at[j].top) ||
4558:          ((gring = atring.ringcontaining(pmol,j)) &&
4559:          pmol.ringlist[gring].present(pmol.at[j].top)))
4560:          &&
4561:          (1 != pmol.at[j].unsat))
4562:              for (g = 1; g <= mol1.numat; ++g)
4563:                  if ((uniqueinmol(mol1,g) ||
4564:                  (mol1.at[g].top == g) ||
4565:                  ((rring = atring.ringcontaining(mol1,g)) &&
4566:                  (mol: 1.ringlist[rring].present(mol1.at[g].top))))
4567:                  &&
4568:                  ((pmol.at[j].attribute == mol1.at[g].attribute) ||
4569:                  ((pmol.at[j].atnum == mol1.at[g].atnum) &&
4570:                  pmol.at[j].heteroaromatic && mol1: .at[g].heteroaromatic)
4571:                  || tautomeric(pmol,mol1,j,g))) // ex.  MOL/: m2944
4572:                      if (fit(pmol,tempmap,j,g))
4573:                          {
4574:                          tempmap.addapair(pmol,j,g);
4575:                          nmap.AddToTail(tempmap);
4576:                          check_nummap_limit();
4577:                          tempmap.clear();
4578:                          }

Experienced programmers must have already realised the issue. For the novices, here is a little explanation. From the indentation, it appears as if the programmer's intention was to have the else on line 4553 act in conjunction with the if on line 4533. Unfortunately, though, C++ does not look at the code that way. It matches the else on line 4553 with the most recent unbalanced if. And, such an if occurs on line 4546. Here is how re-indented code looks.

4533:      if (ignoretop)
4534:          for (j = 1; j <= pmol.numat; ++j)
4535:              if ((uniqueinmol(pmol,j) || (pmol.at[j].top == j))
4536:                      &&
4537:                      (1 != pmol.at[j].unsat))
4538:                  for (g = 1; g <= mol1.numat; ++g)
4539:                      if ((uniqueinmol(mol1,g) || mol1.at[g].top)
4540:                              &&
4541:                              ((pmol.at[j].attribute == mol1.at[g].attribute) ||
4542:                               ((pmol.at[j].atnum == mol1.at[g].atnum) &&
4543:                                pmol.at[j].heteroaromatic &&
4544:                                mol1.at[g].heteroaromatic) ||
4545:                               tautomeric(pmol,mol1,j,g)))
4546:                          if (fit(pmol,tempmap,j,g))
4547:                          {
4548:                              tempmap.addapair(pmol,j,g);
4549:                              nmap.AddToTail(tempmap);
4550:                              check_nummap_limit();
4551:                              tempmap.clear();
4552:                          }
4553:                          else
4554:                              for (j = 1; j <= pmol.numat; ++j)
4555:                                  if ((uniqueinmol(pmol,j) ||
4556:                                              (pmol.at[j].top == j) ||
4557:                                              ((6 != pmol.at[j].atnum) && !pmol.at[j].top) ||
4558:                                              ((gring = atring.ringcontaining(pmol,j)) &&
4559:                                               pmol.ringlist[gring].present(pmol.at[j].top)))
4560:                                          &&
4561:                                          (1 != pmol.at[j].unsat))
4562:                                      for (g = 1; g <= mol1.numat; ++g)
4563:                                          if ((uniqueinmol(mol1,g) ||
4564:                                                      (mol1.at[g].top == g) ||
4565:                                                      ((rring = atring.ringcontaining(mol1,g)) &&
4566:                                                       (mol1.ringlist[rring].present(mol1.at[g].top))))
4567:                                                  &&
4568:                                                  ((pmol.at[j].attribute == mol1.at[g].attribute) ||
4569:                                                   ((pmol.at[j].atnum == mol1.at[g].atnum) &&
4570:                                                    pmol.at[j].heteroaromatic && mol1.at[g].heteroaromatic)
4571:                                                   || tautomeric(pmol,mol1,j,g))) // ex.  MOL/m2944
4572:                                              if (fit(pmol,tempmap,j,g))
4573:                                              {
4574:                                                  tempmap.addapair(pmol,j,g);
4575:                                                  nmap.AddToTail(tempmap);
4576:                                                  check_nummap_limit();
4577:                                                  tempmap.clear();
4578:                                              }

How nice! This disconnect between the programmer's intention and C++'s understanding of the code is – evidently – caused by the absence of braces { and } in appropriate places to delimit the scopes. For that – and to aid my own comprehension – I use braces always; even if the scope has only one statement.

Now, if I insert a { at the end of line 4533 and a } before the else on line 4553, test m5212 passes. But, four other tests, all of which pass otherwise, fail! They are m3651, m3747, m4750 and m5211. Sigh! This needs a deeper investigation.

2 comments:

Anonymous said...

Isn't this the notorious "dangling else" problem?

Unknown said...

Anonymous : Yes, it is.