2011-07-28

Performance improvement - 3

Hmm … a new addition to the ring detection algorithm set brings back the distance matrix into the game. Over the last week, I was interacting with potential users, trying to assess the accuracy of the algorithm. They wanted an SSSR (Smallest Set of Smallest Rings) result set for each input molecule, not a more exhaustive set that I was deeming necessary.

Accordingly, I implemented a new SSSR algorithm. This relies on eliminating contours which can be short-circuited. That, in turn, requires shortest paths between all atoms - pair-wise - to be determined first. So, the matrix is back!

Consequently, the running time for my test set of 19946 molecules is now in the range of 16.5-17s. Preliminary profiling shows that building this matrix for each molecule takes almost 60% of the running time. I should investigate better spanning tree algorithms!

Post a Comment