Wednesday, March 10, 2010

Program Understanding

I’m starting research into Program Understanding and Program Comprehension. I am doing this research because software maintenance is the portion of the software development lifecycle that takes the most time and money. Some studies claim that software maintenance can consume 60 - 80% of the total cost of a software project.

I would like to create tools and techniques to enable people to get up to speed faster on legacy codebases. To me every code base is legacy, or will be legacy shortly.

The first idea that I want to look at to understand an arbitrary code base is finding how the classes are related. To get a feel the overall structure of the code base, is it a standard “n-tier” architecture, where all dependencies are downward? Is it a client server code base, where there are two distinct pieces that communicate with each other over a well defined boundary? Or is the code the classic “Big Ball of Mud” architecture that is common in tools that have grown organically since they were started.