Dependence Communities in Source Code

October 01, 2012 - 00:00

The concept of community structure arises from the analysis of social networks in sociology. Community structure can be found in many real world graphs other than social networks. Recently, efficient community detection algorithms have been developed which can cope with very large graphs with millions of nodes and potentially billions of edges. So, for the first time, there is the potential for investigating communities in real industrial-strength software at the statement level. We provide empirical evidence that dependence between statements in software does, indeed, give rise to community structure. Initial findings suggest that the separate dependence communities are far from arbitrary. They appear to decompose systems into areas of distinct functionality. This new approach to system decomposition has tremendous potential in many areas of software engineering, particularly in reverse engineering of legacy software and program comprehension.

In Proceedings of the IEEE 28th International Conference on Software Maintenance, Early Research Achievements Track