Toward a Taxonomy of Clones in Source Code: A Case Study

Authors: Cory J. Kapser Michael W. Godfrey

Venue: 2003 Intl. Workshop on Evolution of Large-scale Industrial Software Applications, 2003

Year: 2003

Abstract: Code cloning --- that is, the gratuitous duplication of source code within a software system --- is an endemic problem in large, industrial systems [9, 7]. While there has been much research into techniques for clone detection and analysis, there has been relatively little empirical study on characterizing how, where, and why clones occur in industrial software systems. In this paper, we present a preliminary categorization scheme for code clones, and we discuss how we have applied this taxonomy in a case study performed on the file system subsystem of the Linux operating system. Our case study yielded several interesting results, including that cloning is rampant both within particular file system implementations and across different ones, and that as many as 13% of the 4407 functions that are more than six lines long were involved in a clone-pair relationship.

BibTeX:

@inproceedings{coryj.kapser2003tatociscacs,
    author = "Cory J. Kapser and Michael W. Godfrey",
    title = "Toward a Taxonomy of Clones in Source Code: A Case Study",
    year = "2003",
    booktitle = "Proc. of the 2003 Intl. Workshop on Evolution of Large-scale Industrial Software Applications
        "
}

Plain Text:

Cory J. Kapser and Michael W. Godfrey, "Toward a Taxonomy of Clones in Source Code: A Case Study," 2003 Intl. Workshop on Evolution of Large-scale Industrial Software Applications