Studying the impact of evolution in R libraries on software engineering research

Authors: Catherine Ramirez Meiyappan Nagappan Mehdi Mirakhorli

Venue: 2015 IEEE 1st International Workshop on Software Analytics (SWAN), pp. 29--30, 2015

Year: 2015

Abstract: Empirical software engineering has become an integral and important part of software engineering research in both academia and industry. Every year several new theories are empirically validated by mining and analyzing historical data from open source and closed source projects. Researchers rely on statistical libraries in tools like R, Weka, SAS, SPPS, and Matlab for their analysis. However, these libraries like any software library undergo periodic maintenance. Such maintenance can be to improve performance, but can also be to alter the core algorithms behind the library. If indeed the core algorithms are changed, then the empirical results that have been compiled with the previous versions may not be current anymore. However, this problem exists only if (a) statistical libraries are constantly edited and (b) the results they produce are difference from one version to another. Hence in this paper, we first explore if either of the above two conditions hold true for one library in the statistical package R. We find that both conditions are true in the case of the randomForest method in the randomForest package.

BibTeX:

@inproceedings{catherineramirez2015stioeirloser,
    author = "Catherine Ramirez and Meiyappan Nagappan and Mehdi Mirakhorli",
    title = "Studying the impact of evolution in R libraries on software engineering research",
    year = "2015",
    pages = "29--30",
    booktitle = "Proc. of 2015 IEEE 1st International Workshop on Software Analytics (SWAN)"
}

Plain Text:

Catherine Ramirez, Meiyappan Nagappan, and Mehdi Mirakhorli, "Studying the impact of evolution in R libraries on software engineering research," 2015 IEEE 1st International Workshop on Software Analytics (SWAN), pp. 29--30