**Authors:**
Catherine Ramirez
Meiyappan Nagappan
Mehdi Mirakhorli

**Venue:**
2015 IEEE 1st International Workshop on Software Analytics (SWAN), pp. 29--30, 2015

**Year:**
2015

**Abstract:** Empirical software engineering has become an integral and important part of software engineering
research in both academia and industry. Every year several new theories are empirically validated by mining
and analyzing historical data from open source and closed source projects. Researchers rely on statistical
libraries in tools like R, Weka, SAS, SPPS, and Matlab for their analysis. However, these libraries like any
software library undergo periodic maintenance. Such maintenance can be to improve performance, but can also
be to alter the core algorithms behind the library. If indeed the core algorithms are changed, then the
empirical results that have been compiled with the previous versions may not be current anymore. However,
this problem exists only if (a) statistical libraries are constantly edited and (b) the results they produce
are difference from one version to another. Hence in this paper, we first explore if either of the above two
conditions hold true for one library in the statistical package R. We find that both conditions are true in
the case of the randomForest method in the randomForest package.

**BibTeX:**

```
@inproceedings{catherineramirez2015stioeirloser,
author = "Catherine Ramirez and Meiyappan Nagappan and Mehdi Mirakhorli",
title = "Studying the impact of evolution in R libraries on software engineering research",
year = "2015",
pages = "29--30",
booktitle = "Proc. of 2015 IEEE 1st International Workshop on Software Analytics (SWAN)"
}
```

**Plain Text:**

`Catherine Ramirez, Meiyappan Nagappan, and Mehdi Mirakhorli, "Studying the impact of evolution in R libraries on software engineering research," 2015 IEEE 1st International Workshop on Software Analytics (SWAN), pp. 29--30`