CPPX is a free, open source, general purpose parser and fact extractor for C++. It relies on the preprocessing, parsing, and semantic analysis of GNU g++, and produces a graph according to the Datrix fact model, in either GXL, TA, or VCG format, suitable for use by architecture recovery, data flow analysis, pointer analysis, program slicing, query techniques, source code visualization, object recovery, restructuring, refactoring, remodularization, and the like.


SWAG Kit is a toolkit developed by the Software Architecture Group at the University of Waterloo, that can be used to extract, abstract and explore software architectures. Currently Swagkit supports the extraction of C/C++ code, the abstraction to the architectural level and the presentation in a landscape form. Swagkit has been used to visualize the Linux operating system the VIM editor, a variety of Unix shells (Bash, C shell, and others) as well as other software. Swagkit currently runs under x86 Linux. A port to Solaris is possible, but not currently available.


QLDX is a reverse engineering toolkit for exploring and visualizing software architectures. It comes with its own fact extractors (ldx and bfx), a fact base query and manipulation language (QL/JGrok), and visualization software (LSEdit).


ASX is a fact extraction tool that extracts source information from C, C++, assembler, object, libraries, dynamic libraries and executables, in a format that may then immediately be visualised using lsedit.

Javex: Java Fact Extractor

Javex is a fact extractor for Java, that extracts facts from the Java class files


Javap2 pretty prints the contents of a java class file.


Xcise is a C/C++ dead code detection and elimination tool that uses as its input the output of the ASX fact extraction tool.


This tool can be used either in conjunction with lsedit or as a standalone program to show call paths from the mainline to specified functions. To use with lsedit set up the invocation to mainpath as a command, complete with reference to the appropriate TA file, and use the macro capability to present the name of the function of interest as the second parameter to mainpath.


PROWL Converting BASH scripts to use XML Configuration files


Bash2py is a Bash to Python script translator, implemented by modifying the bash 4.3.30 open-source C code, so that instead of executing bash commands,the modified tool simply emits (to the extent currently possible) the commands seen as equivalent Python statements.

PBS: The Portable Bookshelf

The Software Bookshelf is a web-based paradigm for the presentation and navigation of information representing large software systems. The Portable Bookshelf (PBS) is one implementation of this concept. The PBS Toolkit is our set of tools for the generation of a PBS Bookshelf.

LSEdit: The Graphical Landscape Editor

LSEdit is a tool that permits viewing, manipulating, querying, layout and clustering of higraphs. An overview of LSEdit's capabilities can be found here.

BTV: Build Time View

BTV stands for Build Time Architecture View, which captures the structural and behavioral properties that are apparent only at system build time.


Grok is a programming language designed for manipulating collections of binary relations. The initial version of Grok was created by Dr. Ric Holt in 1995, and has since evolved to become a language for manipulating factbases. Grok operates at the level of a relational database, in that operators generally apply across entire relations, and not just to single entities. The Grok interpreter has been optimized to handle large factbases (up to several hundred thousands of facts, or tuples). It keeps all of its data structures in memory. Grok is written in the Turing language.


QL is a Java re-implementation of Grok, written by Jingwei Wu at the University of Waterloo. While serving essentially the same purpose as Grok, QL is not identical to it. While being slower than Grok, QL makes up for it with new operators and built-in commands.

JCD: Java Clone Detector

This program tries to detect potential candidate clones that arise in java class files, by matching pcode against pcode.

DEXCD: Dex Clone Detector

This program tries to detect potential candidate clones that arise in dex class files, by matching pcode against pcode.

ACD: Assembler Clone Detector

This program tries to detect potential candidate clones that arise in assembler code, typically derived from C and C++ source code. It can perform clone detection on C, C++, and assembler produced from other source languages. It only runs on platforms that generate linux gcc/g++ assembler for the intel machine language.

CLICS: CLoning Analysis and Categorization System

The CLoning Analysis and Categorization System (CLICS) is a tool developed for the investigation of duplication of code within a software system. It categorizes duplicates according to a documented taxonomy, and provides multiple methods of exploration to guide users to data relevant to their task.


Beagle is a research tool for exploring software evoluion. It incorporates techniques from reverse engineering, visualization and database and builds a platform where the users can navigate through and annotate software releases so that a better understanding can be achieved.


Perses is a language-agnostic program reducer to minimize a program with respect to a set of constraints. It takes as input a program to reduce, and a test script which specifies the constraints. It outputs a minimized program which still satisfies the constraints specified in the test script. Compared to Delta Debugging and Hierarchical Delta Debugging, Perses leverages the syntax information in the Antlr grammar, and prunes the search space by avoiding generating syntactically invalid programs.