Papers & Posters

*Award winning

Tools / Downloads

Makefile Corpus

As part of my PhD research, I compiled a corpus of approximately 20k Makefiles across 271 open source projects - from the hand-written Makefiles of Android and the Linux kernel, to Makefiles generated from CMake, QMake, and Automake. Makefiles included were collected from the latest version of projects updated between January 2010 and October 2015.

Download Makefile Corpus

Makefile Framework

In our work studying Makefiles, we developed a framework for statically analyzing them using TXL. The framework consists of a Makefile grammar for TXL, TXL rules for extracting and counting features, as well as some scripts to run them.

Download Makefile Analysis Framework

Makefile Analyses

For my PhD thesis, I conducted to analyses. The details of each analysis can be read in the publications above. The data is provided below:

Download Make Feature Analysis Data
Download Make Complexity Analysis Data

Web Services Restructuring of Descriptions (WSRD)

As part of my research, I developed a tool called WSRD (Web Service Restructuring of Descriptions) that gathers related pieces of operations inside WSDL descriptions into self-contained units called Web Service Cells, or WSCells (pronounced “wizzles”). WSCells make WSDL more human-readable and better suited for analysis of operations.

WSRD includes a set of scripts that automate the generation of the complete set of WSCells for all WSDL files in a directory. The actual restructuring is accomplished using a source transformation language called TXL. Extracted WSCells can either be listed in a single file, or separated into individual files.

Download WSRD

  • Bash Shell
  • TXL (available at
  • Python (available at - installed by default on some distributions)
  • dos2unix (utility NOT installed by default on some distributions - use "sudo apt-get install tofrodos" on Ubuntu)
wsdlToGrok - From CASCON 2012 Demo

In our work with WSDL, we found it very tedious to look up operations to see what they have in common, and difficult to find similar operations before analysis in order to calculate the recall of the approaches we tried using to discover these relationships (clone detection and LDA). In this project, we use TXL and Grok to extract facts from WSDL files and query those facts to answer questions about WSDL descriptions. The extractor is available here for you to use.

Download wsdlToGrok