Week 1 (Sep 14): Introduction and Course Overview

Admin details (To Register -- use http://goo.gl/zeogAC)
Intro Lecture based on ICSE Tutorial by Ahmed E. Hassan and Tao Xie.
Assigned Reading:The Road Ahead for Mining Software Repositories by Ahmed E. Hassan slides

Week 2 (Sep 21): MSR Tutorial

Mining Software Repositories MSR Tutorial Continued.
Assigned Reading: Future of Mining Software Archives: A Roundtable

Week 3 (Sept 28): Predicting Bugs
Predicting fault incidence using software change history
Todd L. Graves, Alan F. Karr, J. S. Marron, and Harvey P. Siy
Analysis Techniques: Basic linear regression, GLM, R2, model error, exponential decay
Predictors of customer perceived software quality
Audris Mockus, Ping Zhang, and Paul Luo Li
Analysis Techniques: Classification, Logistic Regression (Building and Interpreting Co-efficients), R2, model error
Predicting Defects for Eclipse
Thomas Zimmermann, Rahul Premraj, and Andreas Zeller
Analysis Techniques: Using R, Classification, Ranking
[READING]
[ASSIGNMENT]
How, and Why, Process Metrics are better
Foyzur Rahman and Premkumar Devanbu
[READING] ASSIGNMENT
Predicting Bugs from History
Thomas Zimmermann, Nachiappan Nagappan, and Andreas Zeller (Evolution Book)
[READING]
Week 4 (Oct 5): Mining Social Structures and Code Reviews
Will My Patch Make it? and How Fast?: Case Study on the Linux Kernel.
Yujuan Jiang, Bram Adams, Daniel M. German
Analysis Techniques: Decision Tree
An Empirical Study of the Impact of Modern Code Review Practices on Software Quality
Shane McIntosh, Yasutaka Kamei, Bram Adams, and Ahmed E. Hassan
Analysis Techniques: Bootstrap validation
ASSIGNMENT
On the Role of Developer's Scattered Changes in Bug Prediction
Dario Di Nucci, Fabio Palomba, Sandro Siravo, Gabriele Bavota, Rocco Oliveto, and Andrea De Lucia
Does Distributed Development Affect Software Quality? An Empirical Case Study of Windows Vista
Christian Bird, Nachiappan Nagappan, Premkumar Devanbu, Harald Gall, and Brendan Murphy
[READING]
Week 5 (Oct 12): Thanksgiving (No Class)
Week 6 (Oct 19): Mining of Non-Structured Data
Assignment Status Update -- OCT 16 (10 min presentation)
Creating and Evolving Developer Documentation: Understanding the Decisions of Open Source Contributors
Barthelemy Dagenais and Martin P. Robillard
Analysis Techniques: Grounded Theory
Semantic clustering: Identifying topics in source code
Adrian Kuhn, Stephane Ducasse, and Tudor Girba
Analysis Techniques: LDA, LSI
Listening to programmers Taxonomies and characteristics of comments in operating system code
Yoann Padioleau, Lin Tan, Yuanyuan Zhou
[READING]
Identifying reasons for software change using historic databases.
Audris Mockus and Larry G. Votta
[READING]
Week 7 (Oct 26): Assignment Presentation
Assignment DUE -- OCT 26 (20 mins presentation + 10 page IEEE report)
Project Proposal DUE -- Oct 30 (2 pages IEEE format)
Week 8 (Nov 2): Project Proposal Presentations
Project Proposal Presentation (10 mins + 10 mins questions)
Week 9 (Nov 9): Mining Mobile Apps
A Measurement Study of Google Play
Nicolas Viennot, Edward Garcia, Jason Nieh
API Change and Fault Proneness: a Threat to the Success of Android Apps
Mario Linares Vasquez, Gabriele Bavota, Carlos Bernal-Cardenas, Massimiliano Di Penta, Rocco Oliveto, Denys Poshyvanyk
Software Analytics for Mobile Applications - Insights & Lessons Learned
Roberto Minelli, Michele Lanza
Visual Analytics in Software Maintenance: Challenges and Opportunities
Alex Telea and and Ozan Ersoy and Lucian Voinea
[READING]
Week 10 (Nov 16): Large Scale Analysis I
Improving Software Diagnosability via Log Enhancement
Ding Yuan, Jing Zheng, Soyeon Park, Yuanyuan Zhou, and Stefan Savage
The Promises and Perils of Mining Github
Eirini Kalliamvakou, Georgios Gousios, Kelly Blincoe, Leif Singer, Daniel M. German, Daniela Damian
Towards Building a Universal Defect Prediction Model
Feng Zhang, Audris Mockus, Iman Keivanloo, Ying Zou: Towards building a universal defect prediction model.
[READING]
Bugs as deviant behavior: A general approach to inferring errors in systems code
Dawson Engler, David Yu Chen, Seth Hallem, Andy Chou, and Benjamin Chelf
Analysis Techniques: Markov Models
[READING]
Scalable statistical bug isolation
Ben Liblit, Mayur Naik, Alice X. Zheng, Alex Aiken, and Michael I. Jordan
[READING]
Week 11 (Nov 23): Large Scale Analysis II
Capturing, indexing, clustering, and retrieving system history
Ira Cohen, Steve Zhang, Moises Goldszmidt, Julie Symons, Terence Kelly, and Armando Fox
vPerfGuard: an Automated Model-Driven Framework for Application Performance Diagnosis in Consolidated Cloud Environment
Pengcheng Xiong, Calton Pu, Xiaoyun Zhu, and Rean Griffith
Performance Debugging in the Large via Mining Millions of Stack Traces
Shi Han, Yingnong Dang, Song Ge, Dongmei Zhang, and Tao Xie
Amassing and indexing a large sample of version control systems: towards the census of public source code history
Audris Mockus
[READING]
Week 12 (Nov 30): Project Presentations
Project Presentation DUE -- Nov 30 (20 mins presentation)


Project Report DUE -- DEC 22 (10 page IEEE report)