Week 1 (Sep 11): Introduction and Course Overview

Admin details (To Register -- use http://goo.gl/zeogAC)
Intro Lecture based on ICSE Tutorial by Ahmed E. Hassan and Tao Xie.
Assigned Reading:The Road Ahead for Mining Software Repositories by Ahmed E. Hassan slides

Week 2 (Sep 18): MSR Tutorial

Mining Software Repositories MSR Tutorial Continued.
Assigned Reading: Future of Mining Software Archives: A Roundtable

Week 3 (Sept 25): Predicting Bugs
Predicting fault incidence using software change history
Todd L. Graves, Alan F. Karr, J. S. Marron, and Harvey P. Siy
Analysis Techniques: Basic linear regression, GLM, R2, model error, exponential decay
Predictors of customer perceived software quality
Audris Mockus, Ping Zhang, and Paul Luo Li
Analysis Techniques: Classification, Logistic Regression (Building and Interpreting Co-efficients), R2, model error
Predicting Defects for Eclipse
Thomas Zimmermann, Rahul Premraj, and Andreas Zeller
Analysis Techniques: Using R, Classification, Ranking
[READING]
[ASSIGNMENT]
Predicting Bugs from History
Thomas Zimmermann, Nachiappan Nagappan, and Andreas Zeller (Evolution Book)
[READING]
Evaluating Defect Prediction Approaches: A Benchmark and an Extensive Comparison
Marco D'Ambros, Michele Lanza, Romain Robbes
[READING] ASSIGNMENT
Week 4 (Oct 2*): Mining Social Structures
Will My Patch Make it? and How Fast?: Case Study on the Linux Kernel.
Yujuan Jiang, Bram Adams, Daniel M. Germán
Analysis Techniques: Decision Tree
Latent Social Structure in Open Source Projects
Christian Bird, David Pattison, Raissa D'Souza, Vladimir Filkov and Premkumar Devanbu
Does Distributed Development Affect Software Quality? An Empirical Case Study of Windows Vista
Christian Bird, Nachiappan Nagappan, Premkumar Devanbu, Harald Gall, and Brendan Murphy
Week 5 (Oct 9): Large Scale Analysis I
Capturing, indexing, clustering, and retrieving system history
Ira Cohen, Steve Zhang, Moises Goldszmidt, Julie Symons, Terence Kelly, and Armando Fox
vPerfGuard: an Automated Model-Driven Framework for Application Performance Diagnosis in Consolidated Cloud Environment
Pengcheng Xiong, Calton Pu, Xiaoyun Zhu, and Rean Griffith
Performance Debugging in the Large via Mining Millions of Stack Traces
Shi Han, Yingnong Dang, Song Ge, Dongmei Zhang, and Tao Xie
Amassing and indexing a large sample of version control systems: towards the census of public source code history
Audris Mockus
[READING]
Week 6 (Oct 16): Mining of Non-Structured Data
Assignment Status Update -- OCT 16 (10 min presentation)
Creating and Evolving Developer Documentation: Understanding the Decisions of Open Source Contributors
Barthelemy Dagenais and Martin P. Robillard
Analysis Techniques: Grounded Theory
Semantic clustering: Identifying topics in source code
Adrian Kuhn, Stephane Ducasse, and Tudor Girba
Analysis Techniques: LDA, LSI
Listening to programmers Taxonomies and characteristics of comments in operating system code
Yoann Padioleau, Lin Tan, Yuanyuan Zhou
[READING]
Identifying reasons for software change using historic databases.
Audris Mockus and Larry G. Votta
[READING]
Week 7 (Oct 23): Assignment Presentation
Assignment DUE -- OCT 23 (30 mins presentation + 10 page IEEE report)
Project Proposal DUE -- Oct 28 (2 pages IEEE format)
Week 8 (Oct 30): Project Proposal Presentations
Project Proposal Presentation (15 mins + 10 mins questions)
Week 9 (Nov 6): Mining Mobile Apps
A Measurement Study of Google Play
Nicolas Viennot, Edward Garcia, Jason Nieh
API Change and Fault Proneness: a Threat to the Success of Android Apps
Mario Linares Vásquez, Gabriele Bavota, Carlos Bernal-Cárdenas, Massimiliano Di Penta, Rocco Oliveto, Denys Poshyvanyk
Software Analytics for Mobile Applications - Insights & Lessons Learned
Roberto Minelli, Michele Lanza
Visual Analytics in Software Maintenance: Challenges and Opportunities
Alex Telea and and Ozan Ersoy and Lucian Voinea
[READING]
Week 10 (Nov 13): Large Scale Analysis II
Improving Software Diagnosability via Log Enhancement
Ding Yuan, Jing Zheng, Soyeon Park, Yuanyuan Zhou, and Stefan Savage
The Promises and Perils of Mining Github
Eirini Kalliamvakou, Georgios Gousios, Kelly Blincoe, Leif Singer, Daniel M. German, Daniela Damian
Towards Building a Universal Defect Prediction Model
Feng Zhang, Audris Mockus, Iman Keivanloo, Ying Zou: Towards building a universal defect prediction model.
[READING]
Bugs as deviant behavior: A general approach to inferring errors in systems code
Dawson Engler, David Yu Chen, Seth Hallem, Andy Chou, and Benjamin Chelf
Analysis Techniques: Markov Models
[READING]
Scalable statistical bug isolation
Ben Liblit, Mayur Naik, Alice X. Zheng, Alex Aiken, and Michael I. Jordan
[READING]
Week 11 (Nov 20): Project Presentations
Project Presentation DUE -- Nov 20 (20 mins presentation)


Project Report DUE -- DEC 22 (10 page IEEE report)