Week 1 (Sep 16): Introduction and Course Overview

Admin details (To Register -- use http://goo.gl/v8LUT)
Intro Lecture based on ICSE Tutorial by Ahmed E. Hassan and Tao Xie.
Assigned Reading:The Road Ahead for Mining Software Repositories by Ahmed E. Hassan slides

Week 2 (Sep 23): MSR Tutorial

Mining Software Repositories MSR Tutorial Continued.
Assigned Reading: Future of Mining Software Archives: A Roundtable

Week 3 (Sept 30): Predicting Bugs
Predicting fault incidence using software change history
Todd L. Graves, Alan F. Karr, J. S. Marron, and Harvey P. Siy
Analysis Techniques: Basic linear regression, GLM, R2, model error, exponential decay
Predictors of customer perceived software quality
Audris Mockus, Ping Zhang, and Paul Luo Li
Analysis Techniques: Classification, Logistic Regression (Building and Interpreting Co-efficients), R2, model error
Don't Touch My Code! Examining the Effects of Ownership on Software Quality
Christian Bird, Nachiappan Nagappan, Brendan Murphy, Harald Gall, and Premkumar Devanbu
Analysis Techniques: Variance Explained
[ASSIGNMENT]
Predicting Defects for Eclipse
Thomas Zimmermann, Rahul Premraj, and Andreas Zeller
Analysis Techniques: Using R, Classification, Ranking
[READING]
[ASSIGNMENT]
Predicting Bugs from History
Thomas Zimmermann, Nachiappan Nagappan, and Andreas Zeller (Evolution Book)
[READING]
Evaluating Defect Prediction Approaches: A Benchmark and an Extensive Comparison
Marco D'Ambros, Michele Lanza, Romain Robbes
[READING]
Week 4 (Oct 7): Mining Social Structures
Studying the Impact of Social Structures on Software Quality
Nicolas Bettenburg and Ahmed E. Hassan
Analysis Techniques: VIF Analysis
Latent Social Structure in Open Source Projects
Christian Bird, David Pattison, Raissa D'Souza, Vladimir Filkov and Premkumar Devanbu
Does Distributed Development Affect Software Quality? An Empirical Case Study of Windows Vista
Christian Bird, Nachiappan Nagappan, Premkumar Devanbu, Harald Gall, and Brendan Murphy
Week 5 (Oct 14): Mining of Non-Structured Data
Assignment Status Update -- OCT 14 (10 min presentation)
Listening to programmers Taxonomies and characteristics of comments in operating system code
Yoann Padioleau, Lin Tan, Yuanyuan Zhou
Creating and Evolving Developer Documentation: Understanding the Decisions of Open Source Contributors
Barthelemy Dagenais and Martin P. Robillard
Analysis Techniques: Grounded Theory
Semantic clustering: Identifying topics in source code
Adrian Kuhn, Stephane Ducasse, and Tudor Girba
Analysis Techniques: LDA, LSI
Identifying reasons for software change using historic databases.
Audris Mockus and Larry G. Votta
[READING]
Week 6 (Oct 21): Tools and Mining Challenges (No Class)
Preprocessing CVS Data for Fine-Grained Analysis
Thomas Zimmermann and Peter Weißgerber
[READING]
The Promises and Perils of Mining Git
Christian Bird, Peter C. Rigby, Earl T. Barr, David J. Hamilton, Daniel M. Germany, and Prem Devanbu
[READING]
Automatic identification of bug-introducing changes
Sunghun Kim, Thomas Zimmermann, Kai Pan, E., and James Whitehead, Jr.
[READING]
Week 7 (Oct 28): Guiding Software Development
Assignment DUE -- OCT 28 (20 mins presentation + 10 page IEEE report)
Mining version histories to guide software changes
Thomas Zimmermann, Peter Weißgerber, Stephan Diehl, Andreas Zeller
Visual Analytics in Software Maintenance: Challenges and Opportunities
Alex Telea and and Ozan Ersoy and Lucian Voinea
Project Proposal DUE -- Oct 31 (2 pages IEEE format)
Week 8 (Nov 4): Project Proposal Presentations
Week 9 (Nov 11): Large Scale Analysis I
Capturing, indexing, clustering, and retrieving system history
Ira Cohen, Steve Zhang, Moises Goldszmidt, Julie Symons, Terence Kelly, and Armando Fox
Scalable statistical bug isolation
Ben Liblit, Mayur Naik, Alice X. Zheng, Alex Aiken, and Michael I. Jordan
Amassing and indexing a large sample of version control systems: towards the census of public source code history
Audris Mockus
Week 10 (Nov 18): Large Scale Analysis II
Local vs. Global Models for Effort Estimation and Defect Prediction
Tim Menzies, Andrew Butcher, Andrian Marcus, Thomas Zimmermann, and David Cok
Bugs as deviant behavior: A general approach to inferring errors in systems code
Dawson Engler, David Yu Chen, Seth Hallem, Andy Chou, and Benjamin Chelf
Analysis Techniques: Markov Models
Improving Bug Triage with Bug Tossing Graphs
Gaeul Jeong, Sunghun Kim, and Thomas Zimmermann
Analysis Techniques: Markov Models
Week 11 (Nov 25): Project Presentations
Project Presentation DUE -- Nov 25 (20 mins presentation)


Project Report DUE -- DEC 22 (10 page IEEE report)