Week 1 (Sep 14): Introduction and Course Overview
Admin details
Intro Lecture based on ICSE 2010 Tutorial by Ahmed E. Hassan and Tao Xie.
Assigned Reading:The Road Ahead for Mining Software Repositories by Ahmed E. Hassan slides

Week 2 (Sep 21): MSR Tutorial

Mining Software Repositories MSR Tutorial Continued.
Assigned Reading: Future of Mining Software Archives: A Roundtable

Week 3 (Sept 28): Predicting Bugs
Predicting fault incidence using software change history
Todd L. Graves, Alan F. Karr, J. S. Marron, and Harvey P. Siy
Analysis Techniques: Basic linear regression, GLM, R2, model error, exponential decay
Predictors of customer perceived software quality
Audris Mockus, Ping Zhang, and Paul Luo Li
Analysis Techniques: Classification, Logistic Regression (Building and Interpreting Co-efficients), R2, model error
Predicting Bugs from History
Thomas Zimmermann, Nachiappan Nagappan, and Andreas Zeller (Evolution Book)
[READING]
Predicting Defects for Eclipse
Thomas Zimmermann, Rahul Premraj, and Andreas Zeller
Analysis Techniques: Using R, Classification, Ranking
[READING]
[ASSIGNMENT]
Week 4 (Oct 5): Bugs and Effort
Using Software Dependencies and Churn Metrics to Predict Field Failures: An Empirical Case Study
Nachiappan Nagappan, Thomas Ball
Analysis Techniques: PCA, classification, ranking, data splitting
Effort-Aware Defect Prediction Models
Thilo Mende and Rainer Koschke
Analysis Techniques: Random Forest, AUC
Defect Prediction from Static Code Features: Current Results, Limitations, New Approaches
Tim Menzies, Zach Milton, Burak Turhan, Bojan Cukic Yue Jiang, and Ayse Bener
Analysis Techniques: Naive Bayes, CART, SVM
Benchmarking classification models for software defect prediction: a proposed framework and novel findings.
Stefan Lessmann, Bart Baesens, Christophe Mues, and Swantje Pietsch
[READING]
[ASSIGNMENT]
Week 5 (Oct 12): Mining of Non-Structured Data
Listening to programmers Taxonomies and characteristics of comments in operating system code
Yoann Padioleau, Lin Tan, Yuanyuan Zhou
Creating and Evolving Developer Documentation: Understanding the Decisions of Open Source Contributors
Barthelemy Dagenais and Martin P. Robillard
Analysis Techniques: Grounded Theory
Semantic clustering: Identifying topics in source code
Adrian Kuhn, Stephane Ducasse, and Tudor Girba
Analysis Techniques: LDA, LSI
Identifying reasons for software change using historic databases.
Audris Mockus and Larry G. Votta
[READING]
Week 6 (Oct 19): Large Scale Mining
Assignment Status Update -- OCT 19 (5 min presentation)
Instant Code Clone Search
Mu-Woong Lee, Jong-Won Roh, Seung-won Hwang, and Sunghun Kim
MAPO: Mining and Recommending API Usage Patterns
Hao Zhong, Tao Xie, Lu Zhang, Jian Pei, and Hong Mei
Scalable statistical bug isolation
Ben Liblit, Mayur Naik, Alice X. Zheng, Alex Aiken, and Michael I. Jordan
Amassing and indexing a large sample of version control systems: towards the census of public source code history
Audris Mockus
[READING]
Week 7 (Oct 26): Guiding Software Development
Assignment DUE -- OCT 26 (15 mins presentation + 10 page IEEE report)
Mining version histories to guide software changes
Thomas Zimmermann, Peter Weißgerber, Stephan Diehl, Andreas Zeller
Visual Analytics in Software Maintenance: Challenges and Opportunities
Alex Telea and and Ozan Ersoy and Lucian Voinea
Preprocessing CVS Data for Fine-Grained Analysis
Thomas Zimmermann and Peter Weißgerber
[READING]
The Promises and Perils of Mining Git
Christian Bird, Peter C. Rigby, Earl T. Barr, David J. Hamilton, Daniel M. Germany, and Prem Devanbu
[READING]
Project Proposal DUE -- Oct 29 (2 pages IEEE format)
Week 8 (Nov 2): Tools and Mining Challenges (No Class)
Automatic identification of bug-introducing changes
Sunghun Kim, Thomas Zimmermann, Kai Pan, E., and James Whitehead, Jr.
Week 9 (Nov 9): Bug Detection and Dashboards
Capturing, indexing, clustering, and retrieving system history
Ira Cohen, Steve Zhang, Moises Goldszmidt, Julie Symons, Terence Kelly, and Armando Fox
Bugs as deviant behavior: A general approach to inferring errors in systems code
Dawson Engler, David Yu Chen, Seth Hallem, Andy Chou, and Benjamin Chelf
Awareness 2.0: Staying Aware of Projects, Developers and Tasks using Dashboards and Feeds
Christoph Treude and Margaret-Anne Storey
Week 10 (Nov 16): Mining Social Structures
Latent Social Structure in Open Source Projects
Christian Bird, David Pattison, Raissa D'Souza, Vladimir Filkov and Premkumar Devanbu
Improving Bug Triage with Bug Tossing Graphs
Gaeul Jeong, Sunghun Kim, and Thomas Zimmermann
Does Distributed Development Affect Software Quality? An Empirical Case Study of Windows Vista
Christian Bird, Nachiappan Nagappan, Premkumar Devanbu, Harald Gall, and Brendan Murphy
Week 11 (Nov 23): Not done yet
Project Presentation First Round DUE -- Nov 23 (20 mins presentation)


Week 12 (Nov 30): Project Presentations
Project Presentation DUE -- Nov 30 (20 mins presentation)


Project Report DUE -- DEC 22 (10 page IEEE report)