CISC 859 Pattern Recognition, Fall 2017
Instructor: Dorothea Blostein, 720 Goodwin Hall, 5336537, blostein@cs.queensu.ca
Lectures: Mondays and Thursday 10:0011:15 in Kingston Hall 104
Office hours: Mondays and Thursday 11:3012:30 in Goodwin 720
Textbook and Course Reader
Pattern Classification, Second Edition, by Duda, Hart, and Stork, Wiley 2001.
This book is well established as the standard reference book in pattern recognition. Available from Queen’s bookstore; the publisher offers an ebook version.
The CISC859 course reader contains my notes about the course material. Available from Queen’s bookstore. If their stock runs out, the bookstore does not automatically print more copies – you must ask them to print another copy for you.
Course description and prerequisites
CISC859 is an introductory pattern recognition course geared toward students who have some background in computer science. The course material is relevant to many areas of research, including data mining, artificial intelligence, computer vision and signal processing. CISC859 has been taken by graduate students from computing, electrical engineering, mechanical engineering, geology, chemistry, and mathematics.
Here is a course overview.
Familiarity with the following subjects is helpful. Students missing this background have successfully taken the course by doing some extra reading.

Elementary calculus: Integrals, and how they relate to the area under a curve.

Elementary probability theory: Probability distribution, probability density, random variables. I provide review.

Elementary formal languages: Context free grammars, and how they define a language. I provide a review.

Programming: For the course project, you are expected to implement a classifier. Toolboxes such as Weka may be used.
Marking Scheme; Information about Assignments, Presentation, Project
The marking scheme is as follows.
32% Assignments and quiz

12% Four assignments (posted in the next section of this website). Assignments may be completed individually, or in groups of two or three students. The assignment mark is based on effort: I quickly assess assignment completion rather than marking in detail. Please see me in office hours if you want more detailed feedback on your assignment answers.

20% Quiz. Questions are based on the assignments, with one major question about the Bayes classifier and another major question about the Anderson grammar for recognition of math notation.
38% Study of a pattern recognition topic, with oral presentation and written report. Here is a list of suggested topics; or choose your own topic.

3% Onepage written plan for your oral presentation, due one week before your presentation. State the topic, the main points you want to present, and the background you are assuming audience members have. If you are unsure about formulating a plan, please discuss this with me in office hours or via email before your due date.

15% Oral presentation to the class. This is my evaluation of your success in presenting according to the plan you submitted: did you convey the main points in a way that is understandable to an audience with the background you assumed in the plan?

15% Written report, due the same day as your presentation.
Required format for this report: 24 pages of text (not counting figures and references) that succinctly presents the main points. Use 12 point font and at least 15 point line spacing. If you wish, you can optionally include appendices to provide more detailed information. In my marking I will concentrate on your 24 pages of text, and will only read the appendices if your writing makes me eager to do so. I impose this strict page limit to give you practice in the vital skill of writing concise documents that convey the main ideas in an informative, convincing and engaging way. See my advice about technical writing.

5% Participation during presentations by other students: fill out a feedback form for each presenter.
30% Digit recognizer project. Here is a project description and here are sample programs for doing image I/O in C and Java, as well as digit images for classifier training and testing.

1% Ontime submission of Digit Recognizer Part 1. This is marked pass fail (100% or 0%).

14% Work done for the project, as described in the final report.

15% Quality of the final report.
Required format for this report: 24 pages of text (not counting figures and references) that succinctly presents the main points. Use 12 point font and at least 15 point line spacing. If you wish, you can optionally include appendices to provide more detailed information. In my marking I will concentrate on your 24 pages of text, and will only read the appendices if your writing makes me eager to do so. I impose this strict page limit to give you practice in the vital skill of writing concise documents that convey the main ideas in an informative, convincing and engaging way. See my advice about technical writing.
Assignments and Schedule
This schedule may be adjusted slightly during the term.

September 21: Assignment 1 is due. Please hand in hardcopy at the lecture (handwritten or typed answers are fine).

October 2: Assignment 2 is due
Here is a website for evaluating the Normal density, if you want to use that for problem 2b to obtain a numerical value for the probability of error when p(x  ω) is normally distributed. Alternatively,
you can leave your answer for P(error) in the form of an integral.

October 5, or earlier: Email me a description (one or two sentences, and one or two references) of the pattern recognition topic you choose to study. Oral presentation later in the term as well as a written report.

October 12: Assignment 3 is due

October 19: Digit recognizer part 1 is due. Please hand in hardcopy at the lecture.

October 23: Assignment 4 is due

October 26 or later: quiz. We will choose the date for the quiz during the first few weeks of term.

October 23 to Nov 30: Student presentations. A detailed schedule will be posted later. Your onepage written plan is due one week before your presentation, and your written report is due the same day as your presentation.
Order of presentation will be alphabetical by last name. If you can find a student who wants to switch times with you, that's fine with me. There will be three presentations per class meeting, so 20 minutes for each presentation and 5 minutes for questions and transition to next presenter.

November 30: Digit recognizer report is due. Please hand in hardcopy at the lecture.
Pattern Recognition Resources
IAPR is the International Association for Pattern Recognition. The IAPR education committee provides researcher/student resources for three areas of core technology (symbolic PR; statistical PR; machine learning) and two broad families of application areas (1D signal analysis; 2D image analysis). For each area, they provide links to tutorials and surveys; explanations; online demos; datasets; books; code.
I also recommend taking at look at the information provided by
the Technical Committees of the IAPR including:

TC 1 Statistical Pattern Recognition Techniques.

TC 2 Structural and Syntactical Pattern Recognition.

TC 3 Neural Networks and Computational Intelligence

TC 7 Remote Sensing and Mapping.

TC 10 Graphics Recognition.
Analysis of engineering drawings, maps, tables,
forms, drawings, math notation, music notation, etc.

TC 11 Reading Systems.
OCR (Optical Character Recognition),
document image processing, penbased computing, signature verification.

TC 15
Graph Based Representations in Pattern Recognition and Image Analysis.

TC 20
Pattern Recognition for Bioinformatics
The Duda Hart Stork textbook has an accompanying
toolbox
written in MATLAB.
Read this Introduction to the DHS toolbox written by graduate student Nawei Chen and me; it illustrates some of the basic pattern recognition ideas we discuss in class.
You are not required to buy or use this toolbox. You can use software such as Weka or R instead.
R is a free software environment for statistical computing and graphics. A past CISC 859 student writes: "R implements a lot
of the concepts that you discuss in the course, and provides many
builtin datasets (like the IRIS dataset and a car manufacturer
dataset), so it's a great way to get some fast and easy experience
with classifiers from a practical perspective. Coupled with the more
theoretical perspective provided by the lectures, I found it to be
great onetwo punch."
A variety of other computing environments also provide implementations of classification algorithms: Weka (with graphical user interface added by RapidMiner), matlab, and OpenCV.
Mirage is a publicly available Javabased tool for exploratory data analysis written by Tin Kam Ho at Bell Labs; she now works at IBM Watson. Tin is one of the world's top researchers in statistical pattern recognition. This tool offers excellent support for exploratory analysis and visualization of large data sets.
An extensive list of
links for pattern recognition and statistics.
Document Layout Interpretation and its Application lists
research groups, conferences, data sets, software, and bibliographies.
Computer Vision Resources
CVonline, a compendium of
computer vision. Covers many topics, such as
Hidden Markov Models (HMMs).
Supplemental information with CVonline:
online and hardcopy books,
datasets for research and student projects,
software packages
Video lectures for an introductory course on computer vision. Topics include flat part recognition, deformable part recognition, range data and stereo data 3D part recognition, detecting & tracking objects in video,and behaviour recognition
CVDICT: Dictionary of Computer Vision and Image Processing.