Trinity College Dublin

Skip to main content.

Top Level TCD Links


Module ST4003: Data Mining

Credit weighting (ECTS)
10 credits
Semester/term taught
Michaelmas term 2012-2013
Contact Hours
4 Lectures and 1 lab per week over Michaelmas Term
Associate Professor Myra O'Regan
Learning Outcomes
On successful completion of this module students should be able to
  • Understand the theory and be able to apply the following techniques to a set of data;
  • Classification trees;
  • Neural Networks;
  • Association rules;
  • Ensemble methods;
  • Random Forests;
  • RuleFit procedure (Jerome Friedman)
  • Support vector machines
  • Evaluation of models
Module Content
  • Handling missing data;
  • Detailed discussion of Classification Trees;
  • Detailed discussion of Evaluation of Models;
  • Overview of Association Rules;
  • Overview of Neural Nets;
  • Overview of Support vector machines;
  • Ensemble methods;
  • General Overview of Ensemble methods;
  • Detailed discussion of Random Forests;
  • Detailed discussion of RuleFit procedure;
Module Prerequisite
ST3007 - Multivariate Analysis and Applied Forecasting
Assessment Detail
This module will be examined in a 3 hour examination in Trinity term. Students will be required to carry out a project employing the above techniques on a set of data using R. The project will consist of a series of mini projects over the term and will account for 40% of the total mark with an exam accounting for the remaining 60%.