STAT0023 Computing for Practical Statistics: course information sheet


Aims of course: to extend students’ practical experience of statistical software environments. To extend students’ abilities in applying ideas and methods already taught in a practical context. To enable students to perform computer-assisted statistical analyses.


Objectives of course: on successful completion, a student should be able to independently perform a systematic analysis with the statistical software suites R and SAS to answer data-based or methodological questions, and report on it according to the scientific state-of-the-art.


Applications: this course provides training in performing statistical analyses with the R and SAS statistical software suites. R is one of the most widely used non-commercial statistical software packages, predominant in research and increasingly in industry, which can be used easily for non-routine statistical analyses. SAS is the commercial statistical analytics suite with the largest worldwide market share, widely used in business and industry. The course provides, amongst other things, basic programming skills, and introduction to R and SAS, and practice in basic statistical analysis workflows.


Prerequisites: Completed STAT0002, STAT0003 and STAT0004 or equivalent. Simultaneous or previous attendance on STAT0005 and STAT0006 or equivalent.


Course content: Introduction to SAS commands and the R environment. Use of these packages for descriptive statistics, graphics and the fitting of regression and ANOVA models. Non-linear regression and generalised linear model fitting, simulation, programming and numerical minimisation.


Lecturers: Prof Richard Chandler (REC: [email protected]), Dr Ioanna Manolopoulou (IM: [email protected]).


Moodle page: entitled ‘STAT0023: Computing for Practical Statistics’: this should be available to all Portico-registered students. This will be the primary source of information about the course.


Useful texts: see the Moodle page.


Workload:

●     Synchronous sessions:

–     Q&A / demo session: one per week, via Zoom (link on the Moodle page) on Mondays from 14.00 to 15.00. These sessions will be recorded.

–     Workshop: one 90-minute session per week. For these workshops, the class is split into four groups: see your personal timetable to find out which group you’re in. The Moodle page gives Zoom links to the workshops.

●     Asynchronous study: each week’s material will be introduced by a collection of prerecorded videos with accompanying self-study exercises and Moodle quizzes, released the previous week. Students are expected to watch the videos and work through the exercises in advance of the Q&A sessions each Monday. The Moodle quizzes should be attempted after each week’s workshop.


The topics to be covered in each week are as follows:

Week Topics covered

1     Revision of R language, graphics and simple commands

2     Linear regression and ANOVA in R

3     Simple programming techniques in R

4     Optimisation, maximum likelihood and nonlinear least squares in R

5     Simulation in R, with applications

6     Generalised linear and generalised additive models in R

7     Introduction to SAS: data manipulation and exploration

8     Simple data analysis in SAS

9     Linear and generalised linear models in SAS

10   Computing for practical statistics: case study for ICA2


Assessment for examination grading: Two in-course assessments, each including a Moodle quiz and a piece of extended coursework. These will be set during the Q&A sessions on Monday 8th February and Monday 22nd March, with the Moodle quizzes taking place during the workshops of those weeks. There is no examination.