# STAT Courses

# STAT 202 – Introductory Statistics for Scientists

Elementary probability, populations, samples and distributions with biological examples. Methods for data summary and presentation. Estimation, hypothesis testing, two-sample techniques and paired comparisons, regression, correlation. [Offered: F,W]

# STAT 206 – Statistics for Software Engineering

Empirical problem solving with applications to software engineering. An introduction to probability theory. An introduction to distribution theory and to methods of statistical inference, including confidence intervals and hypothesis testing. An introduction to regression. [Offered: F]

# STAT 211 – Introductory Statistics and Sampling for Accounting

Descriptive statistics, probability, discrete and continuous random variables. Sampling distributions and simple hypothesis testing. Introduction to survey sampling. [Offered: W]

# STAT 220 – Probability (Non-Specialist Level)

The laws of probability, discrete and continuous random variables, expectation, central limit theorem. [Offered: F,W]

# STAT 221 – Statistics (Non-Specialist Level)

Empirical problem solving, measurement systems, causal relationships, statistical models, estimation, confidence intervals, tests of significance. [Offered: F, W]

# STAT 230 – Probability

This course provides an introduction to probability models including sample spaces, mutually exclusive and independent events, conditional probability and Bayes' Theorem. The named distributions (Discrete Uniform, Hypergeometric, Binomial, Negative Binomial, Geometric, Poisson, Continuous Uniform, Exponential, Normal (Gaussian), and Multinomial) are used to model real phenomena. Discrete and continuous univariate random variables and their distributions are discussed. Joint probability functions, marginal probability functions, and conditional probability functions of two or more discrete random variables and functions of random variables are also discussed. Students learn how to calculate and interpret means, variances and covariances particularly for the named distributions. The Central Limit Theorem is used to approximate probabilities.

# STAT 231 – Statistics

This course provides a systematic approach to empirical problem solving which will enable students to critically assess the sampling protocol and conclusions of an empirical study including the possible sources of error in the study and whether evidence of a causal relationship can be reasonably concluded. The connection between the attributes of a population and the parameters in the named distributions covered in STAT 230 will be emphasized. Numerical and graphical techniques for summarizing data and checking the fit of a statistical model will be discussed. The method of maximum likelihood will be used to obtain point and interval estimates for the parameters of interest as well as testing hypotheses. The interpretation of confidence intervals and p-values will be emphasized. The Chi-squared and t distributions will be introduced and used to construct confidence intervals and tests of hypotheses including likelihood ratio tests. Contingency tables and Gaussian response models including the two sample Gaussian and simple linear regression will be used as examples.

# STAT 232 – Introduction to Medical Statistics

This course will provide an introduction to statistical methods in health research. Topics to be covered include types of medical data, measures of disease prevalence and incidence, age and sex adjustment of disease rates, sensitivity and specificity of diagnostic tests, ROC curves, measures of association between risk factors and disease, major sources of medical data in the Canadian context including surveys, registries, and clinical studies such as cohort studies, clinical trials and case-control studies. Papers from the medical literature will be used throughout to illustrate the concepts. Introduction to SAS for data analysis and an introduction to database management tools. [Offered: F]

# STAT 240 – Probability (Advanced Level)

STAT 240 is an advanced-level enriched version of STAT 230.

# STAT 241 – Statistics (Advanced Level)

STAT 241 is an advanced-level enriched version of STAT 231.

# STAT 316 – Introduction to Statistical Problem Solving by Computer

This is an applications oriented course which prepares the nonmathematical student to use the computer as a research tool. Topics include aids for statistical analysis and the preparation of documents such as reports and theses. The course provides sufficient background for application to other problems specific to the individual's field. [Offered: W]

# STAT 321 – Regression and Forecasting (Non-Specialist Level)

Modeling the relationship between a response variable and several explanatory variables via regression models. Model diagnostics and improvement. Using regression models for forecasting, Exponential smoothing. Simple time series modeling. [Offered: W]

# STAT 322 – Sampling and Experimental Design (Non-Specialist Level)

Planning sample surveys; simple random sampling; stratified sampling. Observational and experimental studies. Blocking, randomization, factorial designs. Analysis of variance. Applications of design principles. [Offered: F]

# STAT 330 – Mathematical Statistics

This course provides a mathematically rigorous treatment for topics covered in STAT 230 and 231, and to make essential extensions to the multivariate case. Maximum likelihood estimation. Random variables and distribution theory. Generating functions. Functions of random variables. Limiting distributions. Large sample theory of likelihood methods. Likelihood ratio tests. Joint probability (density) functions, marginal probability (density) functions, and conditional probability (density) functions of two or more random variables are discussed. Topics covered include independence of random variables, conditional expectation and the determination of the distribution of functions of random variables using the cumulative distribution method, change of variable and moment generating functions. Properties of the Multinomial and Bivariate Normal distributions are proved. Limiting distributions, including convergence in probability and convergence in distribution, are discussed. Important results, including the Weak Law of Large Numbers, Central Limit Theorem, Slutsky's theorem, and the Delta Method, are introduced with applications. The maximum likelihood method is discussed for the multi-parameter case. Asymptotic properties of the maximum likelihood estimator are examined and used to construct confidence intervals or regions. Tests for simple and composite hypotheses are constructed using the Likelihood Ratio Test. [Offered: F,W,S]

# STAT 331 – Applied Linear Models

Modeling the relationship between a response variable and several explanatory variables (an output-input system) via regression models. Least squares algorithm for estimation of parameters. Hypothesis testing and prediction. Model diagnostics and improvement. Algorithms for variable selection. Nonlinear regression and other methods. [Offered: F,W,S]

# STAT 332 – Sampling and Experimental Design

Designing sample surveys. Probability sampling designs. Estimation with elementary designs. Observational and experimental studies. Blocking, randomization, factorial designs. Analysis of variance. Designing for comparison of groups. [Offered: F,W,S]

# STAT 333 – Applied Probability

Review of basic probability. Generating functions. Theory of recurrent events. Markov chains, Markov processes, and their applications. [Offered: F,W,S]

# STAT 334 – Probability Models for Business and Accounting

Random variables and distribution theory, conditional expectations, moment and probability generating functions, change of variables, random walks, Markov chains, Markov processes. [Offered F,S]

# STAT 337 – Introduction to Biostatistics

This course will provide an introduction to statistical methods in health research. Topics to be covered include types of medical data, measures of disease prevalence and incidence, age and sex adjustment of disease rates, sensitivity and specificity of diagnostic tests, ROC curves, measures of association between risk factors and disease, major sources of medical data in the Canadian context including surveys, registries, and clinical studies such as cohort studies, clinical trials and case-control studies. Papers from the medical literature will be used throughout to illustrate the concepts. Introduction to SAS for data analysis and an introduction to database management tools. [Offered: F]

# STAT 340 – Computer Simulation of Complex Systems

Building and validation of stochastic simulation models useful in computing, operations research, engineering and science. Related design and estimation problems. Variance reduction. The implementation and the analysis of the results. [Offered: W,S]

# STAT 341 – Computational Statistics and Data Analysis

Approximation and optimization of noisy functions. Simulation from univariate and multivariate distributions, multivariate normal distribution, mixture distributions and introduction to Markov Monte Carlo. Introduction to supervised statistical learning including discrimination methods. [Offered: F]

# STAT 371 – Applied Linear Models and Process Improvement for Business

Practical and theoretical aspects of simple and multiple linear regression models. Model building, fitting and assessment. Process thinking and improvement. Strategies for variation reduction such as control charting. Process monitoring, control and adjustment. Applications to problems in business. [Offered: F,W,S]

# STAT 372 – Survey Sampling and Experimental Design Techniques for Business

Design and analysis of surveys. Management of sample and non-sample error. Simple random sampling and stratified random sampling. Additional topics in survey sampling. Observational and experimental studies. Principles for the design of experiments. Analysis of Variance, factorial experiments and interaction. Application to problems in business. [Offered: F,W,S]

# STAT 373 – Regression and Forecasting Methods in Finance

Application of regression and time series models in finance; multiple regression; algebraic and geometric representation of least squares; inference methods - confidence intervals and hypothesis tests, ANOVA, prediction; model building and assessment; time series modeling; autoregressive AR(1) models - fitting, assessment and prediction; moving average smoothing, seasonal adjustment; non-stationarity and differencing. [Offered: F]

# STAT 430 – Experimental Design

Review of experimental designs in a regression setting; analysis of variance; replication, balance, blocking, randomization, and interaction; one-way layout, two-way layout, and Latin square as special cases; factorial structure of treatments; covariates; treatment contrasts; two-level fractional factorial designs; fixed versus random effects; split-plot and repeated-measures designs; other topics. [Offered: F,S]

# STAT 431 – Generalized Linear Models and their Applications

Review of the normal linear model and maximum likelihood estimation; regression models for binomial, Poisson and multinomial data; generalized linear models; and other topics in regression modelling. [Offered: F,W,S]

# STAT 433 – Stochastic Processes

Point processes. Renewal theory. Stationary processes. Selected topics. [Offered: F]

# STAT 435 – Statistical Methods for Process Improvements

Statistical methods for improving processes based on observational data. Assessment of measurement systems. Strategies for variation reduction. Process monitoring, control and adjustment. Clue generation techniques for determining the sources of variability. Variation transmission. [Offered: W]

# STAT 436 – Introduction to the Analysis of Spatial Data in Health Research

The objective of this course is to develop understanding and working knowledge of spatial models and analysis of spatial data. The course provides an introduction to the rudiments of statistical inference based on spatially correlated data. Methods of estimation and testing will be developed for geostatistical models based on variograms and spatial autogressive models. Concepts and application of methods will be emphasized through case studies and projects with health applications. [Offered: W]

# STAT 437 – Statistical Methods for Life History Analysis

Statistical methods for the analysis of longitudinal data; hierarchical models, marginal models, and transitional models. Parametric and semiparametric methods for the analysis of survival data under censoring and truncation. [Offered: W]

# STAT 438 – Advanced Methods in Biostatistics

Causal inference methodologies including propensity score matching and inverse probability weighting. Methods for handling incomplete data and covariate measurement error; likelihood based on joint models, estimating functions.

# STAT 440 – Computational Inference

Introduction to and application of computational methods in statistical inference. Monte Carlo evaluation of statistical procedures, exploration of the likelihood function through graphical and optimization techniques including EM. Bootstrapping, Markov Chain Monte Carlo, and other computationally intensive methods. [Offered: W]

# STAT 441 – Statistical Learning - Classification

Given known group membership, methods which learn from data how to classify objects into the groups are treated. Review of likelihood and posterior based discrimination. Main topics include logistic regression, neural networks, tree-based methods and nearest neighbour methods. Model assessment, training and tuning. [Offered: F]

# STAT 442 – Data Visualization

Visualization of high dimensional data including interactive methods directed at exploration and assessment of structure and dependencies in data. Methods for finding groups in data including traditional and modern methods of cluster analysis. Dimension reduction methods including multi-dimensional scaling, nonlinear and other methods. [Offered: F]

# STAT 443 – Forecasting

Modelling techniques for forecasting time series data: smoothing methods, regression including penalty/regularization methods, the Box-Jenkins framework, stationary and non-stationary processes, both with and without seasonal effects. Other topics may include: ARCH/GARCH models, Bayesian methods, dynamic linear models, Markov Chain Monte Carlo simulation, spectral density analysis, and periodograms. [Offered: F,W,S]

# STAT 444 – Statistical Learning - Function Estimation

Methods for finding surfaces in high dimensions from incomplete or noisy functional information. Both data adaptive and methods based on fixed parametric structure will be treated. Model assessment, training and tuning. [Offered: W]

# STAT 446 – Mathematical Models in Finance

Mathematical techniques used to price and hedge derivative securities in modern finance. Modelling, analysis and computations for financial derivative products, including exotic options and swaps in all asset classes. Applications of derivatives in practice. [Offered: F,W]

# STAT 450 – Estimation and Hypothesis Testing

Discussion of inference problems under the headings of hypothesis testing and point and interval estimation. Frequentist and Bayesian approaches to inference. Construction and evaluation of tests and estimators. Large sample theory of point estimation. [Offered: W]

# STAT 454 – Sampling Theory and Practice

Sources of survey error. Probability sampling designs, estimation and efficiency comparisons. Distribution theory and confidence intervals. Generalized regression estimation. Software for survey analysis. [Offered: W]

# STAT 464 – Topics in Probability Theory

Special Topics course as announced by the department.

# STAT 466 – Topics in Statistics 1

Special Topics course as announced by the department.

# STAT 467 – Topics in Statistics 2

Special Topics course as announced by the department.

# STAT 468 – Readings in Statistics 1

Reading course as announced by the department.

# STAT 469 – Readings in Statistics 2

Reading course as announced by the department.

# STAT 631 – Introduction to Statistical Methods in Health Informatics

Exploratory data analysis and data visualization. Confounding, censoring, selection bias, study designs and meta-analysis. Statistical modelling for continuous and binary data. Use of a statistics package, such as SAS, to analyze case studies will be important throughout. This is open only to students registered in the Masters of Health Informatics plan.

# STAT 690 – Literature and Research Studies

# STAT 814 – Systematic Review and Meta-Analysis

This course will provide students with an overview of the rationale and stages involved in the conduct of a formal systematic review and meta-analysis of a well-defined clinical/health research question. The overarching aim is to provide students with the tools to critically appraise and conduct a systematic review and meta-analysis. Students will largely work in pairs to progress through each step involved (with feedback from instructors at each stage) and to produce a final systematic review and meta-analysis to be presented/submitted at the end of the course. Course Objectives: 1. To demonstrate an understanding of the rationale underlying a systematic review and meta-analysis and relevance to clinical care and health policy; 2. To critically appraise a systematic review; 3. To develop a focused research question amenable to a systematic review; 4. To develop and implement a comprehensive and systematic literature search strategy; 5. To determine and apply procedures for including/excluding potential studies for a systematic review and meta-analysis; 6. To develop and implement a data abstraction process and study database; 7. To demonstrate an understanding of the fundamental statistical/biostatistical issues relevant to the conduct of a formal systematic review and meta-analysis. 8. To perform statistical analyses for a systematic review and meta-analysis and complete/present a final report demonstrating all stages involved. Students will need to meet with Course Coordinators and provide of appropriate previous experience with linear and/or logistic regression techniques.

# STAT 830 – Experimental Design

Review of experimental designs in a regression setting; analysis of variance; replication, balance, blocking, randomization, and interaction; one-way layout, two-way layout, and Latin square as special cases; factorial structure of treatments; covariates; treatment contrasts; two-level fractional factorial designs; fixed versus random effects; split-plot and repeated-measures designs; other topics.

# STAT 831 – Generalized Linear Models and Applications

Review of normal linear regression and maximum likelihood estimation. Computational methods, including Newton-Raphson and iteratively reweighted least squares. Binomial regression; the role of the link function. Goodness-of-fit, goodness-of-link, leverage. Poisson regression models. Generalized linear models. Other topics in regression modelling.

# STAT 833 – Stochastic Processes

Random walks, renewal theory and processes and their application, Markov chains, branching processes, statistical inference for Markov chains.

# STAT 835 – Statistical Methods for Process Improvement

Statistical methods for improving processes based on observational data. Assessment of measurement systems. Strategies for variation reduction. Process monitoring, control and adjustment. Clue generation techniques for determining the sources of variability. Variation transmission.

# STAT 836 – Introduction to the Analysis of Spatial Data in Health Research

The objective of this course is to develop understanding and working knowledge of spatial models and analysis of spatial data. The course provides an introduction to the rudiments of statistaical inference based on spatially correlated data. Methods of estimation and testing will be developed for geostatistical models based on variograms and spatial autoregressive models. Concepts and application of methods will be emphasized through case studies and projects with health applications.

# STAT 837 – Analysis of Longitudinal Data in Health Research

This course will provide an introduction to principles and methods for the analysis of longitudinal data. Conditional and random effect of modeling approaches to regression analysis will be covered, as well as semiparametric methods based on generalized estimating equations. The importance of model assessment and parameter interpretation will be emphasized. Problems will be motivated by applications in epidemiology, clinical medicine, health services research, and disease natural history studies. Students will be required to think critically about appropriate strategies for data analysis. Analysis will be carried out with appropriate statistical software.

# STAT 840 – Computational Inference

Introduction to and application of computational methods in statistical inference. Monte Carlo evaluation of statistical procedures, exploration of the likelihood function through graphical and optimization techniques including EM. Bootstrapping, Markov Chain Monte Carlo, and other computationally intensive methods.

# STAT 841 – Statistical Learning - Classification

Given known group membership, methods which learn from data how to classify objects into the groups are treated. Review of likelihood and posterior based discrimination. Main topics include logistic regression, neural networks, tree-based methods and nearest neighbour methods. Model assessment, training and tuning.

# STAT 842 – Data Visualization

Visualization of high dimensional data including interactive methods directed at exploration and assessment of structure and dependencies in data. Methods for finding groups in data including traditional and modern methods of cluster analysis. Dimension reduction methods including multi-dimensional scaling, nonlinear and other methods.

# STAT 844 – Statistical Learning - Function Estimation

Methods for finding surfaces in high dimensions from incomplete or noisy functional information. Both data adaptive and methods based on fixed parametric structure will be treated. Model assessment, training and tuning.

# STAT 846 – Mathematical Models in Finance

Mathematical techniques used to price and hedge derivative securities in modern finance. Modelling, analysis and computations for financial derivative products, including exotic options and swaps in all asset classes. Applications of erivatives in practice.

# STAT 850 – Estimation and Hypothesis Testing

Discussion of inference problems under the headings of hypothesis testing and point and interval estimation. Frequentist and Bayesian approaches to inference. Construction and evaluation of tests and estimators. Large sample theory of point estimation.

# STAT 854 – Sampling Theory and Practice

Sources of survey error. Probability sampling designs, estimation and efficiency comparisons. Distribution theory and confidence intervals. Generalized regression estimation. Software for survey analysis.

# STAT 890 – Topics in Statistics

# STAT 891 – Topics in Probability

# STAT 901 – Theory of Probability 1

Probability measures, random variables as measurable functions, expectation, independence, characteristic functions, limit theorems, applications.

# STAT 902 – Theory of Probability 2

Review of conditioning on sigma-fields; martingale theory (discrete and continuous-time) and applications; counting processes; Brownian motion; stochastic differential and integral equations and applications; general theory of Markov processes (including martingale problems and semigroup theory), diffusions; weak convergence of stochastic processes on function spaces; functional versions of the central limit theorem and strong laws; convergence of empirical processes.

# STAT 906 – Computer Intensive Methods for Stochastic Models in Finance

Review of basic numerical methods. Simulation of random variables, stochastic processes and stochastic models in finance. Numerical solution of deterministic and stochastic differential equations. Valuation of complex financial instruments and derivative securities. Project and paper.

# STAT 908 – Statistical Inference

Principles of Inference: sufficiency, conditionality, and likelihood; examples and counter examples; conditional inference and ancillarity. Theory of Hypothesis Testing: Neyman-Pearson lemma; similar tests; invariant tests. Asymptotic Theory: maximum likelihood and related theory; large-sample properties of parametric significance tests. Interval Estimation: confidence intervals and significance intervals; location and scale models, conditional intervals. Introduction to Decision Theory: loss and risk functions, admissibility; minimax and Bayes rules; prior and posterior analysis. The course content of Stat 850 is a presumed prerequisite for Stat 908.

# STAT 923 – Multivariate Analysis

Multivariate problems as extensions of univariate problems, discriminant analysis, canonical correlation and principle component analysis.

# STAT 929 – Time Series 1

Iterative model building. ARIMA models, application to forecasting, seasonal models, applications.

# STAT 930 – Time Series 2

Multiple time series modeling including transfer function and intervention analysis. Various special topics in time series such as outliers, robustness, order determination methods, Kalman filtering, sampling and aggregation, seasonal adjustments.

# STAT 931 – Statistical Methods for the Design and Analysis of Epidemiological Studies

This course covers a wide range of topics pertaining to the design and statistical analysis of observational health studies. The course is divided into three areas: 1) classical epidemiological study designs including cross-sectional, cohort, population and family-based case-control studies, and issues related to selection of controls; 2) advanced epidemiological study designs including case-cohort, nested case-control, and multiphase sampling designs; 3) causal inference using propensity scores with matching, stratification and regression adjustments, marginal structural models, and instrumental variables. Studies will be discussed from the epidemiological literature and other sources in the public domain. Simulations and data analyses will be carried out using software (e.g. R or SAS). Students will be trained and assessed in part based on the preparation of reports and delivery of presentations.

# STAT 932 – Statistical Methods for the Design and Analysis of Randomized Intervention Trials

This course covers topics relevant for the design, conduct and analysis of clinical intervention trials. The statistical theory behind the methodology as well as practical issues will be discussed. The course is divided into three areas: 1) important methods for the design of randomized controlled trials including randomization techniques, sample size and power calculations, factorial designs, crossover designs, cluster-randomized trials, non-inferiority trials, adaptive designs, group sequential trials, and ethical issues in design and conduct of clinical trials; 2) topics of predictive modeling including ROC curves, explained variation, and biomarker analyses; 3) dealing with missing data in randomized trials through imputation and inverse weighting, missing covariates, non-compliance, contamination, unplanned crossover, and surrogate outcomes. Clinical trials from the medical literature and other sources in the public domain will be used as case studies for illustration. Simulations and data analyses will be carried out using statistical software (e.g. R or SAS). Students will be trained and assessed in part based on the preparation of reports and delivery of presentations.

# STAT 935 – Analysis of Survival Data

This course deals with methods of analyzing data on the time to failure with particular emphasis on the use of regression models for such data. Both parameteric and semi-parametric regression models will be considered.

# STAT 936 – Longitudinal Data Analysis

This course is designed to teach students the appropriate techniques for analyzing data that is collected over time. This data could arise from biomedical, population public health studies as well as finance and actuarial science applications. The course will teach how to recognize the added complexity of longitudinal data versus the univariate response data which is typically seen in introductory and generalized linear model courses. The course emphasizes the importance of the covariance structure for longitudinal responses. The students will study the difference between subject-specific and population-averaged models and how to recognize problems where one or the other approach might be more appropriate. They will be expected to use statistical software in applications in order to analyze longitudinal data.

# STAT 937 – Introduction to Biostatistics and Epidemiology

Design of medical studies including cohort designs, case-control studies and clinical trials. Measures of association in epidemiological studies. Analysis of data from cohort studies, clinical trials and case-control studies. Other topics such as multiplicity in clinical trials, group randomized designs, sequential and group sequential procedures as time permits. Use of a statistical package (e.g. SAS or S-Plus) to analyze medical data.

# STAT 938 – Statistical Consulting

This course will cover some of the basic tools of a statistical consultant. Topics will include the use of statistical packages, problem-solving techniques, discussion of common statistical consulting problems, effective communication of statistical concepts and management of consulting sessions.

# STAT 946 – Topics in Probability and Statistics

Topics of current interest

# STAT 947 – Topics in Biostatistics

Topics of current interest

# STAT 974 – Financial Econometrics

The focus of this course is on the statistical modelling, estimation and inference and forecasting of nonlinear financial time series, with a special emphasis on volatility and correlation of asset prices and returns. Topics to be covered normally include: review on distribution and dynamic behaviour of financial time series, univariate and multivariate GARCH processes, long-memory time-series processes, stochastic volatility models, modelling of extreme values, copulas, realized volatility and correlation modelling for ultra high frequency data and continuous time models.