Applied Categorical Data Analysis
EdPsych/Psych/Soc 589
C.J. Anderson
Spring 2009
Last revised: May 4, 2009
General Information
Announcements
Lecture notes
Homework and Exams .
Example analyses
Handy program and links
Questions or problems regarding this site should be sent to
cja@illinois.edu.
General Information :
Announcements:
- 5/4/2009:
- The wrong dataset for problem 3 of the final was posted; the correct one is now posted.
- All homework answer keys and sas code are posted.
- 2/18/2009:
- Office hours canceled Wed Feb 18.
- Homework 4 is posted.
- Note that other computer labs that have SAS are the Foreign Language building,
ALTAS (2nd floor Lincoln hall), the Union, the undergraduate library, and rm 15/17 Education bulidng.
- ALTAS is a statistics and GIS OPEN computer lab in 202 Lincoln Hall. Below are links to more information and services that
they offer.
- Besides courses at ALTAS on SAS, the Statistics Department also have SAS instruction. For more information see
http://www.stat.uiuc.edu/iso/training.htm (These cost $50 to $75).
- 2/2/2009: Homework 2 is posted.
- 1/28/2009: We have very good students this year who are catching various mistakes.
- The page numbers for the homework problems 2 & 3 in the first homework assignment have been
corrected.
- The first set of notes now have all known errors taken out:
- The computations for the probabilities for Poisson distribution are now correct
- Although the SAS commands in the notes I gave in class and the SAS program online were
correct, those in the online version of the lectures notes weren't. The latter has been
corrected.
- 1/27/2009: An introduction to SAS was given in rm 15/17 Education building on Monday 9-10am.
- 1/21/2009: A student found some errors that I made in the example on Poisson
regression. I've revised the SAS program (makes a much nicer graph too), and
I added another one for this example. I'll also fix the notes.
- 1/8/2009: The course is full.
Lectures Notes:
Suggestion: Only print one or two lectures at a time. I often make
changes to the notes.
- Section 1. Introduction.
SAS code and Extra pages illustrating PROC FREQ using the BINOMIAL options:
- (date & time TBA) Optional: Introduction to SAS to be held in the computer lab. We will go over the first two items.
- Section 2. Two-way tables
- Section 3. Three-way tables.
- Section 4. Generalized linear models.
- Section 5. Regression Models for Counts.
- Section 6. Logistic Regression
- Part 1 The basics
SAS programs used in these notes:
- Part 2 Added complexities of multiple explanatory & qualitative variables. (complete)
- Section 7. Loglinear models for contingency tables
- Section 8. Model building for logit and loglinear models.
Section 9. Logit Models
for multicategory variables
Extra reading:
This site:
- Anderson & Rutkowski (2008). Multinomial Logistic Regression. In Osborne "Best
Practices in Quantitative Methods".
- Anderson (in press). Categorical Data Analysis with a Psychometric Twist. In
Milsap & Maydeu-Olivares "The Sage Handbook of Quantitative Psychology". 311-336.
Section 9. Models for
matched pairs
Homework and Exams
Note: The SAS programs were ones that I used in creating the answer
keys; that is, they're may be extra things in them that were not needed and
I didn't write them with the intenion of posting them on the web.
- Homework 1
- Homework 2
- Homework 3
- Homework 4
- Homework 5
- Midterm
- Homework 6
- Homework 7
- Homework 8
- Homework 9
- Final Exam or Project
Example SAS Programs (most are
in ascii/text format):
- Linear, logit and probit models of probiblities: model the probability
of having attended academic program as a function of achievement test
scores.
- Dealing with overdispersion (SAS options and negative binomial):
Poisson regression example of number of deaths due to AIDs.
- Multiple logistic regression
- Log-linear models and SAS:
- Computing the dissimilarity index
- Log-linear/logit model connection
- SAS program for 4-way table
(Marital status x Gender x PMS x EMS). Two log-linear models are computed
including one that is equivalent to a logit model, which is also fit directly
as a logit model.
- SAS output.
- Example of linear by linear, uniform, and nominal by ordinal
association models using SAS (high school and beyond SES X HSP).
- Example of effect of sampling zeros used in lecture.
- Generalized CMH tests for ordinal x ordinal, nominal x ordinal,
and general association.
SAS input and output.
(in text format)
- Example of baseline/multinomial logistic response model,
(output is in postscript format).
- Likelihood ratio tests of the equality of parameters over response
options in the multinomial response model,
(output is in postscript format).
- Examples of conditional logit response model,
(output is in postscript format).
Handy Programs and Links:
-
CIforP.f:
A FORTRAN program that computes large sample confidence intervals for a
proportion. For PC computers, a execultable
version of CIforP (i.e., already complied)
-
pvalue.f:
A FORTRAN program that computes p-values and (bonferroni) critical values
for the standard normal, chi-squared, t, and F distributions (and for correlations).
For users of PC type computers,
pvalue.exe
is an executable (i.e. already compiled) program.