PLEASE SEE UPDATES ON THE PAPER (AND ASSIGNMENT 3...)

ACCESS THE PAPER GUIDE FOR THE UPDATED DUE DATES. Thank you!

.

WATCH THESE SPACES FOR ANNOUNCEMENTS!

PRESENTATION SITE

 

GUIDE 1: ISSUES IN MODELING
GUIDE 2: TERMINLOGY
GUIDE 3: THE LOWLY 2 X 2 TABLE
GUIDE 4: BASICS ON FITTING MODELS
GUIDE 5: SOME REVIEW, EXTENSIONS, LOGITS
GUIDE 6: LOGLINEAR & LOGIT MODELS
GUIDE 7: LOG-ODDS AND MEASURES OF FIT
GUIDE 8: LOGITS,LAMBDAS & OTHER GENERAL THOUGHTS
READINGS

SUCH AS: 
PLEASE TURN ALL  CELL PHONES  OFF DURING  CLASS. THANKS!
(I will also try to remember)


 
 
 
EDF 6937-03       SPRING 2009
G158  Stone Building (new classroom)
Thursdays  2:15-4:40 PM
THE MULTIVARIATE ANALYSIS OF CATEGORICAL DATA
Ref # 06161
Susan Carol Losh
Department of Educational Psychology and Learning Systems
Florida State University

Reference number = 06161
 
 
 LINKS TO EXERCISES AND FEEDBACK GENERALLY GO HERE AT THE TOP.

 
WATCH FOR ANNOUNCEMENTS ABOUT :

SCHEDULE CHANGES

PAPER INFORMATION

FEEDBACK

ETC.


 
COURSE 
OVERVIEW
REQUIRED 
MATERIALS
ASSIGNMENTS
WEB-ASSISTED 
INFO
COURSE 
TOPICS

 
MY OFFICE: 307K Stone Building 
850-644-8778 
OFFICE HOURS: Any exceptions to be announced
1:00-2:05 P.M. Tuesday & Thursday
Typically Wednesdays & by appointment

slosh@fsu.edu


FSU OLD Stone Building 

INSTRUCTOR: Professor Susan Carol Losh
006  Stone Building
Thursdays 2:15-4:40 PM
CLICK HERE  to find the Stone Building

PLEASE INFORM ME IMMEDIATELY IF YOU REQUIRE ANY ASSISTANCE WITH DISABILITIES!
 
 

COURSE OVERVIEW

The Multivariate Analysis of Non-numeric Data addresses models with categorical dependent variables. Analyses with these variables do not lend themselves to some of the more traditional statistics used in education, the behavioral or the social sciences. Part of our course centers around causality and testing causal models. Along the way we will encounter maximum likelihood estimators, Likelihood-Ratio Chi-Squares, and diverse fit estimates. This material requires familiarity with at least one course past the basic introductory statistics class (e.g., multiple regression, ANOVA, the General Linear Model or structural equation models).

By the end of the semester, you should be able to set up, test and interpret:

   --loglinear and logit models
   --logistic regression equations for both binary and multinomial dependent variables

While it seems elementary, you MUST read assigned material on time to thoroughly understand the topics. Skim material first, then read it again. Readings will overlap slightly so that you will receive different presentations of the same material. Course lectures clarify issues in your readings, provide some of the missing "steps" taken by the authors, reiterate some problems faced by researchers using these techniques, and give examples and applications. I will typically begin with a selection from Gilbert's classic book. Nigel Gilbert has the gift of taking highly mathematical materials and turning them into highly intelligible words (unfortunately he is now Chancellor at the University of Surrey UK and is not planning a new edition of the book.)

Alas! Simply reading materials does not confer peace of mind, let alone a healthy respect for everything that can go wrong. Until you try basic applications, these techniques are difficult to understand or evaluate. Once you work an exercise, the materials become much clearer. You will complete four short exercises designed to familiarize you with terminology, basic concepts, and computer techniques. Finally, you will analyze data and write a short paper, as well as give a short presentaton based on your paper. A handout will set the fundamental parameters for the analysis paper. Several datasets are available to you or (encouraged) you may use your own data.

It is insufficient to simply report results, results are interpreted. Did your original substantive hypotheses receive any support? Were they unequivocally rejected? Did they make sense in your causal system? We will address causal issues several times throughout the semester.
 
 
 
ELEMENTARY CAUSAL EXAMPLES 101:

Did you know? 
Looking across months of the year, ice cream consumption and criminal assaults rise and fall together. 
Students with lower grades tend to smoke more cigarettes.
The storks return to nest in Sweden just as the number of births rise during the year.

Does eating ice cream cause criminals to assault others?
Does smoking cigarettes literally stupify you?
Do storks really bring babies?
Stay tuned.

Statistics never, ever "prove causality." However, certain patterns of results will be more consistent with hypotheses than others. And, you will discover, certain numeric patterns of results can be the same for "real" or substantive findings--and for fake or "spurious" findings.

REQUIRED COURSE TEXTS & LECTURES

MAIN TEXT: Alan Agresti, An Introduction to Categorical Data Analysis. SECOND EDITION! (Wiley, 2007; ISBN = 0471226185)
 
 
 
NOTE: A more expanded version of Agresti, which is more mathematical, can be found at:

Alan Agresti, Categorical Data Analysis. SECOND edition, 2002. ISBN = 0471360937
 

Other readings from: Nigel Gilbert, Analyzing Tabular Data: Loglinear and Logistic Models (1993; ISBN = 1857280903) and

Online Course Guides

The Agresti book is ordered for Bill's and the FSU Bookstore. It is quite possible that you can find used copies online at Amazon and other sites for a good deal. I will make copies of the Gilbert chapters for you.

ALL MY COURSE LECTURES will be placed on the Internet and linked in with each course topic.

Course guides will be keyed to the readings. See the top of each Guide as it is posted.
The lecture urls have the general form of:

http://mailer.fsu.edu/~slosh/CatDataGuide1.html

Please type in course urls EXACTLY. There is no "www" in these urls.

Some material in the guides will be covered during class. However, we will also use class time for instruction related to each exercise, demonstrations, presentations and feedback.

COURSE ASSIGNMENTS

Here is information about  assignments, due dates, and course weights.

There will be four short equally weighted exercises.  While each exercise will focus on the immediately prior units, please be advised that this material is cumulative in nature.

Exercises have several purposes:

To familiarize you with terminology, basic operations, and associated computer programs.

To help practice your basic analysis and results interpretation skills. For example, logistic regression coefficients are often used and equally often misinterpreted.

To alert you to common problems that occur with different kinds of analyses and ways to solve these problems.

All four exercises put together will count a total of 40% toward your final grade.

Details on each assignment are posted to our course WEB site prior to the due date.

As exercises and exercise feedback sites are created and posted, watch the space at the top of the Guides for information and links.

An analytic paper will count 40% toward your final grade.

A presentation based on your analytic paper will count 20% toward your final grade.

In the paper, you will analyze data using course material and interpret your findings. The general format of the paper will resemble a journal article.

You may use your own data, data from your major professor, or one of the many data sets I have available. There are many databases online.


I use plus and minus grading, throughout and for final grades.

If I think you are having trouble with the material, I will alert you immediately. I expect you will seek help as quickly as possible. If you receive such an alert, please take it very seriously. Please do not tell me that you "really understand the material" and fail to seek help. I issue such alerts when the work makes it appear that the student DOES NOT understand the material.


 
 
EXERCISE
DUE DATE
COURSE WEIGHT
1: Terminology and purpose February 5 10 percent
2. Using a hierarchical loglinear program February 26 10 percent
3. Exercise on general loglinear models March 26  April 2 10 percent
4. Loglinear to logit transformations (includes program exercise) and logistic regression April 9 10 percent
PRESENTATION ON PAPER TOPIC & ANALYSIS April 9-23 20 percent
COURSE PAPER April 24 28 by NOON  5 PM 40 percent

 
IMPORTANT NOTE!    IMPORTANT! 
EXERCISE DUE DATES
TURNING IN EXERCISES

 

We are on a tight schedule so exercises must reach me BY THE DUE DATE. Because of the intensive nature of this course, late exercises are not accepted. Because of our schedule, I try to return assignments quickly. If you are late, I just might hand them back before you turn yours in.

 

GETTING EXERCISES TO ME: see the SUBMITTING ASSIGNMENTS webpage here!

I ACCEPT HARD COPY EXERCISES ONLY.

DO NOT SEND ME ANY EMAIL ATTACHMENTS. THEY WILL NOT BE OPENED!

ONE MORE NOTE ON EMAIL: Widespread viruses spread through email use subject lines such as "hi" "hello" "hi there" "thanks" or "my test" or no subject heading at all. If not a virus, some of these subject lines are used to camophage advertisements for products I neither use nor want (I received over 75 of these over Break!) PLEASE USE SOME OTHER SUBJECT LINE. I will delete without opening any emails that have subject lines such as "hi" or "my assignment." ("my edf6937 assignment" works fine)

INFORMATIVE SUBJECT LINES INCLUDE:

  • EDF6937 Exercise 1 question
  • Loglinear exercise question
  • Computer exercise question
  • Question about paper
 

I use plus and minus grading for final grades.

WE’RE ONLINE!

Our course is WEB assisted through Blackboard at FSU. You MUST be registered for edf6937-03 to access our Blackboard site. To access our course through Blackboard, here is what to do. Go online to:

http://campus.fsu.edu

(You will be forwarded to the new, more complicated url <http://campus.fsu.edu/webapps>. The above works and is easy to remember.) Enter your FSU username (USERNAME ONLY!) and password to log in. For example, I would enter "slosh" ONLY and omit the "@fsu.edu" part. Then click on:

MULTVAR ANAL CATEG D

If you DON'T have an FSU account, you need to get one NOW. Go to the Computing Services website (address below) and follow the links to register online for your FSU account and email. (Of course you can set up your FSU email account to forward to the email account of your choice.)

http://www.ucs.fsu.edu

You need an FSU account to log into BlackBoard.

Our course can also be accessed directly through the Internet and FSU's mailer system. Go to:

http://mailer.fsu.edu/~slosh/CatDataOverview.html

(That's THIS WEB address.)

You can link to nearly all the course sites from this central location.
 
 

 

NOTES ON THE NET

I created our navigation system and guides, so there's only one copy of each site. 

Thus, each url is CASE SENSITIVE so you must copy capital and small letters EXACTLY.

There is NO "www" in course WEB addresses so don't insert one. 

The number "0" is different from the letter "O". Don't confuse them!

Don't add any spaces to the web address. 

If you want, everything can be accessed through Blackboard so not to worry.
 

I will use WEB-assist for several course features:

Each Guide (lecture) will have links posted at the top to the Course Overview, Syllabus, and all other course Guides. Watch the top of each Guide for announcements about assignments, generic feedback, and any schedule changes. It's easy to navigate from one site to another. 


 
BASIC COURSE TOPICS, READINGS, AND IMPORTANT DATES

There may be some variations from this syllabus. Please check back weekly and watch Blackboard for any announcements.
DATES
TOPICS TO BE COVERED
ISSUES AND OBJECTIVES
January 8-15 Introduction: Issues in Modeling
Causal issues in experimental, nonexperimental and observational data and their implications for models
Navigating our course web sites

What are the fundamental analytic problems?
What (to anticipate) are some solutions?
Basic casual issues
Review of types of data

January 15-22 Review General Linear Model; issues of basic terminology 
Introduce terminology  (Odds-Ratios; MLEs; iteration; Chi-square as goodness of fit); 
The basic two-way (2 X 2) cross-tabulation table recast
What are important aspects of the GLM?
What are the building blocks and terms of multivariate analyses of non-numeric data?
January 22-29 Basic loglinear models. 
A probabilistic model for table cell counts.
Poisson and multinomial distributions. Begin with two way table, extend initially to three way (2 dichotomous, 1 not, then to multinomial).
Frequency equations; transformed to log-linear equations.
What do loglinear models "look like"?
What is a general cell frequency model?
Show me an example!
February 5 EXERCISE 1 DUE Basic terminology exercise
February 5-12 N-way tables
Model construction and model testing in the loglinear model
Extending the model to "n dimensional" cross-tabulation tables
How do I test my loglinear model?
Which parameters can I drop?
February 12-19 Introduction to programs  
February 26 EXERCISE 2 DUE Program 1 and writing a loglinear equation
February 26-March 5 The logit model. 
Transformation from the loglinear to the logit model. 
Transform equations to logit model. 
What is a logit model?
What are the advantages to a logit model?
What's the relationship between a loglinear model and a logit model?
March 9-13 No class--Spring Break
 March 19 Issues in testing logit models and demonstrations.  
March 26
April 2
EXERCISE 3 DUE Nonhierarchical Loglinear models
March 26 - April 2 The special case of logistic regression. Dichotomous and polytotomous dependent variables. 
Poisson, binomial and multinomial regression
More programs: Multinomial logistic regression
What's the relationship between the GCF and logit models and logistic regression?
How do I handle dependent variables with more than two categories?
 April 9 EXERCISE 4 DUE Loglinear to logit transformation
Running a logit model program
April 9 First draft analytic paper (allows me to get it back to you in time to do revisions)
Presentations
PRESENTATION INFO
April 9-23 Various extensions and special cases: 
Model Fitting
Ordinal response variables versus nominal models; 
Quasi independence; 
Structural versus sampling zeros
 

Presentations

When Chi-square just isn't enough
Many many measures of fit
Are there advantages to an ordinal dependent variable?
What about cells with no cases?
What are some other extensions?

A variety of odds and ends and review

April 24 NOON
April 28 5 PM
Analytic paper due FINAL DUE DATE!! Paper Guide

A LECTURE (AND ASSOCIATED MATERIALS) WILL BE LINKED WITH EACH TOPIC AS THE SEMESTER PROGRESSES.
 
 
READINGS AND ASSIGNMENTS

This page was created with Netscape Composer.
There may be some minor changes as the semester progresses.
Your patience is appreciated.
Susan Carol Losh
Update January 5 2009
Update March 18 2009