Stat Resources

Resources in Statistical Programming

This page describes tools, documentation, and research papers I have written in the area of statistics and statistical programming.

Paper: Maximum Likelihood Estimation of Logistic Regression Models

Built upon the theory of generalized linear models, logistic regression has proven to be one of the most versatile techniques used to model the outcomes of a categorical dependent variable. This paper presents a mathematical introduction to the logistic regression model, beginning with the binomial case and then generalizing to the multinomial case. This is followed by discussion of maximum likelihood estimation and a generic implementation of an algorithm used to estimate logistic regression models.

Update! I have completed and released the source code for mlelr, along with a detailed description of how the code works.

SAS Code to Import Netflix Prize data

Netflix is offering one million dollars to anyone who can build a recommender system that is 10% better than theirs. Register at netflixprize.com. I'm rooting for the SAS user community here so I'm sharing code to import the Netflix datafiles into SAS datasets. The code is well-tuned and correct as far as I know. If anyone spots a bug or has a problem, feel free to write to me or post to the prize forum. I'm off to the ACM Library to catch up on about 15 years of research. Good luck!

A SAS Macro to automate the evaluation of logisitic regression equations

This is a SAS Macro that automatically evaluates the predicted probabilities for all independent variable value combinations resulting from a model estimated using proc catmod. This is free software, and I apologize that it is not neatly packaged. It will likely require some modifications to run on your system. Soon, I plan to more fully document its features and provide a textbook example showing its use.