Today I am extremely happy to announce the release of version 1.0 of mlelr (pronounced ‘mealer’, of course), a C program for Maximum Likelihood Estimation of Logistic Regression Models that has been in sporadic development for over 12 years. For everything you need to know, read this guided tour of mlelr which accompanies the source code.
The journey began in 2002 when I was dangerously entrusted to build a large chunk of the functionality of the CATMOD procedure from SAS using nothing but C, linux, my grad school logistic regression notes, a lot of coffee, a patient partner, and several very eager and very impatient clients. The end result was a fantastic success that gleefully enriched the company I worked for as well as the chiropractor who was subsequently charged with tending to the consequences of my absolute failure to maintain any decent posture while sitting at the computer 18 hours a day. (Proof positive that programming can indeed be back-breaking work!)
As assurance to our benefactors that I knew what I was doing, I produced a research paper Maximum Likelihood Estimation of Logistic Regression Models: Theory and Implementation which provides all the necessary theory involved to perform logistic regression as well as a brief code outline for how to accomplish it in C. In the years since, I have received a great deal of positive feedback about the article and many inquiries about expanding on the short implementation section.
Writing your own logistic regression procedure from scratch is not something I typically suggest as an efficient use of time. The corner cases are extremely tricky and the numerous existing robust implementations, both proprietary and open-source, should be one’s first recourse before attempting to roll this by hand. Still, this caveat notwithstanding, since I had actually done so once, long ago, I felt obliged to return to this project after many years and develop a fully functional and well documented C program which could, if nothing else, serve as a learning resource for students interested in how logistic regression works in all its painstaking glory.
And thus today I can announce that this project has finally been released. The project page for mlelr is rather sparse but the links will lead to all the information you could ever dream of. The code is open-source and you can download it either directly from the project page on my website, or from the GitHub repo for mlelr.
I think I spent more time writing the documentation than I did writing the actual code, so I highly urge you to read, in addition to the original paper, the accompanying documentation for mlelr. This is a beautiful 58 page paper that thoroughly describes how the code works as well as some general thoughts about C programming.