Full text search with Postgres and Django

Being one of the most fundamental problems in computer science, occupying the first half of Volume 3 of Donald Knuth’s classic work The Art of Computer Programming, it would seem fair to assume that by 2017 search would be mainly a done deal. But search is still hard. General purpose algorithms may perform decently in the average case, but what constitutes decent performance may not scale well enough to meet user demands. Unless you’re building a search engine, you are probably not designing your data structures specifically to take advantage of search algorithms. When web application developers work on adding a search feature, they are not going to start evaluating different algorithms with Big-O notation. And they’re not going to ask whether quicksort or merge sort would obtain the best performance on their specific dataset. Instead, most likely, they will defer to whatever features are available in their chosen database and application framework, and simply go from there.

That search is hard is why so many sites either defer to whatever vanilla features are available in their framework or outsource to a third-party library such as the Lucene-based document retrieval system Elasticsearch, or to a search-as-a-service provider like Algolia. While the cost in terms of both time and money of integrating third party libraries or services can be prohibitive for a small development team, the negative impacts of providing a poor search interface cannot be understated. In this post I will walk through the process of building decent search functionality for a small to medium sized website using Django and Postgres.

Legion of lobotomized unices

Over the past two decades, changes are underway with profound consequences for both social organization and system design. Virtual machines, cloud computing, and containers are reducing the need for general purpose multi-user systems and the stewardship that maintenance of such systems requires. At the same time, while we live in an age in which we are more connected than ever, we are increasingly cut off from one another because the systems we use are isolated clients. The loss of centralized loci of computing has changed the way we work and communicate online, in many ways making it more difficult to collaborate and causing more isolation by removing the shared spaces that brought us together in the past. Unix systems which used to be the durable social centers of computing have been replaced by a disposable legion of lobotomized unices.

Let's privatize Air Traffic Control!

Because the private sector can do things so much better.

Tech and Occupational Prestige

In my post on Age and Tech, I explored earnings trends by age for various technical occupations using data from the 2015 American Community Survey. The analysis suggests that technical occupations offer a bright future for people of all ages. Income in tech jobs is consistently well above national averages and the earnings curve by age remains strong. Older workers in tech remain employed at higher rates than in other occupations. In this post I will explore another dimension of tech occupations: social status or occupational prestige. How do Americans perceive technical occupations on the ladder of social status?

My little rant about systemd

The real issue with systemd isn’t technical, it’s sociological. How did this system achieve widespread adoption despite widespread opposition and admission even among its proponents that it wasn’t nearly stable? Understanding the social organization of open source development should be the primary goal of the community if we hope to learn from and prevent such mistakes in the future.

Age and Tech

Whether and to what extent age is a prevalent factor in the technical job market are questions that are frequently raised in press articles, blog posts, and discussion boards. The usual—but by no means sole—concern is that tech companies discriminate against older workers in favor of younger ones. The sources of this tendency are multi-faceted. In this post we will explore the Age Question in tech, by examining the latest American Community Survey.

Generating post-hoc session ids in SQL

This is a short demonstration of the power of analytic functions in SQL to generate session ids for raw event data.

Deploying Django projects

It takes 5 minutes for anyone with a passing familiarity with web development to get to Django’s “It Worked!” page after starting from scratch in a clean development environment. And then it takes 5 days to figure out how to deploy your simple ‘Hello World’ application to a production server. In this post I will describe a recipe for deployment of Django projects so you can focus on application development rather than wrestling with 500s.

Svelte Apache

The agenda for this post is to strip down an apache install to the minimal configuration needed to feasibly run the server. We’ll make it slender and elegant, in a word—svelte. This can serve as a starting point for incrementally adding functionality as it is required for your particular installation. We will also turn on some monitoring modules to provide some helpful diagnostics about a running apache server. In addition, we will install a python script modeled on php’s phpinfo() function to quickly show a lot of detail about the environment in which apache is running. Finally, since all the things should be encrypted these days, we will set ourselves up with a certificate courtesy of Let’s Encrypt which will allow us to serve our site on https and be accepted by modern browsers.

Linux From Scratch in 2016

The Linux From Scratch project is very much alive and well in 2016. What began in the late 1990s as an educational process for building a completely customized GNU/Linux system from source code is still very much relevant today. What you will learn from going through the LFS book will augment your linux knowledge like nothing else. Give it a try, you won’t be disappointed!

subscribe via RSS