When I first heard the term ‘analytics’ used to describe the work that I do, I bristled at the shiny new euphemism dripping in corporate white-paper newspeak. “What’s wrong with ‘data analysis’?” I asked, with the naiveté of a recent graduate trained to be narrowly exact and precise in both language and thought. I resisted the term for several years, passing it off as a buzzword that would eventually begin to sound arcane, like IS or Sys Ops. Little did I know how much impact this novelty buzzword would have. Today, analytics are being used to revolutionize almost every field of human endeavor, and are responsible for a significant chunk of innovation in the global economy. Yet, there remains a lot of confusion and disagreement about what the word means, and this limits its utility and harms opportunities by preventing people from using analytics to their advantage.
Since I describe myself as an Analytics Professional on my LinkedIn profile, I ought to have some idea what it means, and have a good reason for saying so. Earlier this year I was asked to give a talk about analytics to an audience split between engineers and marketing managers. This was a wonderful challenge because both groups have very reasonable and well-formed impressions about what analytics means, but these impressions usually turn out to be completely wrong. My definition of analytics is aimed at settling the misconceptions among both audiences to bring not only a clearer understanding of the work that analysts do, but also some fresh insight into how analytics, engineering, and marketing together can form a tight synergistic circle.
The term analytics originated in financial circles to describe the work of the quants—programmers, usually with a math or stat background, who designed trading systems and invented ever more complex ways for investment banks to make money with financial data. It later gained currency in market research to describe any type of data analysis that went beyond basic reporting. The work usually required some expertise in statistics as well as programming acumen to collect and prepare the data for analysis. Today there are analytics departments in almost every industry that uses data. They may share a lot of the same functions as engineering departments, they may also share a lot of the same functions as marketing departments. Analytics is a specialized hybrid that combines engineering, marketing, finance, and statistics, all in the service of making sense of an organization’s core data.
The simplified definition is that analytics is just data analysis. However, like most simplified definitions this one leaves a lot of room for ambiguity. Not all data analysis should be considered analytics, and analytics incorporates a lot of ancillary activities that are tangential to the actual data. What forms the core of analytics is the principle of using data for competitive advantage. What can we learn from data that will give us an edge over our competitors? This delineates analytics from reporting, whose purpose is to provide basic metrics. While essential, reporting is mainly focused on “keeping the lights on”. Analytics goes beyond the basics to discover what is not yet known and what can be used to transform the business, optimize the service, or generate an entirely new market. Today’s analytical projects will become tomorrow’s dashboard reports.
My definition of analytics is:
The practice of using data to build models that explain, optimize, or predict some event in order to provide a business advantage.
I find this to be a very compact definition that still incorporates the most essential aspects of what analytics means. There are three very important pieces to my definition.
- Analytics is the practice of using data. Note that word 'practice'. This implies some planning and design of an infrastructure to organize and process the data. Most data warehouses are built to specs written by finance teams to serve reporting needs. Analytics requires its own infrastructure in order to generate an efficient data pipeline for analysis. Don't hire a team of Stats PhDs and expect them to crank out the next great recommendations engine with their Dell laptops.
- The purpose of analytics is to provide a business advantage. I did not include this simply to get props from the business folks in the audience. Far too often the business problem gets buried behind implementation details, or is de-prioritized by a "sexier" problem that the analysts want to work on. Let's be clear: analytics is not about building cool stuff. It's about building cool stuff that solves an important business problem or need. Anything else is a waste of time and money, and will ultimately lead the business to question the value of analytics.
- Analytics involves building models. What separates analytics from traditional old-school product development is the use of data—rather than human intuition—to guide decision-making. Anytime you have a dataset that is used to explain existing patterns or predict future behavior, this involves building some kind of model from the dataset to reality. This clarifies that analytics needs clearly defined deliverables without which there would be no way to measure success and iterate.
Analytics is more than just building models, in the same way that software engineering is more than just programming. It is also about more than just data analysis, in the same way that marketing is about more than just writing ad copy.
To conclude, the state transition diagram of analytics work looks roughly like this:
Business problem → Data → Model → System → $
At the beginning of the chain is a business problem or need. The first task of analytics is to help the business to articulate that need and formulate a research project that will address it. The next step is to collect, organize, recode, reshape, aggregate, and/or otherwise munge the data to produce the input to the model building process. A model is then built to explain, optimize, or predict an event or process. The output of the model must inform, if not drive, decision-making, and must provide a clear means to measure the outcome. This output defines the success criteria and includes a system for refinement, iteration, and continued learning of the model over time. Finally, if all goes well, money is made and everyone goes home happy.
This is of course highly generalized, and many of these parts will overlap and feedback on one another, making the reality of analytic work a lot more complicated in practice than this abstract example. Nonetheless, the three key points in my definition of analytics can serve as a solid framework for how analytics is used to capitalize on data assets and add value to an organization.