In thinking about the topic of data mining, a lot of different types of roles pop up in people’s minds. From data scientists typing away in giant data centers, to DBAs sitting in cubicles processing large amounts of corporate data, to an analyst building a spreadsheet for an annual report contribution.
Maybe it’s something far more physical, bringing up images of pick axes and hard hats and a big block of data (however that’s visualized, probably with 1’s and 0’s – all matrix like). Regardless of the image that comes to mind, it’s probably hard to fathom every business professional in some form or another becoming adept at data mining, and considering it a critical competency to keep in their professional toolbox in the years to come. Yet, when we explore the topic, we can easily see how data mining could become one of the preeminent skills that set folks apart in an era where it’s harder and harder to stand out from an increasingly noisy and competitive work climate. Lets start by looking at the six attributes that make up data mining (as defined by Wikipedia)
- Anomaly detection (Outlier/change/deviation detection) – The identification of unusual data records, that might be interesting or data errors that require further investigation.
- Association rule learning (Dependency modeling) – Searches for relationships between variables. This is sometimes referred to as market basket analysis.
- Clustering – is the task of discovering groups and structures in the data that are in some way or another “similar”, without using known structures in the data.
- Classification – is the task of generalizing known structure to apply to new data. For example, an e-mail program might attempt to classify an e-mail as “legitimate” or as “spam”.
- Regression – attempts to find a function which models the data with the least error.
- Summarization – providing a more compact representation of the data set, including visualization and report generation.
Though the definitions seem somewhat dense, think about how you’d be able to take any job – from being able to use regression analysis to construct a real estate data model to improve pricing predictions, to using summarization to build a better financial report for your senior leaders to interpret how great of a quarter you had.
Though some methods of data mining are harder than others, and you can quickly get in way over your skis without proper learning, knowing how to sift through data, and pull out the useful stuff, will give you a greater sense of the world you work in by understanding the data that matters and it’s so easy these days to learn data mining techniques online!
Just typing in “data mining classes online” produces hundreds of leads, from Coursera to MIT open courseware. Though some options go into areas like Data Science, which is much deeper level analysis, it all starts with understanding data and how best to derive meaning from it – regardless of how deep into the weeds you want to go.
This in turn gives you a big foot up against your competitors, who are largely relying on other services / people to hand them processed data and conclusions to do something with. Going from a commodity to a distinct competitive advantage means going in a direction others aren’t, and just having a nicely worded dictionary isn’t enough these days – you need to be able to turn that dictionary into a novel, and tell a story with the data that will reveal things about your business or your industry that’ll drive better decisions through unique insights.