Data mining and predictive modeling tools can enhance BPM projects, but they must be used with care.
Good buzzwords are difficult for marketers to resist, and this seems especially true when software companies want to mark their territory by using new acronyms and feature descriptors — or revamping old ones — in the overlapping areas of business intelligence (BI), business performance management (BPM), and the product space recently branded as predictive analytics and data mining (PA/DM). If, like me, you’ve been observing the evolution of these software categories from a strictly BPM perspective, you’ll have noticed how the meanings of certain terms shift and slide, and how, as a result, the dividing lines between these sectors have become a bit blurred.
Take “predictive,” for example. Five years ago, some BPM product literature used the word rather loosely to describe any software capability that enabled the visualization of meaningful metrics, or any functionality which helped the decision-maker to manage by exception. There wasn’t much actual prediction involved.
Currently, though, when a BPM product touts predictive ability, it’s the real deal. Traditional data mining and predictive modeling methods, often now together referred to as “predictive analytics,” have become much more readily available in BPM vendors’ product lines. Marketing messages and related articles hammer away at their utility, their necessity, and even their unassailable superiority over more traditional approaches.
There’s no question that predictive analytics can powerfully enhance a BPM project. However, as with any tool, in determining how and when it should be used, it’s sometimes a good idea to take a step back and think about what you need, what you’d like to accomplish, and how you intend to proceed. Here are some key points to keep in mind:
It’s an Evolution, Not a Revolution
Let’s take that step back. What has really changed in predictive technologies? Traditional data mining methods, which focus on the discovery of new relationships and patterns in large data sets, aren’t new, although they have benefited from increasing computational capabilities and new approaches. It can also be argued that there haven’t been any recent, major breakthroughs in the areas of information visualization or predictive modeling.
On the other hand, it’s true that extract, transform and load (ETL) products, middleware integration, and related deployment tools have steadily improved and that integrated data preparation tools are now more widely available in data warehousing solutions. In addition, the storage of massively large data sets has become very inexpensive, which comes in handy since extensive data availability — for example, from detailed, real-time, customer activity measurement and social media sources — is now possible.
BPM decision-makers might find this interesting, but many will still regard predictive tools as much more useful for focused tasks such as detecting fraud, analyzing marketing campaigns, or generating customer recommendations than for, say, plotting a company’s overall direction. However, as these computing trends continue, statistical analysis and related practices for analyzing large data sets are becoming more mainstream. This is good for BPM; one can expect users to have more arrows in their quiver to predict, measure, and hit their targets.
The question then should be: How can these tools help improve the efficiency of processes and the accuracy of BPM-related decisions? Real competitive advantage will result from learning how to use each tool, knowing when to use it, and understanding how much reliance to place on its results.
You Can’t Predict Everything
Pick up any book or article on data mining or predictive analytics, and you’ll be bombarded with success stories. They’re typically very selective, and they posit — using seemingly unassailable statistical measures — that given a large enough data set, nearly everything and everyone is wholly predictable.
Just in case you were predicting that this article would take a similar direction, don’t worry — it won’t! It’s no fun being so predictable.
Current product literature in the PA/DM space often launches immediately into regression, clustering, rule induction, and other subjects that can be quite daunting for most of us possessing only an average familiarity with statistical methods. In general, at its core a product’s predictive ability is enabled by its use of these traditional statistical methods. They can yield impressive results; however, they’re so often misused and their results so commonly misunderstood that business decision makers are constantly challenged to determine what degree of reliance should be placed on their recommendations.
And as if that isn’t problematic enough, the methods themselves also have potential weaknesses. It can be dangerous to base important decisions on a poorly designed statistical study or to rely too heavily on the results of an analysis that aren’t fully understood. It’s no wonder that BPM practitioners tend to be slow to adopt these new tools. To a degree, their hesitation is justified.
Nothing Trumps Human Judgment and Common Sense
Take heart, and don’t let a statistician tell you that your decision-making days are over! If you feel that you’re still smarter than your computer, you’re not crazy. Human beings, despite their lack of raw computational power, have amazing pattern recognition abilities that no computer can come close to matching.
On the downside, our brains are certainly limited by the amount of data we can process. In addition, our decision-making ability can be diminished when it’s negatively affected by certain experiential patterns, emotions, social pressures, and similar factors. But a well-tuned human critical reasoning process can successfully deal with these and leverage the information that’s needed to make good decisions, including the results of well-designed statistical studies.
Getting to that point isn’t an easy task; there’s a lot to consider when using these computational tools. The dangers include over-reliance on the results of a poorly constructed study or misunderstanding the results of a well-crafted study. Even regarding the statistical methods themselves as flawless can get you in trouble. For example, it can be a bad decision to hang your hat on results from a single study just because they’re “statistically significant”; improbable factors not measured by the study can be the cause of the effect under investigation.
Biases in human decision-making often turn up in certain predictive studies — for example, when sampling is less than random, or when Bayesian approaches are incorporated into a computational design. (A Bayesian approach or inference may include prior studies’ results or other informed guesses about certain hypotheses, which may introduce biases into the study).
Not to worry — the risks can be managed, and the results of such studies can be measured against our informed common sense. We regularly decide whether or not to incorporate the recommendations of our computing tools into our final decisions. Anyone who intuitively knows when to ignore their car’s GPS or when they’ve wasted enough time on the phone talking to a voice recognition program understands the difference between computational intelligence that complements their own know-how and a system that seems to insult their IQ.
The key to leveraging the benefits of any PA/DM tool is to understand how it arrived at its recommendation; without this, any predictive calculation becomes a mysterious black box. It would be irresponsible to surrender your decision-making ability to such a gadget. This is a primary challenge to successfully leveraging predictive BPM and predictive analytics as a whole.
Understanding predictive analytics shouldn’t require you to forgo common sense; just the opposite, in fact. The next time you’re told that the economy is doing well because jobless claims increased by X percent less than expected, ask yourself — what was expected? What were the economic measurements these growth figures were based on? What were these numbers five, ten, or 20 years ago? What has changed with those factors since then?
It’s tempting to register statistical measurements at face value without questioning them. Within the world of PA/DM, it’s equally easy to avoid questioning, let’s say, the value of a certain standard deviation measurement when it’s one of dozens of factors to be considered during your next important business decision-making process. If you can picture yourself tossing such an interpretation task to your fresh-out-of-school MBA employee rather than tackling it yourself, you’re probably placing too much reliance on results given to you by your PA/DM tool set.
The key to utilizing predictive BPM tools for competitive success is to improve your understanding of these methods and their application to your industry. The competitive advantage that can be gained by using these tools is, of course, heavily reliant on the quality of your organization’s systems and data governance, its ability to adapt, and other foundational factors. The challenges are real, but they’re not insurmountable — and there’s no good reason to ignore these new tools when you’re measuring, managing, and designing business performance efficiencies.
Chris Iervolino holds a doctorate in computing studies and is a senior managing director at ITEC Consulting Inc., a BPM consulting organization specializing in all aspects of corporate performance application design and implementation.