Data vs. Information:
Raw data by itself is not very interesting. However, by applying purposeful processing with analytic algorithms to your data you can turn the unattractive bundles of characters and numbers into information, find patterns, thus exposing the beauty and value of the data. But wait, there is more to it. Information that is merely explanatory is less valuable than information that leads to action. So the promise from data science is that the value for your data is in the analytics and that the highest value is in the analytics that leads to action. Do you agree? Let us examine the value chain of data in terms of analytics together.
The first question becomes what are the different phases for analytics? Analytics can be:
- Descriptive
- Predictive
- Prescriptive
Descriptive Analytics is widely used in many applications today. It is what we know and love. It is usually in a traditional business intelligence system that is a SQL database, data mart or deployed as an OLAP cube. The presentation of the descriptive analytics is usually handled by reporting tools such as: Business Objects, Cognos, Excel and/or Crystal reports. Many of the business intelligence systems have the capabilities to show trends and statistics and to provide information to an organization on what happed in the past. While descriptive analytics is useful the main problem is that it is reactive, and not proactive.
Predictive Analytics: The next step up from descriptive analytics is predictive analytics. Predictive analytics is driven mainly from mathematical computations and deep statistical models that usually involve some kind of regression analysis. Or today you can find Machine Learning algorithms embedded in Open Source libraries such as Apache Mahout or UC Berekely’s AMPLab MLib. The way it works is that you combine your historical data with these algorithms and rules to predict with significance the probability that an opportunity (or problem) will present itself. It is a challenging exercise and the model is only as good as the coefficients or predictors that are chosen as inputs into the model. The key for me is to allow human intervention to tune the model, and allow a feedback loop to enhance the model. In my experience a supervised model will outperform an unsupervised model all things being equal. The other problem with predictive analytics is that it only goes so far to predict situation but it does not recommend action(s).
Prescriptive Analytics: The next level up is prescriptive analytics. Prescriptive analytics will build upon the “what” of descriptive analytics and the “when” of predictive analytics. Prescriptive analytics will say why and recommend actions to take. These are suggestions usually based on business rules or conditions that support business models for the organization. A key component for prescriptive analytics is the feedback loop. It is crucial to knowing if the prescribe action worked. When it works it will create immeasurable opportunities for an organization. One problem with Prescriptive analytics is that rules engines are complex to build, test and maintain. The major challenge for prescriptive analytics is that it has to align the when with the action.