Data Mining is a set of processes related to analyzing and discovering useful, actionable knowledge buried deep beneath large volumes of data stores or data sets. This knowledge discovery involves finding patterns or behaviors within the data that lead to some profitable business action. Data Mining requires generally large volumes of data including history data as well as current data to explore the knowledge.
Once the required amount of data has been accumulated from various sources, it is cleaned, validated and prepared for storing it in the data warehouse or data mart. BI reporting Tools capture the required facts from these data to be used by the knowledge discovery process. Data Mining can be accomplished by utilizing one or more of the traditional knowledge discovery techniques like Market Basket Analysis, Clustering, Memory Based Reasoning, Link Analysis, Neural Networks and so on.
Data Mining Life Cycle:
- Find out the Business Problem: Consider a company’s current year sales dropped by a percentage when compared to the previous year. By using OLAP Tools, the exact sales fact can be determined across several dimensions like region, time etc.
- Knowledge Discovery: Given this business problem, various reasons for the decrease in sales have to be analyzed utilizing one or more of the Data Mining Techniques. Causes may include poor quality or service of the product or flaws in marketing schemes or less demand for the product or seasonal changes or regulations enforced by the Government or competitors pressure, and so on. The exact solutions have to be found out in order to resolve this sales drop, which we call it as the Knowledge Discovery here.
- Implement the Knowledge: Based on above discovery, proper actions should be taken in order to overcome the business problem.
- Analyze the Results: Once it is been implemented, results need to be monitored and measured to find out outcomes of that action.
OLAP vs Data Mining:
OLAP helps organizations to find out the measures like sales drop, productivity, service response time, inventory in hand etc. Simply, OLAP tell us ‘What has happened’ and Data Mining helps to find out ‘Why it has happened’ at the first place. Data Mining can also be used to predict ‘What will happen in the future’ with the help of data patterns available within the organization and publicly available data.
For example if a borrower with bad credit and employment history, applies for a mortgage loan, his/her application may be denied by a mortgage lender since he/she may default the loan if approved. The mortgage lender would have come to this decision based upon the historical data previously mined following a similar pattern.
Nice article, Very helpful to understand data mining!! Thanks