Data mining is an interdisciplinary field consisting of various technologies that enable the discovery of interesting patterns from huge collection of Data. It focuses on the understandability of discovered patterns. Many researchers are attracted towards data mining due to availability of relevant practical problems and elegant, efficient, generic and scalable solutions. Data Mining encompasses fundamental concepts and recent trends and its various applications. Data mining technology automates the process of pattern discovery. Discovered patterns are considered useful if they are actionable i.e. these patterns can be used for decision making that can improve some utility. For business applications these patterns help in decision making that add value to business.
For example, consider a super market chain in which
• Total sales in a particular location that are dropped by 20% since 2017.
• The overall sales of groceries improved last month by 10%.
• 60% of customers who buy Milk also buy Sugar.
On further investigation it is found that in 2017 another Super market opened in nearby location. This additional information may provide several strategies such as launching an aggressive advertisement campaign or reducing prices.
Data mining is additionally called Knowledge Discovery in databases (KDD). KDD is the automatic extraction of novel, understandable and useful patterns from large amount of data. It is a multidisciplinary field involving AI, statistics, database technologies, high performance computing, and information retrieval and data visualization.
Data Mining Process:
Data mining process involves following steps:
1. Problem definition: In this step scope and objectives of the problem is determined.
2. Data collection: This step involves collection and exploration of data.
3. Data preprocessing: In this step collected data is cleaned and organized for further processing.
4. Data modeling: In this step a model is created using data mining techniques that will help in solving the given problem.
5. Interpretation and result evaluation: In this step conclusion is drawn and validated. The results are translated into business decisions.
Data Mining Techniques:
Following is the list of Data mining techniques applied for discovering interesting patterns from large amount of data:
1. Frequent Pattern Mining:
2. Association Rule Mining
3. Clustering
4. Classification
5. Regression Analysis
6. Outlier Analysis
Applications:
Data mining offers several applications in business. Some of the applications are listed below:
1. Market Basket Analysis
2. Future healthcare
3. Education
4. Manufacturing Engineering
5. Fraud detection
6. Intrusion detection
7. Customer segmentation
8. Bio informatics
The above list is not exhaustive as there are many more potential areas where data mining can be applied effectively for discovering interesting patterns. Pattern discovery results provide insights for effective decision making by business managers and leaders