What is Data Mining?
In practice, the term data mining is very often used in different ways. It was coined from the demand to automatically search for new knowledge in large amounts of data. It often only describes the techniques used in data analysis and is therefore also used to identify data mining software.
More correct, however, is the consideration of an entire process:
Data mining is the process of automatically finding previously unknown, statistically correct, interesting and interpretable relationships in large amounts of data and using these for important company decisions.
Data Mining Process
The data mining process consists of a recurring cyclic sequence of the following steps:
- Data preparation deals with the viewing, aggregation, and preparation of data for analysis. Careful preparation of the data is an absolute prerequisite for the possibility of obtaining meaningful results in data analysis.
- In data analysis, the techniques of statistics and machine learning are used, among other things, for the recognition and extraction of patterns and relationships from the data.
- Once the analysis is complete, the results are interpreted with the close involvement of the company's expertise. The evaluation includes, among other things, the extraction of the findings that can be used for the company objective.
- From the results of the interpretation, measures are derived which represent a practical implementation of the data mining results. These measures again have a direct influence on the previously considered data and thus represent the source for feedback in this cycle. The extended data are again the basis for the renewed data preparation and the data-minig cycle begins again.