 |
Data Mining uses data analysis techniques to explore data and uncover patterns, relationships, trends, and anomalies that are not already obvious. It is a process of discovering hidden insights in data. This 3-hour course introduces data mining by explaining what it is, how it relates to statistics and machine learning, and where supervised and unsupervised methods fit. It walks through the overall data mining process, with emphasis on exploratory data analysis, profiling, descriptive statistics, visualization, and identifying meaningful patterns and variables in data. It also provides an accessible overview of major model types such as classification, clustering, and association analysis, along with the basic ideas behind validation and model selection. The course concludes with a brief look at deployment so learners understand how data mining moves from analysis into practical business use.
|
|
You will learn:
- The definitions of data mining and data science
- The role of statistics in data mining
- Machine learning concepts
- To differentiate between supervised and unsupervised learning
- The data mining process
- How to conduct exploratory data analysis
- To identify data mining models and algorithms
- How to match the problem with the model
- Model validation techniques
- How to deploy data mining models
This course is geared towards:
- Analysts looking to gain foundational data mining knowledge
- Analysts looking to understand data mining models
- Analysts looking to apply the right data mining models to the right problem
- Attendees should have a basic understanding of undergraduate statistics, data types, databases, and data management concepts
BA-06 Data Mining Concepts and Techniques
|
Module 0. About the Course (3 min)
Module 1. Introduction to Data Mining (25 min)
- Overview
- Module Overview
- What is Data Mining
- Statistics in Data Mining
- Machine Learning
- Supervised Learning
- Unsupervised Learning
- Summary
Module 2. The Data Mining Process (24 min)
- Module Overview
- Data Mining Framework
- Data Mining Approaches
- Data Mining Techniques
- Data Mining Process
- Summary
Module 3. Exploratory Data Analysis (29 min)
- Overview
- Exploratory Data Analysis
- Data Profiling: Uncovering Structure
- Data Profiling: Types of Profiling
- Descriptive Statistics
- Results of Data Profiling and Descriptive Statistics
- Data Relationships
- Findings – Important Variables
- Visualization Techniques
- Outcomes and Interpretations
- Sampling Size
- Sample Quality
- Big Data Considerations
- Feature Selection
- EDA Checklist
- Summary
Module 4. Data Mining Models and Algorithms (71 min)
- Overview
- Build the Model
- Anatomy of a Model
- What is a Classification Problem
- Classification
- Ensemble Methods
- Clustering
- Clustering Uses
- Association−Market Basket
- Association Uses
- Application of Data Mining Models
- Model Selection
- Summary
Module 5. Model Validation Techniques (18 min)
- Overview
- Module Overview
- The Validation Process
- Fitting a Model
- Bias/Variance Tradeoff
- Regression – Mean Squared Error
- Linear Regression – Confidence and Prediction Intervals
- Logistic Regression – Significance Test
- Classification Accuracy
- Classification Accuracy – Other Measures
- Prediction Error Methods
- Hold-Out Cross Validation
- K-Fold Cross Validation Method
- Summary
Module 6. Deploying Data Mining Tools (9 min)
- Overview
- Deploying Data Mining Models
- Course Summary Parts 1 & 2
- References
|
Click –here- to download a more detailed outline of this course.
|
This exam tests knowledge and understanding of basic concepts, principles, techniques, and terminology of data mining.
|
You will be tested in these areas:
- Data mining definition and concepts
- The roles of statistics and machine learning in data mining
- Descriptive modeling, decision modeling, and predictive modeling
- Data mining techniques – classification, association, sequencing, and predicting
- The process of data mining
- Exploratory data mining and the use of data visualization
- Data mining models and algorithms
- Model validation and deployment
|
Additional Information
Number of Questions: 20
Time Limit: 40 Minutes
Passing Score: 70%
|
|
Once you pass the exam, you will receive a Certificate of Education
documenting that you have demonstrated mastery of the topic. Course
exams count towards eLC certification programs. Visit our Certification page for more information about our various programs.
We recommend that you take detailed notes and review the course material multiple times before taking this exam.
Click here to learn more about CIMP exams.
|
|
|
 |