Overview
The ML (Machine Learning) module provides classical machine learning algorithms for:- Classification
- Regression
- Clustering
- Statistical modeling
This module implements traditional ML algorithms. For deep learning, see the DNN Module.
Key Concepts
StatModel Base Class
All ML algorithms inherit fromStatModel:
TrainData Class
Encapsulates training data:Classification Algorithms
Support Vector Machines (SVM)
C_SVC: C-Support Vector ClassificationNU_SVC: Nu-Support Vector ClassificationONE_CLASS: One-class SVMEPS_SVR: Epsilon-Support Vector RegressionNU_SVR: Nu-Support Vector Regression
LINEAR: Linear kernelPOLY: Polynomial kernelRBF: Radial Basis Function (Gaussian)SIGMOID: Sigmoid kernel
K-Nearest Neighbors (KNN)
Decision Trees
Random Forest
Naive Bayes
Logistic Regression
Neural Networks
ANN_MLP (Multi-Layer Perceptron)
Clustering
K-Means
EM (Expectation Maximization)
Complete Example: SVM Classification
Model Persistence
Save Model
Load Model
Cross-Validation
Algorithm Selection Guide
| Algorithm | Type | Pros | Cons | Best For |
|---|---|---|---|---|
| SVM | Classification/Regression | Effective in high dimensions | Slow on large datasets | Small to medium data |
| KNN | Classification/Regression | Simple, no training | Slow prediction, memory intensive | Small datasets |
| Random Forest | Classification/Regression | Robust, handles non-linear | Can overfit | General purpose |
| Naive Bayes | Classification | Fast, simple | Assumes independence | Text classification |
| ANN_MLP | Classification/Regression | Powerful | Needs tuning | Complex patterns |
| K-Means | Clustering | Fast, simple | Needs K specified | Data segmentation |
Best Practices
Normalize Features
Scale features to similar ranges for better performance
Cross-Validate
Use train/test split to avoid overfitting
Tune Parameters
Use grid search for optimal hyperparameters
Save Models
Persist trained models for reuse
See Also
- DNN Module - Deep learning
- Core Module - Matrix operations
- ML Tutorial
