Scaling transforms numerical features to a similar range, preventing features with larger values from dominating the model. Different scaling methods are appropriate for different algorithms and data distributions.
Our form provides four scaling options for your numerical features:
Transforms features to a specific range (typically 0 to 1). Preserves the shape of the distribution while ensuring all features have the same scale.
Transforms features to have zero mean and unit variance. Best for algorithms that assume normally distributed data, like linear regression and neural networks.
Scales individual samples to have a unit norm (typically L1, L2, or max norm). Useful when the scale of each sample matters more than the feature distributions.
Skip scaling if your features are already on a similar scale or when using algorithms that are invariant to feature scaling (like tree-based methods).
To choose a scaling method in the form, simply click on one of the scaling option cards. The selected option will be highlighted with the primary color.
Scaling Method | Best For | Compatible Algorithms |
---|---|---|
Min-Max Scaling |
| Neural networks, K-nearest neighbors, algorithms using distance metrics |
Standard Scaling |
| Linear/logistic regression, SVMs, PCA, neural networks |
Normalizer |
| Text classification, clustering, recommendation systems |
None |
| Decision trees, Random Forest, Gradient Boosting |
Transforms features to a range of [0, 1] using the formula:
X_scaled = (X - X_min) / (X_max - X_min)
Original | Min-Max Scaled |
---|---|
10 | 0.0 |
30 | 0.5 |
50 | 1.0 |
Transforms features to have zero mean and unit variance using the formula:
X_scaled = (X - X_mean) / X_std
Original | Standard Scaled |
---|---|
10 | -1.22 |
30 | 0.0 |
50 | 1.22 |
Scales each sample (row) to have a unit norm. With L2 normalization, the formula is:
X_normalized = X / ||X||
(where ||X|| is the L2 norm of the sample)
This transforms each sample so that its Euclidean distance from the origin equals 1. Unlike the other methods, normalization operates on rows (samples) rather than columns (features).
Outlier Sensitivity: Standard Scaling can be more sensitive to outliers than Min-Max Scaling. If your data contains extreme values, consider using robust scaling techniques or Min-Max Scaling.
Tree-Based Models: Decision trees, Random Forests, and other tree-based algorithms are invariant to feature scaling because they make decisions based on sorting values, not their magnitudes.
Data Leakage: Always fit your scaler on the training data only, then apply the same transformation to test data. This prevents information from the test set influencing the scaling parameters.
Interpretability: Scaling affects model coefficient magnitudes, which can impact the interpretation of feature importance in linear models. Consider this when analyzing your models.
When applying your model to new data, you must use the same scaling parameters (mean, standard deviation, min, max) from the training data. Our platform handles this automatically when you deploy your models.