Documentation > Models > Model Training

Train Your Regression and Classification Models

Welcome to the ML Clever model training portal—your one-stop solution to build robust regression and classification models without writing any code. Whether you prefer fine-tuning parameters manually or leveraging our intelligent AutoML engine, this guide will help you turn your preprocessed data into high-performance models.

Regression Models

Powerful predictive modeling algorithms for your no-code machine learning applications. Select from industry-standard regression techniques with optimal configurations.

Machine Learning Made Simple

Our no-code platform leverages the power of scikit-learn, TensorFlow, and PyTorch to provide state-of-the-art predictive modeling capabilities with minimal setup. Each model is carefully documented with implementation guidance, parameter configurations, and performance characteristics to help you choose the right tool for your analytical needs without writing a single line of code.

Available Regression Models

When to use Random Forest Classifier: ideal for handling mixed data types, capturing non-linear patterns, and providing robust performance with minimal tuning.
Limitations of Random Forest Classifier: can be computationally intensive and lacks the interpretability of simpler models.

Statistical Foundation

Random Forest Classifier employs robust estimation techniques to minimize squared errors and provide optimal linear unbiased estimates under standard assumptions.

Computational Efficiency

Training complexity scales with input dimensionality and sample size, with efficient matrix operations for production deployment.

Data Requirements

Requires normalized, non-collinear input features for optimal performance. Handles continuous and categorical variables with appropriate preprocessing.
Configure these parameters to optimize the Random Forest Classifier for your specific use case. Each parameter affects model training, performance, and prediction accuracy.
ParameterTypeDefaultDescription
n_estimatorsInteger100Number of trees in the forest.
max_depthIntegerNoneMaximum depth of the tree.
min_samples_splitInteger2Minimum number of samples required to split an internal node.
min_samples_leafInteger1Minimum number of samples required to be at a leaf node.
max_featuresStringsqrtNumber of features to consider when looking for the best split.
bootstrapBooleanTrueWhether bootstrap samples are used when building trees.
oob_scoreBooleanFalseWhether to use out-of-bag samples to estimate generalization accuracy.
class_weightString or DictNoneWeights associated with classes in the form {class_label: weight}.
criterionStringginiFunction to measure the quality of a split.
random_stateIntegerNoneControls the randomness of the estimator.
ccp_alphaFloat0.0Complexity parameter used for Minimal Cost-Complexity Pruning.
max_leaf_nodesIntegerNoneGrow trees with max_leaf_nodes in best-first fashion.
max_samplesFloat or IntegerNoneNumber of samples to draw from X to train each base estimator.
min_impurity_decreaseFloat0.0A node will be split if this split induces a decrease of the impurity greater than or equal to this ...
min_weight_fraction_leafFloat0.0Minimum weighted fraction of the sum total of weights required to be at a leaf node.
monotonic_cstArray of IntegersNoneConstraint to enforce monotonicity in the predictions with respect to certain features.
n_jobsIntegerNoneNumber of jobs to run in parallel.
verboseInteger0Controls the verbosity when fitting and predicting.
warm_startBooleanFalseWhen True, reuse the solution of the previous call to fit and add more estimators to the ensemble.

Detailed Parameter Reference

Number of trees in the forest.
Technical Implementation Guidance
100-300 for most datasets, higher for more complex problems
Effect on Model Performance
More trees generally lead to better accuracy but with diminishing returns and increased computation time
Valid Numerical Range
10-500
Maximum depth of the tree.
Technical Implementation Guidance
None for unrestricted growth, or 5-15 to control model complexity
Effect on Model Performance
Controls how deep trees can grow; deeper trees capture more specific patterns but risk overfitting
Valid Numerical Range
1-50
Minimum number of samples required to split an internal node.
Technical Implementation Guidance
2 for maximum model growth, 5-10 to control overfitting
Effect on Model Performance
Higher values prevent creating nodes that represent very few samples, reducing overfitting
Valid Numerical Range
2-20
Minimum number of samples required to be at a leaf node.
Technical Implementation Guidance
1-5 for most cases, higher values (10+) for noisier datasets
Effect on Model Performance
Controls the minimum size of terminal nodes; higher values create more generalized models
Valid Numerical Range
1-50
Number of features to consider when looking for the best split.
Technical Implementation Guidance
sqrt for classification problems, log2 for very high-dimensional data
Effect on Model Performance
Introduces randomness by limiting features considered at each split; helps prevent trees from becoming too correlated
Valid Numerical Range
auto, sqrt, log2, or specific integer/float
Whether bootstrap samples are used when building trees.
Technical Implementation Guidance
True for standard random forest implementation
Effect on Model Performance
When True, trees are built with bootstrapped samples; when False, the whole dataset is used for each tree
Valid Numerical Range
True/False
Whether to use out-of-bag samples to estimate generalization accuracy.
Technical Implementation Guidance
True when you want built-in cross-validation metrics
Effect on Model Performance
Provides an unbiased estimate of model performance without requiring a separate validation set
Valid Numerical Range
True/False
Weights associated with classes in the form {class_label: weight}.
Technical Implementation Guidance
balanced for imbalanced datasets to adjust weights inversely proportional to class frequencies
Effect on Model Performance
Helps model perform better on imbalanced datasets by penalizing misclassifications of minority class more heavily
Valid Numerical Range
None, balanced, balanced_subsample, or custom dictionary
Function to measure the quality of a split.
Technical Implementation Guidance
gini for faster computation, entropy for slightly different split behavior
Effect on Model Performance
Gini impurity tends to isolate the most frequent class in its own leaf, while entropy might create more balanced trees
Valid Numerical Range
gini, entropy
Controls the randomness of the estimator.
Technical Implementation Guidance
Any fixed value for reproducibility
Effect on Model Performance
Setting a constant seed ensures reproducible results across model runs
Valid Numerical Range
Any integer value
Complexity parameter used for Minimal Cost-Complexity Pruning.
Technical Implementation Guidance
0.0 for no pruning, or small positive values for effective pruning
Effect on Model Performance
Helps control the complexity of the model by penalizing overly complex trees
Valid Numerical Range
>=0
Grow trees with max_leaf_nodes in best-first fashion.
Technical Implementation Guidance
None for unlimited nodes, or a positive integer to limit growth
Effect on Model Performance
Restricts the maximum number of terminal nodes, which can help in reducing overfitting
Valid Numerical Range
None or positive integer
Number of samples to draw from X to train each base estimator.
Technical Implementation Guidance
None for using the full dataset, or a fraction/number for sub-sampling
Effect on Model Performance
Can speed up training and introduce additional randomness when less than the full dataset is used
Valid Numerical Range
None or positive number
A node will be split if this split induces a decrease of the impurity greater than or equal to this value.
Technical Implementation Guidance
0.0 for no threshold, or a small positive value to avoid insignificant splits
Effect on Model Performance
Prevents splits that do not significantly reduce impurity, helping to control overfitting
Valid Numerical Range
>=0
Minimum weighted fraction of the sum total of weights required to be at a leaf node.
Technical Implementation Guidance
0.0 for standard cases, higher for datasets with weighted samples
Effect on Model Performance
Ensures that leaves represent a minimum proportion of the total sample weight, useful in imbalanced datasets
Valid Numerical Range
0.0-0.5
Constraint to enforce monotonicity in the predictions with respect to certain features.
Technical Implementation Guidance
None unless specific monotonic constraints are needed
Effect on Model Performance
Enforces monotonic relationships, which can be important for regulatory or business requirements
Valid Numerical Range
None or list of integers
Number of jobs to run in parallel.
Technical Implementation Guidance
Use -1 to utilize all processors
Effect on Model Performance
Speeds up training by parallelizing computations
Valid Numerical Range
Any integer value or None
Controls the verbosity when fitting and predicting.
Technical Implementation Guidance
0 for silent mode, higher values for more detailed logs
Effect on Model Performance
Helps monitor training progress and debug issues
Valid Numerical Range
Non-negative integers
When True, reuse the solution of the previous call to fit and add more estimators to the ensemble.
Technical Implementation Guidance
False unless you plan to incrementally add trees
Effect on Model Performance
Allows for continued training of an existing model without starting over
Valid Numerical Range
True/False
Ideal for binary classification problems with a linear decision boundary and when interpretability is a priority.
May underperform on complex datasets that require modeling non-linear relationships.

Statistical Foundation

Logistic Regression employs robust estimation techniques to minimize squared errors and provide optimal linear unbiased estimates under standard assumptions.

Computational Efficiency

Training complexity scales with input dimensionality and sample size, with efficient matrix operations for production deployment.

Data Requirements

Requires normalized, non-collinear input features for optimal performance. Handles continuous and categorical variables with appropriate preprocessing.
Configure these parameters to optimize the Logistic Regression for your specific use case. Each parameter affects model training, performance, and prediction accuracy.
ParameterTypeDefaultDescription
penaltyStringl2Used to specify the norm used in the penalization.
CFloat1.0Inverse of regularization strength; must be a positive float.
class_weightString or DictNoneWeights associated with classes in the form {class_label: weight}.
dualBooleanFalseDual or primal formulation. Dual formulation is only implemented for l2 penalty.
fit_interceptBooleanTrueSpecifies if a constant (a.k.a. bias or intercept) should be added to the decision function.
intercept_scalingFloat1.0Useful only when the solver 'liblinear' is used and fit_intercept is set to True.
l1_ratioFloat or NoneNoneThe Elastic-Net mixing parameter, with 0 <= l1_ratio <= 1. Only used if penalty is 'elasticnet'.
max_iterInteger1000Maximum number of iterations taken for the solvers to converge.
multi_classStringautoIf 'ovr', a binary problem is fit for each label; otherwise, a multinomial loss is minimized.
n_jobsIntegerNoneNumber of CPU cores used when parallelizing over classes.
random_stateIntegerNoneControls the randomness of the estimator.
solverStringliblinearAlgorithm to use in the optimization problem.
tolFloat0.0001Tolerance for stopping criteria.
verboseInteger0For the liblinear and lbfgs solvers, set verbose to any positive number for increased logging.
warm_startBooleanFalseWhen set to True, reuse the solution of the previous call to fit as initialization, otherwise, start...

Detailed Parameter Reference

Used to specify the norm used in the penalization.
Technical Implementation Guidance
l2 is common; use l1 for sparsity, elasticnet for a balance, or none to disable regularization
Effect on Model Performance
Regularization controls overfitting; the choice of penalty affects sparsity and model performance
Valid Numerical Range
Options: l1, l2, elasticnet, none
Inverse of regularization strength; must be a positive float.
Technical Implementation Guidance
Typically between 0.1 and 10.0, depending on dataset complexity
Effect on Model Performance
Lower values imply stronger regularization, helping to prevent overfitting
Valid Numerical Range
0.1-100.0
Weights associated with classes in the form {class_label: weight}.
Technical Implementation Guidance
Use 'balanced' for imbalanced datasets to adjust class weights inversely proportional to frequencies
Effect on Model Performance
Helps the model adjust its focus towards minority classes, improving performance on imbalanced data
Valid Numerical Range
None, balanced, or custom dictionary
Dual or primal formulation. Dual formulation is only implemented for l2 penalty.
Technical Implementation Guidance
False for most cases
Effect on Model Performance
Dual formulation can be beneficial for high-dimensional data, but is generally less efficient
Valid Numerical Range
True/False
Specifies if a constant (a.k.a. bias or intercept) should be added to the decision function.
Technical Implementation Guidance
True
Effect on Model Performance
Ensures the decision boundary is not forced to pass through the origin
Valid Numerical Range
True/False
Useful only when the solver 'liblinear' is used and fit_intercept is set to True.
Technical Implementation Guidance
1.0
Effect on Model Performance
Scales the intercept term when using the liblinear solver
Valid Numerical Range
Any positive float
The Elastic-Net mixing parameter, with 0 <= l1_ratio <= 1. Only used if penalty is 'elasticnet'.
Technical Implementation Guidance
Typically between 0.1 and 0.5 when using elasticnet
Effect on Model Performance
Balances the L1 and L2 regularization in elasticnet, affecting model sparsity and performance
Valid Numerical Range
0-1 or None
Maximum number of iterations taken for the solvers to converge.
Technical Implementation Guidance
1000 is standard; increase if convergence warnings occur
Effect on Model Performance
Ensures that the solver has enough iterations to reach convergence
Valid Numerical Range
Positive integers
If 'ovr', a binary problem is fit for each label; otherwise, a multinomial loss is minimized.
Technical Implementation Guidance
auto for automatic handling, 'ovr' for one-vs-rest, or 'multinomial' for multi-class optimization
Effect on Model Performance
Determines the strategy used for multi-class classification, affecting model performance and computational cost
Valid Numerical Range
Options: auto, ovr, multinomial
Number of CPU cores used when parallelizing over classes.
Technical Implementation Guidance
Set to -1 to use all available cores
Effect on Model Performance
Speeds up computation by leveraging parallel processing
Valid Numerical Range
Any integer or None
Controls the randomness of the estimator.
Technical Implementation Guidance
Any fixed integer value to ensure reproducibility
Effect on Model Performance
Ensures consistent results across multiple runs
Valid Numerical Range
Any integer value
Algorithm to use in the optimization problem.
Technical Implementation Guidance
liblinear is common for small datasets; lbfgs or saga for larger ones
Effect on Model Performance
Different solvers offer varying trade-offs between speed and accuracy
Valid Numerical Range
Options: newton-cg, lbfgs, liblinear, sag, saga
Tolerance for stopping criteria.
Technical Implementation Guidance
0.0001 is standard, but may be adjusted for convergence issues
Effect on Model Performance
Determines the threshold at which the optimization stops, affecting convergence
Valid Numerical Range
Any positive float
For the liblinear and lbfgs solvers, set verbose to any positive number for increased logging.
Technical Implementation Guidance
0 for no verbosity, increase for debugging purposes
Effect on Model Performance
Controls the amount of logging output during model training
Valid Numerical Range
Non-negative integers
When set to True, reuse the solution of the previous call to fit as initialization, otherwise, start fresh.
Technical Implementation Guidance
False unless incremental training is required
Effect on Model Performance
Can save computation if multiple fits are performed sequentially
Valid Numerical Range
True/False
Ideal for instance-based learning where similar instances drive the prediction, especially in small-to-medium datasets.
Can be computationally expensive with large datasets and is sensitive to feature scaling and noisy data.

Statistical Foundation

KNN employs robust estimation techniques to minimize squared errors and provide optimal linear unbiased estimates under standard assumptions.

Computational Efficiency

Training complexity scales with input dimensionality and sample size, with efficient matrix operations for production deployment.

Data Requirements

Requires normalized, non-collinear input features for optimal performance. Handles continuous and categorical variables with appropriate preprocessing.
Configure these parameters to optimize the KNN for your specific use case. Each parameter affects model training, performance, and prediction accuracy.
ParameterTypeDefaultDescription
n_neighborsInteger5Number of neighbors to use.
weightsStringuniformWeight function used in prediction.
algorithmStringautoAlgorithm used to compute the nearest neighbors.
leaf_sizeInteger30Leaf size passed to BallTree or KDTree; affects the speed of construction and query, as well as memo...
metricStringminkowskiThe distance metric to use for the tree.
metric_paramsDictNoneAdditional keyword arguments for the metric function.
n_jobsIntegerNoneNumber of parallel jobs to run for neighbors search.
pInteger2Power parameter for the Minkowski metric.

Detailed Parameter Reference

Number of neighbors to use.
Technical Implementation Guidance
Typically between 3 and 10 depending on the dataset's density
Effect on Model Performance
Determines how many nearest data points influence the prediction; too few may be noisy, too many may smooth out class distinctions
Valid Numerical Range
1-20
Weight function used in prediction.
Technical Implementation Guidance
Use 'uniform' for equal weighting or 'distance' to give closer neighbors more influence
Effect on Model Performance
Affects how neighbor distances contribute to the decision, influencing model sensitivity and performance
Valid Numerical Range
Options: uniform, distance
Algorithm used to compute the nearest neighbors.
Technical Implementation Guidance
Set to 'auto' to let the model choose the optimal algorithm based on data characteristics
Effect on Model Performance
Determines the efficiency and performance of the neighbor search, with some algorithms better for high-dimensional data
Valid Numerical Range
Options: auto, ball_tree, kd_tree, brute
Leaf size passed to BallTree or KDTree; affects the speed of construction and query, as well as memory usage.
Technical Implementation Guidance
30 is standard; adjust based on available memory and desired query speed
Effect on Model Performance
Impacts the trade-off between tree construction speed and query performance
Valid Numerical Range
Any positive integer
The distance metric to use for the tree.
Technical Implementation Guidance
Use 'minkowski' for general purposes; 'euclidean' or 'manhattan' if the specific distance type is required
Effect on Model Performance
Different metrics can capture various notions of similarity, affecting neighbor selection and prediction accuracy
Valid Numerical Range
Options: euclidean, manhattan, chebyshev, minkowski
Additional keyword arguments for the metric function.
Technical Implementation Guidance
None unless custom adjustments to the metric are needed
Effect on Model Performance
Allows fine-tuning of the distance metric's behavior to better suit specific datasets
Valid Numerical Range
Dictionary or None
Number of parallel jobs to run for neighbors search.
Technical Implementation Guidance
Set to -1 to utilize all available CPU cores for faster computation
Effect on Model Performance
Enables parallel processing to speed up the neighbor search, especially useful for larger datasets
Valid Numerical Range
Any integer or None
Power parameter for the Minkowski metric.
Technical Implementation Guidance
2 for Euclidean distance; 1 for Manhattan distance
Effect on Model Performance
Determines the type of distance calculation; different values change the metric's sensitivity to feature differences
Valid Numerical Range
Any positive integer
Ideal for datasets with clear margins of separation and for non-linear classification tasks.
Can be computationally expensive for large datasets and sensitive to the choice of kernel.

Statistical Foundation

SVM employs robust estimation techniques to minimize squared errors and provide optimal linear unbiased estimates under standard assumptions.

Computational Efficiency

Training complexity scales with input dimensionality and sample size, with efficient matrix operations for production deployment.

Data Requirements

Requires normalized, non-collinear input features for optimal performance. Handles continuous and categorical variables with appropriate preprocessing.
Configure these parameters to optimize the SVM for your specific use case. Each parameter affects model training, performance, and prediction accuracy.
ParameterTypeDefaultDescription
kernelStringrbfSpecifies the kernel type to be used in the algorithm.
CFloat1.0Regularization parameter.

Detailed Parameter Reference

Specifies the kernel type to be used in the algorithm.
Technical Implementation Guidance
rbf is common for non-linear data; use linear for linearly separable data or poly for polynomial relationships
Effect on Model Performance
The kernel choice determines how data is transformed, directly influencing the ability to capture complex, non-linear patterns
Valid Numerical Range
Options: linear, poly, rbf, sigmoid
Regularization parameter.
Technical Implementation Guidance
Typically 1.0; adjust based on the trade-off between margin width and classification error
Effect on Model Performance
Controls the trade-off between maximizing the margin and minimizing misclassification, with lower values providing stronger regularization
Valid Numerical Range
0.1-100.0
Best for scenarios where interpretability is key and the dataset is manageable in size.
Prone to overfitting without proper pruning and can be sensitive to small variations in the data.

Statistical Foundation

Decision Tree employs robust estimation techniques to minimize squared errors and provide optimal linear unbiased estimates under standard assumptions.

Computational Efficiency

Training complexity scales with input dimensionality and sample size, with efficient matrix operations for production deployment.

Data Requirements

Requires normalized, non-collinear input features for optimal performance. Handles continuous and categorical variables with appropriate preprocessing.
Configure these parameters to optimize the Decision Tree for your specific use case. Each parameter affects model training, performance, and prediction accuracy.
ParameterTypeDefaultDescription
criterionStringginiFunction to measure the quality of a split.
splitterStringbestStrategy used to choose the split at each node.
max_depthIntegerNoneThe maximum depth of the tree.
min_samples_splitInteger2The minimum number of samples required to split an internal node.
min_samples_leafInteger1The minimum number of samples required to be at a leaf node.
min_weight_fraction_leafFloat0.0The minimum weighted fraction of the sum total of weights required to be at a leaf node.
max_featuresStringNoneThe number of features to consider when looking for the best split.
max_leaf_nodesIntegerNoneGrow trees with max_leaf_nodes in best-first fashion.
min_impurity_decreaseFloat0.0A node will be split if this split induces a decrease of the impurity greater than or equal to this ...
class_weightString or DictNoneWeights associated with classes in the form {class_label: weight}.
ccp_alphaFloat0.0Complexity parameter used for Minimal Cost-Complexity Pruning.
monotonic_cstArray of IntegersNoneConstraint to enforce monotonicity in the predictions with respect to certain features.
random_stateIntegerNoneControls the randomness of the estimator.

Detailed Parameter Reference

Function to measure the quality of a split.
Technical Implementation Guidance
Use 'gini' for faster computation or 'entropy' for potentially better splits.
Effect on Model Performance
Determines the measure of impurity, directly affecting the quality of splits in the tree.
Valid Numerical Range
Options: gini, entropy
Strategy used to choose the split at each node.
Technical Implementation Guidance
Use 'best' for optimal splits or 'random' to introduce randomness and possibly improve generalization.
Effect on Model Performance
Influences the tree structure; 'best' yields deterministic splits while 'random' may reduce overfitting.
Valid Numerical Range
Options: best, random
The maximum depth of the tree.
Technical Implementation Guidance
None for full growth or a value between 5 and 15 to prevent overfitting.
Effect on Model Performance
Controls the complexity of the tree; deeper trees can capture more nuances but may overfit.
Valid Numerical Range
1-50 or None
The minimum number of samples required to split an internal node.
Technical Implementation Guidance
2 for maximum growth; higher values (5-10) to reduce overfitting.
Effect on Model Performance
Ensures that splits are statistically significant, reducing the chance of overfitting on noise.
Valid Numerical Range
2-20
The minimum number of samples required to be at a leaf node.
Technical Implementation Guidance
Typically between 1 and 5 to ensure leaves are not too specific.
Effect on Model Performance
Helps to smooth the model by ensuring that each leaf node contains a minimum number of samples.
Valid Numerical Range
1-50
The minimum weighted fraction of the sum total of weights required to be at a leaf node.
Technical Implementation Guidance
0.0 for standard datasets; adjust for datasets with weighted samples.
Effect on Model Performance
Prevents the creation of leaves that do not represent a significant portion of the total weight.
Valid Numerical Range
0.0-0.5
The number of features to consider when looking for the best split.
Technical Implementation Guidance
None for using all features; consider 'sqrt' or 'log2' for high-dimensional data.
Effect on Model Performance
Limits the number of features, which can reduce overfitting and improve computational efficiency.
Valid Numerical Range
Options: auto, sqrt, log2, None
Grow trees with max_leaf_nodes in best-first fashion.
Technical Implementation Guidance
None for no limit; set a positive integer to restrict the tree size and reduce overfitting.
Effect on Model Performance
Restricts the number of terminal nodes, controlling the complexity of the model.
Valid Numerical Range
None or positive integer
A node will be split if this split induces a decrease of the impurity greater than or equal to this value.
Technical Implementation Guidance
0.0 for standard splitting; increase slightly to avoid insignificant splits.
Effect on Model Performance
Prevents splits that do not significantly reduce impurity, thereby controlling overfitting.
Valid Numerical Range
>=0
Weights associated with classes in the form {class_label: weight}.
Technical Implementation Guidance
Use 'balanced' to automatically adjust weights for imbalanced datasets.
Effect on Model Performance
Helps the model focus on minority classes by assigning higher weights to underrepresented classes.
Valid Numerical Range
None, balanced, or custom dictionary
Complexity parameter used for Minimal Cost-Complexity Pruning.
Technical Implementation Guidance
0.0 for no pruning; use small positive values to effectively prune the tree.
Effect on Model Performance
Controls the trade-off between tree complexity and training accuracy, helping to prevent overfitting.
Valid Numerical Range
>=0
Constraint to enforce monotonicity in the predictions with respect to certain features.
Technical Implementation Guidance
None unless monotonicity is a required constraint for the application.
Effect on Model Performance
Ensures that predictions change monotonically with certain features, which is critical for some regulatory or business cases.
Valid Numerical Range
None or list of integers
Controls the randomness of the estimator.
Technical Implementation Guidance
Set to a fixed integer to ensure reproducibility of results.
Effect on Model Performance
Ensures that the results are consistent across different runs by controlling random elements in the model.
Valid Numerical Range
Any integer or None
Ideal for complex datasets where boosting can iteratively improve weak learners to achieve high accuracy.
Can be sensitive to noisy data and outliers; training may be time-consuming with many boosting stages.

Statistical Foundation

Gradient Boosting employs robust estimation techniques to minimize squared errors and provide optimal linear unbiased estimates under standard assumptions.

Computational Efficiency

Training complexity scales with input dimensionality and sample size, with efficient matrix operations for production deployment.

Data Requirements

Requires normalized, non-collinear input features for optimal performance. Handles continuous and categorical variables with appropriate preprocessing.
Configure these parameters to optimize the Gradient Boosting for your specific use case. Each parameter affects model training, performance, and prediction accuracy.
ParameterTypeDefaultDescription
n_estimatorsInteger100Number of boosting stages to perform.
learning_rateFloat0.1Learning rate shrinks the contribution of each tree by the learning_rate.
lossStringdevianceLoss function to be optimized.
subsampleFloat1.0The fraction of samples used for fitting the individual base learners.
criterionStringfriedman_mseThe function to measure the quality of a split.
min_samples_splitInteger2The minimum number of samples required to split an internal node.
min_samples_leafInteger1The minimum number of samples required to be at a leaf node.
min_weight_fraction_leafFloat0.0The minimum weighted fraction of the sum total of weights required to be at a leaf node.
max_depthInteger3The maximum depth of the individual regression estimators.
min_impurity_decreaseFloat0.0A node will be split if this split induces a decrease of the impurity greater than or equal to this ...
initString or EstimatorNoneAn estimator object that is used to compute the initial predictions.
random_stateIntegerNoneControls the randomness of the estimator.
max_featuresString or IntegerNoneThe number of features to consider when looking for the best split.
verboseInteger0Enable verbose output.
max_leaf_nodesIntegerNoneGrow trees with max_leaf_nodes in best-first fashion.
warm_startBooleanFalseWhen set to True, reuse the solution of the previous call to fit and add more estimators to the ense...
validation_fractionFloat0.1Proportion of training data to set aside as validation set for early stopping.
n_iter_no_changeIntegerNoneNumber of iterations with no improvement to wait before early stopping.
tolFloat1e-4Tolerance for the early stopping.
ccp_alphaFloat0.0Complexity parameter used for Minimal Cost-Complexity Pruning.

Detailed Parameter Reference

Number of boosting stages to perform.
Technical Implementation Guidance
Typically between 100 and 300; more stages can improve performance but may overfit.
Effect on Model Performance
More boosting stages generally improve model performance until diminishing returns or overfitting occur.
Valid Numerical Range
10-500
Learning rate shrinks the contribution of each tree by the learning_rate.
Technical Implementation Guidance
Commonly set to 0.1; lower values require more estimators, while higher values can lead to overfitting.
Effect on Model Performance
Controls the trade-off between the learning rate and the number of estimators; lower values reduce the risk of overfitting.
Valid Numerical Range
0.01-1.0
Loss function to be optimized.
Technical Implementation Guidance
'deviance' is standard for classification tasks; 'exponential' can be used for alternative boosting behavior.
Effect on Model Performance
Determines the optimization objective, influencing how the model penalizes errors.
Valid Numerical Range
Options: deviance, exponential
The fraction of samples used for fitting the individual base learners.
Technical Implementation Guidance
Often set to 1.0 for full sampling; lower values can introduce bias to reduce variance.
Effect on Model Performance
Reduces overfitting by using a subset of data for each estimator, though too low a value may underfit.
Valid Numerical Range
0.1-1.0
The function to measure the quality of a split.
Technical Implementation Guidance
'friedman_mse' is generally preferred for its performance improvements over mse and mae.
Effect on Model Performance
Influences the tree split quality, affecting the overall performance of the boosting process.
Valid Numerical Range
Options: friedman_mse, mse, mae
The minimum number of samples required to split an internal node.
Technical Implementation Guidance
2 for maximum flexibility; increase to 5-10 to reduce overfitting on small datasets.
Effect on Model Performance
Controls the granularity of the splits, affecting the model’s complexity and risk of overfitting.
Valid Numerical Range
2-20
The minimum number of samples required to be at a leaf node.
Technical Implementation Guidance
Typically between 1 and 5; higher values can smooth the model and reduce variance.
Effect on Model Performance
Ensures that leaf nodes have enough samples to provide stable estimates, reducing overfitting.
Valid Numerical Range
1-50
The minimum weighted fraction of the sum total of weights required to be at a leaf node.
Technical Implementation Guidance
0.0 is standard; adjust if dealing with weighted samples or imbalanced data.
Effect on Model Performance
Prevents splits that result in leaves representing an insignificant portion of the data.
Valid Numerical Range
0.0-0.5
The maximum depth of the individual regression estimators.
Technical Implementation Guidance
A depth of 3 is common for balancing model complexity and performance.
Effect on Model Performance
Limits the complexity of individual trees, preventing them from overfitting to the training data.
Valid Numerical Range
1-50
A node will be split if this split induces a decrease of the impurity greater than or equal to this value.
Technical Implementation Guidance
0.0 is standard; small positive values can be set to prevent insignificant splits.
Effect on Model Performance
Helps in controlling overfitting by ensuring that only splits with sufficient impurity reduction are considered.
Valid Numerical Range
>=0
An estimator object that is used to compute the initial predictions.
Technical Implementation Guidance
None for default behavior; provide a custom estimator if prior predictions are available.
Effect on Model Performance
Can improve convergence speed if a good initial estimator is provided.
Valid Numerical Range
None or valid estimator
Controls the randomness of the estimator.
Technical Implementation Guidance
Set to a fixed integer for reproducibility.
Effect on Model Performance
Ensures consistency of the results across multiple runs by controlling random aspects of the model.
Valid Numerical Range
Any integer or None
The number of features to consider when looking for the best split.
Technical Implementation Guidance
None for using all features; 'sqrt' or 'log2' can be used for high-dimensional data.
Effect on Model Performance
Limits the number of features for split consideration, potentially reducing overfitting and speeding up computation.
Valid Numerical Range
Options: auto, sqrt, log2, None, or positive integer
Enable verbose output.
Technical Implementation Guidance
0 for no output; increase for detailed logging during training.
Effect on Model Performance
Helps monitor the training process and debug issues by providing detailed output.
Valid Numerical Range
Non-negative integers
Grow trees with max_leaf_nodes in best-first fashion.
Technical Implementation Guidance
None for unlimited nodes; specify a number to control tree complexity.
Effect on Model Performance
Restricts the number of terminal nodes in each tree, helping to reduce overfitting.
Valid Numerical Range
None or positive integer
When set to True, reuse the solution of the previous call to fit and add more estimators to the ensemble.
Technical Implementation Guidance
False unless incremental learning is desired.
Effect on Model Performance
Allows the model to continue training from a previous state, potentially saving computation time.
Valid Numerical Range
True/False
Proportion of training data to set aside as validation set for early stopping.
Technical Implementation Guidance
0.1 is common; adjust based on the size of your training set.
Effect on Model Performance
Determines the amount of data used for early stopping, influencing the model’s ability to prevent overfitting.
Valid Numerical Range
0.0-1.0
Number of iterations with no improvement to wait before early stopping.
Technical Implementation Guidance
Set based on desired patience; None to disable early stopping.
Effect on Model Performance
Enables early stopping if the model stops improving, potentially saving computation time and preventing overfitting.
Valid Numerical Range
Any positive integer or None
Tolerance for the early stopping.
Technical Implementation Guidance
1e-4 is standard; adjust if convergence issues are observed.
Effect on Model Performance
Defines the threshold for stopping the training process early, affecting convergence behavior.
Valid Numerical Range
Any positive float
Complexity parameter used for Minimal Cost-Complexity Pruning.
Technical Implementation Guidance
0.0 for no pruning; small positive values can help reduce overfitting.
Effect on Model Performance
Regularizes the complexity of the trees, ensuring that overly complex structures are pruned away.
Valid Numerical Range
>=0
Ideal for large-scale classification tasks, especially when dealing with structured and sparse data.
Sensitive to hyperparameter tuning and may overfit if not regularized properly.

Statistical Foundation

XGBoost employs robust estimation techniques to minimize squared errors and provide optimal linear unbiased estimates under standard assumptions.

Computational Efficiency

Training complexity scales with input dimensionality and sample size, with efficient matrix operations for production deployment.

Data Requirements

Requires normalized, non-collinear input features for optimal performance. Handles continuous and categorical variables with appropriate preprocessing.
Configure these parameters to optimize the XGBoost for your specific use case. Each parameter affects model training, performance, and prediction accuracy.
ParameterTypeDefaultDescription
n_estimatorsInteger100Number of boosting rounds.
learning_rateFloat0.1Learning rate shrinks the contribution of each tree.
max_depthInteger6Maximum depth of a tree.
min_child_weightFloat1Minimum sum of instance weight needed in a child.
subsampleFloat1.0Subsample ratio of the training instance.
colsample_bytreeFloat1.0Subsample ratio of columns when constructing each tree.
gammaFloat0Minimum loss reduction required to make a further partition.
alphaFloat0L1 regularization term on weights.
lambdaFloat1L2 regularization term on weights.
random_stateIntegerNoneSeed for random number generator.
objectiveStringbinary:logisticSpecify the learning task and objective.
boosterStringgbtreeType of boosting model to use.
verbosityInteger1Verbosity of printing messages.

Detailed Parameter Reference

Number of boosting rounds.
Technical Implementation Guidance
Typically between 100 and 500 for balanced performance.
Effect on Model Performance
More boosting rounds can improve performance but also increase the risk of overfitting.
Valid Numerical Range
10-1000
Learning rate shrinks the contribution of each tree.
Technical Implementation Guidance
Usually set between 0.01 and 0.3.
Effect on Model Performance
Lower values improve model robustness but require more estimators to converge.
Valid Numerical Range
0.01-1.0
Maximum depth of a tree.
Technical Implementation Guidance
Commonly between 3 and 10, depending on dataset complexity.
Effect on Model Performance
Controls model complexity; deeper trees capture more details but may overfit.
Valid Numerical Range
1-50
Minimum sum of instance weight needed in a child.
Technical Implementation Guidance
Typically 1 or greater; higher values can help prevent overfitting.
Effect on Model Performance
Ensures that nodes have sufficient information before further splitting, reducing overfitting.
Valid Numerical Range
0-10
Subsample ratio of the training instance.
Technical Implementation Guidance
Usually set around 0.8 to 1.0.
Effect on Model Performance
Reduces overfitting by using a fraction of the training data for each tree.
Valid Numerical Range
0.1-1.0
Subsample ratio of columns when constructing each tree.
Technical Implementation Guidance
Commonly between 0.7 and 1.0.
Effect on Model Performance
Helps prevent overfitting by randomly sampling features for each tree.
Valid Numerical Range
0.1-1.0
Minimum loss reduction required to make a further partition.
Technical Implementation Guidance
Set to 0 for default behavior; adjust upward to make the algorithm more conservative.
Effect on Model Performance
Controls the complexity by requiring a minimum loss reduction to split nodes.
Valid Numerical Range
0-5
L1 regularization term on weights.
Technical Implementation Guidance
0 for default; higher values can induce sparsity.
Effect on Model Performance
Adds a penalty for large coefficients, encouraging a simpler model.
Valid Numerical Range
0-5
L2 regularization term on weights.
Technical Implementation Guidance
Typically 1; adjust to reduce overfitting.
Effect on Model Performance
Helps stabilize the learning process and reduce overfitting by penalizing large weights.
Valid Numerical Range
0-5
Seed for random number generator.
Technical Implementation Guidance
Set to a fixed integer to ensure reproducibility.
Effect on Model Performance
Ensures consistent results across runs by controlling randomness.
Valid Numerical Range
Any integer or None
Specify the learning task and objective.
Technical Implementation Guidance
Choose based on the task: binary classification, multi-class classification, or regression.
Effect on Model Performance
Determines the prediction problem type and influences the loss function used.
Valid Numerical Range
Options: binary:logistic, multi:softmax, reg:squarederror
Type of boosting model to use.
Technical Implementation Guidance
gbtree is common; consider gblinear for linear models or dart for dropout boosting.
Effect on Model Performance
Impacts model performance and training speed depending on the boosting algorithm chosen.
Valid Numerical Range
Options: gbtree, gblinear, dart
Verbosity of printing messages.
Technical Implementation Guidance
1 for moderate output; adjust based on desired level of logging.
Effect on Model Performance
Controls the amount of training information printed, aiding in debugging and monitoring.
Valid Numerical Range
Non-negative integers
Ideal for datasets with a high proportion of categorical features and when minimal preprocessing is desired.
May require careful tuning on numerical features and can be computationally intensive with large datasets.

Statistical Foundation

CatBoost employs robust estimation techniques to minimize squared errors and provide optimal linear unbiased estimates under standard assumptions.

Computational Efficiency

Training complexity scales with input dimensionality and sample size, with efficient matrix operations for production deployment.

Data Requirements

Requires normalized, non-collinear input features for optimal performance. Handles continuous and categorical variables with appropriate preprocessing.
Configure these parameters to optimize the CatBoost for your specific use case. Each parameter affects model training, performance, and prediction accuracy.
ParameterTypeDefaultDescription
iterationsInteger1000The number of boosting iterations.
learning_rateFloat0.03Learning rate.
depthInteger6Depth of the tree.
l2_leaf_regFloat3L2 regularization term on weights.
bagging_temperatureFloat1.0Controls the Bayesian bootstrap sampling.
random_strengthFloat1.0Score regularization coefficient.
border_countInteger254The number of splits for numerical features.
random_seedIntegerNoneRandom number generator seed.
boosting_typeStringPlainType of boosting used.
verboseInteger0Verbosity level.

Detailed Parameter Reference

The number of boosting iterations.
Technical Implementation Guidance
Typically between 500 and 1500, depending on dataset size and complexity.
Effect on Model Performance
More iterations can lead to better performance but increase the risk of overfitting and training time.
Valid Numerical Range
10-10000
Learning rate.
Technical Implementation Guidance
Commonly set between 0.01 and 0.1 to balance between convergence speed and overfitting.
Effect on Model Performance
Lower learning rates provide more robust models at the expense of longer training times.
Valid Numerical Range
0.01-1.0
Depth of the tree.
Technical Implementation Guidance
Typically between 4 and 10; deeper trees capture more complex patterns but may overfit.
Effect on Model Performance
Controls the complexity of the model; deeper trees can capture more interactions but increase overfitting risk.
Valid Numerical Range
1-16
L2 regularization term on weights.
Technical Implementation Guidance
Commonly set to 3; adjust upward to reduce overfitting in complex models.
Effect on Model Performance
Helps regularize the model by penalizing large weights, reducing overfitting.
Valid Numerical Range
0-10
Controls the Bayesian bootstrap sampling.
Technical Implementation Guidance
Typically set around 1.0; higher values introduce more randomness, which may help with overfitting.
Effect on Model Performance
Affects the randomness of sample selection, influencing model robustness and generalization.
Valid Numerical Range
0-10
Score regularization coefficient.
Technical Implementation Guidance
Usually set to 1.0; increasing can add robustness to the model by smoothing predictions.
Effect on Model Performance
Regulates the model’s sensitivity to changes in the data, potentially improving generalization.
Valid Numerical Range
0-20
The number of splits for numerical features.
Technical Implementation Guidance
Default of 254 works well in many cases; lower values may speed up training with a slight trade-off in accuracy.
Effect on Model Performance
Determines how numerical features are binned, impacting both model performance and training efficiency.
Valid Numerical Range
1-255
Random number generator seed.
Technical Implementation Guidance
Set to a fixed integer for reproducibility.
Effect on Model Performance
Ensures consistency in results across different runs by controlling randomness.
Valid Numerical Range
Any integer or None
Type of boosting used.
Technical Implementation Guidance
'Plain' is common; 'Ordered' can help reduce prediction shift in some cases.
Effect on Model Performance
Determines the boosting algorithm, which affects both model performance and training behavior.
Valid Numerical Range
Options: Ordered, Plain
Verbosity level.
Technical Implementation Guidance
0 for minimal output; increase for more detailed training logs.
Effect on Model Performance
Controls the level of logging during training, aiding in debugging and performance monitoring.
Valid Numerical Range
Non-negative integers
Ideal for large-scale data and high-dimensional feature spaces where fast training and low memory usage are crucial.
May require careful tuning and can be sensitive to overfitting on small datasets.

Statistical Foundation

LightGBM employs robust estimation techniques to minimize squared errors and provide optimal linear unbiased estimates under standard assumptions.

Computational Efficiency

Training complexity scales with input dimensionality and sample size, with efficient matrix operations for production deployment.

Data Requirements

Requires normalized, non-collinear input features for optimal performance. Handles continuous and categorical variables with appropriate preprocessing.
Configure these parameters to optimize the LightGBM for your specific use case. Each parameter affects model training, performance, and prediction accuracy.
ParameterTypeDefaultDescription
num_leavesInteger31Maximum number of leaves in one tree.
learning_rateFloat0.1Learning rate.
n_estimatorsInteger100Number of boosting rounds.
max_depthInteger-1Maximum depth of a tree.
min_data_in_leafInteger20Minimum number of data needed in a leaf.
feature_fractionFloat1.0Subsample ratio of features.
bagging_fractionFloat1.0Subsample ratio of training data.
bagging_freqInteger0Frequency of bagging.
lambda_l1Float0.0L1 regularization term.
lambda_l2Float0.0L2 regularization term.
min_split_gainFloat0.0Minimum gain to make a split.
random_stateIntegerNoneRandom number generator seed.
objectiveStringbinaryLearning objective.
verboseInteger1Verbosity of output.

Detailed Parameter Reference

Maximum number of leaves in one tree.
Technical Implementation Guidance
Typically between 20 and 100 for balanced models.
Effect on Model Performance
Affects the complexity of the tree; more leaves can capture more complex patterns but risk overfitting.
Valid Numerical Range
2-131072
Learning rate.
Technical Implementation Guidance
Commonly set between 0.05 and 0.2.
Effect on Model Performance
Determines the contribution of each tree; lower rates require more boosting rounds.
Valid Numerical Range
0.01-1.0
Number of boosting rounds.
Technical Implementation Guidance
Typically between 100 and 500.
Effect on Model Performance
More boosting rounds can improve performance but increase training time.
Valid Numerical Range
10-1000
Maximum depth of a tree.
Technical Implementation Guidance
Set to -1 for no limit or a positive integer to restrict depth.
Effect on Model Performance
Controls the complexity of individual trees; limiting depth can reduce overfitting.
Valid Numerical Range
-1-50
Minimum number of data needed in a leaf.
Technical Implementation Guidance
Typically set to 20 or higher to ensure stable splits.
Effect on Model Performance
Ensures that leaves have sufficient samples to provide robust predictions.
Valid Numerical Range
1-10000
Subsample ratio of features.
Technical Implementation Guidance
Usually set around 0.8 to 1.0.
Effect on Model Performance
Helps reduce overfitting by randomly selecting a subset of features for each tree.
Valid Numerical Range
0.1-1.0
Subsample ratio of training data.
Technical Implementation Guidance
Often between 0.8 and 1.0.
Effect on Model Performance
Improves model generalization by using a fraction of data for each iteration.
Valid Numerical Range
0.1-1.0
Frequency of bagging.
Technical Implementation Guidance
Set to 0 to disable bagging, or a positive integer to specify frequency.
Effect on Model Performance
Determines how often bagging is performed; more frequent bagging can reduce overfitting.
Valid Numerical Range
0-10
L1 regularization term.
Technical Implementation Guidance
Typically 0.0; increase to add sparsity.
Effect on Model Performance
Helps reduce overfitting by penalizing large coefficients.
Valid Numerical Range
0.0-10.0
L2 regularization term.
Technical Implementation Guidance
Typically 0.0; adjust to prevent overfitting.
Effect on Model Performance
Adds stability to the model by penalizing large weights.
Valid Numerical Range
0.0-10.0
Minimum gain to make a split.
Technical Implementation Guidance
Usually set to 0.0; increase to require more significant splits.
Effect on Model Performance
Prevents insignificant splits, which can help reduce overfitting.
Valid Numerical Range
0.0-10.0
Random number generator seed.
Technical Implementation Guidance
Set to a fixed integer for reproducibility.
Effect on Model Performance
Ensures consistent results across different runs.
Valid Numerical Range
Any integer or None
Learning objective.
Technical Implementation Guidance
Choose based on task: 'binary' for binary classification, etc.
Effect on Model Performance
Determines the type of prediction problem the model will solve.
Valid Numerical Range
Options: binary, multiclass, regression
Verbosity of output.
Technical Implementation Guidance
1 for moderate output; adjust as needed.
Effect on Model Performance
Controls the level of logging during training.
Valid Numerical Range
Non-negative integers

Was this page helpful?

Need help?Contact Support
Questions?Contact Sales

Last updated: 3/22/2025

ML Clever Docs