AutoML (Automated Machine Learning) enables you to build, optimize, and deploy high-quality machine learning models with minimal effort. This powerful feature automatically discovers the best algorithms, hyperparameters, and preprocessing steps for your specific dataset, making machine learning accessible to users of all experience levels.
Reduce the time spent on model selection, hyperparameter tuning, and preprocessing by automating the entire process.
Discover model combinations and parameters that you might not have tried manually, leading to better overall model performance.
Create production-ready machine learning pipelines without writing a single line of code or understanding complex algorithms.
Leverage industry best practices for preprocessing, cross-validation, feature selection, and model evaluation automatically.
ML Clever provides three different approaches to start your AutoML process, depending on your workflow preferences and requirements:
Launch AutoML directly from your dataset view with minimal configuration using the AutoML button.
Ideal for: Quick exploration of your dataset's potential or when you want a completely automated solution with minimal input.
Manually preprocess your data first to apply domain-specific transformations, then use AutoML for model selection and optimization.
Ideal for: Users who want to apply their domain knowledge to data preparation while letting AutoML handle the model selection and optimization.
Create a complete end-to-end pipeline that includes preprocessing steps and AutoML in a reusable workflow.
Ideal for: Production workflows, recurring tasks, or when you need to standardize your ML process across multiple datasets.
Pro Tip: Check the "ML Readiness" tab on your dataset view page before starting AutoML. Addressing any data quality issues first can significantly improve your AutoML results.
ML Clever offers three levels of automation to match your specific needs and time constraints:
A rapid analysis that tries several common models with basic optimization to give you quick insights.
Ideal for initial dataset exploration or when you need quick results.
A well-rounded approach that balances thoroughness with reasonable time constraints.
Suitable for most use cases and production-ready models.
An extensive search for the absolute best model with comprehensive optimization.
Best for critical applications where model performance is paramount.
When you start an AutoML training job, ML Clever systematically works through the following stages:
AutoML begins by analyzing your dataset's characteristics, including:
Based on the data analysis, AutoML selects appropriate preprocessing techniques:
AutoML intelligently selects and trains multiple model types:
For promising models, AutoML performs sophisticated hyperparameter tuning:
AutoML builds ensemble models that combine top-performing individual models:
AutoML performs a comprehensive evaluation to select the best model:
Once your AutoML process completes, ML Clever presents the results in an organized dashboard:
The Top Models section displays all models created during your AutoML run, sorted by performance:
Note: Clicking on any model card will take you to the detailed model page where you can explore performance metrics, feature importance, confusion matrices (for classification), and other in-depth analyses.
Follow these guidelines to get the most out of ML Clever's AutoML capabilities:
While AutoML handles preprocessing, addressing obvious data quality issues (extreme outliers, duplicate records, irrelevant columns) beforehand can improve results significantly.
Match the AutoML level to your use case. For initial exploration, choose Quick; for critical applications with high performance requirements, choose Thorough.
Thorough optimization requires significant computing power. For large datasets, consider running AutoML during off-hours or upgrading your compute resources.
While AutoML selects the best overall model, review the top 3-5 models. Sometimes the second-best model might be preferable due to explainability, inference speed, or other factors.
For critical applications, use AutoML as a starting point, then fine-tune the best model with your domain expertise using the manual model configuration options.
When you find effective preprocessing + AutoML combinations, save them as pipeline templates for future use with similar datasets.
Occasionally, the AutoML process might stop unexpectedly or fail to complete.
If all AutoML models show unexpectedly poor performance, there might be underlying data issues.
AutoML can sometimes take longer than expected, especially with complex datasets.
Very large datasets or complex models might exceed available memory.
Runtime varies significantly based on dataset size, number of features, and the selected AutoML level. Quick Exploration usually completes in 2-5 minutes, Balanced takes 10-20 minutes, and Thorough Optimization can take 30-60+ minutes for complex datasets. The progress indicators will show estimated completion time.
Yes, you can stop an AutoML run at any time using the "Stop Task" button. ML Clever will save all completed models up to that point, allowing you to review partial results. This is useful if you've already found a satisfactory model or need to adjust your approach.
For classification tasks, AutoML evaluates models including Logistic Regression, Decision Trees, Random Forests, Gradient Boosting, Support Vector Machines, K-Nearest Neighbors, and Neural Networks. For regression, it tries Linear Regression, Ridge/Lasso Regression, Decision Trees, Random Forests, Gradient Boosting (XGBoost, LightGBM), and Neural Networks. The specific set varies based on your data characteristics and chosen AutoML level.
AutoML automatically detects and handles missing values using imputation techniques appropriate for your data. For numerical features, it may use mean, median, or model-based imputation. For categorical features, it typically uses mode imputation or creates a dedicated "missing" category. Categorical features are automatically encoded using appropriate methods like one-hot, label, or target encoding based on cardinality and relationship with the target.
Yes, in the detailed model view (accessible by clicking on any model card), you'll find a "Preprocessing Pipeline" section that shows all preprocessing steps applied to your data, including imputation methods, encoding techniques, scaling approaches, and feature transformations.
Consider these factors beyond the primary score: (1) Secondary metrics specific to your problem (e.g., precision vs. recall for imbalanced classification), (2) Model explainability if interpretability is important, (3) Inference speed for real-time applications, (4) Memory footprint for deployment constraints, and (5) Robustness to data drift for long-term stability.
Detailed guide to regression model types, metrics, and use cases
In-depth explanation of classification model options and evaluation
Learn about manual preprocessing options for your datasets
Understand how to interpret model performance metrics and charts