Recently, professors Risheng Liu from Dalian University of Technology and Zhouchen Lin from Peking University collaborated on an opinion article published in the National Science Review (NSR). Their article delves deeply into AutoML from the perspective of bilevel optimization, achieving unified modeling of various AutoML tasks while exploring challenges and opportunities. This article will be included in the NSR‘s special topic on “Automating Machine Learning.”
Generally, AutoML requires the automation of three key tasks, including meta-feature learning, neural network architecture search, and hyperparameter optimization. Bilevel Optimization (BLO) is an effective mathematical tool for modeling these tasks, providing a unified AutoML framework. This framework achieves the core objective of AutoML: constructing high-performance models with minimal human intervention.
Specifically, in the upper-level optimization, the core variables are “meta-parameters,” aiming to seek the optimal “methodology” to achieve performance optimization of machine learning models on the validation set (such as meta-features, network structures, and hyperparameters). On the other hand, the core variables in lower-level optimization are “model parameters,” focusing on optimizing model performance on the training set.
Currently, ML/AutoML technologies, represented by gradient-based BLO algorithms, have gradually gained prominence. However, they still face numerous challenges in practical applications.
For instance, some algorithms heavily rely on the singularity and convexity of lower-level problems, limiting their practicality in real-world scenarios. Additionally, when employing approximate substitution methods in practical applications, there is a lack of theoretical analysis regarding the rigorous convergence of algorithms.
In the future, the challenges faced by BLO in the field of AutoML and promising research directions mainly include the following aspects:
- Compute Acceleration: As the scale of datasets expands and task complexity grows, there is an urgent need to accelerate the computational speed of BLO algorithms in handling large-scale, high-dimensional AutoML tasks. Parallel/distributed computing technologies could serve as an effective approach to address this issue.
- Theoretical Breakthroughs: Presently, gradient-based BLO methods heavily rely on stringent theoretical assumptions, such as the assumption of submodularity and convexity in lower-level problems. To meet the demands of real-world applications, there is a necessity to construct new theoretical analysis frameworks and efficient computational methods to handle better more challenging practical scenarios involving non-convexity and discreteness.
- Optimization-Derived Learning: From the new perspective of bi-level optimization, we can explore disruptive AutoML technologies that integrate Simulation Learning Methodology (SLeM), especially when integrated with large models. This exploration involves delving deeper into the underlying logic of AutoML to design more efficient and precise learning strategies.
In summary, this article has achieved unified modeling of different AutoML tasks from the perspective of BLO. It extensively analyzes the current state and future directions of AutoML centered around the development of BLO algorithms. The novel viewpoints presented in this article contribute to advancing AutoML, empowering artificial intelligence technology to progress toward more intelligent and efficient realms.
More information:
Risheng Liu et al, Bilevel optimization for automated machine learning: a new perspective on framework and algorithm, National Science Review (2023). DOI: 10.1093/nsr/nwad292
Science China Press
Citation:
In-depth analysis: Automated machine learning from the perspective of bilevel optimization (2024, February 21)
retrieved 21 February 2024
from https://techxplore.com/news/2024-02-depth-analysis-automated-machine-perspective.html
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.