Computer Science - Learning Mathematics - Optimization and Control Statistics - Machine Learning
In this work, we focus on the study of stochastic zeroth-order (ZO)
optimization which does not require first-order gradient information and uses
only function evaluations. The problem of ZO optimization has emerged in many
recent machine learning applications, where the gradient of the objective
function is either unavailable or difficult to compute. In such cases, we can
approximate the full gradients or stochastic gradients through function value
based gradient estimates. Here, we propose a novel hybrid gradient estimator
(HGE), which takes advantage of the query-efficiency of random gradient
estimates as well as the variance-reduction of coordinate-wise gradient
estimates. We show that with a graceful design in coordinate importance
sampling, the proposed HGE-based ZO optimization method is efficient both in
terms of iteration complexity as well as function query cost. We provide a
thorough theoretical analysis of the convergence of our proposed method for
non-convex, convex, and strongly-convex optimization. We show that the
convergence rate that we derive generalizes the results for some prominent
existing methods in the nonconvex case, and matches the optimal result in the
convex case. We also corroborate the theory with a real-world black-box attack
generation application to demonstrate the empirical advantage of our method
over state-of-the-art ZO optimization approaches.
Metrics
6 Record Views
Details
Title
Zeroth-Order Hybrid Gradient Descent: Towards A Principled Black-Box Optimization Framework
Creators
Pranay Sharma
Kaidi Xu
Sijia Liu
Pin-Yu Chen
Xue Lin
Pramod K Varshney
Publication Details
arXiv.org
Resource Type
Preprint
Language
English
Academic Unit
Computer Science (Computing)
Other Identifier
991021871361504721
Research Home Page
Browse by research and academic units
Learn about the ETD submission process at Drexel
Learn about the Libraries’ research data management services