Publications
Publications by categories in reversed chronological order. generated by jekyll-scholar.
See also my Google Scholar and Github pages. You can also view them by choosing the category below.
2024
- PreprintA Guide to Stochastic Optimisation for Large-Scale Inverse ProblemsarXiv preprint arXiv:2406.06342, 2024
Stochastic optimisation algorithms are the de facto standard for machine learning with large amounts of data. Handling only a subset of available data in each optimisation step dramatically reduces the per-iteration computational costs, while still ensuring significant progress towards the solution. Driven by the need to solve large-scale optimisation problems as efficiently as possible, the last decade has witnessed an explosion of research in this area. Leveraging the parallels between machine learning and inverse problems has allowed harnessing the power of this research wave for solving inverse problems. In this survey, we provide a comprehensive account of the state-of-the-art in stochastic optimisation from the viewpoint of inverse problems. We present algorithms with diverse modalities of problem randomisation and discuss the roles of variance reduction, acceleration, higher-order methods, and other algorithmic modifications, and compare theoretical results with practical behaviour. We focus on the potential and the challenges for stochastic optimisation that are unique to inverse imaging problems and are not commonly encountered in machine learning. We conclude the survey with illustrative examples from imaging problems to examine the advantages and disadvantages that this new generation of algorithms bring to the field of inverse problems.
- JournalFederated Primal-Dual Fixed Point AlgorithmYa-Nan Zhu, Jingwei Liang, and Xiaoqun ZhangSIAM Journal on Mathematics of Data Science, accepted, 2024
Federated learning (FL) is a distributed learning paradigm that allows several clients to learn a global model without sharing their private data. In this paper, we generalize a primal dual fixed point (PDFP) \citePDFP method to federated learning setting and propose an algorithm called Federated PDFP (FPDFP) for solving composite optimization problems. In addition, a quantization scheme is applied to reduce the communication overhead during the learning process. An O(1/k) convergence rate (where is the communication round) of the proposed FPDFP is provided. Numerical experiments, including graph-guided logistic regression, 3D Computed Tomography (CT) reconstruction are considered to evaluate the proposed algorithm.
- JournalA Scalable Sphere-Constrained Magnitude-Sparse SAR ImagingMing Jiang, Jiaxuan Qu , Jinshan Ding, and Jingwei LiangJournal of Nonlinear & Variational Analysis, 2024
The classical synthetic aperture radar (SAR) imaging techniques based on matched filters are limited by data bandwidth, resulting in limited imaging performance with side lobes and speckles present. To address the high-resolution SAR imaging problem, sparse reconstruction has been extensively investigated. However, the state-of-the-art sparse recovery methods seldom consider the complex-valued reflectivity of the scene and only recover an approximated real-valued scene instead. Furthermore, iterative schemes associated with the sparse recovery methods demand a high computational cost, which limits the practical applications of these methods. In this paper, we establish a sphere-constrained magnitude-sparsity SAR imaging model, aiming at enhancing the SAR imaging quality with high efficiency. We propose a non-convex non-smooth optimization method, which can be accelerated by stochastic average gradient acceleration to be scalable with large-scale problems. Numerical experiments are conducted with point-target and extended-target simulations. On the one hand, the point-target simulation showcases the superiority of our proposed method over the classical methods in terms of resolution. On the other hand, the extended-target simulation with random phases is considered to be in line with the practical scenario, and the results demonstrate that our method outperforms the classical SAR imaging methods and sparse recovery without phase prior in terms of PSNR. Meanwhile, owing to the stochastic acceleration, our method is faster than the existing sparse recovery methods by orders of magnitude.
2023
- JournalA Distance Function Based Cascaded Neural Network for Accurate Polyps Segmentation and ClassificationYuanhong Jiang, Jingwei Liang, Weiqi Xiong , Qiwei Li , Yijue Zhang, Tao Chen, and 1 more authorInverse Problems and Imaging, 2023
In clinical practice, accurate locating and measuring the size of polyps in general is a difficult task. In this work, based on the position constraint between the primary organ and polyps boundary, we propose a U-Net based cascaded neural network for the joint segmentation of the organ of interest and the polyps. The position constraint is enforced by considering a narrow-band distance function and complimentary dice function in the loss function. Through a series of comparisons and ablation study, we evaluate the performance of our proposed cascaded network architecture and the additional loss functions using an in-house dataset for gallbladder polyps segmentation and classification. Numerical results indicate that the proposed method achieves a significant improvement over conventional U-Net, U-Net++, etc.. Lastly, the pathological type classification based on the segmented polys shows 30% higher accuracy compared to those conventional ResNet based results.
- ConferenceRobust Graph Representation Learning for Local Corruption RecoveryBingxin Zhou , Yuanhong Jiang, Yuguang Wang, Jingwei Liang, Junbin Gao, Shirui Pan, and 1 more authorIn Proceedings of the ACM Web Conference, 2023
The performance of graph representation learning is affected by the quality of graph input. While existing research usually pursues a globally smoothed graph embedding, we believe the rarely observed anomalies are as well harmful to an accurate prediction. This work establishes a graph learning scheme that automatically detects (locally) corrupted feature attributes and recovers robust embedding for prediction tasks. The detection operation leverages a graph autoencoder, which does not make any assumptions about the distribution of the local corruptions. It pinpoints the positions of the anomalous node attributes in an unbiased mask matrix, where robust estimations are recovered with sparsity promoting regularizer. The optimizer approaches a new embedding that is sparse in the framelet domain and conditionally close to input observations. Extensive experiments are provided to validate our proposed model can recover a robust graph representation from black-box poisoning and achieve excellent performance.
2022
- JournalPartial Smoothness and Constant RankAdrian S. Lewis, Jingwei Liang, and Tonghua TianSIAM Journal on Optimization, 2022
In optimization, the notion of a partly smooth objective function is powerful for applications in algorithmic convergence and postoptimality analysis, and yet is complex to define. A shift in focus to the first-order optimality conditions reduces the concept to a simple constant-rank condition. In this view, partial smoothness extends to more general variational systems, encompassing in particular the saddlepoint operators underlying popular primal-dual splitting algorithms. For a broad class of semi-algebraic generalized equations, partial smoothness holds generically.
- JournalVariable Screening for Sparse Online RegressionJingwei Liang, and Clarice PoonJournal of Computational and Graphical Statistics, 2022
Sparsity-promoting regularizers are widely used to impose low-complexity structure (e.g., l1-norm for sparsity) to the regression coefficients of supervised learning. In the realm of deterministic optimization, the sequence generated by iterative algorithms (such as proximal gradient descent) exhibit “finite activity identification” property, that is, they can identify the low-complexity structure of the solution in a finite number of iterations. However, many online algorithms (such as proximal stochastic gradient descent) do not have this property owing to the vanishing step-size and nonvanishing variance. In this article, by combining with a screening rule, we show how to eliminate useless features of the iterates generated by online algorithms, and thereby enforce finite sparsity identification. One advantage of our scheme is that when combined with any convergent online algorithm, sparsity properties imposed by the regularizer can be exploited to improve computational efficiency. Numerically, significant acceleration can be obtained.
- JournalOn Biased Stochastic Gradient EstimationDerek Driggs, Jingwei Liang, and Carola-Bibiane SchönliebJournal of Machine Learning Research, 2022
We present a uniform analysis of biased stochastic gradient methods for minimizing convex, strongly convex, and non-convex composite objectives, and identify settings where bias is useful in stochastic gradient estimation. The framework we present allows us to extend proximal support to biased algorithms, including SAG and SARAH, for the first time in the convex setting. We also use our framework to develop a new algorithm, Stochastic Average Recursive GradiEnt (SARGE), that achieves the oracle complexity lower-bound for non-convex, finite-sum objectives and requires strictly fewer calls to a stochastic gradient oracle per iteration than SVRG and SARAH. We support our theoretical results with numerical experiments that demonstrate the benefits of certain biased gradient estimators.
- JournalTFPNP: Tuning-free Plug-and-Play Proximal Algorithms with Applications to Inverse Imaging ProblemsKaixuan Wei, Angelica Aviles-Rivero, Jingwei Liang, Ying Fu, Hua Huang, and Carola-Bibiane SchönliebJournal of Machine Learning Research, 2022
Plug-and-Play (PnP) is a non-convex optimization framework that combines proximal algorithms, for example, the alternating direction method of multipliers (ADMM), with advanced denoising priors. Over the past few years, great empirical success has been obtained by PnP algorithms, especially for the ones that integrate deep learning-based denoisers. However, a key problem of PnP approaches is the need for manual parameter tweaking which is essential to obtain high-quality results across the high discrepancy in imaging conditions and varying scene content. In this work, we present a class of tuning-free PnP proximal algorithms that can determine parameters such as denoising strength, termination time, and other optimization-specific parameters automatically. A core part of our approach is a policy network for automated parameter search which can be effectively learned via a mixture of model-free and model-based deep reinforcement learning strategies. We demonstrate, through rigorous numerical and visual experiments, that the learned policy can customize parameters to different settings, and is often more efficient and effective than existing handcrafted criteria. Moreover, we discuss several practical considerations of PnP denoisers, which together with our learned policy yield state-of-the-art results. This advanced performance is prevalent on both linear and nonlinear exemplar inverse imaging problems, and in particular shows promising results on compressed sensing MRI, sparse-view CT, single-photon imaging, and phase retrieval.
- JournalImproving “Fast Iterative Shrinkage-Thresholding Algorithm”: Faster, Smarter, and GreedierJingwei Liang, Tao Luo, and Carola-Bibiane SchönliebSIAM Journal on Scientific Computing, 2022
The “fast iterative shrinkage-thresholding algorithm,” a.k.a. FISTA, is one of the most well known first-order optimization scheme in the literature, as it achieves the worst-case O(1/k^2) optimal convergence rate for objective function value. However, despite such an optimal theoretical convergence rate, in practice the (local) oscillatory behavior of FISTA often damps its efficiency. Over the past years, various efforts have been made in the literature to improve the practical performance of FISTA, such as monotone FISTA, restarting FISTA, and backtracking strategies. In this paper, we propose a simple yet effective modification to the original FISTA scheme which has two advantages: It allows us to (1) prove the convergence of generated sequence and (2) design a so-called lazy-start strategy which can be up to an order faster than the original scheme. Moreover, we propose novel adaptive and greedy strategies which further improve the practical performance. The advantages of the proposed schemes are tested through problems arising from inverse problems, machine learning, and signal/image processing.
2021
- JournalA Stochastic Alternating Direction Method of Multipliers for Non-smooth and Non-convex OptimizationFengmiao Bian, Jingwei Liang, and Xiaoqun ZhangInverse Problems, 2021
Alternating direction method of multipliers (ADMM) is a popular first-order method owing to its simplicity and efficiency. However, similar to other proximal splitting methods, the performance of ADMM degrades significantly when the scale of optimization problems to solve becomes large. In this paper, we consider combining ADMM with a class of variance-reduced stochastic gradient estimators for solving large-scale non-convex and non-smooth optimization problems. Global convergence of the generated sequence is established under the additional assumption that the object function satisfies Kurdyka-Łojasiewicz property. Numerical experiments on graph-guided fused lasso and computed tomography are presented to demonstrate the performance of the proposed methods.
- JournalA Stochastic Proximal Alternating Minimization for Nonsmooth and Nonconvex OptimizationSIAM Journal on Imaging Sciences, 2021
In this work, we introduce a novel stochastic proximal alternating linearized minimization algorithm [J. Bolte, S. Sabach, and M. Teboulle, Math. Program., 146 (2014), pp. 459–494] for solving a class of nonsmooth and nonconvex optimization problems. Large-scale imaging problems are becoming increasingly prevalent due to the advances in data acquisition and computational capabilities. Motivated by the success of stochastic optimization methods, we propose a stochastic variant of proximal alternating linearized minimization. We provide global convergence guarantees, demonstrating that our proposed method with variance-reduced stochastic gradient estimators, such as SAGA [A. Defazio, F. Bach, and S. Lacoste-Julien, Advances in Neural Information Processing Systems, 2014, pp. 1646–1654] and SARAH [L. M. Nguyen, J. Liu, K. Scheinberg, and M. Takáĉ, Proceedings of the 34th International Conference on Machine Learning, PMLR 70, 2017, pp. 2613–2621], achieves state-of-the-art oracle complexities. We also demonstrate the efficacy of our algorithm via several numerical examples including sparse nonnegative matrix factorization, sparse principal component analysis, and blind image-deconvolution.
- PreprintAn Adaptive Rank Continuation Algorithm for General Weighted Low-rank RecoveryAritra Dutta, Jingwei Liang , and Xin LiarXiv preprint arXiv:2101.00749, 2021
This paper is devoted to proposing a general weighted low-rank recovery model and designing a fast SVD-free computational scheme to solve it. First, our generic weighted low-rank recovery model unifies several existing approaches in the literature. Moreover, our model readily extends to the non-convex setting. Algorithm-wise, most first-order proximal algorithms in the literature for low-rank recoveries require computing singular value decomposition (SVD). As SVD does not scale appropriately with the dimension of the matrices, these algorithms become slower when the problem size becomes larger. By incorporating the variational formulation of the nuclear norm into the sub-problem of proximal gradient descent, we avoid computing SVD, which results in significant speed-up. Moreover, our algorithm preserves the rank identification property of nuclear norm [33] which further allows us to design a rank continuation scheme that asymptotically achieves the minimal iteration complexity. Numerical experiments on both toy examples and real-world problems, including structure from motion (SfM) and photometric stereo, background estimation, and matrix completion, demonstrate the superiority of our proposed algorithm.
2020
- ConferenceTuning-free Plug-and-Play Proximal Algorithm for Inverse Imaging ProblemsKaixuan Wei, Angelica Aviles-Rivero, Jingwei Liang, Ying Fu, Carola-Bibiane Schönlieb, and Hua HuangIn International Conference on Machine Learning (ICML), 2020
This paper received the Outstanding Paper Award from ICML2020.
Plug-and-play (PnP) is a non-convex framework that combines ADMM or other proximal algorithms with advanced denoiser priors. Recently, PnP has achieved great empirical success, especially with the integration of deep learning-based denoisers. However, a key problem of PnP based approaches is that they require manual parameter tweaking. It is necessary to obtain high-quality results across the high discrepancy in terms of imaging conditions and varying scene content. In this work, we present a tuning-free PnP proximal algorithm, which can automatically determine the internal parameters including the penalty parameter, the denoising strength and the terminal time. A key part of our approach is to develop a policy network for automatic search of parameters, which can be effectively learned via mixed model-free and model-based deep reinforcement learning. We demonstrate, through numerical and visual experiments, that the learned policy can customize different parameters for different states, and often more efficient and effective than existing handcrafted criteria. Moreover, we discuss the practical considerations of the plugged denoisers, which together with our learned policy yield state-of-the-art results. This is prevalent on both linear and nonlinear exemplary inverse imaging problems, and in particular, we show promising results on Compressed Sensing MRI and phase retrieval.
- PreprintGeometry of First-Order Methods and Adaptive AccelerationClarice Poon, and Jingwei LiangarXiv preprint arXiv:2003.03910, 2020
First-order operator splitting methods are ubiquitous among many fields through science and engineering, such as inverse problems, signal/image processing, statistics, data science and machine learning, to name a few. In this paper, we study a geometric property of first-order methods when applying to solve non-smooth optimization problems. With the tool of "partial smoothness", we design a framework to analyze the trajectory of the fixed-point sequence generated by first-order methods and show that locally, the fixed-point sequence settles onto a regular trajectory such as a straight line or a spiral. Based on this finding, we discuss the limitation of current widely used "inertial acceleration" technique, and propose a trajectory following adaptive acceleration algorithm. Global convergence is established for the proposed acceleration scheme based on the perturbation of fixed-point iteration. Locally, we first build connections between the acceleration scheme and the well-studied "vector extrapolation technique" in the field of numerical analysis, and then discuss local acceleration guarantees of the proposed acceleration scheme. Moreover, our result provides a geometric interpretation of these vector extrapolation techniques. Numerical experiments on various first-order methods are provided to demonstrate the advantage of the proposed adaptive acceleration scheme.
- JournalThe Fun is Finite: Douglas-Rachford and Sudoku Puzzle–Finite Termination and Local Linear ConvergenceRobert Tovey, and Jingwei LiangJournal of Applied and Numerical Optimization, 2020
In recent years, the Douglas-Rachford splitting method has been shown to be effective at solving many non-convex optimization problems. In this paper we present a local convergence analysis for non-convex feasibility problems and show that both finite termination and local linear convergence are obtained. For a generalization of the Sudoku puzzle, we prove that the local linear rate of convergence of Douglas-Rachford is exactly $\frac\sqrt55 and independent of puzzle size. For the s$-queens problem we prove that Douglas-Rachford converges after a finite number of iterations. Numerical results on solving Sudoku puzzles and s-queens puzzles are provided to support our theoretical findings.
- JournalBest Pair Formulation & Accelerated Scheme for Non-convex Principal Component PursuitAritra Dutta, Filip Hanzely, Jingwei Liang, and Peter RichtárikIEEE Transactions on Signal Processing, 2020
Given two disjoint sets, the best pair problem aims to find a point in one set and another point in the other set with minimal distance between them. In this paper, we formulate the classical robust principal component analysis (RPCA) problem as a best pair problem and design an accelerated proximal gradient algorithm to solve it. We prove that the method enjoys global convergence with a local linear rate. Our extensive numerical experiments on both real and synthetic data sets suggest that our proposed algorithm outperforms relevant baseline algorithms in the literature.
2019
- ConferenceTrajectory of Alternating Direction Method of Multipliers and Adaptive AccelerationClarice Poon, and Jingwei LiangIn Advances in Neural Information Processing Systems (NeurIPS), 2019
The alternating direction method of multipliers (ADMM) is one of the most widely used first-order optimisation methods in the literature owing to its simplicity, flexibility and efficiency. Over the years, numerous efforts are made to improve the performance of the method, such as the inertial technique. By studying the geometric properties of ADMM, we discuss the limitations of current inertial accelerated ADMM and then present and analyze an adaptive acceleration scheme for the method. Numerical experiments on problems arising from image processing, statistics and machine learning demonstrate the advantages of the proposed acceleration approach.
- JournalConvergence Rates of Forward–Douglas–Rachford Splitting MethodCesare Molinari, Jingwei Liang, and Jalal FadiliJournal of Optimization Theory and Applications, 2019
Over the past decades, operator splitting methods have become ubiquitous for non-smooth optimization owing to their simplicity and efficiency. In this paper, we consider the Forward-Douglas-Rachford splitting method and study both global and local convergence rates of this method. For the global rate, we establish a sublinear convergence rate in terms of a Bregman divergence suitably designed for the objective function. Moreover, when specializing to the Forward-Backward splitting, we prove a stronger convergence rate result for the objective function value. Then locally, based on the assumption that the non-smooth part of the optimization problem is partly smooth, we establish local linear convergence of the method. More precisely, we show that the sequence generated by Forward-Douglas-Rachford first (i) identifies a smooth manifold in a finite number of iteration and then (ii) enters a local linear convergence regime, which is for instance characterized in terms of the structure of the underlying active smooth manifold. To exemplify the usefulness of the obtained result, we consider several concrete numerical experiments arising from applicative fields including, for instance, signal/image processing, inverse problems and machine learning.
2018
- ConferenceLocal Convergence Properties of SAGA/Prox-SVRG and AccelerationClarice Poon, Jingwei Liang, and Carola-Bibiane SchönliebIn International Conference on Machine Learning (ICML), 2018
In this paper, we present a local convergence anal- ysis for a class of stochastic optimisation meth- ods: the proximal variance reduced stochastic gradient methods, and mainly focus on SAGA (Defazio et al., 2014) and Prox-SVRG (Xiao & Zhang, 2014). Under the assumption that the non-smooth component of the optimisation prob- lem is partly smooth relative to a smooth mani- fold, we present a unified framework for the local convergence analysis of SAGA/Prox-SVRG: (i) the sequences generated by the methods are able to identify the smooth manifold in a finite num- ber of iterations; (ii) then the sequence enters a local linear convergence regime. Furthermore, we discuss various possibilities for accelerating these algorithms, including adapting to better lo- cal parameters, and applying higher-order deter- ministic/stochastic optimisation methods which can achieve super-linear convergence. Several concrete examples arising from machine learning are considered to demonstrate the obtained result.
- JournalLocal Linear Convergence Analysis of Primal–Dual Splitting MethodsJingwei Liang, Jalal Fadili, and Gabriel PeyréOptimization, 2018
In this paper, we study the local linear convergence properties of a versatile class of Primal-Dual splitting methods for minimizing composite non-smooth convex optimization problems. Under the assumption that the non-smooth components of the problem are partly smooth relative to smooth manifolds, we present a unified local convergence analysis framework for these methods. More precisely, in our framework, we first show that (i) the sequences generated by Primal-Dual splitting methods identify a pair of primal and dual smooth manifolds in a finite number of iterations, and then (ii) enter a local linear convergence regime, which is characterized based on the structure of the underlying active smooth manifolds. We also show how our results for Primal-Dual splitting can be specialized to cover existing ones on Forward-Backward splitting and Douglas-Rachford splitting/ADMM (alternating direction methods of multipliers). Moreover, based on these obtained local convergence analysis result, several practical acceleration techniques are discussed. To exemplify the usefulness of the obtained result, we consider several concrete numerical experiments arising from fields including signal/image processing, inverse problems and machine learning. The demonstration not only verifies the local linear convergence behaviour of Primal-Dual splitting methods, but also the insights on how to accelerate them in practice.
2017
- JournalActivity Identification and Local Linear Convergence of Forward–Backward-type MethodsJingwei Liang, Jalal Fadili, and Gabriel PeyréSIAM Journal on Optimization, 2017
In this paper, we consider a class of Forward–Backward (FB) splitting methods that includes several variants (e.g., inertial schemes, FISTA) for minimizing the sum of two proper convex and lower semicontinuous functions, one of which has a Lipschitz continuous gradient, and the other is partly smooth relative to a smooth active manifold M. We propose a unified framework, under which we show that this class of FB-type algorithms (i) correctly identifies the active manifold in a finite number of iterations (finite activity identification), and (ii) then enters a local linear convergence regime, which we characterize precisely in terms of the structure of the underlying active manifold. We also establish and explain why FISTA (with convergent sequences) locally oscillates and can be locally slower than FB. These results may have numerous applications including in signal/image processing, sparse recovery and machine learning. Indeed, the obtained results explain the typical behaviour that has been observed numerically for many problems in these fields such as the Lasso, the group Lasso, the fused Lasso and the nuclear norm minimization to name only a few.
- JournalLocal Convergence Properties of Douglas–Rachford and Alternating Direction Method of MultipliersJingwei Liang, Jalal Fadili, and Gabriel PeyréJournal of Optimization Theory and Applications, 2017
The Douglas-Rachford and alternating direction method of multipliers are two proximal splitting algorithms designed to minimize the sum of two proper lower semi-continuous convex functions whose proximity operators are easy to compute. The goal of this work is to understand the local linear convergence behaviour of Douglas-Rachford (resp. alternating direction method of multipliers) when the involved functions (resp. their Legendre-Fenchel conjugates) are moreover partly smooth. More precisely, when the two functions (resp. their conjugates) are partly smooth relative to their respective smooth submanifolds, we show that Douglas-Rachford (resp. alternating direction method of multipliers) (i) identifies these manifolds in finite time; (ii) enters a local linear convergence regime. When both functions are locally polyhedral, we show that the optimal convergence radius is given in terms of the cosine of the Friedrichs angle between the tangent spaces of the identified submanifolds. Under polyhedrality of both functions, we also provide conditions sufficient for finite convergence. The obtained results are illustrated by several concrete examples and supported by numerical experiments.
2016
- ThesisConvergence Rates of First-Order Operator Splitting MethodsJingwei LiangNormandie Université; GREYC CNRS UMR 6072, 2016
This manuscript is concerned with convergence analysis of first-order operator splitting methods that are ubiquitous in modern non-smooth optimization. It consists of three main theoretical advances on this class of methods, namely global convergence rates, novel operator splitting schemes and local linear convergence. First, we propose global (sub-linear) and local (linear) convergence rates for the inexact Krasnosel’skii-Mann iteration built from non-expansive operators, and its application to a variety of monotone operator splitting schemes. Then we design two novel multi-step inertial operator splitting algorithms, both in the convex and non-convex settings, and prove their global convergence. Finally, building on the key concept of partial smoothness, we present a unified and sharp local linear convergence analysis for the class of first-order proximal splitting methods for optimization. We show that for all these algorithms, under appropriate non-degeneracy conditions, the iterates generated by each of these methods will (i) identify the involved partial smooth manifolds in finite time, and then (ii) enter a local linear convergence regime. The linear convergence rates are characterized precisely based on the structure of the optimization problems, that of the proximal splitting scheme, and the geometry of the identified active manifolds. Our theoretical findings are systematically illustrated on applications arising from inverse problems, signal/image processing and machine learning.
- JournalConvergence Rates with Inexact Non-expansive OperatorsJingwei Liang, Jalal Fadili, and Gabriel PeyréMathematical Programming, 2016
In this paper, we present a convergence rate analysis for the inexact Krasnosel’skiĭ-Mann iteration built from non-expansive operators. The presented results include two main parts: we first establish the global pointwise and ergodic iteration-complexity bounds; then, under a metric sub-regularity assumption, we establish a local linear convergence for the distance of the iterates to the set of fixed points. The obtained results can be applied to analyze the convergence rate of various monotone operator splitting methods in the literature, including the Forward-Backward splitting, the Generalized Forward-Backward, the Douglas-Rachford splitting, alternating direction method of multipliers and Primal-Dual splitting methods. For these methods, we also develop easily verifiable termination criteria for finding an approximate solution, which can be seen as a generalization of the termination criterion for the classical gradient descent method. We finally develop a parallel analysis for the non-stationary Krasnosel’skiĭ-Mann iteration.
- ConferenceA Multi-step Inertial Forward-Backward Splitting Method for Non-convex OptimizationJingwei Liang, Jalal Fadili, and Gabriel PeyréIn Advances in Neural Information Processing Systems (NeurIPS), 2016
In this paper, we propose a multi-step inertial Forward–Backward splitting algorithm for minimizing the sum of two non-necessarily convex functions, one of which is proper lower semi-continuous while the other is differentiable with a Lipschitz continuous gradient. We first prove global convergence of the scheme with the help of the Kurdyka-Łojasiewicz property. Then, when the non-smooth part is also partly smooth relative to a smooth submanifold, we establish finite identification of the latter and provide sharp local linear convergence analysis. The proposed method is illustrated on a few problems arising from statistics and machine learning.
2015
- ConferenceActivity Identification and Local Linear Convergence of Douglas–Rachford/ADMM under Partial SmoothnessIn International Conference on Scale Space and Variational Methods in Computer Vision (SSVM), 2015
Convex optimization has become ubiquitous in most quantitative disciplines of science, including variational image processing. Proximal splitting algorithms are becoming popular to solve such structured convex optimization problems. Within this class of algorithms, Douglas-Rachford (DR) and ADMM are designed to minimize the sum of two proper lower semi-continuous convex functions whose proximity operators are easy to compute. The goal of this work is to understand the local convergence behaviour of DR (resp. ADMM) when the involved functions (resp. their Legendre-Fenchel conjugates) are moreover partly smooth. More precisely, when both of the two functions (resp. their conjugates) are partly smooth relative to their respective manifolds, we show that DR (resp. ADMM) identifies these manifolds in finite time. Moreover, when these manifolds are affine or linear, we prove that DR/ADMM is locally linearly convergent with a rate in terms of the cosine of the Friedrichs angle between the tangent spaces of the identified manifolds. This is illustrated by several concrete examples and supported by numerical experiments.
- JournalRetinex by Higher Order Total Variation L1 DecompositionJingwei Liang, and Xiaoqun ZhangJournal of Mathematical Imaging and Vision, 2015
In this paper, we propose a reflectance and illumination decomposition model for the Retinex problem via high-order total variation and L^1 decomposition. Based on the observation that illumination varies smoother than reflectance, we propose a convex variational model which can effectively decompose the gradient field of images into salient edges and relatively smoother illumination field through the first- and second-order total variation regularizations. The proposed model can be efficiently solved by a primal-dual splitting method. Numerical experiments on both grayscale and color images show the strength of the proposed model with applications to Retinex illusions, medical image bias field removal, and color image shadow correction.
2014
- ConferenceLocal Linear Convergence of Forward–Backward under Partial SmoothnessJingwei Liang, Jalal Fadili, and Gabriel PeyréAdvances in Neural Information Processing Systems (NeurIPS), 2014
In this paper, we consider the Forward–Backward proximal splitting algorithm to minimize the sum of two proper closed convex functions, one of which having a Lipschitz continuous gradient and the other being partly smooth relatively to an active manifold M. We propose a generic framework in which we show that the Forward–Backward (i) correctly identifies the active manifold M in a finite number of iterations, and then (ii) enters a local linear convergence regime that we characterize precisely. This gives a grounded and unified explanation to the typical behaviour that has been observed numerically for many problems encompassed in our framework, including the Lasso, the group Lasso, the fused Lasso and the nuclear norm regularization to name a few. These results may have numerous applications including in signal/image processing processing, sparse recovery and machine learning.
- JournalSeismic Data Restoration via Data-driven Tight FrameJingwei Liang, Jianwei Ma, and Xiaoqun ZhangGeophysics, 2014
Restoration/interpolation of missing traces plays a crucial role in the seismic data processing pipeline. Efficient restoration methods have been proposed based on sparse signal representation in a transform domain such as Fourier, wavelet, curvelet, and shearlet transforms. Most existing methods are based on transforms with a fixed basis. We considered an adaptive sparse transform for restoration of data with complex structures. In particular, we evaluated a data-driven tight-frame-based sparse regularization method for seismic data restoration. The main idea of the data-driven tight frame (TF) is to adaptively learn a set of framelet filters from the currently interpolated data, under which the data can be more sparsely represented; hence, the sparsity-promoting l1-norm (SPL1) minimization methods can produce better restoration results by using the learned filters. A split inexact Uzawa algorithm, which can be viewed as a generalization of the alternating direction of multiplier method (ADMM), was applied to solve the presented SPL1 model. Numerical tests were performed on synthetic and real seismic data for restoration of randomly missing traces over a regular data grid. Our experiments showed that our proposed method obtains the state-of-the-art restoration results in comparison with the traditional Fourier-based projection onto convex sets, the tight-frame-based method, and the recent shearlet regularization ADMM method.
2013
- JournalWavelet Frame Based Color Image DemosaicingInverse Problems and Imaging, 2013
Color image demosaicing consists in recovering full resolution color information from color-filter-array (CFA) samples with 66.7% amount of missing data. Most of the existing color demosaicing methods [14, 25, 16, 2, 26] are based on interpolation from inter-channel correlation and local geometry, which are not robust to highly saturated color images with small geometric features. In this paper, we introduce wavelet frame based methods by using a sparse wavelet [8, 22, 9, 23] approximation of individual color channels and color differences that recovers both geometric features and color information. The proposed models can be efficiently solved by Bregmanized operator splitting algorithm [27]. Numerical simulations of two datasets: McM and Kodak PhotoCD, show that our method outperforms other existing methods in terms of PSNR and visual quality.
2024
- PreprintA Guide to Stochastic Optimisation for Large-Scale Inverse ProblemsarXiv preprint arXiv:2406.06342, 2024
Stochastic optimisation algorithms are the de facto standard for machine learning with large amounts of data. Handling only a subset of available data in each optimisation step dramatically reduces the per-iteration computational costs, while still ensuring significant progress towards the solution. Driven by the need to solve large-scale optimisation problems as efficiently as possible, the last decade has witnessed an explosion of research in this area. Leveraging the parallels between machine learning and inverse problems has allowed harnessing the power of this research wave for solving inverse problems. In this survey, we provide a comprehensive account of the state-of-the-art in stochastic optimisation from the viewpoint of inverse problems. We present algorithms with diverse modalities of problem randomisation and discuss the roles of variance reduction, acceleration, higher-order methods, and other algorithmic modifications, and compare theoretical results with practical behaviour. We focus on the potential and the challenges for stochastic optimisation that are unique to inverse imaging problems and are not commonly encountered in machine learning. We conclude the survey with illustrative examples from imaging problems to examine the advantages and disadvantages that this new generation of algorithms bring to the field of inverse problems.
2022
- JournalPartial Smoothness and Constant RankAdrian S. Lewis, Jingwei Liang, and Tonghua TianSIAM Journal on Optimization, 2022
In optimization, the notion of a partly smooth objective function is powerful for applications in algorithmic convergence and postoptimality analysis, and yet is complex to define. A shift in focus to the first-order optimality conditions reduces the concept to a simple constant-rank condition. In this view, partial smoothness extends to more general variational systems, encompassing in particular the saddlepoint operators underlying popular primal-dual splitting algorithms. For a broad class of semi-algebraic generalized equations, partial smoothness holds generically.
- JournalOn Biased Stochastic Gradient EstimationDerek Driggs, Jingwei Liang, and Carola-Bibiane SchönliebJournal of Machine Learning Research, 2022
We present a uniform analysis of biased stochastic gradient methods for minimizing convex, strongly convex, and non-convex composite objectives, and identify settings where bias is useful in stochastic gradient estimation. The framework we present allows us to extend proximal support to biased algorithms, including SAG and SARAH, for the first time in the convex setting. We also use our framework to develop a new algorithm, Stochastic Average Recursive GradiEnt (SARGE), that achieves the oracle complexity lower-bound for non-convex, finite-sum objectives and requires strictly fewer calls to a stochastic gradient oracle per iteration than SVRG and SARAH. We support our theoretical results with numerical experiments that demonstrate the benefits of certain biased gradient estimators.
- JournalImproving “Fast Iterative Shrinkage-Thresholding Algorithm”: Faster, Smarter, and GreedierJingwei Liang, Tao Luo, and Carola-Bibiane SchönliebSIAM Journal on Scientific Computing, 2022
The “fast iterative shrinkage-thresholding algorithm,” a.k.a. FISTA, is one of the most well known first-order optimization scheme in the literature, as it achieves the worst-case O(1/k^2) optimal convergence rate for objective function value. However, despite such an optimal theoretical convergence rate, in practice the (local) oscillatory behavior of FISTA often damps its efficiency. Over the past years, various efforts have been made in the literature to improve the practical performance of FISTA, such as monotone FISTA, restarting FISTA, and backtracking strategies. In this paper, we propose a simple yet effective modification to the original FISTA scheme which has two advantages: It allows us to (1) prove the convergence of generated sequence and (2) design a so-called lazy-start strategy which can be up to an order faster than the original scheme. Moreover, we propose novel adaptive and greedy strategies which further improve the practical performance. The advantages of the proposed schemes are tested through problems arising from inverse problems, machine learning, and signal/image processing.
2021
- JournalA Stochastic Proximal Alternating Minimization for Nonsmooth and Nonconvex OptimizationSIAM Journal on Imaging Sciences, 2021
In this work, we introduce a novel stochastic proximal alternating linearized minimization algorithm [J. Bolte, S. Sabach, and M. Teboulle, Math. Program., 146 (2014), pp. 459–494] for solving a class of nonsmooth and nonconvex optimization problems. Large-scale imaging problems are becoming increasingly prevalent due to the advances in data acquisition and computational capabilities. Motivated by the success of stochastic optimization methods, we propose a stochastic variant of proximal alternating linearized minimization. We provide global convergence guarantees, demonstrating that our proposed method with variance-reduced stochastic gradient estimators, such as SAGA [A. Defazio, F. Bach, and S. Lacoste-Julien, Advances in Neural Information Processing Systems, 2014, pp. 1646–1654] and SARAH [L. M. Nguyen, J. Liu, K. Scheinberg, and M. Takáĉ, Proceedings of the 34th International Conference on Machine Learning, PMLR 70, 2017, pp. 2613–2621], achieves state-of-the-art oracle complexities. We also demonstrate the efficacy of our algorithm via several numerical examples including sparse nonnegative matrix factorization, sparse principal component analysis, and blind image-deconvolution.
2020
- ConferenceTuning-free Plug-and-Play Proximal Algorithm for Inverse Imaging ProblemsKaixuan Wei, Angelica Aviles-Rivero, Jingwei Liang, Ying Fu, Carola-Bibiane Schönlieb, and Hua HuangIn International Conference on Machine Learning (ICML), 2020
This paper received the Outstanding Paper Award from ICML2020.
Plug-and-play (PnP) is a non-convex framework that combines ADMM or other proximal algorithms with advanced denoiser priors. Recently, PnP has achieved great empirical success, especially with the integration of deep learning-based denoisers. However, a key problem of PnP based approaches is that they require manual parameter tweaking. It is necessary to obtain high-quality results across the high discrepancy in terms of imaging conditions and varying scene content. In this work, we present a tuning-free PnP proximal algorithm, which can automatically determine the internal parameters including the penalty parameter, the denoising strength and the terminal time. A key part of our approach is to develop a policy network for automatic search of parameters, which can be effectively learned via mixed model-free and model-based deep reinforcement learning. We demonstrate, through numerical and visual experiments, that the learned policy can customize different parameters for different states, and often more efficient and effective than existing handcrafted criteria. Moreover, we discuss the practical considerations of the plugged denoisers, which together with our learned policy yield state-of-the-art results. This is prevalent on both linear and nonlinear exemplary inverse imaging problems, and in particular, we show promising results on Compressed Sensing MRI and phase retrieval.
2019
- ConferenceTrajectory of Alternating Direction Method of Multipliers and Adaptive AccelerationClarice Poon, and Jingwei LiangIn Advances in Neural Information Processing Systems (NeurIPS), 2019
The alternating direction method of multipliers (ADMM) is one of the most widely used first-order optimisation methods in the literature owing to its simplicity, flexibility and efficiency. Over the years, numerous efforts are made to improve the performance of the method, such as the inertial technique. By studying the geometric properties of ADMM, we discuss the limitations of current inertial accelerated ADMM and then present and analyze an adaptive acceleration scheme for the method. Numerical experiments on problems arising from image processing, statistics and machine learning demonstrate the advantages of the proposed acceleration approach.
2018
- ConferenceLocal Convergence Properties of SAGA/Prox-SVRG and AccelerationClarice Poon, Jingwei Liang, and Carola-Bibiane SchönliebIn International Conference on Machine Learning (ICML), 2018
In this paper, we present a local convergence anal- ysis for a class of stochastic optimisation meth- ods: the proximal variance reduced stochastic gradient methods, and mainly focus on SAGA (Defazio et al., 2014) and Prox-SVRG (Xiao & Zhang, 2014). Under the assumption that the non-smooth component of the optimisation prob- lem is partly smooth relative to a smooth mani- fold, we present a unified framework for the local convergence analysis of SAGA/Prox-SVRG: (i) the sequences generated by the methods are able to identify the smooth manifold in a finite num- ber of iterations; (ii) then the sequence enters a local linear convergence regime. Furthermore, we discuss various possibilities for accelerating these algorithms, including adapting to better lo- cal parameters, and applying higher-order deter- ministic/stochastic optimisation methods which can achieve super-linear convergence. Several concrete examples arising from machine learning are considered to demonstrate the obtained result.
2017
- JournalActivity Identification and Local Linear Convergence of Forward–Backward-type MethodsJingwei Liang, Jalal Fadili, and Gabriel PeyréSIAM Journal on Optimization, 2017
In this paper, we consider a class of Forward–Backward (FB) splitting methods that includes several variants (e.g., inertial schemes, FISTA) for minimizing the sum of two proper convex and lower semicontinuous functions, one of which has a Lipschitz continuous gradient, and the other is partly smooth relative to a smooth active manifold M. We propose a unified framework, under which we show that this class of FB-type algorithms (i) correctly identifies the active manifold in a finite number of iterations (finite activity identification), and (ii) then enters a local linear convergence regime, which we characterize precisely in terms of the structure of the underlying active manifold. We also establish and explain why FISTA (with convergent sequences) locally oscillates and can be locally slower than FB. These results may have numerous applications including in signal/image processing, sparse recovery and machine learning. Indeed, the obtained results explain the typical behaviour that has been observed numerically for many problems in these fields such as the Lasso, the group Lasso, the fused Lasso and the nuclear norm minimization to name only a few.
2016
- ThesisConvergence Rates of First-Order Operator Splitting MethodsJingwei LiangNormandie Université; GREYC CNRS UMR 6072, 2016
This manuscript is concerned with convergence analysis of first-order operator splitting methods that are ubiquitous in modern non-smooth optimization. It consists of three main theoretical advances on this class of methods, namely global convergence rates, novel operator splitting schemes and local linear convergence. First, we propose global (sub-linear) and local (linear) convergence rates for the inexact Krasnosel’skii-Mann iteration built from non-expansive operators, and its application to a variety of monotone operator splitting schemes. Then we design two novel multi-step inertial operator splitting algorithms, both in the convex and non-convex settings, and prove their global convergence. Finally, building on the key concept of partial smoothness, we present a unified and sharp local linear convergence analysis for the class of first-order proximal splitting methods for optimization. We show that for all these algorithms, under appropriate non-degeneracy conditions, the iterates generated by each of these methods will (i) identify the involved partial smooth manifolds in finite time, and then (ii) enter a local linear convergence regime. The linear convergence rates are characterized precisely based on the structure of the optimization problems, that of the proximal splitting scheme, and the geometry of the identified active manifolds. Our theoretical findings are systematically illustrated on applications arising from inverse problems, signal/image processing and machine learning.
- JournalConvergence Rates with Inexact Non-expansive OperatorsJingwei Liang, Jalal Fadili, and Gabriel PeyréMathematical Programming, 2016
In this paper, we present a convergence rate analysis for the inexact Krasnosel’skiĭ-Mann iteration built from non-expansive operators. The presented results include two main parts: we first establish the global pointwise and ergodic iteration-complexity bounds; then, under a metric sub-regularity assumption, we establish a local linear convergence for the distance of the iterates to the set of fixed points. The obtained results can be applied to analyze the convergence rate of various monotone operator splitting methods in the literature, including the Forward-Backward splitting, the Generalized Forward-Backward, the Douglas-Rachford splitting, alternating direction method of multipliers and Primal-Dual splitting methods. For these methods, we also develop easily verifiable termination criteria for finding an approximate solution, which can be seen as a generalization of the termination criterion for the classical gradient descent method. We finally develop a parallel analysis for the non-stationary Krasnosel’skiĭ-Mann iteration.
- ConferenceA Multi-step Inertial Forward-Backward Splitting Method for Non-convex OptimizationJingwei Liang, Jalal Fadili, and Gabriel PeyréIn Advances in Neural Information Processing Systems (NeurIPS), 2016
In this paper, we propose a multi-step inertial Forward–Backward splitting algorithm for minimizing the sum of two non-necessarily convex functions, one of which is proper lower semi-continuous while the other is differentiable with a Lipschitz continuous gradient. We first prove global convergence of the scheme with the help of the Kurdyka-Łojasiewicz property. Then, when the non-smooth part is also partly smooth relative to a smooth submanifold, we establish finite identification of the latter and provide sharp local linear convergence analysis. The proposed method is illustrated on a few problems arising from statistics and machine learning.
2015
- JournalRetinex by Higher Order Total Variation L1 DecompositionJingwei Liang, and Xiaoqun ZhangJournal of Mathematical Imaging and Vision, 2015
In this paper, we propose a reflectance and illumination decomposition model for the Retinex problem via high-order total variation and L^1 decomposition. Based on the observation that illumination varies smoother than reflectance, we propose a convex variational model which can effectively decompose the gradient field of images into salient edges and relatively smoother illumination field through the first- and second-order total variation regularizations. The proposed model can be efficiently solved by a primal-dual splitting method. Numerical experiments on both grayscale and color images show the strength of the proposed model with applications to Retinex illusions, medical image bias field removal, and color image shadow correction.
2024
- PreprintA Guide to Stochastic Optimisation for Large-Scale Inverse ProblemsarXiv preprint arXiv:2406.06342, 2024
Stochastic optimisation algorithms are the de facto standard for machine learning with large amounts of data. Handling only a subset of available data in each optimisation step dramatically reduces the per-iteration computational costs, while still ensuring significant progress towards the solution. Driven by the need to solve large-scale optimisation problems as efficiently as possible, the last decade has witnessed an explosion of research in this area. Leveraging the parallels between machine learning and inverse problems has allowed harnessing the power of this research wave for solving inverse problems. In this survey, we provide a comprehensive account of the state-of-the-art in stochastic optimisation from the viewpoint of inverse problems. We present algorithms with diverse modalities of problem randomisation and discuss the roles of variance reduction, acceleration, higher-order methods, and other algorithmic modifications, and compare theoretical results with practical behaviour. We focus on the potential and the challenges for stochastic optimisation that are unique to inverse imaging problems and are not commonly encountered in machine learning. We conclude the survey with illustrative examples from imaging problems to examine the advantages and disadvantages that this new generation of algorithms bring to the field of inverse problems.
- JournalFederated Primal-Dual Fixed Point AlgorithmYa-Nan Zhu, Jingwei Liang, and Xiaoqun ZhangSIAM Journal on Mathematics of Data Science, accepted, 2024
Federated learning (FL) is a distributed learning paradigm that allows several clients to learn a global model without sharing their private data. In this paper, we generalize a primal dual fixed point (PDFP) \citePDFP method to federated learning setting and propose an algorithm called Federated PDFP (FPDFP) for solving composite optimization problems. In addition, a quantization scheme is applied to reduce the communication overhead during the learning process. An O(1/k) convergence rate (where is the communication round) of the proposed FPDFP is provided. Numerical experiments, including graph-guided logistic regression, 3D Computed Tomography (CT) reconstruction are considered to evaluate the proposed algorithm.
2022
- JournalPartial Smoothness and Constant RankAdrian S. Lewis, Jingwei Liang, and Tonghua TianSIAM Journal on Optimization, 2022
In optimization, the notion of a partly smooth objective function is powerful for applications in algorithmic convergence and postoptimality analysis, and yet is complex to define. A shift in focus to the first-order optimality conditions reduces the concept to a simple constant-rank condition. In this view, partial smoothness extends to more general variational systems, encompassing in particular the saddlepoint operators underlying popular primal-dual splitting algorithms. For a broad class of semi-algebraic generalized equations, partial smoothness holds generically.
- JournalVariable Screening for Sparse Online RegressionJingwei Liang, and Clarice PoonJournal of Computational and Graphical Statistics, 2022
Sparsity-promoting regularizers are widely used to impose low-complexity structure (e.g., l1-norm for sparsity) to the regression coefficients of supervised learning. In the realm of deterministic optimization, the sequence generated by iterative algorithms (such as proximal gradient descent) exhibit “finite activity identification” property, that is, they can identify the low-complexity structure of the solution in a finite number of iterations. However, many online algorithms (such as proximal stochastic gradient descent) do not have this property owing to the vanishing step-size and nonvanishing variance. In this article, by combining with a screening rule, we show how to eliminate useless features of the iterates generated by online algorithms, and thereby enforce finite sparsity identification. One advantage of our scheme is that when combined with any convergent online algorithm, sparsity properties imposed by the regularizer can be exploited to improve computational efficiency. Numerically, significant acceleration can be obtained.
- JournalOn Biased Stochastic Gradient EstimationDerek Driggs, Jingwei Liang, and Carola-Bibiane SchönliebJournal of Machine Learning Research, 2022
We present a uniform analysis of biased stochastic gradient methods for minimizing convex, strongly convex, and non-convex composite objectives, and identify settings where bias is useful in stochastic gradient estimation. The framework we present allows us to extend proximal support to biased algorithms, including SAG and SARAH, for the first time in the convex setting. We also use our framework to develop a new algorithm, Stochastic Average Recursive GradiEnt (SARGE), that achieves the oracle complexity lower-bound for non-convex, finite-sum objectives and requires strictly fewer calls to a stochastic gradient oracle per iteration than SVRG and SARAH. We support our theoretical results with numerical experiments that demonstrate the benefits of certain biased gradient estimators.
- JournalImproving “Fast Iterative Shrinkage-Thresholding Algorithm”: Faster, Smarter, and GreedierJingwei Liang, Tao Luo, and Carola-Bibiane SchönliebSIAM Journal on Scientific Computing, 2022
The “fast iterative shrinkage-thresholding algorithm,” a.k.a. FISTA, is one of the most well known first-order optimization scheme in the literature, as it achieves the worst-case O(1/k^2) optimal convergence rate for objective function value. However, despite such an optimal theoretical convergence rate, in practice the (local) oscillatory behavior of FISTA often damps its efficiency. Over the past years, various efforts have been made in the literature to improve the practical performance of FISTA, such as monotone FISTA, restarting FISTA, and backtracking strategies. In this paper, we propose a simple yet effective modification to the original FISTA scheme which has two advantages: It allows us to (1) prove the convergence of generated sequence and (2) design a so-called lazy-start strategy which can be up to an order faster than the original scheme. Moreover, we propose novel adaptive and greedy strategies which further improve the practical performance. The advantages of the proposed schemes are tested through problems arising from inverse problems, machine learning, and signal/image processing.
2021
- JournalA Stochastic Alternating Direction Method of Multipliers for Non-smooth and Non-convex OptimizationFengmiao Bian, Jingwei Liang, and Xiaoqun ZhangInverse Problems, 2021
Alternating direction method of multipliers (ADMM) is a popular first-order method owing to its simplicity and efficiency. However, similar to other proximal splitting methods, the performance of ADMM degrades significantly when the scale of optimization problems to solve becomes large. In this paper, we consider combining ADMM with a class of variance-reduced stochastic gradient estimators for solving large-scale non-convex and non-smooth optimization problems. Global convergence of the generated sequence is established under the additional assumption that the object function satisfies Kurdyka-Łojasiewicz property. Numerical experiments on graph-guided fused lasso and computed tomography are presented to demonstrate the performance of the proposed methods.
- JournalA Stochastic Proximal Alternating Minimization for Nonsmooth and Nonconvex OptimizationSIAM Journal on Imaging Sciences, 2021
In this work, we introduce a novel stochastic proximal alternating linearized minimization algorithm [J. Bolte, S. Sabach, and M. Teboulle, Math. Program., 146 (2014), pp. 459–494] for solving a class of nonsmooth and nonconvex optimization problems. Large-scale imaging problems are becoming increasingly prevalent due to the advances in data acquisition and computational capabilities. Motivated by the success of stochastic optimization methods, we propose a stochastic variant of proximal alternating linearized minimization. We provide global convergence guarantees, demonstrating that our proposed method with variance-reduced stochastic gradient estimators, such as SAGA [A. Defazio, F. Bach, and S. Lacoste-Julien, Advances in Neural Information Processing Systems, 2014, pp. 1646–1654] and SARAH [L. M. Nguyen, J. Liu, K. Scheinberg, and M. Takáĉ, Proceedings of the 34th International Conference on Machine Learning, PMLR 70, 2017, pp. 2613–2621], achieves state-of-the-art oracle complexities. We also demonstrate the efficacy of our algorithm via several numerical examples including sparse nonnegative matrix factorization, sparse principal component analysis, and blind image-deconvolution.
- PreprintAn Adaptive Rank Continuation Algorithm for General Weighted Low-rank RecoveryAritra Dutta, Jingwei Liang , and Xin LiarXiv preprint arXiv:2101.00749, 2021
This paper is devoted to proposing a general weighted low-rank recovery model and designing a fast SVD-free computational scheme to solve it. First, our generic weighted low-rank recovery model unifies several existing approaches in the literature. Moreover, our model readily extends to the non-convex setting. Algorithm-wise, most first-order proximal algorithms in the literature for low-rank recoveries require computing singular value decomposition (SVD). As SVD does not scale appropriately with the dimension of the matrices, these algorithms become slower when the problem size becomes larger. By incorporating the variational formulation of the nuclear norm into the sub-problem of proximal gradient descent, we avoid computing SVD, which results in significant speed-up. Moreover, our algorithm preserves the rank identification property of nuclear norm [33] which further allows us to design a rank continuation scheme that asymptotically achieves the minimal iteration complexity. Numerical experiments on both toy examples and real-world problems, including structure from motion (SfM) and photometric stereo, background estimation, and matrix completion, demonstrate the superiority of our proposed algorithm.
2020
- PreprintGeometry of First-Order Methods and Adaptive AccelerationClarice Poon, and Jingwei LiangarXiv preprint arXiv:2003.03910, 2020
First-order operator splitting methods are ubiquitous among many fields through science and engineering, such as inverse problems, signal/image processing, statistics, data science and machine learning, to name a few. In this paper, we study a geometric property of first-order methods when applying to solve non-smooth optimization problems. With the tool of "partial smoothness", we design a framework to analyze the trajectory of the fixed-point sequence generated by first-order methods and show that locally, the fixed-point sequence settles onto a regular trajectory such as a straight line or a spiral. Based on this finding, we discuss the limitation of current widely used "inertial acceleration" technique, and propose a trajectory following adaptive acceleration algorithm. Global convergence is established for the proposed acceleration scheme based on the perturbation of fixed-point iteration. Locally, we first build connections between the acceleration scheme and the well-studied "vector extrapolation technique" in the field of numerical analysis, and then discuss local acceleration guarantees of the proposed acceleration scheme. Moreover, our result provides a geometric interpretation of these vector extrapolation techniques. Numerical experiments on various first-order methods are provided to demonstrate the advantage of the proposed adaptive acceleration scheme.
- JournalThe Fun is Finite: Douglas-Rachford and Sudoku Puzzle–Finite Termination and Local Linear ConvergenceRobert Tovey, and Jingwei LiangJournal of Applied and Numerical Optimization, 2020
In recent years, the Douglas-Rachford splitting method has been shown to be effective at solving many non-convex optimization problems. In this paper we present a local convergence analysis for non-convex feasibility problems and show that both finite termination and local linear convergence are obtained. For a generalization of the Sudoku puzzle, we prove that the local linear rate of convergence of Douglas-Rachford is exactly $\frac\sqrt55 and independent of puzzle size. For the s$-queens problem we prove that Douglas-Rachford converges after a finite number of iterations. Numerical results on solving Sudoku puzzles and s-queens puzzles are provided to support our theoretical findings.
- JournalBest Pair Formulation & Accelerated Scheme for Non-convex Principal Component PursuitAritra Dutta, Filip Hanzely, Jingwei Liang, and Peter RichtárikIEEE Transactions on Signal Processing, 2020
Given two disjoint sets, the best pair problem aims to find a point in one set and another point in the other set with minimal distance between them. In this paper, we formulate the classical robust principal component analysis (RPCA) problem as a best pair problem and design an accelerated proximal gradient algorithm to solve it. We prove that the method enjoys global convergence with a local linear rate. Our extensive numerical experiments on both real and synthetic data sets suggest that our proposed algorithm outperforms relevant baseline algorithms in the literature.
2019
- ConferenceTrajectory of Alternating Direction Method of Multipliers and Adaptive AccelerationClarice Poon, and Jingwei LiangIn Advances in Neural Information Processing Systems (NeurIPS), 2019
The alternating direction method of multipliers (ADMM) is one of the most widely used first-order optimisation methods in the literature owing to its simplicity, flexibility and efficiency. Over the years, numerous efforts are made to improve the performance of the method, such as the inertial technique. By studying the geometric properties of ADMM, we discuss the limitations of current inertial accelerated ADMM and then present and analyze an adaptive acceleration scheme for the method. Numerical experiments on problems arising from image processing, statistics and machine learning demonstrate the advantages of the proposed acceleration approach.
- JournalConvergence Rates of Forward–Douglas–Rachford Splitting MethodCesare Molinari, Jingwei Liang, and Jalal FadiliJournal of Optimization Theory and Applications, 2019
Over the past decades, operator splitting methods have become ubiquitous for non-smooth optimization owing to their simplicity and efficiency. In this paper, we consider the Forward-Douglas-Rachford splitting method and study both global and local convergence rates of this method. For the global rate, we establish a sublinear convergence rate in terms of a Bregman divergence suitably designed for the objective function. Moreover, when specializing to the Forward-Backward splitting, we prove a stronger convergence rate result for the objective function value. Then locally, based on the assumption that the non-smooth part of the optimization problem is partly smooth, we establish local linear convergence of the method. More precisely, we show that the sequence generated by Forward-Douglas-Rachford first (i) identifies a smooth manifold in a finite number of iteration and then (ii) enters a local linear convergence regime, which is for instance characterized in terms of the structure of the underlying active smooth manifold. To exemplify the usefulness of the obtained result, we consider several concrete numerical experiments arising from applicative fields including, for instance, signal/image processing, inverse problems and machine learning.
2018
- ConferenceLocal Convergence Properties of SAGA/Prox-SVRG and AccelerationClarice Poon, Jingwei Liang, and Carola-Bibiane SchönliebIn International Conference on Machine Learning (ICML), 2018
In this paper, we present a local convergence anal- ysis for a class of stochastic optimisation meth- ods: the proximal variance reduced stochastic gradient methods, and mainly focus on SAGA (Defazio et al., 2014) and Prox-SVRG (Xiao & Zhang, 2014). Under the assumption that the non-smooth component of the optimisation prob- lem is partly smooth relative to a smooth mani- fold, we present a unified framework for the local convergence analysis of SAGA/Prox-SVRG: (i) the sequences generated by the methods are able to identify the smooth manifold in a finite num- ber of iterations; (ii) then the sequence enters a local linear convergence regime. Furthermore, we discuss various possibilities for accelerating these algorithms, including adapting to better lo- cal parameters, and applying higher-order deter- ministic/stochastic optimisation methods which can achieve super-linear convergence. Several concrete examples arising from machine learning are considered to demonstrate the obtained result.
- JournalLocal Linear Convergence Analysis of Primal–Dual Splitting MethodsJingwei Liang, Jalal Fadili, and Gabriel PeyréOptimization, 2018
In this paper, we study the local linear convergence properties of a versatile class of Primal-Dual splitting methods for minimizing composite non-smooth convex optimization problems. Under the assumption that the non-smooth components of the problem are partly smooth relative to smooth manifolds, we present a unified local convergence analysis framework for these methods. More precisely, in our framework, we first show that (i) the sequences generated by Primal-Dual splitting methods identify a pair of primal and dual smooth manifolds in a finite number of iterations, and then (ii) enter a local linear convergence regime, which is characterized based on the structure of the underlying active smooth manifolds. We also show how our results for Primal-Dual splitting can be specialized to cover existing ones on Forward-Backward splitting and Douglas-Rachford splitting/ADMM (alternating direction methods of multipliers). Moreover, based on these obtained local convergence analysis result, several practical acceleration techniques are discussed. To exemplify the usefulness of the obtained result, we consider several concrete numerical experiments arising from fields including signal/image processing, inverse problems and machine learning. The demonstration not only verifies the local linear convergence behaviour of Primal-Dual splitting methods, but also the insights on how to accelerate them in practice.
2017
- JournalActivity Identification and Local Linear Convergence of Forward–Backward-type MethodsJingwei Liang, Jalal Fadili, and Gabriel PeyréSIAM Journal on Optimization, 2017
In this paper, we consider a class of Forward–Backward (FB) splitting methods that includes several variants (e.g., inertial schemes, FISTA) for minimizing the sum of two proper convex and lower semicontinuous functions, one of which has a Lipschitz continuous gradient, and the other is partly smooth relative to a smooth active manifold M. We propose a unified framework, under which we show that this class of FB-type algorithms (i) correctly identifies the active manifold in a finite number of iterations (finite activity identification), and (ii) then enters a local linear convergence regime, which we characterize precisely in terms of the structure of the underlying active manifold. We also establish and explain why FISTA (with convergent sequences) locally oscillates and can be locally slower than FB. These results may have numerous applications including in signal/image processing, sparse recovery and machine learning. Indeed, the obtained results explain the typical behaviour that has been observed numerically for many problems in these fields such as the Lasso, the group Lasso, the fused Lasso and the nuclear norm minimization to name only a few.
- JournalLocal Convergence Properties of Douglas–Rachford and Alternating Direction Method of MultipliersJingwei Liang, Jalal Fadili, and Gabriel PeyréJournal of Optimization Theory and Applications, 2017
The Douglas-Rachford and alternating direction method of multipliers are two proximal splitting algorithms designed to minimize the sum of two proper lower semi-continuous convex functions whose proximity operators are easy to compute. The goal of this work is to understand the local linear convergence behaviour of Douglas-Rachford (resp. alternating direction method of multipliers) when the involved functions (resp. their Legendre-Fenchel conjugates) are moreover partly smooth. More precisely, when the two functions (resp. their conjugates) are partly smooth relative to their respective smooth submanifolds, we show that Douglas-Rachford (resp. alternating direction method of multipliers) (i) identifies these manifolds in finite time; (ii) enters a local linear convergence regime. When both functions are locally polyhedral, we show that the optimal convergence radius is given in terms of the cosine of the Friedrichs angle between the tangent spaces of the identified submanifolds. Under polyhedrality of both functions, we also provide conditions sufficient for finite convergence. The obtained results are illustrated by several concrete examples and supported by numerical experiments.
2016
- ThesisConvergence Rates of First-Order Operator Splitting MethodsJingwei LiangNormandie Université; GREYC CNRS UMR 6072, 2016
This manuscript is concerned with convergence analysis of first-order operator splitting methods that are ubiquitous in modern non-smooth optimization. It consists of three main theoretical advances on this class of methods, namely global convergence rates, novel operator splitting schemes and local linear convergence. First, we propose global (sub-linear) and local (linear) convergence rates for the inexact Krasnosel’skii-Mann iteration built from non-expansive operators, and its application to a variety of monotone operator splitting schemes. Then we design two novel multi-step inertial operator splitting algorithms, both in the convex and non-convex settings, and prove their global convergence. Finally, building on the key concept of partial smoothness, we present a unified and sharp local linear convergence analysis for the class of first-order proximal splitting methods for optimization. We show that for all these algorithms, under appropriate non-degeneracy conditions, the iterates generated by each of these methods will (i) identify the involved partial smooth manifolds in finite time, and then (ii) enter a local linear convergence regime. The linear convergence rates are characterized precisely based on the structure of the optimization problems, that of the proximal splitting scheme, and the geometry of the identified active manifolds. Our theoretical findings are systematically illustrated on applications arising from inverse problems, signal/image processing and machine learning.
- JournalConvergence Rates with Inexact Non-expansive OperatorsJingwei Liang, Jalal Fadili, and Gabriel PeyréMathematical Programming, 2016
In this paper, we present a convergence rate analysis for the inexact Krasnosel’skiĭ-Mann iteration built from non-expansive operators. The presented results include two main parts: we first establish the global pointwise and ergodic iteration-complexity bounds; then, under a metric sub-regularity assumption, we establish a local linear convergence for the distance of the iterates to the set of fixed points. The obtained results can be applied to analyze the convergence rate of various monotone operator splitting methods in the literature, including the Forward-Backward splitting, the Generalized Forward-Backward, the Douglas-Rachford splitting, alternating direction method of multipliers and Primal-Dual splitting methods. For these methods, we also develop easily verifiable termination criteria for finding an approximate solution, which can be seen as a generalization of the termination criterion for the classical gradient descent method. We finally develop a parallel analysis for the non-stationary Krasnosel’skiĭ-Mann iteration.
- ConferenceA Multi-step Inertial Forward-Backward Splitting Method for Non-convex OptimizationJingwei Liang, Jalal Fadili, and Gabriel PeyréIn Advances in Neural Information Processing Systems (NeurIPS), 2016
In this paper, we propose a multi-step inertial Forward–Backward splitting algorithm for minimizing the sum of two non-necessarily convex functions, one of which is proper lower semi-continuous while the other is differentiable with a Lipschitz continuous gradient. We first prove global convergence of the scheme with the help of the Kurdyka-Łojasiewicz property. Then, when the non-smooth part is also partly smooth relative to a smooth submanifold, we establish finite identification of the latter and provide sharp local linear convergence analysis. The proposed method is illustrated on a few problems arising from statistics and machine learning.
2015
- ConferenceActivity Identification and Local Linear Convergence of Douglas–Rachford/ADMM under Partial SmoothnessIn International Conference on Scale Space and Variational Methods in Computer Vision (SSVM), 2015
Convex optimization has become ubiquitous in most quantitative disciplines of science, including variational image processing. Proximal splitting algorithms are becoming popular to solve such structured convex optimization problems. Within this class of algorithms, Douglas-Rachford (DR) and ADMM are designed to minimize the sum of two proper lower semi-continuous convex functions whose proximity operators are easy to compute. The goal of this work is to understand the local convergence behaviour of DR (resp. ADMM) when the involved functions (resp. their Legendre-Fenchel conjugates) are moreover partly smooth. More precisely, when both of the two functions (resp. their conjugates) are partly smooth relative to their respective manifolds, we show that DR (resp. ADMM) identifies these manifolds in finite time. Moreover, when these manifolds are affine or linear, we prove that DR/ADMM is locally linearly convergent with a rate in terms of the cosine of the Friedrichs angle between the tangent spaces of the identified manifolds. This is illustrated by several concrete examples and supported by numerical experiments.
2014
- ConferenceLocal Linear Convergence of Forward–Backward under Partial SmoothnessJingwei Liang, Jalal Fadili, and Gabriel PeyréAdvances in Neural Information Processing Systems (NeurIPS), 2014
In this paper, we consider the Forward–Backward proximal splitting algorithm to minimize the sum of two proper closed convex functions, one of which having a Lipschitz continuous gradient and the other being partly smooth relatively to an active manifold M. We propose a generic framework in which we show that the Forward–Backward (i) correctly identifies the active manifold M in a finite number of iterations, and then (ii) enters a local linear convergence regime that we characterize precisely. This gives a grounded and unified explanation to the typical behaviour that has been observed numerically for many problems encompassed in our framework, including the Lasso, the group Lasso, the fused Lasso and the nuclear norm regularization to name a few. These results may have numerous applications including in signal/image processing processing, sparse recovery and machine learning.
2024
- PreprintA Guide to Stochastic Optimisation for Large-Scale Inverse ProblemsarXiv preprint arXiv:2406.06342, 2024
Stochastic optimisation algorithms are the de facto standard for machine learning with large amounts of data. Handling only a subset of available data in each optimisation step dramatically reduces the per-iteration computational costs, while still ensuring significant progress towards the solution. Driven by the need to solve large-scale optimisation problems as efficiently as possible, the last decade has witnessed an explosion of research in this area. Leveraging the parallels between machine learning and inverse problems has allowed harnessing the power of this research wave for solving inverse problems. In this survey, we provide a comprehensive account of the state-of-the-art in stochastic optimisation from the viewpoint of inverse problems. We present algorithms with diverse modalities of problem randomisation and discuss the roles of variance reduction, acceleration, higher-order methods, and other algorithmic modifications, and compare theoretical results with practical behaviour. We focus on the potential and the challenges for stochastic optimisation that are unique to inverse imaging problems and are not commonly encountered in machine learning. We conclude the survey with illustrative examples from imaging problems to examine the advantages and disadvantages that this new generation of algorithms bring to the field of inverse problems.
- JournalA Scalable Sphere-Constrained Magnitude-Sparse SAR ImagingMing Jiang, Jiaxuan Qu , Jinshan Ding, and Jingwei LiangJournal of Nonlinear & Variational Analysis, 2024
The classical synthetic aperture radar (SAR) imaging techniques based on matched filters are limited by data bandwidth, resulting in limited imaging performance with side lobes and speckles present. To address the high-resolution SAR imaging problem, sparse reconstruction has been extensively investigated. However, the state-of-the-art sparse recovery methods seldom consider the complex-valued reflectivity of the scene and only recover an approximated real-valued scene instead. Furthermore, iterative schemes associated with the sparse recovery methods demand a high computational cost, which limits the practical applications of these methods. In this paper, we establish a sphere-constrained magnitude-sparsity SAR imaging model, aiming at enhancing the SAR imaging quality with high efficiency. We propose a non-convex non-smooth optimization method, which can be accelerated by stochastic average gradient acceleration to be scalable with large-scale problems. Numerical experiments are conducted with point-target and extended-target simulations. On the one hand, the point-target simulation showcases the superiority of our proposed method over the classical methods in terms of resolution. On the other hand, the extended-target simulation with random phases is considered to be in line with the practical scenario, and the results demonstrate that our method outperforms the classical SAR imaging methods and sparse recovery without phase prior in terms of PSNR. Meanwhile, owing to the stochastic acceleration, our method is faster than the existing sparse recovery methods by orders of magnitude.
2022
- JournalTFPNP: Tuning-free Plug-and-Play Proximal Algorithms with Applications to Inverse Imaging ProblemsKaixuan Wei, Angelica Aviles-Rivero, Jingwei Liang, Ying Fu, Hua Huang, and Carola-Bibiane SchönliebJournal of Machine Learning Research, 2022
Plug-and-Play (PnP) is a non-convex optimization framework that combines proximal algorithms, for example, the alternating direction method of multipliers (ADMM), with advanced denoising priors. Over the past few years, great empirical success has been obtained by PnP algorithms, especially for the ones that integrate deep learning-based denoisers. However, a key problem of PnP approaches is the need for manual parameter tweaking which is essential to obtain high-quality results across the high discrepancy in imaging conditions and varying scene content. In this work, we present a class of tuning-free PnP proximal algorithms that can determine parameters such as denoising strength, termination time, and other optimization-specific parameters automatically. A core part of our approach is a policy network for automated parameter search which can be effectively learned via a mixture of model-free and model-based deep reinforcement learning strategies. We demonstrate, through rigorous numerical and visual experiments, that the learned policy can customize parameters to different settings, and is often more efficient and effective than existing handcrafted criteria. Moreover, we discuss several practical considerations of PnP denoisers, which together with our learned policy yield state-of-the-art results. This advanced performance is prevalent on both linear and nonlinear exemplar inverse imaging problems, and in particular shows promising results on compressed sensing MRI, sparse-view CT, single-photon imaging, and phase retrieval.
- JournalImproving “Fast Iterative Shrinkage-Thresholding Algorithm”: Faster, Smarter, and GreedierJingwei Liang, Tao Luo, and Carola-Bibiane SchönliebSIAM Journal on Scientific Computing, 2022
The “fast iterative shrinkage-thresholding algorithm,” a.k.a. FISTA, is one of the most well known first-order optimization scheme in the literature, as it achieves the worst-case O(1/k^2) optimal convergence rate for objective function value. However, despite such an optimal theoretical convergence rate, in practice the (local) oscillatory behavior of FISTA often damps its efficiency. Over the past years, various efforts have been made in the literature to improve the practical performance of FISTA, such as monotone FISTA, restarting FISTA, and backtracking strategies. In this paper, we propose a simple yet effective modification to the original FISTA scheme which has two advantages: It allows us to (1) prove the convergence of generated sequence and (2) design a so-called lazy-start strategy which can be up to an order faster than the original scheme. Moreover, we propose novel adaptive and greedy strategies which further improve the practical performance. The advantages of the proposed schemes are tested through problems arising from inverse problems, machine learning, and signal/image processing.
2021
- JournalA Stochastic Alternating Direction Method of Multipliers for Non-smooth and Non-convex OptimizationFengmiao Bian, Jingwei Liang, and Xiaoqun ZhangInverse Problems, 2021
Alternating direction method of multipliers (ADMM) is a popular first-order method owing to its simplicity and efficiency. However, similar to other proximal splitting methods, the performance of ADMM degrades significantly when the scale of optimization problems to solve becomes large. In this paper, we consider combining ADMM with a class of variance-reduced stochastic gradient estimators for solving large-scale non-convex and non-smooth optimization problems. Global convergence of the generated sequence is established under the additional assumption that the object function satisfies Kurdyka-Łojasiewicz property. Numerical experiments on graph-guided fused lasso and computed tomography are presented to demonstrate the performance of the proposed methods.
- JournalA Stochastic Proximal Alternating Minimization for Nonsmooth and Nonconvex OptimizationSIAM Journal on Imaging Sciences, 2021
In this work, we introduce a novel stochastic proximal alternating linearized minimization algorithm [J. Bolte, S. Sabach, and M. Teboulle, Math. Program., 146 (2014), pp. 459–494] for solving a class of nonsmooth and nonconvex optimization problems. Large-scale imaging problems are becoming increasingly prevalent due to the advances in data acquisition and computational capabilities. Motivated by the success of stochastic optimization methods, we propose a stochastic variant of proximal alternating linearized minimization. We provide global convergence guarantees, demonstrating that our proposed method with variance-reduced stochastic gradient estimators, such as SAGA [A. Defazio, F. Bach, and S. Lacoste-Julien, Advances in Neural Information Processing Systems, 2014, pp. 1646–1654] and SARAH [L. M. Nguyen, J. Liu, K. Scheinberg, and M. Takáĉ, Proceedings of the 34th International Conference on Machine Learning, PMLR 70, 2017, pp. 2613–2621], achieves state-of-the-art oracle complexities. We also demonstrate the efficacy of our algorithm via several numerical examples including sparse nonnegative matrix factorization, sparse principal component analysis, and blind image-deconvolution.
2020
- ConferenceTuning-free Plug-and-Play Proximal Algorithm for Inverse Imaging ProblemsKaixuan Wei, Angelica Aviles-Rivero, Jingwei Liang, Ying Fu, Carola-Bibiane Schönlieb, and Hua HuangIn International Conference on Machine Learning (ICML), 2020
This paper received the Outstanding Paper Award from ICML2020.
Plug-and-play (PnP) is a non-convex framework that combines ADMM or other proximal algorithms with advanced denoiser priors. Recently, PnP has achieved great empirical success, especially with the integration of deep learning-based denoisers. However, a key problem of PnP based approaches is that they require manual parameter tweaking. It is necessary to obtain high-quality results across the high discrepancy in terms of imaging conditions and varying scene content. In this work, we present a tuning-free PnP proximal algorithm, which can automatically determine the internal parameters including the penalty parameter, the denoising strength and the terminal time. A key part of our approach is to develop a policy network for automatic search of parameters, which can be effectively learned via mixed model-free and model-based deep reinforcement learning. We demonstrate, through numerical and visual experiments, that the learned policy can customize different parameters for different states, and often more efficient and effective than existing handcrafted criteria. Moreover, we discuss the practical considerations of the plugged denoisers, which together with our learned policy yield state-of-the-art results. This is prevalent on both linear and nonlinear exemplary inverse imaging problems, and in particular, we show promising results on Compressed Sensing MRI and phase retrieval.
2019
- ConferenceTrajectory of Alternating Direction Method of Multipliers and Adaptive AccelerationClarice Poon, and Jingwei LiangIn Advances in Neural Information Processing Systems (NeurIPS), 2019
The alternating direction method of multipliers (ADMM) is one of the most widely used first-order optimisation methods in the literature owing to its simplicity, flexibility and efficiency. Over the years, numerous efforts are made to improve the performance of the method, such as the inertial technique. By studying the geometric properties of ADMM, we discuss the limitations of current inertial accelerated ADMM and then present and analyze an adaptive acceleration scheme for the method. Numerical experiments on problems arising from image processing, statistics and machine learning demonstrate the advantages of the proposed acceleration approach.
2015
- JournalRetinex by Higher Order Total Variation L1 DecompositionJingwei Liang, and Xiaoqun ZhangJournal of Mathematical Imaging and Vision, 2015
In this paper, we propose a reflectance and illumination decomposition model for the Retinex problem via high-order total variation and L^1 decomposition. Based on the observation that illumination varies smoother than reflectance, we propose a convex variational model which can effectively decompose the gradient field of images into salient edges and relatively smoother illumination field through the first- and second-order total variation regularizations. The proposed model can be efficiently solved by a primal-dual splitting method. Numerical experiments on both grayscale and color images show the strength of the proposed model with applications to Retinex illusions, medical image bias field removal, and color image shadow correction.
2014
- JournalSeismic Data Restoration via Data-driven Tight FrameJingwei Liang, Jianwei Ma, and Xiaoqun ZhangGeophysics, 2014
Restoration/interpolation of missing traces plays a crucial role in the seismic data processing pipeline. Efficient restoration methods have been proposed based on sparse signal representation in a transform domain such as Fourier, wavelet, curvelet, and shearlet transforms. Most existing methods are based on transforms with a fixed basis. We considered an adaptive sparse transform for restoration of data with complex structures. In particular, we evaluated a data-driven tight-frame-based sparse regularization method for seismic data restoration. The main idea of the data-driven tight frame (TF) is to adaptively learn a set of framelet filters from the currently interpolated data, under which the data can be more sparsely represented; hence, the sparsity-promoting l1-norm (SPL1) minimization methods can produce better restoration results by using the learned filters. A split inexact Uzawa algorithm, which can be viewed as a generalization of the alternating direction of multiplier method (ADMM), was applied to solve the presented SPL1 model. Numerical tests were performed on synthetic and real seismic data for restoration of randomly missing traces over a regular data grid. Our experiments showed that our proposed method obtains the state-of-the-art restoration results in comparison with the traditional Fourier-based projection onto convex sets, the tight-frame-based method, and the recent shearlet regularization ADMM method.
2013
- JournalWavelet Frame Based Color Image DemosaicingInverse Problems and Imaging, 2013
Color image demosaicing consists in recovering full resolution color information from color-filter-array (CFA) samples with 66.7% amount of missing data. Most of the existing color demosaicing methods [14, 25, 16, 2, 26] are based on interpolation from inter-channel correlation and local geometry, which are not robust to highly saturated color images with small geometric features. In this paper, we introduce wavelet frame based methods by using a sparse wavelet [8, 22, 9, 23] approximation of individual color channels and color differences that recovers both geometric features and color information. The proposed models can be efficiently solved by Bregmanized operator splitting algorithm [27]. Numerical simulations of two datasets: McM and Kodak PhotoCD, show that our method outperforms other existing methods in terms of PSNR and visual quality.
2024
- JournalFederated Primal-Dual Fixed Point AlgorithmYa-Nan Zhu, Jingwei Liang, and Xiaoqun ZhangSIAM Journal on Mathematics of Data Science, accepted, 2024
Federated learning (FL) is a distributed learning paradigm that allows several clients to learn a global model without sharing their private data. In this paper, we generalize a primal dual fixed point (PDFP) \citePDFP method to federated learning setting and propose an algorithm called Federated PDFP (FPDFP) for solving composite optimization problems. In addition, a quantization scheme is applied to reduce the communication overhead during the learning process. An O(1/k) convergence rate (where is the communication round) of the proposed FPDFP is provided. Numerical experiments, including graph-guided logistic regression, 3D Computed Tomography (CT) reconstruction are considered to evaluate the proposed algorithm.
2023
- JournalA Distance Function Based Cascaded Neural Network for Accurate Polyps Segmentation and ClassificationYuanhong Jiang, Jingwei Liang, Weiqi Xiong , Qiwei Li , Yijue Zhang, Tao Chen, and 1 more authorInverse Problems and Imaging, 2023
In clinical practice, accurate locating and measuring the size of polyps in general is a difficult task. In this work, based on the position constraint between the primary organ and polyps boundary, we propose a U-Net based cascaded neural network for the joint segmentation of the organ of interest and the polyps. The position constraint is enforced by considering a narrow-band distance function and complimentary dice function in the loss function. Through a series of comparisons and ablation study, we evaluate the performance of our proposed cascaded network architecture and the additional loss functions using an in-house dataset for gallbladder polyps segmentation and classification. Numerical results indicate that the proposed method achieves a significant improvement over conventional U-Net, U-Net++, etc.. Lastly, the pathological type classification based on the segmented polys shows 30% higher accuracy compared to those conventional ResNet based results.
- ConferenceRobust Graph Representation Learning for Local Corruption RecoveryBingxin Zhou , Yuanhong Jiang, Yuguang Wang, Jingwei Liang, Junbin Gao, Shirui Pan, and 1 more authorIn Proceedings of the ACM Web Conference, 2023
The performance of graph representation learning is affected by the quality of graph input. While existing research usually pursues a globally smoothed graph embedding, we believe the rarely observed anomalies are as well harmful to an accurate prediction. This work establishes a graph learning scheme that automatically detects (locally) corrupted feature attributes and recovers robust embedding for prediction tasks. The detection operation leverages a graph autoencoder, which does not make any assumptions about the distribution of the local corruptions. It pinpoints the positions of the anomalous node attributes in an unbiased mask matrix, where robust estimations are recovered with sparsity promoting regularizer. The optimizer approaches a new embedding that is sparse in the framelet domain and conditionally close to input observations. Extensive experiments are provided to validate our proposed model can recover a robust graph representation from black-box poisoning and achieve excellent performance.
2022
- JournalVariable Screening for Sparse Online RegressionJingwei Liang, and Clarice PoonJournal of Computational and Graphical Statistics, 2022
Sparsity-promoting regularizers are widely used to impose low-complexity structure (e.g., l1-norm for sparsity) to the regression coefficients of supervised learning. In the realm of deterministic optimization, the sequence generated by iterative algorithms (such as proximal gradient descent) exhibit “finite activity identification” property, that is, they can identify the low-complexity structure of the solution in a finite number of iterations. However, many online algorithms (such as proximal stochastic gradient descent) do not have this property owing to the vanishing step-size and nonvanishing variance. In this article, by combining with a screening rule, we show how to eliminate useless features of the iterates generated by online algorithms, and thereby enforce finite sparsity identification. One advantage of our scheme is that when combined with any convergent online algorithm, sparsity properties imposed by the regularizer can be exploited to improve computational efficiency. Numerically, significant acceleration can be obtained.
- JournalOn Biased Stochastic Gradient EstimationDerek Driggs, Jingwei Liang, and Carola-Bibiane SchönliebJournal of Machine Learning Research, 2022
We present a uniform analysis of biased stochastic gradient methods for minimizing convex, strongly convex, and non-convex composite objectives, and identify settings where bias is useful in stochastic gradient estimation. The framework we present allows us to extend proximal support to biased algorithms, including SAG and SARAH, for the first time in the convex setting. We also use our framework to develop a new algorithm, Stochastic Average Recursive GradiEnt (SARGE), that achieves the oracle complexity lower-bound for non-convex, finite-sum objectives and requires strictly fewer calls to a stochastic gradient oracle per iteration than SVRG and SARAH. We support our theoretical results with numerical experiments that demonstrate the benefits of certain biased gradient estimators.
- JournalTFPNP: Tuning-free Plug-and-Play Proximal Algorithms with Applications to Inverse Imaging ProblemsKaixuan Wei, Angelica Aviles-Rivero, Jingwei Liang, Ying Fu, Hua Huang, and Carola-Bibiane SchönliebJournal of Machine Learning Research, 2022
Plug-and-Play (PnP) is a non-convex optimization framework that combines proximal algorithms, for example, the alternating direction method of multipliers (ADMM), with advanced denoising priors. Over the past few years, great empirical success has been obtained by PnP algorithms, especially for the ones that integrate deep learning-based denoisers. However, a key problem of PnP approaches is the need for manual parameter tweaking which is essential to obtain high-quality results across the high discrepancy in imaging conditions and varying scene content. In this work, we present a class of tuning-free PnP proximal algorithms that can determine parameters such as denoising strength, termination time, and other optimization-specific parameters automatically. A core part of our approach is a policy network for automated parameter search which can be effectively learned via a mixture of model-free and model-based deep reinforcement learning strategies. We demonstrate, through rigorous numerical and visual experiments, that the learned policy can customize parameters to different settings, and is often more efficient and effective than existing handcrafted criteria. Moreover, we discuss several practical considerations of PnP denoisers, which together with our learned policy yield state-of-the-art results. This advanced performance is prevalent on both linear and nonlinear exemplar inverse imaging problems, and in particular shows promising results on compressed sensing MRI, sparse-view CT, single-photon imaging, and phase retrieval.
2020
- ConferenceTuning-free Plug-and-Play Proximal Algorithm for Inverse Imaging ProblemsKaixuan Wei, Angelica Aviles-Rivero, Jingwei Liang, Ying Fu, Carola-Bibiane Schönlieb, and Hua HuangIn International Conference on Machine Learning (ICML), 2020
This paper received the Outstanding Paper Award from ICML2020.
Plug-and-play (PnP) is a non-convex framework that combines ADMM or other proximal algorithms with advanced denoiser priors. Recently, PnP has achieved great empirical success, especially with the integration of deep learning-based denoisers. However, a key problem of PnP based approaches is that they require manual parameter tweaking. It is necessary to obtain high-quality results across the high discrepancy in terms of imaging conditions and varying scene content. In this work, we present a tuning-free PnP proximal algorithm, which can automatically determine the internal parameters including the penalty parameter, the denoising strength and the terminal time. A key part of our approach is to develop a policy network for automatic search of parameters, which can be effectively learned via mixed model-free and model-based deep reinforcement learning. We demonstrate, through numerical and visual experiments, that the learned policy can customize different parameters for different states, and often more efficient and effective than existing handcrafted criteria. Moreover, we discuss the practical considerations of the plugged denoisers, which together with our learned policy yield state-of-the-art results. This is prevalent on both linear and nonlinear exemplary inverse imaging problems, and in particular, we show promising results on Compressed Sensing MRI and phase retrieval.
2018
- ConferenceLocal Convergence Properties of SAGA/Prox-SVRG and AccelerationClarice Poon, Jingwei Liang, and Carola-Bibiane SchönliebIn International Conference on Machine Learning (ICML), 2018
In this paper, we present a local convergence anal- ysis for a class of stochastic optimisation meth- ods: the proximal variance reduced stochastic gradient methods, and mainly focus on SAGA (Defazio et al., 2014) and Prox-SVRG (Xiao & Zhang, 2014). Under the assumption that the non-smooth component of the optimisation prob- lem is partly smooth relative to a smooth mani- fold, we present a unified framework for the local convergence analysis of SAGA/Prox-SVRG: (i) the sequences generated by the methods are able to identify the smooth manifold in a finite num- ber of iterations; (ii) then the sequence enters a local linear convergence regime. Furthermore, we discuss various possibilities for accelerating these algorithms, including adapting to better lo- cal parameters, and applying higher-order deter- ministic/stochastic optimisation methods which can achieve super-linear convergence. Several concrete examples arising from machine learning are considered to demonstrate the obtained result.