A proximal-projection bundle method for convex nonsmooth optimization with on-demand accuracy oracles

For some practical problems, the exact computation of the function and (sub)gradient values may be difficult. In this paper, a proximal-projection bundle method for minimizing convex nonsmooth optimization problems with on-demand accuracy oracles is proposed. Our method essentially generalizes the work of Kiwiel (SIAM J Optim, 17: 1015-1034, 2006) from exact and inexact oracles to various oracles, including exact, inexact, partially inexact, asymptotically exact and partially asymptotically exact oracles. At each iteration, a proximal subproblem is solved to generate a linear model of the objective function, and then a projection subproblem is solved to obtain a trial point. Finally, global convergence of the algorithm is established under different types of inexactness.


Introduction
In this paper, we focus on solving problems of the form where C ⊆ R n is a nonempty closed convex set, and f : R n → R is a convex function but not necessarily differentiable.
It is well known that bundle methods are among the most efficient methods for solving nonsmooth optimization problems.For the case where an exact oracle is available, i.e., there is a subroutine that can exactly (in theory) evaluate the function value f (u) and one arbitrary subgradient g(u) ∈ ∂f (u) at any point u, bundle methods are well studied [1,2].However, for some practical problems, such as minimax problems, generalized assignment problems and two-stage stochastic programming problems, etc. (see, e.g.[3,4]), the exact computation of the function values and subgradients is difficult.In order to solve such kind of problems, a class of bundle methods based on inexact oracle information is proposed [4][5][6][7][8][9][10][11][12][13][14].In particular, Kiwiel [4] proposed a bundle method with a partially inexact oracle which becomes exact when an objective target level for a descent step is reached, and applied it to solve generalized assignment problems.Oliveira et al. [5] proposed inexact bundle methods for solving two-stage stochastic programming.Fábián [6] presented an asymptotically exact level bundle method that extends the exact version in [15].Kiwiel [7] proposed a proximal-projection bundle method for constrained problem (1.1), in which a fixed error tolerance of inexactness is used.At each iteration of the algorithm in [7], two subproblems are

Preliminaries
The oracle with on-demand accuracy proposed by Oliveira and Sagastizábal [3] is described as follows.For given u ∈ C, a descent target γ u and an error bound ε u ≥ 0, the approximate function value f u (≈ f (u)) and the approximate subgradient g u (≈ g(u)) satisfy the following condition: , and whenever f u ≤ γ u (descent target reached), the relation η(γ u ) ≤ ε u holds. (2.1) From the above relations, when the descent target is reached, the exact function value satisfies the following relation: By suitably choosing the parameters γ u and ε u , the oracle (2.1) covers various oracles: • Exact Oracle: Set γ u = +∞ and ε u = 0.
As in [3], an additional assumption is needed: there exists a positive constant η such that η(γ u ) ≤ η, ∀ u ∈ C. We now provide one example coming from stochastic optimization that is suitable to apply the oracle (2.1).
Example 1 (On-demand accuracy oracles for stochastic programming [3]) Consider two-stage stochastic linear programming problems [5,17] with fixed recourse.By discretizing the uncertainty into N scenarios, we obtain the form of problem (1.1) where u is the first-stage decision variable, c ∈ R n , A ∈ R m1×n , and b ∈ R m1 .In addition, the recourse function is corresponding to the ith scenario (h i , T i ), with probability p i > 0 for h i ∈ R m2 and T i ∈ R m2×n .Here π is the second-stage decision variable.
The above recourse function can be written as its dual form: where q ∈ R n2 and W ∈ R m2×n2 .By solving this linear programming to return a solution with precision up to a given tolerance, one can establish an inexact oracle in the form (2.1), see [3] for more detailed description.

The proximal-projection bundle method for oracles with on-demand accurary
In this section, we present our proximal-projection bundle method for oracles with on-demand accurary to solve problem (1.1).Firstly, we know that problem (1.1) is equivalent to the unconstrained problem where i C is the indicator function of C, i.e., i C (u) = 0 if u ∈ C; ∞ otherwise.Let k be the current iteration index, {u j } k j=1 ⊂ C be a sequence of trial points, and the corresponding approximate values f u j /g u j be produced by the oracle (2.1).For simplicity, denote f j u := f u j , g j u := g u j , ε j u := ε u j and γ j u := γ u j , then the approximate linearizations of f at u j are given by f j (•) = f j u + ⟨g j u , • − u j ⟩.In addition, from (2.1) we conclude that Thus, a simple form of the approximate cutting-planes model of f at the kth iteration can be defined by where J k ⊆ {1, ..., k} is some index set.Note that, in what follows, the choice of the model function fk may be different from the form of (3.3), since a subgradient aggregation strategy is adopted.
Based on the idea of proximal bundle methods (see, e.g.[1]) for solving problem (3.1), one may solve the following subproblem to obtain a new trial u k+1 : where t k > 0 is a stepsize that controls the size of ∥u k+1 − ûk ∥, and ûk (called stability center) is the "best" point obtained so far.Usually, However, the subproblem (3.4) is usually not easy to solve, so by making use the proximal-projection idea of Kiwiel [7], we solve two easier subproblems instead.One is an unconstrained proximal subproblem which is used to generate an aggregate linearization of f , and the other subproblem based on this linearization is solved to produce a new trial point.The second subproblem is equivalent to projecting a certain point onto the feasible set C, which can have a closed-form solution if C has some special structure.Now, we present the details of our algorithm, which is a generalized version of that in [7].The main difference lies in Step 6 which incorporates the strategy of the on-demand accuracy oracle (2.1).

Remark 1
(i) The choice of the model function fk is very flexible.The simplest choice fk = max{ fk−1 , f k } only contains two linear functions, but for numerical stability, some other linearizations should be included.
(ii) Solving two subproblems at Steps 2 and 3 can be viewed as the alternating linearization method (e.g.[18]) being applied to problem (3.4).
(iii) Solving subproblem (3.8) is equivalent to projecting the point ûk − t k p k f onto the feasible set C. (iv) Inexactness is discovered via v k < −ϵ k at Step 5, and the stepsize t k is increased until v k ≥ −ϵ k is generated.(v) The descent target and the error bound are updated at Step 6, and the detailed rules to ensure convergence of the algorithm are given in the next section.
Denote the optimality measure When {û k } is bounded and The following lemma states some important properties of Algorithm 3.1, in which most of the results are borrowed from [7], but for completeness, we present the whole proof.

Lemma 3.2
(i) The vectors p k f and p k C defined in (3.7) and (3.9) satisfy The linearizations fk , īk C and f k C satisfy the following inequalities fk ≤ fk , īk (ii) The aggregate subgradient p k of (3.10) and the above linearization f k C can be expressed as follows The predicted descent v k and the aggregate linearization error ϵ k of (3.10) satisfy (iv) The aggregate linearization (v) The optimality measure V k of (3.12) satisfies (vi) The following relations hold: ).Furthermore, the relation fk fk follows from fk (ǔ k+1 ) = fk (ǔ k+1 ).Similarly, from the optimality condition of (3.8), we have p k C ∈ ∂i C (u k+1 ) and īk C ≤ i C .Hence, (3.14) holds.(ii) By (3.9), we obtain Using the linearity of f k C (•) and (3.7), we derive (iii) Combining (3.10) and (ii), we have , the aggregate lineaization satisfies (v) By the Cauchy-Schwarz inequality and (3.12), we obtain From (3.16), we have and for all u holds.(vi) From part (iii) above, (3.19) holds immediately.Furthermore, by (3.15), (3.14) and (3.5) Combining

Global convergence
In this section, we establish the global convergence of Algorithm 3.1 under different types of oracles.The oracles depend on two parameters: the error bound ε u and the descent target γ u .Table 1 provides the choice of the parameters for various instances described in Section 2, including Exact (Ex), Partially Inexact (PI), Inexact(IE),Asymptotically Exact (AE) and Partially Asymptotically Exact (PAE) oracles.The constants are selected as θ, κ ∈ (0, 1), and κ ϵ ∈ (0, κ).
Table 1.The choices for the error bound and the descent target The descent target is always reached at the stability centers, i.e.,

Lemma 4.2
If either Algorithm 3.1 terminates at Step 4 with iteration k, or the number of loops between Steps 2 and 5 is infinite, then (i) ûk is an optimal solution to problem (1.1) for instances Ex and PI. (ii u , for instances AE and PAE.

Proof
For the case where Algorithm 3.1 terminates at Step 4. From (3.17), it follows that V k = 0. Thus, we have for instances AE and PAE.For the case where the loop between Steps 2 and 5 is infinite.From Lemma 4.1 and the condition at Step 5, we obtain (3.22), which in turn implies V k → 0 since t k ↑ ∞.Thus, f k û ≤ f C (u), ∀u by (3.18), and then 2) again, we can obtain the claims by repeating the lines in first case.From Lemma 4.2 above, we now assume that Algrorithm 3.1 neither terminates nor loops infinitely between Steps 2 and 5.As in [7], it is assumed that the model subgradients The following analysis is divided into two cases: finitely many descent steps and infinitely many descent steps.We consider the first case, which involves two subcases:

Proof
For the last time t k increases before Step 5 for k ∈ K, we have Next, we consider the case where t ∞ < ∞.Lemma  We finish the case of finitely many descent steps.The following lemma comes from [7,Lemma 3.3].

Lemma 4.5
Suppose that there exists k such that only null steps occur for all k ≥ k.
Theorem 4.6 Suppose that finitely many descent steps occur and let ûk be the last stability center.Then ûk is an ε k u -optimal solution to problem (1.1).

Remark 2
From Table 1, Theorem 4.6 shows that ûk is an optimal solution for instances Ex and PI; an ε-optimal solution for IE; and an ε k u -optimal solution for AE and PAE.We now analyze the case of infinitely many descent steps.The following lemma is borrowed from [7,Lemma 3.4].

Lemma 4.7
Suppose that infinitely many descent steps occur and f This together with (4.1) shows claim (i).
(ii) We split our analysis for other instances in Table 1: • Instance Ex has ε k+1 (iii) If f * > −∞, then f ∞ û > −∞ by (i) and (ii) above.So by Lemma 4.7, we have lim k∈K V k = 0. Passing to the limit in (3.18), we obtain f ∞ û ≤ inf f C = f * .

Conclusions
In this paper, we have presented a proximal-projection bundle method for oracles with on-demand accuracy for minimizing a convex function over a closed convex set.Our method extends the inexact oracle of [7] to oracles with on-demand accuracy.By suitably choosing the parameters of the oracle, the descent target is always reached at the stability centers, which is the key to establish global convergence of the algorithm under different types of inexactness.
We first analyze the case of t ∞ = ∞.Suppose that there exists a last descent index k such that only null steps occur for all k ≥ k, and t ∞ 4.4Suppose that there exists k such that ûk = ûk andt min ≤ t k+1 ≤ t k for all k ≥ k.If the descent criterion (3.11) fails for all k ≥ k, then V k → 0. Proof For k ≥ k, combining t k ≥ t min , ûk =ûk and the proof of [7, Lemma 3.2], we have V k → 0.