Skip to content

Commit ef08702

Browse files
Correctness fixes to tutorial (#37)
1 parent d612626 commit ef08702

1 file changed

Lines changed: 22 additions & 12 deletions

File tree

docs/src/tutorial.md

Lines changed: 22 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -134,21 +134,32 @@ julia> @minimize ls(X1*X2-Y) st X1 >= 0., X2 >= 0.
134134

135135
## Limitations
136136

137-
Currently StructuredOptimization.jl supports only *proximal gradient algorithms* (i.e., *forward-backward splitting* base), which require specific properties of the nonsmooth functions and constraint to be applicable. In particular, the nonsmooth functions must have an *efficiently computable proximal mapping*.
137+
Currently StructuredOptimization.jl supports only *proximal gradient algorithms* (i.e., *forward-backward splitting* based), which require specific properties of the nonsmooth functions and constraint to be applicable. In particular, the nonsmooth function $g$ must have an *efficiently computable proximal mapping*:
138+
139+
```math
140+
\text{prox}_{g,\lambda}\left(x\right)=\arg\min_{y}g\left(x\right)+\frac{\lambda}{2}\left\Vert y-x\right\Vert ^{2}
141+
```
138142

143+
(we affectionately say such a function is prox-able).
144+
139145
If we express the nonsmooth function $g$ as the composition of
140-
a function $\tilde{g}$ with a linear operator $A$:
146+
a prox-able function $\tilde{g}$ with a linear operator $A$:
141147

142148
```math
143149
g(\mathbf{x}) =
144150
\tilde{g}(A \mathbf{x})
145151
```
146152

147-
then the proximal mapping of $g$ is efficiently computable if either of the following hold:
153+
then $g$ is also `prox`-able if $A$ is a *tight frame*, namely it satisfies $A A^* = \mu Id$, where $\mu \geq 0$, $A^*$ is the adjoint of $A$, and $Id$ is the identity operator.
148154

149-
1. Operator $A$ is a *tight frame*, namely it satisfies $A A^* = \mu Id$, where $\mu \geq 0$, $A^*$ is the adjoint of $A$, and $Id$ is the identity operator.
155+
More generally, a *separable sum* of prox-able functions $h_j$ is also prox-able:
150156

151-
2. Function $g$ is a *separable sum* $g(\mathbf{x}) = \sum_j h_j (B_j \mathbf{x}_j)$, where $\mathbf{x}_j$ are non-overlapping slices of $\mathbf{x}$, and $B_j$ are tight frames.
157+
```math
158+
g(\mathbf{x}) =
159+
\sum_j h_j (B_j \mathbf{x}_j)
160+
```
161+
162+
where $\mathbf{x}_j$ are non-overlapping slices of $\mathbf{x}$, and $B_j$ are tight frames.
152163

153164
Let us analyze these rules with a series of examples.
154165
The LASSO example above satisfy the first rule:
@@ -157,7 +168,7 @@ The LASSO example above satisfy the first rule:
157168
julia> @minimize ls( A*x - y ) + λ*norm(x, 1)
158169
```
159170

160-
since the nonsmooth function $\lambda \| \cdot \|_1$ is not composed with any operator (or equivalently is composed with $Id$ which is a tight frame).
171+
since the nonsmooth function $\lambda \| \cdot \|_1$ is separable and not composed with any operator (or equivalently is composed with $Id$ which is a tight frame).
161172
Also the following problem would be accepted by StructuredOptimization.jl:
162173

163174
```julia
@@ -170,21 +181,20 @@ since the discrete cosine transform (DCT) is orthogonal and is therefore a tight
170181
julia> @minimize ls( A*x - y ) + λ*norm(x, 1) st x >= 1.0
171182
```
172183

173-
cannot be solved through proximal gradient algorithms, since the second rule would be violated.
174-
Here the constraint would be converted into an indicator function and the nonsmooth function $g$ can be written as the sum:
184+
cannot be solved by this package. Here the constraint would be converted into an indicator function and the nonsmooth function $g$ can be written as the sum:
175185

176186
```math
177187
g(\mathbf{x}) =\lambda \| \mathbf{x} \|_1 + \delta_{\mathcal{S}} (\mathbf{x})
178188
```
179189

180-
which is not separable. On the other hand this problem would be accepted:
190+
which is separable, but not in a way that is obvious to the package: a simple sum of two prox-able operators is not always proxable.
191+
This can be worked around by extending the library with a (simple) new ProximalOperator.
192+
On the other hand in this problem the sum is separable, and thus accepted:
181193

182194
```julia
183195
julia> @minimize ls( A*x - y ) + λ*norm(x[1:div(n,2)], 1) st x[div(n,2)+1:n] >= 1.0
184196
```
185197

186-
as not the optimization variables $\mathbf{x}$ are partitioned into non-overlapping groups.
187-
188198
!!! note
189199

190-
When the problem is not accepted it might be still possible to solve it: see [Smoothing](@ref) and [Duality](@ref).
200+
When the problem is not accepted it might be still possible to solve a variant: see [Smoothing](@ref) and [Duality](@ref).

0 commit comments

Comments
 (0)