运筹学项目的组成

Formulation
Mathematical Modeling
Data Collection
Validation/Analysis
Interpretation/Implementation

运筹学课程大致包括以下内容：

数学优化 (Mathematical Optimization)
1. 线性规划 (Linear Programming, LP)
2. 整数规划 (Integer Linear Programming)
3. 非线性规划 (Non-Linear Programming)
动态规划 (Dynamic Programming)
图 (Graphs)
调度 (Scheduling)

# 线性规划

# 线性规划的概念

# 线性规划的基本形式

线性规划 (Linear Programming, LP) 是一类最基本的优化问题，其存在一个最优化的目标函数 $f(\boldsymbol{x})$ 和一些约束式。如果将约束式和最优化目标的系数进行调整，就可以使目标函数的优化方向为最小化，同时保证各线性不等式均为 $\mathrm{LHS} \geq \mathrm{RHS}$ , 且约束变量恒非负，我们称此种形式的线性规划问题为规范形式的线性规划问题 (normal form of LP), 其包括一个最小化目标函数 $f(\boldsymbol{x})$ 及多个线性不等式作为限制条件，可以表示为：

$\begin{aligned} & \min && f(\boldsymbol{x}) = \sum_{i=1}^nc_ix_i = \boldsymbol{c}^\top\boldsymbol{x}, && \boldsymbol{c}, \boldsymbol{x} \in \mathbb{R}^n \\ & s.t. && A\boldsymbol{x} \geq \boldsymbol{b}, && A \in \mathbb{R}^{m\times n}, \boldsymbol{b} \in \mathbb{R}^m\\ & && x_i \geq 0, && \forall i=1,2,\dots,n \end{aligned}$

如果此时我们引入松弛变量 (slack variable)

$x_j = b_k-\sum_{i=1}^na_{k,i}x_i, \;j=n+1,\dots,n+m$

那么 $Ax \geq b$ 可以转化为 $(A\;I_m)\boldsymbol{x}' = \boldsymbol{b}$ , 其中 $\boldsymbol{x}' = (\boldsymbol{x}^\top,x_{n+1}, \dots, x_{n+m})^\top$ . 我们将作这种转化后的线性规划问题称为标准形式的线性规划问题 (standard form of LP), 记作

$\begin{aligned} & \min && \boldsymbol{c}^\top \boldsymbol{x} \\ & s.t. && A\boldsymbol{x} = \boldsymbol{b}\\ & && \boldsymbol{x} \geq \boldsymbol{0} \end{aligned}$

注意到，线性方程组 $A\boldsymbol{x} = \boldsymbol{b}$ 有 $n$ 个未知量。一般认为 $A\boldsymbol{x} = \boldsymbol{b}$ 中各方程是相互独立的，因此存在关系 $m = \mathrm{rank}{A} \leq n$ . 一般情况下，默认存在前述限制条件 $A \in \mathbb{R}^{m\times n}, \boldsymbol{b} \in \mathbb{R}^m, \boldsymbol{c, x} \in \mathbb{R}^n$ .

# 线性规划的解

对标准形式的线性规划，设矩阵 $A$ 的一个 $m$ 阶满秩方阵为 $B$ , 则 $B$ 是 $A$ 的一个基 (basis). $B$ 中 $m$ 个线性无关的列向量称为基向量 (basis vectors). $\boldsymbol{x}$ 中对应的 $m$ 个分量称为基变量 (basic variables), 设其组成 $\boldsymbol{x}_B \in \mathbb{R}^m$ . 其余 $n-m$ 个称为非基变量 (nonbasic variables). 非基变量都为 $0$ 的解称为 $\boldsymbol{x}$ 的基本解 (basic solution). 如果解中存在基变量为 $0$ , 那么称这是一个退化的基本解 (degenerate basic solution).

我们可以交换 $A$ 的列向量，而对原 LP 问题的解没有影响 (仅交换顺序). 因此，我们在进行形式推导时，往往假设 $\boldsymbol{x}$ 中前 $m$ 个为其基变量，其余为其非基变量。这样 $\boldsymbol{x} = (\boldsymbol{x}_B^\top, \boldsymbol{0}^\top)^\top$ .

# 线性规划的求解

线性规划基本定理 (Fundamental Theorem of LP): 若线性规划问题具有可行解，那么其具有基本可行解。若线性规划问题具有最优可行解，那么其具有最优基本可行解。

证明假设 $\boldsymbol{x} = (x_1,x_2,\dots, x_n)^\top$ 是一个可行解，不妨设其为 $\boldsymbol{x} = (x_1,x_2,\dots, x_p, 0, \dots, 0)^\top$ , 其中 $\forall 1 \leq i \leq p$ , $x_i>0$ . 设 $A = (\boldsymbol{a}_1, \boldsymbol{a}_2, \dots, \boldsymbol{a}_m)$

w.r.t.: with respect to, 关于

# 单纯形法 (Simplex)

# 单纯形法的本质

单纯形法的几何意义是寻找多面体顶点，验证其是否为最优解。

# 单纯形法的过程

单纯形法算法如下进行：

Find an initial basis feasible solution $x$ w.r.t. $B$ .
Transform $[A \; b]$ to $[I_m \; Y \; \gamma]$ .
Compute $\gamma_i = c_i-z_i$ , $\forall i \geq m+1$ . $z_i = \sum_{l=1}^mc_ly_{l,i}$ .
If $\gamma_i \geq 0$ , then $x$ is optimal. Stop.
Otherwise, pick $q$ with $\gamma_q < 0$ . (Always pick the minimum).
If no $y_{i,q}>0$ for all $1\leq i\leq m$ , then stop and report "unbounded", else $\displaystyle p = \argmin \left\{\left.\frac{y_{i,0}}{y_{i,q}}\right|y_{i,q}>0\right\}$
$B' = \{a_q\}\cup B \setminus \{a_p\}$ and update $[Y, y_0]$ .
Go to Step 3.

Why is $B'$ a basis?

# 单纯形法举例

$\begin{aligned} & \min &&Z = -x_1 - 2x_2 - x_3\\ & s.t. && 2x_1 + x_2 - x_3 \leq 2\\ & && 2x_1 - x_2 + 5x_3 \leq 6\\ & && 4x_1 + x_2 + x_3 \leq 6\\ & && x_1,x_2,x_3 \geq 0 \end{aligned}$

过程略。

# 两阶段法 (Two-Phase Method)

单纯形法的执行过程包括两个主要部分：寻找初始解和最优解迭代。对于约束为 $A\boldsymbol{x} \leq \boldsymbol{b}$ 类型的问题，我们可以通过松弛变量找到一个相当弱的初始解 (一般为 $x_i = 0, \forall i$ ). 但对于一些难以求得初始解的 LP 问题，我们需要借助两阶段法寻找一个合适的初始解。

两阶段法 (two-phase method) 实际上是单纯形法中求初始解的优化方法，其算法可以分为两个部分：

约束条件不变，求解优化目标为 $\sum x_{art}$ 的 LP 问题，其中 $a_{art}$ 为辅助变量 (artificial variables).
从上一阶段得到的基本可行解出发，找到最优的解。

注意到，寻找阶段 1 的初始解是容易的，而阶段 1 的最终解是原问题的一个基本可行解。因此可以简化单纯形法中初始解的寻找问题。

两阶段法的具体做法如下：对约束问题

$\begin{aligned} & \min && \boldsymbol{c}^\top \boldsymbol{x} \\ & s.t. && A\boldsymbol{x} = \boldsymbol{b}\\ & && \boldsymbol{x} \geq \boldsymbol{0} \end{aligned}$

构造辅助变量 $x_{n+1},x_{n+2},\dots,x_{n+m}$ , 求解

$\begin{aligned} & \min && g = \sum_{i=n+1}^{n+m} x_i \\ & s.t. && A\boldsymbol{x} + (x_{n+1},x_{n+2},\dots,x_{n+m})^\top = \boldsymbol{b}\\ & && \boldsymbol{x} \geq \boldsymbol{0},\;x_{art}>0 \end{aligned}$

这样，若最小值 $g = \sum x_{art} = 0$ , 那么对应的 $\boldsymbol{x}$ 就是一个初始解。

# 单纯形表

Assumption: The first $m$ columns of $A$ form a basis $B$ . $A = [B, D], C = [C_B^\top, C_D^\top]^\top$ , $x=[x_B^\top,x_D^\top]^\top$ . Then $\min c_B^\top x_B + c_D^\top x_D$ , s.t. $Bx_B + Dx_D = b$ , and $x_B, x_D \geq 0$ .

For the basic solution: $x=[x_B^\top, 0^\top]$ , $x_B=B^{-1}b$ , $x=[B^{-1}b, 0]^\top$ . $Z_D = c_B^\top B^{-1}b$ .

Tableau

For a $(m+1)\times(m+1)$ matrix

$\left[\begin{matrix} A & b \\ c^\top & 0 \end{matrix}\right] = \left[\begin{matrix} B & D & b \\ c_B^\top & c_D^\top & 0 \end{matrix}\right]$

Exists a $(m+1)\times(m+1)$ matrix

$\left[\begin{matrix} B^{-1} & \boldsymbol{0} \\ \boldsymbol{0}^\top & \boldsymbol{1} \end{matrix}\right] \times \left[\begin{matrix} B & D & b \\ c_B^\top & c_D^\top & \boldsymbol{0} \end{matrix}\right] = \left[\begin{matrix} I_m & B^{-1}D=Y & B^{-1}b=\boldsymbol{y}_0\\ c_B^\top & c_D^\top & 0 \end{matrix}\right]$

继续变换，得到

$\left[\begin{matrix} I_m & \boldsymbol{0} \\ -c_B^\top & \boldsymbol{1} \end{matrix}\right] \times \left[\begin{matrix} I_m & B^{-1}D & B^{-1}b \\ c_B^\top & c_D^\top & 0 \end{matrix}\right] = \left[\begin{matrix} I_m & B^{-1}D & B^{-1}b\\ \boldsymbol{0} & c_D^\top-c_B^\top B^{-1}D & -c_B^\top B^{-1}b \end{matrix}\right](*)$

其中， $(*)$ 的右下角项 $-c_B^\top B^{-1}b = -Z_0$ 是目标函数值。且 $c_D^\top - c_B^\top B^{-1}D = r_D^\top$ .

因此这样可以顺便算出 $r_D$ 和 $-Z_0$ , 但要求是 $(*)$ 矩阵的左下块为 $\boldsymbol{0}$ .

给了一个需要自行证明的结论：

$\forall j \geq m+1$ :

$\left[\begin{matrix} Y & \boldsymbol{y}_0 \\ c_D^\top & \boldsymbol{0} \end{matrix}\right] \to \left[\begin{matrix} Y & \boldsymbol{y}_0'\\ c_D^\top & Z' \end{matrix}\right]$

只需要
$\forall 1 \leq i \leq m+1:\quad y_{ij}' = y_{ij}-\frac{y_{pj}}{y_{pq}}y_{iq}, i\neq p$ , $y_{ij}'=\frac{y_{pi}}{y_{pq}}, i=p$ .

Example:

$\begin{aligned} & \max && 7x_1 + 6x_2, \\ & s.t. && 2x_1 + x_2 \leq 3 \\ & && x_1 + 4x_2 \leq 4 \\ & && x_1, x_2 \geq 0 \end{aligned}$

标准化，得到

$\begin{aligned} &\min && -7x_1-6x_2\\ &s.t. && 2x_1 + x_2 + x_3 = 3 \\ &&& x_1 + 4x_2 + x_4 = 4 \\ &&& x_1, x_2, x_3, x_4\geq 0 \end{aligned}$

Then,

$A= \left[\begin{matrix} 2 & 1 & 1 & 0\\ 1 & 4 & 0 & 1\\ \end{matrix}\right]$

So,

$\left[\begin{matrix} A & b \\ c^\top & \boldsymbol{0} \end{matrix}\right] = \left[\begin{matrix} 2 & 1 & 1 & 0 & 3 \\ 1 & 4 & 0 & 1 & 3 \\ -7 & -6 & 0 & 0 & 0 \\ \end{matrix}\right]$

and $x = [x_1,x_2,x_3,x_4]^\top = [0,0,3,4]^\top$ .

So $r_1 = -7, r_2 = -6 \Rightarrow q = 1$ .

$\left[\begin{matrix} 1 & \frac{1}{2} & \frac{1}{2} & 0 & \frac{3}{2}\\ 0 & \frac{7}{2} & -\frac{1}{2} & 1 & \frac{5}{2}\\ 0 & -\frac{5}{2} & \frac{7}{2} & 0 & \frac{21}{2} \end{matrix}\right]$

Simplex 的反例 (会走入死循环，cycling)

$\begin{aligned} & \min && -\frac{3}{4}x_1 + 20x_2 - \frac{1}{2}x_3 + 6x_4 \\ & s.t. && \frac{1}{4}x_1-8x_2-x_3+9x_4+x_5 = 0 \\ & && \frac{1}{2}x_1-12x_2-\frac{1}{2}x_3+3x_4+x_6=0 \\ & && x_3+x_7 = 1\\ & && x_1, \dots, x_7 \geq 0 \end{aligned}$

# Improvement of Simplex Method

$[A,b]\leadsto[I_m, Y, y_0]\leadsto[I_m, Y', y_0]\leadsto \cdots$

若 $n\gg m$ , 绝大多数列没有机会进入基。

# 过程总结

Form a revised tableau $[B^{-1}, y_0=B^{-1}b]$ .
Calculate $r_D^\top = c_D^\top - \lambda^\top D, \lambda^\top =c_B^\top B^{-1}$ .
If $\forall j, r_j \geq 0$ , then output $x$ .
Select $q$ with $r_q < 0$ .
Compute $y_q = B^{-1}a_q$ .
If no $y_{iq} > 0$ , then stop. (unbounded) $p = \argmin\{\frac{y_{i0}}{y_{iq}}:y_{iq}>0\}$ .
$B^{-1}_{\max} = E \times B^{-1}$ .

每步的计算量减小。

Example:

$\left\{\begin{aligned} & \min & & 3x_1+5x_2 \\ & s.t. & & x_1+x_2\leq 4\\ & & & 5x_1+3x_2\geq 8\\ & & & x_1,x_2 \geq 0 \end{aligned}\right.$

# 线性规划的对偶问题

对偶性 (duality) 在近似算法、LP 等多种方向都有所应用。

# 对偶问题的概念

对一个规范线性规划

$\begin{aligned} & \min && \boldsymbol{c}^\top \boldsymbol{x} \\ & s.t. && A \boldsymbol{x} \geq \boldsymbol{b}\\ & && \boldsymbol{x} \geq \boldsymbol{0} \end{aligned}$

其对偶问题定义为

$\begin{aligned} & \max && \boldsymbol{b}^\top \boldsymbol{y} \\ & s.t. && \boldsymbol{y}^\top A \leq \boldsymbol{c}\\ & && \boldsymbol{y} \geq \boldsymbol{0} \end{aligned}$

这样可以推得，对问题

$\begin{aligned} & \min && \boldsymbol{c}^\top \boldsymbol{x} \\ & s.t. && A \boldsymbol{x} = \boldsymbol{b} \\ & && \boldsymbol{x} \geq \boldsymbol{0} \end{aligned}$

其对偶问题为

$\begin{aligned} & \max && \boldsymbol{b}^\top \boldsymbol{y} \\ & s.t. && \boldsymbol{y}^\top A \leq \boldsymbol{c} \end{aligned}$

# 对偶问题的性质

Claim: The dual of dual is primal. (对偶的对偶是原问题)

Lemma: Suppose $x$ and $y$ are feasible solution to the primal and dual LPs respectively. Then, $c^\top x \geq y^\top b$ . 即互相对偶的问题中，min 那项对应的值更大。

Lemma: If one is unbounded, then the other has no feasible solution.

Theorem: Suppose $x_0,y_0$ feasible solutions to primal, dual rep. If $c^\top x_0 = b^\top y_0$ , then $x_0, y_0$ are optimal. 碰上了一定是最优的。

Proof: 略

# 线性规划的其它快速求解方法

# 椭圆球算法 (Ellipsoid)

by Khachiyan

Def of ellipsoid:

$E = E(A,a) = \{x \in \mathbb{R}^n|(x-a)^\top A^{-1} (x-a) \leq 1\}$ . 其宽度 (width) 为 $\sqrt\alpha$ , 半径 (radius) 为 $\sqrt{\beta}$ . 其中 $\alpha, \beta$ 分别是 $A$ 最小和最大的特征值。

对一个 LP 问题：

$\left\{\begin{aligned} & \min && c^\top x\\ & s.t. && Ax \geq b\\ \end{aligned}\right.$

# Karmarkar algorithm

【线性规划 (七)】内点法 - 王源的文章 - 知乎

Def: Canonical from of LP

$\begin{aligned} & \min & c^\top x &\\ & s.t. & Ax = 0&\\ & & \sum_{x_i} = 1& \\ && x_i \geq 0\\ \end{aligned}$

Def: $\Omega=\{x:Ax = 0\}, \Delta = \{e^\top x = 1\}$ .

Nullspace of a matrix $M$ : $\{x|Mx = 0\}$ .
center of $\Delta$ : $\alpha_0 = \frac{e}{n} = [1/n,1/n,\dots,1/n]^\top$ .

$\Omega \cap \Delta = \left\{x:\binom{A}{e^\top}x = \binom{}{} \right\}$

Assumptions:

$\alpha_0 \in \Omega$
Opt. value of $c^\top x = 0$ .
$\displaystyle rank\left[\begin{matrix}A \\ e^\top \end{matrix}\right] = m+1$ .
$\exists q \in \mathbb{R}^+$ , s.t. if $\displaystyle \frac{c^\top x}{c^\top x_0} \leq 2^{-q}$ . then $x$ is optimal.

(待整理)

# 整数线性规划

# 概念

整数线性规划 (Integer Linear Programming, ILP) 问题的形式如下：

$\begin{aligned} & \min && \boldsymbol{c}^\top \boldsymbol{x} \\ & s.t. && A\boldsymbol{x} = \boldsymbol{b}\\ & && \boldsymbol{x} \geq \boldsymbol{0}\\ & && \boldsymbol{x} \in \mathbb{Z}^n \end{aligned}$

首先举几个整数线性规划问题的例子。

# 旅行商问题 (TSP)

# Gomory 割平面法

# 凸优化

# 基本概念

可行解空间 $\Omega$ 是凸集，且优化目标函数 $f$ 是一个凸函数，那么称为一个凸优化 (convex optimization).

# 凸集

$S \subset \mathbb{R}^n$ 是凸 (convex) 的，当且仅当 $\forall a,b \in S, \forall \alpha \in (0,1)$ , 有 $\alpha a + (1-\alpha) b \in S$ .

# 凸函数

$S \in \mathbb{R}^n, \;\forall a,b \in S, \;\forall \alpha \in (0,1)$ , 若

$f(\alpha a + (1-\alpha b)) \leq \alpha f(a) + (1-\alpha)f(b)$

那么称函数 $f$ 是凸函数 (convex function).

# 性质与例子

若 $f_1, f_2$ 是凸函数，那么

$\alpha f_1$ 是凸函数，其中 $\alpha > 0$ .
$f_1 + f_2$ 是凸函数。

# 例 1

$f(x) = \|x\|$ 是凸函数。采用 Cauchy-Schwarz 不等式即可证明：

$\|x+y\|^2 = \|x\|^2 + 2\langle x,y\rangle + \|y\|^2 \leq \|x\|^2 + 2\|x\|\,\|y\| + \|y\|^2 = (\|x\|+\|y\|)^2$

# 例 2

给定凸集 $S \in \mathbb{R}^n$ , 其上的凸函数 $f:S \to \mathbb{R}$ 和常数 $c \in \mathbb{R}$ , 其水平集 (level set, 又称等值面) $H_S(f,c) = \{x \in S:f(x) \leq c\}$ 是凸集。

证明
$\forall x_1, x_2 \in H_S$ , $\forall \alpha \in (0,1)$ , 根据 $f$ 的凸性， $f(\alpha x_1 + (1-\alpha)x_2) \leq \alpha f(x_1) + (1-\alpha) f(x_2)$ .
又 $x_1, x_2 \in H_S$ , 于是 $\alpha f(x_1) + (1-\alpha) f(x_2) \leq \alpha c + (1-\alpha)c = c$ .
故 $\alpha x_1 + (1-\alpha)x_2 \in H_S(f,c)$ .

# 例 3

对 $f: S \to \mathbb{R}$ , 定义 $f$ 的上图像 (epigraph) 为

$\mathrm{epi}(f) = \left[ \begin{aligned} x \\ c \end{aligned} \right]: x \in S, c \in \mathbb{R}, f(x) \leq c$

# 定理

$S \in \mathbb{R}^n$ 是一个非空开凸集。 $f: S \to \mathbb{R}$ 且 $f \in C^1$ . 那么有 $f$ 是 $S$ 上的凸函数 $\iff \forall x, y \in S$ 有 $f(y) \geq f(x) + \nabla f^\top(x) (y-x)$ .

Proof:

" $\Rightarrow$ ":

$f \in C^1$ is convex, then

$\forall x,y \in S: \; f(\alpha y + (1-\alpha)x) \leq \alpha f(y) + (1-\alpha)f(x)$

$f\big(x + \alpha(y-x)\big) - f(x) \leq \alpha(f(y)-f(x))$

By Taylor Theorem:

$f\big(x + \alpha(y-x)\big) = \alpha \nabla f^\top(y-x) + o(\|\alpha(y-x)\|)$

Let $\alpha \to 0$ ,

$\lim_{\alpha \to 0}\frac{o(\|\alpha (y-x)\|)}{\alpha} = 0$

but $o\|\alpha (y-x)\|>0$ , so

$\nabla f^\top(x) (y-x) \leq f(y) - f(x)$

" $\Leftarrow$ ":

Def $z = \alpha u + (1-\alpha)v$ , for $u,v \in S, \alpha\in(0,1)$ .

So $z \in S$ , and

$f(u) \geq f(z) + \nabla f^\top(z)(u-z)\\ f(v) \geq f(z) + \nabla f^\top(z)(v-z)$

Then

$\alpha f(u) + (1-\alpha)f(v) \geq f(z) + \nabla f^\top(z)\big(\alpha(u-z) + (1-\alpha)(v-z)\big) \geq f(z)$

# Thm:

$S \subset \mathbb{R}^n$ , $f:S \to \mathbb{R}$ and $f \in C^2$ . Then $f$ is convex $\iff$ $\forall x \in S$ , the Hessian $H(x)$ of $f$ at $x$ is positive semi-definite (半正定， $x^\top H x \geq 0$ ).

Proof:

" $\Leftarrow$ ":

By Taylor Thm, $\exist \alpha \in (0,1)$ . $f(y) = f(x) + \nabla f^\top(x)(y-x) + \frac{1}{2} (y-x)^\top H(x + \alpha(y-x)) (y-x)$ .

" $\Rightarrow$ ":

By contraposition.(逆否命题) $\exist x \in S, \exist d \in \mathbb{R}^n$ s.t. $d^\top H(x) d < 0$ .

$S$ is open, so $\exist t \in \mathbb{R}$ with $y + td \in S$ .
Because $H$ is continuous, $\forall z = \alpha x + (1-\alpha) y$ , $d^\top H(z) d < 0$ .

$f(y) = f(x) + \nabla f^\top (x)(y-x) + \frac{1}{2} (y-x)^\top H(x + \alpha(y-z))(y-z) = f(x) + \frac{1}{2}f^\top(y-x) + \frac{1}{2}d^\top H(z) d < 0$ .

# Example

$f(x) = \frac{1}{2}x^\top A x + b^\top x + c$ , $A \in \mathbb{R}^{n \times n}$ s.t. $A = A^\top$ , $x,b \in \mathbb{R}^n, c \in \mathbb{R}$ .

Then, $\nabla f(x) = Ax + b$ . $F(x) = A$ .

So $f$ is convex $\iff$ $A$ is positive semi-definite.

# 凸规划

$\begin{aligned} \min f(x) & \quad f\; \text{convex} \\ s.t. \;x \in \Omega &\quad \Omega \;\text{convex} \end{aligned}$

Then, $x^\star$ is a global optimizer $\iff$ $x^\star$ is local optimizer.

Proof: " $\Leftarrow$ ":

$x^\star$ is global $\Rightarrow y \in \Omega: f(y) < f(x^\star)$ .

Prop:

$f$ is convex on $\Omega$ , then the set of all optimizers of $f$ is convex.

Prop2:

$Q \in \mathbb{R}$ is open, convex. $f \in C^1$ . If $\exist x^\star \in Q$ s.t. $\forall x \in Q$ with $x \neq x^\star$ , we have $\nabla f^\top(x)(x-x^\star) \geq 0$ , then $x^\star$ is a global optimizer.

Note: $\nabla f(x^\star) = 0 \Rightarrow x^\star$ is optimizer.

所以怎么找到一个局部最优解捏？

# 一维的优化问题

就是求最小值 $\min f(x)$ .

# 黄金分割方法 (Golden Section)

Assumption: $f$ is unimodal (单峰的), which means $[a,b]$ only one local minimizer.

求 $\min f(x)$

其循环次数可以如下计算 (停机条件为 $b_k-a_k < \varepsilon$ )：

if $b_i-a_i = (1-\rho) (b_{i-1}-a_{i-1})$ , then $(b_n-a_n)(1-\rho)^n \leq \varepsilon$ .

# Fibonacci Method

是否存在比黄金分割法更节约计算的方式？这时考虑采用不同的 $\rho$ . 要求

$\begin{aligned} & \min && \prod_{i=1}^N(1-\rho_i)\\ & s.t. && \rho_{k+1}(1-\rho_k) = 1-2\rho_k & \forall k \\ & && \rho_k \in (0,1/2) \end{aligned}$

直接给出这个优化问题的答案。与斐波那契数列 $\{F_n\}$ 有关。 $\rho_k = 1-F_{N+1-k}/F_{N+2-2}$ . $\rho_1 = 1-F_N/F_{N+1}$ .

Then, $\displaystyle \min\prod_{i=1}^N(1-\rho_i) = \frac{1}{F_{N+1}} > 1/\varepsilon$ .

# Newton's Method

要求 $f \in C^2$ .

采用牛顿法求根，其迭代式为

$x_i = x_{i-1} - \frac{f(x_{i-1})}{f'(x_{i-1})}$

于是要求最小值，只需要求 $x^\star$ 使得 $f'(x^\star) = 0$ . 这样采用牛顿法即可。

# 无约束优化问题 (Unconstrained Optimization)

# 无约束优化问题的最优性条件

$\min f(x), \; x \in \mathbb{R},\,f:\mathbb{R}^n \to \mathbb{R}$ .

Necessary Condition (必要条件)：用来判断一个点不是最优解。
Sufficient Condition (充分条件)：用来判断一个点是最优的。

在这里只介绍一个必要条件：

# FONC (First Order Necessary Condition, 一阶必要条件)

$f \in C^1$ . 若 $x^\star$ 是一个局部最小值，那么 $f'(x^\star) = 0$ . 即局部最优点必是驻点。

Proof:

Claim: For any feasible direction $d$ at $x^\star$ , we have $d^\top \nabla f(x^\star) = 0$ .

Def $x(\alpha) := x^\star + \alpha d$ , $\phi(x) = f(x(\alpha))$ .

Then use Taylor Theorem:

$f(x^\star + \alpha d) - f(x^\star) = \phi(\alpha) - \phi(0) = \phi'(0)\alpha + \mathcal o(\alpha) = \alpha d^\top f(x(\alpha)) + \mathcal o(\alpha)$

因为 $x^\star$ 是局部最小值，所以 $f(x^\star + \alpha d) \geq f(x^\star)$ , 即 $d^\top \nabla f(x^\star) \geq 0$ . 同理，取 $d$ 为 $-d$ , 则 $-d^\top \nabla f(x^\star) \geq 0$ . 所以 $d^\top \nabla f(x^\star) = 0$ , $\forall d$ . 故 $\nabla f(x^\star) = 0$ .

# SONC (Second Order Necessary Condition, 二阶必要条件)

不讲了，直接讲充分条件。

# SOSC (Second Order Sufficient Condition, 二阶充分条件)

$f \in C^2$ . 若 $\nabla f(x^\star) = 0$ 且 $F(x^\star) > 0$ , 那么 $x^\star$ 是严格局部最小值。

现在的研究只能找到严格的局部最优充分条件，而不能找到局部最优的充分条件。

Proof:

采用泰勒定理。

$\begin{aligned} f(x^\star + \alpha d) - f(x^\star) = \frac{1}{2}d^\top F(x^\star)d + \mathcal o(\|\alpha d\|) \end{aligned}$

Example:

$f(x) = x_1^2 + x_2^2$ .

$\nabla f(x) = \left[\begin{matrix} 2x_1 \\ 2x_2 \end{matrix}\right]$

懒得抄了 ***

# 无约束优化算法

# 最速下降法

$x_{k+1} = x_k + \alpha_kd_k, \; \alpha_k \in \mathbb{R}, d_k \in \mathbb{R}^n$

$f(x_{k+1}) - f(x_k) = \alpha_k\nabla f(x_k)^\top + \mathcal o(\|\alpha_kd_k\|)$

其中， $\displaystyle d_k = \frac{\nabla f(x_k)}{\|\nabla f(x_k)\|}$ , $\alpha_k$ 可以是固定的，也可以是变化的。若取 $\alpha_k = \argmin_{\alpha>0}f(x_k-\alpha f(x_k))$ . 即作一个一维优化问题，其中 $\alpha$ 是唯一未知变量。这种方法称为最速下降法 (Steepest decent method).

需要注意的是，最速下降法不同于梯度下降法。未经优化的梯度下降法中，步长 (学习率) $\alpha$ 是一个实现确定的值。而此处的 $\alpha$ 通过求 $\argmin f(x_k-\alpha f(x_k))$ 得到。

# 最速下降法的性质

性质 1 $x_{k+1}-x_k$ orthogonal to $x_k-x_{k-1}$ . 即最速下降法中的任意相邻两步相互正交。

Proof:

$\langle x_{k+1}-x_k, x_k-x_{k-1}\rangle = \alpha_k\alpha_{k-1}\langle\nabla f(x_k), \nabla f(x_{k-1})\rangle$

FONC:

$\phi'(\alpha_k) = 0$

So, $\nabla f(x-\alpha\nabla f(x_k))^\top(-\nabla f(x_k)) = 0$ , which means $\nabla f(x_{k+1})^\top\nabla f(x_k) = 0$ .

性质 2 If $\nabla f(x_k) \neq 0$ , then $f(x_{k+1}) < f(x_k)$ . 即函数值始终严格下降。

Proof:

$x_{k+1} = x_k - \alpha_k\nabla f(x_k)$ , where $\alpha_k = \argmin f(x_k-\alpha \nabla f(x_k))$ .

So $\forall \alpha \neq \alpha_k$ , $\phi_k(\alpha) = f(x_k - \alpha \nabla f(x_k)) \geq \phi_k(\alpha_k)$ .

$\phi'(0) = -\left(\nabla f(x_k - 0\nabla f(x_k))\right)^\top \nabla f(x_k) = -\|\nabla f(x_k)\|^2 < 0$ .

So, $\exists \omega < 0$ , $\forall \bar{\alpha} \in (0, \omega]$ ???????

# Conjugate Direction Method (共轭方向法)

# 相关概念和性质

首先给出两向量共轭的定义。 $p,q \in \mathbb{R}^n$ are $Q$ -conjugate, if $p^\top Qq = 0$ . If $Q = I_n$ , then $Q$ -conjugate is orthogonal.

若 $Q$ 是正定实对称矩阵。若 $d^0, d^1, \dots, d^k$ 是非零向量且两两 $Q$ - 共轭。那么这些向量线性独立。

# 性质

对二次函数，共轭方向法只需要 $n$ 步即可达到最优值。

# 改进

$k:=0$ , initialize $x_0$
$g_0 = \nabla f(x_0)$ . If $x_0 = 0$ , then stop; else, $d_0 = -g_0$ .
$\alpha_k = -\frac{g_k^\top d_k}{d_k ^\top Qd_k}, x_{k+1} = x_k + \alpha_k d_k$
$g_{k+1} = \nabla f(x_{k+1})$ . If $g_{k+1} = 0$ , then STOP.
\beta_k = -\frac{g_k^\top Q d_k}
$d_{k+1} = -g_{k+1}+\beta_k d_k$
$k:=k+1$ . Go to STEP 3

改进后的方法不需要实现给出共轭方向，而是采用类似 Schmidt 正交化的方式得到各共轭向量。证明上述改进的正确性，也只需要证明该方法给出的向量是共轭向量。

# 约束优化问题 (Constrained Optimization)

# 拉格朗日定理 (FONC)

$x^\star$ is a

# 不等式限制条件

$\begin{aligned} & \min && f(x) && f: \mathbb{R}^n \to \mathbb{R} \\ & s.t. && h(x) = 0 && h: \mathbb{R}^n \to \mathbb{R}^m, m \leq n \\ & && g(x)\leq 0 && g: \mathbb{R}^n \to \mathbb{R}^p \end{aligned}$

# 相关概念

如果一个不等式约束 $g_i(x) \leq 0$ 在 $x^\star$ 处成立，那么称该条件是活跃的 (active), 否则称为不活跃的 (inactive).

若 $x^\star$ 满足上述约束条件，即 $h(x^\star) = 0$ 且 $g(x^\star) \leq 0$ . 令 $J(x^\star) := \{i|g_i(x^\star)=0\}$ . 若满足 $\nabla g_i(x^\star), \nabla h_i(x^\star)$ 线性独立，那么称 $x^\star$ 是正则的 (regular).

# KKT 定理 (FONC)

令 $f,h,g \in C^1$ , $x^\star$ 是正则点，且是满足上述约束条件的局部极小值，那么存在 $\lambda^\star \in \mathbb{R}^m$ 及 $\mu^\star \in \mathbb{R}^p$ 满足

$\frac{\partial L}{\partial x_i} = 0, \forall i$ . 其中， $L = f(x) + \lambda^\star h(x) + \mu^\star g(x)$ , 称为拉格朗日定常方程式 (Lagrange stationary equation).
$h(x^\star) = 0, g(x^\star) \leq 0$ , 即约束条件
$\mu^\star \geq 0$ , 即对偶可行性条件
${\mu^\star}^\top g(x^\star)=0$ , 即互补松紧性条件

上述四条件合称 KKT 条件.

求解上述条件方程组殊为不易，一般从等式条件入手采用分类讨论的方式降低变元个数，再进行求解。

# 动态规划

“你们都学过，所以不讲了。”

# 线性规划

# 线性规划的概念

# 线性规划的基本形式

# 线性规划的解

# 线性规划的求解

# 单纯形法 (Simplex)

# 单纯形法的本质

# 单纯形法的过程

# 单纯形法举例

# 两阶段法 (Two-Phase Method)

# 单纯形表

# Improvement of Simplex Method

# 过程总结

# 线性规划的对偶问题

# 对偶问题的概念

# 对偶问题的性质

# 线性规划的其它快速求解方法

# 椭圆球算法 (Ellipsoid)

# Karmarkar algorithm

# 整数线性规划

# 概念

# 旅行商问题 (TSP)

# Gomory 割平面法

# 凸优化

# 基本概念

# 凸集

# 凸函数

# 性质与例子

# 例 1

# 例 2

# 例 3

# 定理

# Thm:

# Example

# 凸规划

# 一维的优化问题

# 黄金分割方法 (Golden Section)

# Fibonacci Method

# Newton's Method

# 无约束优化问题 (Unconstrained Optimization)

# 无约束优化问题的最优性条件

# FONC (First Order Necessary Condition, 一阶必要条件)

# SONC (Second Order Necessary Condition, 二阶必要条件)

# SOSC (Second Order Sufficient Condition, 二阶充分条件)

# 无约束优化算法

# 最速下降法

# 最速下降法的性质

# Conjugate Direction Method (共轭方向法)

# 相关概念和性质

# 性质

# 改进

# 约束优化问题 (Constrained Optimization)

# 拉格朗日定理 (FONC)

# 不等式限制条件

# 相关概念

# KKT 定理 (FONC)

# 动态规划

图形学课程随记

机器学习课程随记