Primal-Dual Interior Point Algorithm

A.4 Primal-Dual Interior Point Algorithm

This section provides some details on the primal-dual interior point algorithm used by MIPS and described in [8, 43].

A.4.1 Notation

For a scalar function $n f : ℝ → ℝ$ of a real vector $[ ]T X = x1 x2 ⋅⋅⋅ xn$ , we use the following notation for the ﬁrst derivatives (transpose of the gradient):

∂f-- [ ∂f- ∂f- -∂f ] fX = ∂X = ∂x1 ∂x2 ⋅⋅⋅ ∂xn .

(A.14)

The matrix of second partial derivatives, the Hessian of $f$ , is:

⌊ ⌋ ( ) ∂2f2 ⋅⋅⋅ -∂2f-- ∂2f ∂ ∂f T | ∂x.1 . ∂x1∂.xn | fXX = ∂X2--= ∂X-- ∂X-- = |⌈ .. .. .. |⌉ . --∂2f- ⋅⋅⋅ ∂2f ∂xn∂x1 ∂x2n

(A.15)

For a vector function $F : ℝn → ℝm$ of a vector $X$ , where

[ ]T F (X ) = f1(X ) f2(X ) ⋅⋅⋅ fm (X )

(A.16)

the ﬁrst derivatives form the Jacobian matrix, where row $i$ is the transpose of the gradient of $fi$

⌊ ⌋ ∂∂fx1 ⋅⋅⋅ ∂∂fx1 ∂F-- | ..1 .. ..n | FX = ∂X = ⌈ . . . ⌉ . ∂∂fxm- ⋅⋅⋅ ∂∂fxm- 1 n

(A.17)

In these derivations, the full 3-dimensional set of second partial derivatives of $F$ will not be computed. Instead a matrix of partial derivatives will be formed by computing the Jacobian of the vector function obtained by multiplying the transpose of the Jacobian of $F$ by a vector $λ$ , using the following notation

( ) FXX (λ ) = -∂-- FX Tλ . ∂X

(A.18)

Please note also that $[A]$ is used to denote a diagonal matrix with vector $A$ on the diagonal and $e$ is a vector of all ones.

A.4.2 Problem Formulation and Lagrangian

The primal-dual interior point method used by MIPS solves a problem of the form:

min f (X ) X

(A.19)

subject to

where the linear constraints and variable bounds from (A.4) and (A.5) have been incorporated into $G (X )$ and $H (X )$ . The approach taken involves converting the $ni$ inequality constraints into equality constraints using a barrier function and vector of positive slack variables $Z$ .

[ ∑ni ] min f(X ) − γ ln(Zm ) X m=1

(A.22)

subject to

As the parameter of perturbation $γ$ approaches zero, the solution to this problem approaches that of the original problem.

For a given value of $γ$ , the Lagrangian for this equality constrained problem is

ni γ T T ∑ ℒ (X, Z,λ,μ ) = f(X ) + λ G (X ) + μ (H (X ) + Z ) − γ ln(Zm ). m=1

(A.26)

Taking the partial derivatives with respect to each of the variables yields:

ℒ γX(X, Z, λ,μ) = fX + λTGX + μTHX (A.27 ) γ T T −1 ℒ Z(X, Z, λ,μ) = μ − γe [Z] (A.28 ) ℒγλ(X, Z, λ,μ) = GT (X ) (A.29 ) γ T T ℒμ(X, Z, λ,μ) = H (X ) + Z . (A.30 )

And the Hessian of the Lagrangian with respect to $X$ is given by

ℒγ (X, Z, λ,μ) = f + G (λ) + H (μ ). XX XX XX XX

(A.31)

A.4.3 First Order Optimality Conditions

The ﬁrst order optimality (Karush-Kuhn-Tucker) conditions for this problem are satisﬁed when the partial derivatives of the Lagrangian above are all set to zero:

F(X, Z, λ,μ) = 0 (A.32 ) Z > 0 (A.33 ) μ > 0 (A.34 )

where

⌊ γ T ⌋ ⌊ T T T ⌋ ℒ X fX + GX λ + HX μ F (X, Z,λ, μ) = || [μ ]Z − γe || = || [μ]Z − γe || . ⌈ G (X ) ⌉ ⌈ G (X ) ⌉ H (X ) + Z H (X ) + Z

(A.35)

A.4.4 Newton Step

The ﬁrst order optimality conditions are solved using Newton’s method. The Newton update step can be written as follows:

⌊ ⌋ ΔX [ ]| ΔZ | FX FZ Fλ F μ |⌈ |⌉ = − F(X, Z, λ,μ) Δ λ Δ μ

(A.36)

⌊ γ T T ⌋ ⌊ ⌋ ⌊ γ T ⌋ ℒXX 0 GX HX ΔX ℒX || 0 [μ] 0 [Z ] || || ΔZ || || [μ]Z − γe || ⌈ GX 0 0 0 ⌉ ⌈ Δ λ ⌉ = − ⌈ G (X ) ⌉ . H I 0 0 Δ μ X H (X ) + Z

(A.37)

This set of equations can be simpliﬁed and reduced to a smaller set of equations by solving explicitly for $Δ μ$ in terms of $ΔZ$ and for $ΔZ$ in terms of $ΔX$ . Taking the 2^nd row of (A.37) and solving for $Δ μ$ we get

Solving the 4^th row of (A.37) for $ΔZ$ yields

Then, substituting (A.38) and (A.39) into the 1^st row of (A.37) results in

where

and

Combining (A.40) and the 3^rd row of (A.37) results in a system of equations of reduced size:

[ T ][ ] [ ] M GX ΔX = − N . GX 0 Δλ − G (X )

(A.45)

The Newton update can then be computed in the following 3 steps:

1.: Compute $ΔX$ and $Δ λ$ from (A.45).
2.: Compute $ΔZ$ from (A.39).
3.: Compute $Δμ$ from (A.38).

In order to maintain strict feasibility of the trial solution, the algorithm truncates the Newton step by scaling the primal and dual variables by $αp$ and $αd$ , respectively, where these scale factors are computed as follows:

resulting in the variable updates below.

The parameter $ξ$ is a constant scalar with a value slightly less than one. In MIPS, $ξ$ is set to 0.99995.

In this method, during the Newton-like iterations, the perturbation parameter $γ$ must converge to zero in order to satisfy the ﬁrst order optimality conditions of the original problem. MIPS uses the following rule to update $γ$ at each iteration, after updating $Z$ and $μ$ :

ZT-μ- γ ← σ ni

(A.52)

where $σ$ is a scalar constant between 0 and 1. In MIPS, $σ$ is set to 0.1.

[prev] [prev-tail] [front] [up]