A.4 Primal-Dual Interior Point Algorithm

This section provides some details on the primal-dual interior point algorithm used by MIPS and described in [843].

A.4.1 Notation

For a scalar function      n
f : ℝ →  ℝ  of a real vector      [                 ]T
X =    x1  x2  ⋅⋅⋅ xn   , we use the following notation for the first derivatives (transpose of the gradient):

      ∂f--   [ ∂f-  ∂f-      -∂f ]
fX =  ∂X  =    ∂x1  ∂x2  ⋅⋅⋅ ∂xn   .
(A.14)

The matrix of second partial derivatives, the Hessian of f  , is:

                              ⌊                    ⌋
                  (     )         ∂2f2   ⋅⋅⋅  -∂2f--
       ∂2f      ∂    ∂f  T    |   ∂x.1   .    ∂x1∂.xn |
fXX =  ∂X2--=  ∂X-- ∂X--   =  |⌈    ..     ..    ..   |⌉ .
                                --∂2f-  ⋅⋅⋅   ∂2f
                                ∂xn∂x1        ∂x2n
(A.15)

For a vector function F : ℝn →  ℝm  of a vector X  , where

        [                             ]T
F (X  ) =  f1(X )  f2(X )  ⋅⋅⋅ fm (X )
(A.16)

the first derivatives form the Jacobian matrix, where row i  is the transpose of the gradient of fi

            ⌊                ⌋
               ∂∂fx1  ⋅⋅⋅  ∂∂fx1
      ∂F--  |   ..1   ..    ..n |
FX =  ∂X  = ⌈   .     .   .  ⌉ .
               ∂∂fxm- ⋅⋅⋅  ∂∂fxm-
                 1         n
(A.17)

In these derivations, the full 3-dimensional set of second partial derivatives of F  will not be computed. Instead a matrix of partial derivatives will be formed by computing the Jacobian of the vector function obtained by multiplying the transpose of the Jacobian of F  by a vector λ  , using the following notation

               (     )
FXX  (λ ) = -∂-- FX Tλ  .
           ∂X
(A.18)

Please note also that [A]  is used to denote a diagonal matrix with vector A  on the diagonal and e  is a vector of all ones.

A.4.2 Problem Formulation and Lagrangian

The primal-dual interior point method used by MIPS solves a problem of the form:

min f (X )
 X
(A.19)

subject to

pict

where the linear constraints and variable bounds from (A.4) and (A.5) have been incorporated into G (X )  and H (X )  . The approach taken involves converting the ni  inequality constraints into equality constraints using a barrier function and vector of positive slack variables Z  .

    [          ∑ni        ]
min   f(X ) − γ    ln(Zm )
  X            m=1
(A.22)

subject to

pict

As the parameter of perturbation γ  approaches zero, the solution to this problem approaches that of the original problem.

For a given value of γ  , the Lagrangian for this equality constrained problem is

                                                       ni
  γ                       T          T                ∑
ℒ  (X, Z,λ,μ ) = f(X ) + λ G (X ) + μ (H (X ) + Z ) − γ   ln(Zm ).
                                                      m=1
(A.26)

Taking the partial derivatives with respect to each of the variables yields:

ℒ γX(X, Z, λ,μ)  =   fX + λTGX  +  μTHX                (A.27 )
  γ                  T     T    −1
ℒ Z(X, Z, λ,μ)  =   μ  − γe  [Z]                      (A.28 )
 ℒγλ(X, Z, λ,μ)  =   GT (X  )                           (A.29 )
  γ                   T        T
 ℒμ(X, Z, λ,μ)  =   H  (X ) + Z .                     (A.30 )
And the Hessian of the Lagrangian with respect to X  is given by
ℒγ  (X, Z, λ,μ) = f    + G    (λ) + H   (μ ).
 XX                XX      XX        XX
(A.31)

A.4.3 First Order Optimality Conditions

The first order optimality (Karush-Kuhn-Tucker) conditions for this problem are satisfied when the partial derivatives of the Lagrangian above are all set to zero:

F(X, Z, λ,μ) = 0                            (A.32 )

          Z  > 0                            (A.33 )
           μ > 0                            (A.34 )
where
                ⌊      γ T    ⌋   ⌊    T       T       T   ⌋
                     ℒ X             fX  + GX   λ + HX  μ
F (X, Z,λ, μ) = ||  [μ ]Z − γe  || = ||       [μ]Z  − γe       || .
                ⌈    G (X  )   ⌉   ⌈         G (X )         ⌉
                  H  (X ) + Z              H (X ) + Z
(A.35)

A.4.4 Newton Step

The first order optimality conditions are solved using Newton’s method. The Newton update step can be written as follows:

                     ⌊      ⌋
                       ΔX
[                   ]|  ΔZ  |
  FX   FZ   Fλ  F μ  |⌈      |⌉ =  − F(X, Z, λ,μ)
                        Δ λ
                        Δ μ
(A.36)

⌊   γ            T     T ⌋ ⌊       ⌋     ⌊      γ T    ⌋
   ℒXX    0  GX     HX        ΔX               ℒX
||   0    [μ]   0     [Z ] || ||  ΔZ   ||     ||  [μ]Z −  γe ||
⌈  GX     0    0      0  ⌉ ⌈  Δ λ  ⌉ = − ⌈    G (X )   ⌉ .
   H      I    0      0       Δ μ
     X                                      H (X ) + Z
(A.37)

This set of equations can be simplified and reduced to a smaller set of equations by solving explicitly for Δ μ  in terms of ΔZ  and for ΔZ  in terms of ΔX  . Taking the 2nd row of (A.37) and solving for Δ μ  we get

pict

Solving the 4th row of (A.37) for ΔZ  yields

pict

Then, substituting (A.38) and (A.39) into the 1st row of (A.37) results in

pict

where

pict

and

pict

Combining (A.40) and the 3rd row of (A.37) results in a system of equations of reduced size:

[         T  ][      ]   [          ]
  M    GX       ΔX     =     − N     .
  GX     0       Δλ        − G (X )
(A.45)

The Newton update can then be computed in the following 3 steps:

1.
Compute ΔX  and Δ λ  from (A.45).
2.
Compute ΔZ  from (A.39).
3.
Compute Δμ  from (A.38).

In order to maintain strict feasibility of the trial solution, the algorithm truncates the Newton step by scaling the primal and dual variables by αp  and αd  , respectively, where these scale factors are computed as follows:

pict

resulting in the variable updates below.

pict

The parameter ξ  is a constant scalar with a value slightly less than one. In MIPS, ξ  is set to 0.99995.

In this method, during the Newton-like iterations, the perturbation parameter γ  must converge to zero in order to satisfy the first order optimality conditions of the original problem. MIPS uses the following rule to update γ  at each iteration, after updating Z  and μ  :

       ZT-μ-
γ ←  σ  ni
(A.52)

where σ  is a scalar constant between 0 and 1. In MIPS, σ  is set to 0.1.