Empirical Algorithmics (Spring 2008)
ICT International Doctorate School, Universit&agrave; degli Studi di Trento
Notes by Holger H. Hoos, University of British Columbia

---------------------------------------------------------------------------------
Module 5: Algorithms for optimisation problems
---------------------------------------------------------------------------------

5.1 Introduction

Many real-world tasks involve finding solutions to a given problem that
optimise certain criteria.

Here: assume that there is only one optimisation criterion
  (the more general case of multi-criteria or multi-objective optimisation will be briefly mentioned
  in Module 6)

Def: Optimisation problem
Given: Input data (e.g., graph G) and objective function f (e.g., size of a given clique in G)
Objective: Output optimal objective function value (e.g., maximum size of any clique in G)
(Equivalently: Output a solution with optimal objective function value.)

Examples:
- Travelling Salesperson Problem (finding shortest round-trips in graphs)
- Vehicle Routing (and other logistics problems)
- scheduling: given a set of resources R, a set of tasks T with resource requirements, 
	what is the shortest time in which all tasks in T can be accomplished?

Note: Minimisation vs. maximisation problems - can be easily translated into each other
[ask students: how?]
In the following, consider minimisation problems (without loss of generality)

The objective function value of a given solution s is also called _solution quality_ of s

(Note: for minimisation problems, solution quality values are minimised - somewhat counterintuitive)


Def: Optimisation algorithm
An optimisation algorithm is an algorithm that takes as an input an instance of a given optimisation
problem and returns a solution quality (and in practice, typically also the corresponding solution)

Example:
- branch & bound algorithms for integer programming
- stochastic local search algorithms for the TSP
- efficient algorithms for finding shortest paths, minimum spanning trees in graphs
...

An optimisation algorithm A is called
- exact (or complete) iff it returns for each problem instance within bounded time 
	the optimal solution quality
- r-approximation algorithms iff it returns for each problem instance a solution quality within 
	a const factor r of the optimum

Some important concepts related to solution quality:
- approximation ratio r = max{q/q*, q*/q}, where q is the solution quality achieved for a given 
	problem inst and q* is the opt soln qual of that inst (note: r always > 1)
- relative solution quality  q' :=(r-1)*100 (= percent deviation from optimum) where
	r is the approximation ratio

Every optimisation problem has associated decision problems:
given bound on objective function value, decide whether for given problem instance
that bound can be met (or exceeded)

Note:
- associated decision problems are useful for analysing an algorithm’s ability 
  to find optimal or close-to-optimal solutions, or solutions deemed feasible
  / good enough in a given application context 
- can be studied using exactly the same techniques as those
  used for decision algorithms.

General issue for the analysis of optimisation algorithms: 
trade-off between run-time and solution quality

---
5.2 Deterministic optimisation algorithms without error

--
Single algorithm, single problem instance:

simple approach: solution quality (sq) for fixed run-time (rt) and/or rt for fixed sq bound
better: SQT curves = plot of best (rel) solution quality seen at any time t (y axis) vs. t

An SQT curve completely characterises the behaviour of an optimisation algorithm
on a given problem instance.
In particular: shows trade-off between solution quality and run-time 

[draw illustration]


How to measure SQT curves:

Throughout run of the algorithm, record (q,t) at any time t when a solution has been 
found that is better than any other solution seen so far in this run
(= new incumbent solution), i.e., track improvements in incumbent solution quality

--
Multiple algorithms, single problem instance:

Compare SQT curves

Def: alg A dominates alg B on instance i iff 
(1) \forall t: sq_A(t) <= sq_B(t) 
(2) \exists t: sq_A(t) < sq_B(t) 

Can be easily checked graphically:
A dominates B if SQT of A is 'above' that of B

[draw illustration]

Crossing SQTs -> ?
[ask students; A: preferable algorithm depends on run-time]

--
Empirical analysis on instance sets:

general approach:
(somewhat analogous to RTD-based comparative analysis of LVAs:)

- analyse SCDs for multiple solution quality bounds 
  and/or distribution of (rel) solution quality over inst set for multiple run-time bounds 
  + more detailed analysis  for carefully selected individual instances (SQT curves)
- correlation analysis: 
  - run-time vs. instance properties / parameter settings: 
    as for emp decision algorithms for multipe solution quality bounds
  - solution quality vs. instance properties / parameter settings
    for multiple run-time bounds
  - performance of multiple algorithms w.r.t. to solution quality (for fixed RT)
    or run-time (for fixed SQ bound)
- domination relation between two algorithms A,B on given instance set
    partition into three subsets (analogous to case of LVAs) according 
    to domination relation between A,B

Note: 
- use relative solution qualities (for comparabiliy across sets)
- typically, solution quality bounds used are often optimal + slightly suboptimal (per instance)

General issue: How to deal with instances of optimisation problems for which (provably)
  optimal solution qualities are not known?

[ask students]

1. use theoretically determined bounds (e.g., Held-Karp bound for TSP)
2. use empirically determined bounds (often obtained from high-performance heuristic algorithms)

Important: ensure that bounds used in lieu of provably optimal solution qualities are 
  as tight as possible
  -> when using heuristic algorithms to determine these, use state-of-the-art methods
      with very high run-times 
      (often: better solution found -> allow same run-time again)

---
5.3 Randomised optimisation algorithms 

Many high-performance heuristic optimisation algorithms are randomised.

Examples:
- stochastic local search algorithms for the TSP, MAX-CLIQUE, scheduling problems, ...
- efficient randomised algorithms for finding shortest paths, minimum spanning trees in graphs, ...

How to analyse these empirically?
[ask students]

Key idea: extend methodology for empirically analysing 
  deterministic optimisation algorithms / LVAs

Note: For given randomised optimisation algorithms on problem instance i,
- run-time for reaching (or exceeding) given solution quality bound is a random variable
- likewise, solution quality reached within given run-time bound.

=> The behaviour of a randomised opt alg (without error) on a given problem instance i
is completely and uniquely characterised by its bivariate run-time distribution (RTD),
i.e., joint probability of reaching or exceeding a given solution quality bound within
a given run-time.

-> bivariate RTD plots = 2-dim surface rtd(t,q) = P_s(RT <= t & SQ <= q) = CDF 
  of bivariate random variable (RT,SQ)

[slide]

qualified RTDs (QRTDs) = RTDs for given solution quality bound:
  qrtd_q(t) = P_s(RT <= t & SQ <= q)

[slides]

solution quality distribution (SQD) = distribution of solution quality for given run-time bound:
  sqd_t(q) = P_s(RT <= t & SQ <= q)

[slides]

Note: QRTDs and SQDs correspond to orthogonal vertical cuts through the respective bivariate RTD

Notes: 
- especially for relatively short runs on instances of hard combinatorial optimisation  problems: 
	SQDs are often approx. normally distributed
	-> can use specialised tests for normal distributions
	This does not hold for RTDs / QRTDs.
- for sufficiently long run-times, increase in mean solution
	quality is often accompanied by decrease in solution quality
	variability.

SQT curves: SQD stats (such as median, quantiles, ...) as a function of run-time
  -> these correspond to horizontal cuts through a bivariate RTD (contour lines)
  and characterise the trade-off between solution quality and run-time for a given problem instance

[slides]

Note:
- SQT curves are widely used to illustrate the trade-off between
	run-time and solution quality for a given OLVA.
- but: Important aspects of an algorithm’s run-time behaviour
	may be easily missed when basing an analysis solely on
	a single SQT curve.

--
How to measure bivariate RTD for algorithm A on problem instance i:

- Perform a number of independent runs of A on i (with some cut-off time t')
- For each run of the algorithm, track improvements in incumbent solution quality
  -> solution quality traces
- let sq(t,j) denote the best solution quality seen in run j up to time t;
  cumulative empirical RTD of A on i is defined by rtd(t,q) = #{j | sq(t,j) <= q}/k

--
Comparative analysis: Can be based on concept of probabilistic domination
- simple generalisation from same concept for LVAs; intuitively, A dominates B
  on problem instance i iff the bivariate RTD surface of A is above that of B (in a CDF plot) 
- intersecting RTD surfaces -> relative performance depends on run-time / solution quality 

Note: (so far) bivariate RTDs are rarely analysed in practive; instead, use combination
  of QRTDs, SQDs, SQTs.

Key issues:
- study trade-off between solution quality and run-time
- do not base comparison on one fixed run-time bound or one fixed solution quality bound only,
  (unless the bound comes directly from a real-world application)
- do not confuse sources of variation (randomisation within algorithm, variation between problem instances)

--
5.4 Essentially incomplete (= inexact) optimisation algorithms:

The notions of completeness, probabilitic approximate completeness and essential
incompleteness (introduced for LVAs in Module 3) can be analogously defined
for randomised optimisation algorithms, where they refer to the ability of 
an algorithm to find optimal solutions of a given problem instance.

Many randomised optimisation algorithms are complete or PAC,
but some are essentially incomplete. 

Bivariate RTDs capture the behaviour of the algorithm in all cases,
but are typically difficult to analyse directly.

Def: asymptotic SQD of A on i = distribution approached by the SQDs of A on i
	in the limit as run-time -> infinity

Note: 
- For complete and PAC algorithms, the asymptotic SQDs are degenerate 
	distributions that concentrate all probability on the optimal solution quality.
- For essentially incomplete algorithm A (such as Iterative Improvement) the asymptotic 
	SQDs are non-degenerate distributions.

When analysing essentially incomplete optimisation algorithms
- SQD analysis still works
- asymptotic SQD analysis can be very useful
- SQT analysis still works
- for algorithms with termination criteria that lead to different termination times between runs
  on the same problem instance,  it can be also useful to study distribution of termination times 
  over multiple runs.

Note: Analysis of complete/PAC algorithms with prematurely terminated runs
is mostly analogous to that of essentially incomplete algorithms (as for LVAs).

Note: Independently of essential incompleteness, considerations related to stagnation,
  restarts, and parallelisation by means of multiple indepent runs apply as in the case
  of LVAs.


---
learning goals:
- be able to explain the concept of optimisation problems and deterministic optimisation algorithms
- be abel to explain and apply appropriate methods for the empirical analysis of tradeoffs between
  run-time and solution quality
- be able to explain the following concepts and the way they are related to each other: 
  objective function, (absolute) solution quality, relative solution quality, approximation ratio
- be able to explain two general methods for dealing with problem instances whose optimal 
  solution quality is unknown
- be able to explain the concepts of domination and probabilistic domination and apply them
  in the comparative analysis of optimisation algorithms
- be able to explain the concept of decision problems associated with an optimisation problem 
  and how this can be used in the empirical analysis of optimisation algorithms
- be able to explain the following concepts and the way they are related to each other: 
  bivariate RTD, SQT, QRTD, SQD
- be able to measure empirical bivariate RTDs
- be able to explain the concept of an asymptotic SQD and its application to 
  PAC and essentially incomplete optimisation algorithms
- be able to outline the general approach for empirically analysing essentially incomplete optimisation
  algorithms

<eof>