算法设计与分析笔记0x07 —— 动态规划(1)

Divide and Conquer 小问题中独立解决

  • Dynamic Programming 小问题是有重叠的
    • 分隔时,父问题共享子问题

Weighted Interval Scheduling

Notation. 将工作按结束时间排序 $f_1 \leq f_2 \leq … \leq f_n$.
Def. p(j) = largest index i < j such that job i is compatible with j.
Goal: 求最大化工作权重之和

OPT(j)的公式
OPT(j) = value of optimal solution to the problem consisting
of job requests 1, 2, …, j.

  • Case 1: OPT selects job j.
    • can’t use incompatible jobs {p(j) + 1, p(j) + 2, …, j – 1}
    • must include optimal solution to problem consisting of remaining compatible jobs 1, 2, …, p(j)
  • Case 2: OPT does not select job j.
    • must include optimal solution to problem consisting of remaining compatible jobs 1, 2, …, j-1

Brute Force

Memoization

Store results of each sub-problem in a cache; lookup as needed.

Memoized version of algorithm takes O(nlogn) time.

  • Sort by finish time: O(n log n).
  • Computing p(.) : O(n) after sorting by start time

M-Compute-Opt(j) : O(n)

  • Each entry M[j] is computed only once
  • The computation of M[j] invokes M-Compute-Opt twice

Find Solution

Top-down vs. Bottom-up

Bottom-Up Version:

Top-down 有时可跳过一些不需要计算的OPT(j)
Bottom-up 省去递归额外的消耗

Segmented Least Squares

  • Given n points in the plane: $(x_1, y_1)$, $(x_2, y_2)$, …, $(x_n, y_n)$.
  • Find a line y = ax + b that minimizes the sum of the squared error(和方差)
    $$SSE=\sum_{i=1}^{n}(y_i -ax_i -b)^2$$
  • Solution, min error is achieved when
    • $$a=\frac{n\sum_i x_i y_i – \sum_i x_i \sum_i j_i}{n\sum_i x_{i}^{2} – (\sum_i x_i)^2}$$
    • $$b=\frac{\sum_i j_i – a\sum_i x_i}{n}$$

  • 用多条线段拟合, 平衡accuracy和parsimony
  • 每条线段都有误差平方和E
  • L为线段数量, c为常数, 控制二者平衡
  • 找到一系列线段,最小化E+cL

OPT(j) = minimum cost for points $p_1$, $p_2$, …, $p_j$.
e(i, j) = minimum sum of squares for points $p_i$, $p_{i+1}$, …, $p_j$.

$O(n^2)$ pairs, and $O(n)$ to calculate SSE per pair. So the total time is $O(n^3)$.

Knapsack Problem

Goal: 重量最轻(不大于限制值),价值最大

Greedy Algorithm

  • repeatedly add item with maximum value vi .
  • repeatedly add item with maximum weight wi .
  • repeatedly add item with maximum ratio vi / wi .
  • Not so good.
  • Greedy thief won’t get the most. 🙂

OPT(i,w), 只考虑前i个物品,在容量上限为w的情况下的最优解

  • Case 1: OPT does not select item i.
    • OPT selects best of { 1, 2, …, i-1 }
  • Case 2: OPT selects item i.
    • new weight limit = w – wi
    • OPT selects best of { 1, 2, …, i–1 } using this new weight limit


Running time: $\Theta(nW)$

  • exponential
  • Pseudo-polynomial
  • NP-complete — 目前未发现polynomial的算法

RNA Secondary Structure

配对碱基数目越多,能量越低
Goal: 给定碱基序列,找到配对数最大的二级结构
Second structure A set of pairs S = {(bi, bj)} that satisfy:

  • [Watson-Crick.] S is a matching and each pair in S is a Watson-Crick complement: A-U, U-A, C-G, or G-C.
  • [No sharp turns.] The ends of each pair are separated by at least 4 intervening bases. If (bi, bj)∈S, then i < j – 4.
  • [Non-crossing.] If (bi, bj) and (bk, bl) are two pairs in S, then we cannot have i < k < j < l.

Sharp turn:

Crossing:

OPT(i,j) 第i个到第j个碱基最大配对数

  • If i ≥ j – 4.
    • OPT(i, j) = 0 by no-sharp turns condition.
  • If i < j-4: take max of two cases
    • Case 1. Base bj is not involved in a pair.
      • OPT(i, j-1)
  • Case 2. Base bj pairs with bt for some i x ≤ t < j – 4.
    • Non-crossing constraint decouples resulting sub-problems
      • $1 + max_t \{ OPT(i, t-1) + OPT(t+1, j-1) \}$

Running time: $O(n^3)$.

Sequence Alignment

For two strings:

  • ocurrance
  • occurrence
    Different alignment may result different mismatches

Edit distance

  • Gap penalty $\delta$; mismatch penalty $\alpha_{pq}$.
  • Cost = sum of gap and mismatch penalties.

Application

  • Basis for Unix diff
  • Speech recognition
  • Computational biology

Goal: 给定两个字符串X=x1 x2 … xm, Y=y1 y2 … yn,找到使得cost最小的对齐方式

  • 不能出现前后交叉

Problem Structure

OPT(i,j) = min cost of strings ‘x1 x2 … xi’ and ‘y1 y2 … yj’

  • Base Case:
    • i=0, OPT(i,j)=j$\delta$
    • j=0, OPT(i,j)=i$\delta$
  • Case 1: OPT mastches xi-yj
    • 只看 OPT(i-1,j-1)
  • Case 2: OPT leaves xi unmatched
    • gap penalty for xi and OPT(i-1,j)
  • Case 3: OPT leaves yj unmatched
    • gap penalty for yj and OPT(i,j-1)

Algorithm

回溯路径树

Analysis $\Theta(mn)$ time and space

Leave a Reply

Your email address will not be published.