算法基础笔记0x03 —— 线性排序

Comparison sorts
在插入排序、归并排序和快速排序中,输出序列仅仅决定于输入数值间的比较运算。
任何比较排序算法都有一个Ω(nlogn)的下限(决策树)。

Beyond comparison sorts
为了快于Ω(nlogn),我们需要运用除了比较以外的运算。

Counting Sort

  • Assumption: 输入的值为0~k的整数
  • The algorithm runs in Θ(n) time when k is small.
  • 用大小为k的数组C存储每个数值出现的次数,eg:C[i] = c means there a c inputs with value i.
  • 值i出现C[i]次
    • i出现在0,1,…,i-1后
    • There are $\sum_{j=0}^{i-1}C[j]$ values 0,1,…,i-1.
    • So the first i occurs at $\sum_{j=0}^{i-1}C[j]+1$ , last i occurs at $\sum_{j=0}^{i}C[j]$.
  • 计算完C后,再计算前缀和/prefix sumC’
    • $C'[i]=\sum_{j=0}^{i}C[j]$. $C'[-1] = 0$.
    • Value i occurs in indices $[C'[i-1]+1, C'[i]]$ of output.
  • Eg: If C = [2, 1, 3, 0, 0, 1], then output is 0,0,1,2,2,2,5.

Pseudocode and Complexity

  • reset all counts: O(k)
  • counts occurrences of each value: O(n)
  • computes prefix sum: O(k)
  • scatter inputs to output positions: O(n)
  • overall: Θ(n+k)
  • If k = O(n), the complexity is Θ(n).

Stability

  • If two inputs are equal, then their order in the output is the same as their order in the input.
  • This is why we iterated through A in reverse order when producing B.
  • Stability is necessary when counting sort is used as a subroutine in other sorts, e.g. radix sort.

Radix Sort

  • Sort digit by digit.
  • Two types
    • MSD = Most Significant Digit first, then second most significant, etc.
    • LSD = Least Significant Digit first, then second least significant, etc.
    • We focus on LSD, which is easier to implement.
  • Take list of input values, sort them on the singles digit.
  • Take the new list, sort values on the tens digit.
  • Take the new list, sort values on hundreds digit. Etc.
  • The sorting algorithm for each digit must be stable.
  • We will use counting sort.
    • It’s stable.
    • Since we sort a digit at a time, the values being sorted are between 0 and 9.
    • Sorting n inputs takes O(n) time.

Correctness

  • Lemma 1 Let x and y be two inputs to LSD radix sort.
    • Let k be the most significant digit on which they differ.
    • Suppose k’th digit of x is less than k’th digit of y.
  • After sorting the k’th digit (in nondecreasing order), x will come before y in the remainder of the execution.

Complexity

  • Lemma 2 Suppose we sort n d-digit numbers, where each digit is between 0 to k-1. Then radix sort takes Θ(d(n+k)) time.
    • Sort each digit by counting sort, Θ(n+k), so the total time is Θ(d(n+k)).
    • n d-digit binary number: Θ(dn).
  • Lemma 3 Given n b-bit binary numbers and r <= b. Radix sort takes $\Theta(\frac{b}{r}(n+2^r))$ time.
    • Break the b bits into blocks of r digits, having values between 0 and $2^r − 1$.
    • There are $d=\lceil b/r\rceil$ such blocks.
    • We can think of each number as a d digit number, where each digit has value between 0 and $2^r-1$.
    • The lemma follows from Lemma 2.
  • Lemma 4 Setting $r=min(\lfloor logn \rfloor, b)$ minimizes the running time $\Theta(\frac{b}{r}(n+2^r))$.
    • If $b<\lfloor logn \rfloor$, then for any $r\leq b$, we have $n+2^r=\Theta(n)$. So we set $r=b$ to minimize $b/r$.
    • If $b\geq\lfloor logn \rfloor$, then setting $r=\lfloor logn \rfloor$ makes the running time $\Theta(\frac{bn}{logn})$.
      • If $r>\lfloor logn \rfloor$, then $2^r > n$, and $2^r$ in numerator increases faster than r in denominator, so running time increases.
      • If $r<\lfloor logn \rfloor$, then $b/r$ term increases, but $n+2^r$ remains $\Theta(n)$.

Bucket Sort

  • Bucket sort works for real value inputs in [0, 1).
    • Can shift and scale inputs so they always fall in this range.
    • Ex: If inputs all between 0 to 10, can divide them by 11.
  • If inputs are randomly distributed, then bucket sort runs in Θ(n).
    • If inputs are skewed, e.g. mostly very small, then bucket sort can be much slower.
  • Create n buckets, where i’th bucket holds values in [i/n, (i+1)/n).
    • Distribute each input value to corresponding bucket.
    • Each bucket is a linked list, so can hold multiple values.
    • Sort values in each bucket, using e.g. insertion sort.
    • Iterate through the buckets to produce sorted output.

Pseudocode


Complexity

  • Distributing input values to buckets: $\Theta(n)$.
  • Suppose the i’th bucket has $n_i$ items. Then sorting all buckets using insertion sort use $\sum_{i=1}^{n}O(n_{i}^2)$.
  • Total time: $T(n)=\Theta(n)+\sum_{i=1}^{n}O(n_{i}^2)$.
  • Assume the input values are uniformly random in [0,1). Then we look at the expected running time.

Expected running time

  • Linearity of expectations If $X_1$,…,$X_k$ are random variables, then $E[\sum_{i=1}^{k}X_i]=\sum_{i=1}^{k}E[X_i]$.
  • Expectation of product distribution If X,Y are independent random variables, then E[XY] = E[X]E[Y].
  • Lemma For any i, we have $E[n_{i}^2]=2-1/n$. Then we have expected running time is $\Theta(n)$.
    • Let $X_{ij}=1$ if j’th input value falls in i’th bucket, and 0 otherwise.
    • $n_i=\sum_{j=1}^{n}X_{ij}$ is number of inputs in i’th bucket.
    • For $j \neq k$, $X_{ij}$ and $X_{ik}$ are independent.

Leave a Reply

Your email address will not be published.