# 算法基础笔记0x03 —— 线性排序

Comparison sorts

Beyond comparison sorts

## Counting Sort

• Assumption: 输入的值为0~k的整数
• The algorithm runs in Θ(n) time when k is small.
• 用大小为k的数组C存储每个数值出现的次数，eg:C[i] = c means there a c inputs with value i.
• 值i出现C[i]次
• i出现在0,1,…,i-1后
• There are $\sum_{j=0}^{i-1}C[j]$ values 0,1,…,i-1.
• So the first i occurs at $\sum_{j=0}^{i-1}C[j]+1$ , last i occurs at $\sum_{j=0}^{i}C[j]$.
• 计算完C后，再计算前缀和/prefix sumC’
• $C'[i]=\sum_{j=0}^{i}C[j]$. $C'[-1] = 0$.
• Value i occurs in indices $[C'[i-1]+1, C'[i]]$ of output.
• Eg: If C = [2, 1, 3, 0, 0, 1], then output is 0,0,1,2,2,2,5.

### Pseudocode and Complexity

• reset all counts: O(k)
• counts occurrences of each value: O(n)
• computes prefix sum: O(k)
• scatter inputs to output positions: O(n)
• overall: Θ(n+k)
• If k = O(n), the complexity is Θ(n).

### Stability

• If two inputs are equal, then their order in the output is the same as their order in the input.
• This is why we iterated through A in reverse order when producing B.
• Stability is necessary when counting sort is used as a subroutine in other sorts, e.g. radix sort.

• Sort digit by digit.
• Two types
• MSD = Most Significant Digit first, then second most significant, etc.
• LSD = Least Significant Digit first, then second least significant, etc.
• We focus on LSD, which is easier to implement.
• Take list of input values, sort them on the singles digit.
• Take the new list, sort values on the tens digit.
• Take the new list, sort values on hundreds digit. Etc.
• The sorting algorithm for each digit must be stable.
• We will use counting sort.
• It’s stable.
• Since we sort a digit at a time, the values being sorted are between 0 and 9.
• Sorting n inputs takes O(n) time.

### Correctness

• Lemma 1 Let x and y be two inputs to LSD radix sort.
• Let k be the most significant digit on which they differ.
• Suppose k’th digit of x is less than k’th digit of y.
• After sorting the k’th digit (in nondecreasing order), x will come before y in the remainder of the execution.

### Complexity

• Lemma 2 Suppose we sort n d-digit numbers, where each digit is between 0 to k-1. Then radix sort takes Θ(d(n+k)) time.
• Sort each digit by counting sort, Θ(n+k), so the total time is Θ(d(n+k)).
• n d-digit binary number: Θ(dn).
• Lemma 3 Given n b-bit binary numbers and r <= b. Radix sort takes $\Theta(\frac{b}{r}(n+2^r))$ time.
• Break the b bits into blocks of r digits, having values between 0 and $2^r − 1$.
• There are $d=\lceil b/r\rceil$ such blocks.
• We can think of each number as a d digit number, where each digit has value between 0 and $2^r-1$.
• The lemma follows from Lemma 2.
• Lemma 4 Setting $r=min(\lfloor logn \rfloor, b)$ minimizes the running time $\Theta(\frac{b}{r}(n+2^r))$.
• If $b<\lfloor logn \rfloor$, then for any $r\leq b$, we have $n+2^r=\Theta(n)$. So we set $r=b$ to minimize $b/r$.
• If $b\geq\lfloor logn \rfloor$, then setting $r=\lfloor logn \rfloor$ makes the running time $\Theta(\frac{bn}{logn})$.
• If $r>\lfloor logn \rfloor$, then $2^r > n$, and $2^r$ in numerator increases faster than r in denominator, so running time increases.
• If $r<\lfloor logn \rfloor$, then $b/r$ term increases, but $n+2^r$ remains $\Theta(n)$.

## Bucket Sort

• Bucket sort works for real value inputs in [0, 1).
• Can shift and scale inputs so they always fall in this range.
• Ex: If inputs all between 0 to 10, can divide them by 11.
• If inputs are randomly distributed, then bucket sort runs in Θ(n).
• If inputs are skewed, e.g. mostly very small, then bucket sort can be much slower.
• Create n buckets, where i’th bucket holds values in [i/n, (i+1)/n).
• Distribute each input value to corresponding bucket.
• Each bucket is a linked list, so can hold multiple values.
• Sort values in each bucket, using e.g. insertion sort.
• Iterate through the buckets to produce sorted output.

### Complexity

• Distributing input values to buckets: $\Theta(n)$.
• Suppose the i’th bucket has $n_i$ items. Then sorting all buckets using insertion sort use $\sum_{i=1}^{n}O(n_{i}^2)$.
• Total time: $T(n)=\Theta(n)+\sum_{i=1}^{n}O(n_{i}^2)$.
• Assume the input values are uniformly random in [0,1). Then we look at the expected running time.

Expected running time

• Linearity of expectations If $X_1$,…,$X_k$ are random variables, then $E[\sum_{i=1}^{k}X_i]=\sum_{i=1}^{k}E[X_i]$.
• Expectation of product distribution If X,Y are independent random variables, then E[XY] = E[X]E[Y].
• Lemma For any i, we have $E[n_{i}^2]=2-1/n$. Then we have expected running time is $\Theta(n)$.
• Let $X_{ij}=1$ if j’th input value falls in i’th bucket, and 0 otherwise.
• $n_i=\sum_{j=1}^{n}X_{ij}$ is number of inputs in i’th bucket.
• For $j \neq k$, $X_{ij}$ and $X_{ik}$ are independent.