QuickSort推导 - The Science of Programming
@(学习笔记)[读书笔记, Programming]
由于近日开发单片机程序的时候,写到一个小循环,不能确定其正确性,进而想到这本书的内容,算是一个小小的回顾。本书的读书笔记参考这篇The Science of Programming。
正所谓温故而知新,这次重新摘读又有些新的收获。本想以QuickSort作为一个练习来运用本书的知识,结果尝试了许久,发现越陷越深。最后还是找了书上的例子(书中有QuickSort的例子)。本文最终目标是阐述如何推导QuickSort程序本身。先摘取一些重要的知识点,再利用这些知识点来推导QuickSort。
知识点梳理
1. Quantification(量化)[原书4.2节]
- There exists
(E i: m\(\leq\)i<n: Ei) - For all
(A i: m\(\leq\)i<n: Ei) - Numerial
(N i: m\(\leq\)i<n: Ei)
他们有以下特性
(E i: m\(\leq\)i<k+1: Ei) = (E i: m\(\leq\)i<k: Ei) V Ek+1
(A i: m\(\leq\)i<n: Ei) = \(\daleth\)(E i: m\(\leq\)i<n: \(\daleth\)Ei)
(E i: m\(\leq\)i<n: Ei) = (N i: m\(\leq\)i<n: Ei) \(\geq\) 1
(A i: m\(\leq\)i<n: Ei) = (N i: m\(\leq\)i<n: Ei) = n-m
2. 何为wp?[原书第7章]
原书的定义:
The Predicate Transformer wp
The set of all states such that execution of S begun in any one of them is guaranteed to terminate in a finite amount of time in a state satisfying R.
wp通常被用来表达一段程序
\[\{Q\}S\{R\}\]
或者这样写
\[Q\Rightarrow wp(S,R)\]
- Q:前条件
- S:执行代码(原文称为commands)
- R:后条件
wp的特性
1) wp(S,F) = F
2) wp(S ^ Q) ^ wp(S,R) = wp(S,Q ^ R )
3) if Q then wp(S, Q ) \(\Rightarrow\) wp(S, R)
4) wp(S wp(S, R)) \(\Rightarrow\) wp(S ,Q v R)
5) wp(S, Q) V wp(S, R) = wp(S, Q V R) (for deterministic S)
3. 循环检查 (原书第11章)
定义
\[ wp(DO, R) \Rightarrow (E k: 0\leq k: H_k(R))\]
用数学语言标记一个循环如下:
{Q}
{inv P: the invariant}
{bound t: the bound function}
{init IS: initialization command}
do _B_1 \(\rightarrow\) _S_1
\(\square\ \cdots\)
\(\square\ \) _B_n \(\rightarrow\) _S_n
od
Invariant
A predicate P that is true before and after each iteration of a loop.
这就是说P在任何情况下都要成立,于是有
- Q \(\Rightarrow\) wp(IS, P)
- P \(\Rightarrow\) wp(DO, P ^ \(\daleth\)BB)
Bound
又叫bound function
The function changes at each iteration; the relation remains invariantly true.
each iteration decreases t by at least one,so that termination is guaranteed to occur.
检查bound函数确保循环能结束。
Checklist
非常重要!!!
1. Q \(\Rightarrow\) wp(IS, P)
2. P ^ Bi \(\Rightarrow\) wp(Si, P)
3. P ^ \(\daleth\)BB \(\Rightarrow\) R
4. P ^ BB \(\Rightarrow\) (t > 0)
5. P ^ Bi \(\Rightarrow\) wp(t1:=t; Si, t < t1)
依次验证这5条,就能证明一个循环的正确性了
4. 如何运用所有的知识来开发一段程序(原书PART III)
推测Command
通过后条件R,来推测执行代码(Command)。这就是所谓的Programming as a Goal-Oriented Activity。详情参看原书第14章。
Strategy for developing an alternative command: To invent a guarded command,find a command C whose execution will establish postcondition R in at least some cases; find a Boolean B satisfying B\(\Rightarrow\)wp(C,R) and put them together to form B \(\rightarrow\)C. Continue to invent guarded commands until the precondition of the construct implies that at least one guard is true. - 原文(14.7)
获得Guard(原书第15章)
所有的编程都应该是结果导向的(这与测试驱动开发TDD的思想暗合)。所以后条件R是最先有的。
其次,通过P \(\wedge\ \daleth\)BB \(\Rightarrow\) R能得到BB(即guard)。虽然有公式A\(\Rightarrow\)B <=> \(\daleth\)A V B=T,但是B的获得感觉更多是依靠分析而不是推导。具体例子会在本文最后推导Quick Sort中提及。
其中
\[ \mathbf B \mathbf B = B_1 \wedge \ B\_2 \wedge \ldots \ B\_n\]
不过由上式可知,此时的guard还是个总的guard,还需要将之分解为各个Bi。
获得不变量Invariant(原书第16章)
书中给了4种方法:
1. Delete a conjunct. For example, predicate A ^ B ^ C can be weakened to A ^ C
2. Replace a constant by a variable. For example, predicate x \(\leq\)b[1:10], where x is a simple variable,can be weakened to x \(\leq\) b[1:i] ^ 1\(\leq i \leq\)10,where i is a fresh variable. Since a new
variable has been introduced, its possible set of values must be precisely stated.
3. Enlarge the range of a variable. For example, predicate 5 \(\leq\) i < 10 can be weakened to 0 \(\leq\) i < 10.
4. Add a disjunct. For example, predicate A can be weakened to A v B, for some other predicate B.
书中推荐用前三个,根据阅读本书后面部分的经验,最常用的还是第二条。
推导Quick Sort (原书18.2节)
本文以推导快速排序作为结束。
快速排序的核心思想就是不断分割,在小数组中递归的作排序,直到小数组的元素个数少于或等于2个,即可进行直接排序了。所以分成Partition和Sort两部分。
Partition (原书练习16.5)
一次分割的后条件如下
R: \(m\leq p<n \wedge perm(b, B) \wedge b\)
由R获得P
通过上一节中的方法二Replace a constant by a variable.来得到P。
P: \(m<q \leq p+1 \leq n \wedge x=B[m] \wedge \)
由P \(\wedge \daleth\)BB \(\Rightarrow\)R推算BB
注意是推算,不是严格的逻辑推导
\(\because\ \)P \(\wedge\) (p-q+1=0) \(\Leftrightarrow\) R
\(\therefore\ \daleth\)BB = (p-q+1=0)
\(\therefore\ \)BB = (p\(\neq\)q-1)
Bound function
t = p-q+1
推测command
由QuickSort的基本算法可知,有三个可能的command
1. q++
2. p—
3. swap(b[p], b[q])
所以现在程序形式如下:
do p\(\neq\)q-1
if__ \(\rightarrow\) p:=p-1
\(\square\) __ \(\rightarrow\) q:=q+1
\(\square\) __ \(\rightarrow\) swap(b[p], b[q])
fi
od
由command+P+BB获得Guard
P\(\wedge\)BB\(\wedge B_i \Rightarrow\)wp(S,P)
1. S = p:=p-1
wp(S,P) = \((mx\]
2. S = q:=q+1
过程类似
wp(S,P)=\(m<q+1\leq p+1 \leq n \wedge\)
对于前半部分的证明如下:
P中有\(mx\]
依次分割程序写作:
do p\(\neq\)q-1
if b[p]>x \(\rightarrow\) p:=p-1
\(\square\) b[q]\(\leq\) x \(\rightarrow\) q:=q+1
\(\square\) (b[q]\(\leq\) x) \(\wedge\) (b[p]>x)\(\rightarrow\) swap(b[p], b[q]); p,q:=p-1,q+1
fi
od
验证
Q: T
R: \(m\leq p \leq n \wedge perm(b, B) \wedge b\)
P: \(m0)
BB = (p\(\neq\)q-1) \(\Leftrightarrow\) p-q+1\(\neq\)0
又由P有
q\(\leq\)p+1 \(\Leftrightarrow\) p-q+1\(\geq\)0
\(\therefore\) t=p-q+1>0
得证
5. P\(\wedge B_i \Rightarrow\) wp(\(t_1=t;S_i\), t<t1)
对于三种\(S_i\),t都显然会变小,所以满足
综上,5条check point都满足,程序正确性可保证
Sort
推导
根据快速排序算法可知,对数组排序,即进行若干次分割,直至子区间包含\(\leq\)2个元素,即可以进行直接排序。
* P: s is a set of pairs (i, j) representing disjoint array sections b[i:j] of b. Further,b[0:n—1] is ordered iff all the disjoint partitions given by set s are.
* IS: s:={(0,n-1)}
* R: b[0:n-1] are ordered
* t: n - ordered
快速排序程序如下:
s:={(0,n-1)}
{Invariant: 如上P}
do s\(\neq\){} \(\rightarrow\) Choose((i,j),s); s:=s-{(i,j)};
if j-i < 2 \(\rightarrow\) Sort b[i:j] directly
\(\square\) j-i \(\geq\)2 \(\rightarrow\) Partition(b,i,j,p;); \(s:=s\cup {(i,p-1)} \cup {(p+1,j)}\)
fi
od
C代码
最终Quick Sort C代码实现如下:
#define N 50
#define MAX_SPLIT 50 // This should be logN
void swap(int* b, int p, int q)
{
b[p]=b[p]+b[q];
b[q]=b[p]-b[q];
b[p]=b[p]-b[q];
}
int Partition (int* b, int m, int n)
{
assert(n>m);
int p=n-1, q=m+1;
while (p != q-1)
{
if (b[p]>x)
{
p--;
}
else if (b[q]<=x)
{
q++;
}
else if (b[q]<=x && b[p]>x)
{
swap(b, p ,q);
p--;
q++;
}
}
return p;
}
void QuickSort(int* b, int n)
{
typedef struct
{
int start;
int end;
}Pair s[MAX_SPLIT] = {0};
int count = 1; // how many pairs in s
s[0].start = 0;
s[0].end = n-1;
while (count > 0)
{
Pair p;
memcpy(&p, &s[count-1], sizeof(Pair));
if(p.end - p.start < 2)
{
if (p.start > p.end)
{
swap (b, p.start, p.end);
count --;
}
else
{
int m = p.start;
int n = p.end - p.start + 1;
int x = Partition(b, m, n);
s[count-1].start = p.start;
s[count-1].end = x-1;
s[count].start = x+1;
s[count].end = p.end;
count ++;
}
}
}
}