QuickSort推导 - The Science of Programming

@(学习笔记)[读书笔记, Programming]

由于近日开发单片机程序的时候,写到一个小循环,不能确定其正确性,进而想到这本书的内容,算是一个小小的回顾。本书的读书笔记参考这篇The Science of Programming
正所谓温故而知新,这次重新摘读又有些新的收获。本想以QuickSort作为一个练习来运用本书的知识,结果尝试了许久,发现越陷越深。最后还是找了书上的例子(书中有QuickSort的例子)。本文最终目标是阐述如何推导QuickSort程序本身。先摘取一些重要的知识点,再利用这些知识点来推导QuickSort。

知识点梳理

1. Quantification(量化)[原书4.2节]

  • There exists
    (E i: m\(\leq\)i<n: Ei)
  • For all
    (A i: m\(\leq\)i<n: Ei)
  • Numerial
    (N i: m\(\leq\)i<n: Ei)

他们有以下特性

(E i: m\(\leq\)i<k+1: Ei) = (E i: m\(\leq\)i<k: Ei) V Ek+1
(A i: m\(\leq\)i<n: Ei) = \(\daleth\)(E i: m\(\leq\)i<n: \(\daleth\)Ei)
(E i: m\(\leq\)i<n: Ei) = (N i: m\(\leq\)i<n: Ei) \(\geq\) 1
(A i: m\(\leq\)i<n: Ei) = (N i: m\(\leq\)i<n: Ei) = n-m

2. 何为wp?[原书第7章]

原书的定义:

The Predicate Transformer wp

The set of all states such that execution of S begun in any one of them is guaranteed to terminate in a finite amount of time in a state satisfying R.

wp通常被用来表达一段程序
\[\{Q\}S\{R\}\]
或者这样写
\[Q\Rightarrow wp(S,R)\]
- Q:前条件

- S:执行代码(原文称为commands)

- R:后条件

wp的特性

1) wp(S,F) = F

2) wp(S ^ Q) ^ wp(S,R) = wp(S,Q ^ R )

3) if Q then wp(S, Q ) \(\Rightarrow\) wp(S, R)

4) wp(S wp(S, R)) \(\Rightarrow\) wp(S ,Q v R)

5) wp(S, Q) V wp(S, R) = wp(S, Q V R) (for deterministic S)

3. 循环检查 (原书第11章)

定义

\[ wp(DO, R) \Rightarrow (E k: 0\leq k: H_k(R))\]

用数学语言标记一个循环如下:

{Q}

{inv P: the invariant}

{bound t: the bound function}

{init IS: initialization command}

do _B_1 \(\rightarrow\) _S_1

\(\square\ \cdots\)

\(\square\ \) _B_n \(\rightarrow\) _S_n

od

Invariant

A predicate P that is true before and after each iteration of a loop.

这就是说P在任何情况下都要成立,于是有

- Q \(\Rightarrow\) wp(IS, P)

- P \(\Rightarrow\) wp(DO, P ^ \(\daleth\)BB)

Bound

又叫bound function

The function changes at each iteration; the relation remains invariantly true.
each iteration decreases t by at least one,so that termination is guaranteed to occur.

检查bound函数确保循环能结束。

Checklist

非常重要!!!

1. Q \(\Rightarrow\) wp(IS, P)
2. P ^ Bi \(\Rightarrow\) wp(Si, P)
3. P ^ \(\daleth\)BB \(\Rightarrow\) R
4. P ^ BB \(\Rightarrow\) (t > 0)
5. P ^ Bi \(\Rightarrow\) wp(t1:=t; Si, t < t1)

依次验证这5条,就能证明一个循环的正确性了

4. 如何运用所有的知识来开发一段程序(原书PART III)

推测Command

通过后条件R,来推测执行代码(Command)。这就是所谓的Programming as a Goal-Oriented Activity。详情参看原书第14章。

Strategy for developing an alternative command: To invent a guarded command,find a command C whose execution will establish postcondition R in at least some cases; find a Boolean B satisfying B\(\Rightarrow\)wp(C,R) and put them together to form B \(\rightarrow\)C. Continue to invent guarded commands until the precondition of the construct implies that at least one guard is true. - 原文(14.7)

获得Guard(原书第15章)

所有的编程都应该是结果导向的(这与测试驱动开发TDD的思想暗合)。所以后条件R是最先有的。
其次,通过P \(\wedge\ \daleth\)BB \(\Rightarrow\) R能得到BB(即guard)。虽然有公式A\(\Rightarrow\)B <=> \(\daleth\)A V B=T,但是B的获得感觉更多是依靠分析而不是推导。具体例子会在本文最后推导Quick Sort中提及。
其中
\[ \mathbf B \mathbf B = B_1 \wedge \ B\_2 \wedge \ldots \ B\_n\]
不过由上式可知,此时的guard还是个总的guard,还需要将之分解为各个Bi。

获得不变量Invariant(原书第16章)

书中给了4种方法:

1. Delete a conjunct. For example, predicate A ^ B ^ C can be weakened to A ^ C

2. Replace a constant by a variable. For example, predicate x \(\leq\)b[1:10], where x is a simple variable,can be weakened to x \(\leq\) b[1:i] ^ 1\(\leq i \leq\)10,where i is a fresh variable. Since a new
variable has been introduced, its possible set of values must be precisely stated.

3. Enlarge the range of a variable. For example, predicate 5 \(\leq\) i < 10 can be weakened to 0 \(\leq\) i < 10.

4. Add a disjunct. For example, predicate A can be weakened to A v B, for some other predicate B.

书中推荐用前三个,根据阅读本书后面部分的经验,最常用的还是第二条。

推导Quick Sort (原书18.2节)

本文以推导快速排序作为结束。
快速排序的核心思想就是不断分割,在小数组中递归的作排序,直到小数组的元素个数少于或等于2个,即可进行直接排序了。所以分成Partition和Sort两部分。

Partition (原书练习16.5)

一次分割的后条件如下
R: \(m\leq p<n \wedge perm(b, B) \wedge b\)

由R获得P

通过上一节中的方法二Replace a constant by a variable.来得到P。
P: \(m<q \leq p+1 \leq n \wedge x=B[m] \wedge \)

由P \(\wedge \daleth\)BB \(\Rightarrow\)R推算BB

注意是推算,不是严格的逻辑推导
\(\because\ \)P \(\wedge\) (p-q+1=0) \(\Leftrightarrow\) R
\(\therefore\ \daleth\)BB = (p-q+1=0)
\(\therefore\ \)BB = (p\(\neq\)q-1)

Bound function

t = p-q+1

推测command

由QuickSort的基本算法可知,有三个可能的command
1. q++
2. p—
3. swap(b[p], b[q])
所以现在程序形式如下:
do p\(\neq\)q-1
 if__ \(\rightarrow\) p:=p-1
 \(\square\) __ \(\rightarrow\) q:=q+1
 \(\square\) __ \(\rightarrow\) swap(b[p], b[q])
 fi
od

由command+P+BB获得Guard

P\(\wedge\)BB\(\wedge B_i \Rightarrow\)wp(S,P)
1. S = p:=p-1
  wp(S,P) = \((mx\]
2. S = q:=q+1
过程类似
  wp(S,P)=\(m<q+1\leq p+1 \leq n \wedge\)

对于前半部分的证明如下:
P中有\(mx\]
依次分割程序写作:
do p\(\neq\)q-1
 if b[p]>x \(\rightarrow\) p:=p-1
 \(\square\) b[q]\(\leq\) x \(\rightarrow\) q:=q+1
 \(\square\) (b[q]\(\leq\) x) \(\wedge\) (b[p]>x)\(\rightarrow\) swap(b[p], b[q]); p,q:=p-1,q+1
 fi
od

验证

Q: T
R: \(m\leq p \leq n \wedge perm(b, B) \wedge b\)
P: \(m0)
BB = (p\(\neq\)q-1) \(\Leftrightarrow\) p-q+1\(\neq\)0
又由P有
q\(\leq\)p+1 \(\Leftrightarrow\) p-q+1\(\geq\)0
\(\therefore\) t=p-q+1>0
得证
5. P\(\wedge B_i \Rightarrow\) wp(\(t_1=t;S_i\), t<t1)
对于三种\(S_i\),t都显然会变小,所以满足

综上,5条check point都满足,程序正确性可保证

Sort

推导

根据快速排序算法可知,对数组排序,即进行若干次分割,直至子区间包含\(\leq\)2个元素,即可以进行直接排序。
* P: s is a set of pairs (i, j) representing disjoint array sections b[i:j] of b. Further,b[0:n—1] is ordered iff all the disjoint partitions given by set s are.
* IS: s:={(0,n-1)}
* R: b[0:n-1] are ordered
* t: n - ordered

快速排序程序如下:
s:={(0,n-1)}
{Invariant: 如上P}
do s\(\neq\){} \(\rightarrow\) Choose((i,j),s); s:=s-{(i,j)};
   if j-i < 2 \(\rightarrow\) Sort b[i:j] directly
   \(\square\) j-i \(\geq\)2 \(\rightarrow\) Partition(b,i,j,p;); \(s:=s\cup {(i,p-1)} \cup {(p+1,j)}\)
   fi
od

C代码

最终Quick Sort C代码实现如下:

#define N           50
#define MAX_SPLIT   50 // This should be logN
void swap(int* b, int p, int q)
{
    b[p]=b[p]+b[q];
    b[q]=b[p]-b[q];
    b[p]=b[p]-b[q];
}
int Partition (int* b, int m, int n)
{
    assert(n>m);
    int p=n-1, q=m+1;
    while (p != q-1)
    {
        if (b[p]>x)
        {
            p--;
        }
        else if (b[q]<=x)
        {
            q++;
        }
        else if (b[q]<=x && b[p]>x)
        {
            swap(b, p ,q);
            p--;
            q++;
        }
    }
    return p;
}
void QuickSort(int* b, int n)
{
    typedef struct 
    {
        int start;
        int end;
    }Pair s[MAX_SPLIT] = {0};
    int count = 1; // how many pairs in s
    s[0].start = 0;
    s[0].end = n-1;
    while (count > 0)
    {
        Pair p;
        memcpy(&p, &s[count-1], sizeof(Pair));
        if(p.end - p.start < 2)
        {
            if (p.start > p.end)
            {
                swap (b, p.start, p.end);
                count --;
            }
            else
            {
                int m = p.start;
                int n = p.end - p.start + 1;
                int x = Partition(b, m, n);
                s[count-1].start = p.start;
                s[count-1].end = x-1;
                s[count].start = x+1;
                s[count].end = p.end;
                count ++;
            }
        } 
    }
}