[算法]——全排列（Permutation）以及next_permutation

排列（Arrangement），简单讲是从N个不同元素中取出M个，按照一定顺序排成一列，通常用A(M,N)表示。当M=N时，称为全排列（Permutation）。从数学角度讲，全排列的个数A(N,N)=(N)*(N-1)*...*2*1=N!，但从编程角度，如何获取所有排列？那么就必须按照某种顺序逐个获得下一个排列，通常按照升序顺序（字典序）获得下一个排列。

例如对于一个集合A={1,2,3,}，首先获取全排列a1: 1,2,3,；然后获取下一个排列a2: 1,3,2,；按此顺序，A的全排列如下：

a1: 1,2,3;　　a2: 1,3,2;　　a3: 2,1,3;　　a4: 2,3,1;　　a5: 3,1,2;　　a6: 3,2,1;　　共6种。

1）下一个全排列（Next Permutation）

对于给定的任意一种全排列，如果能求出下一个全排列的情况，那么求得所有全排列情况就容易了。好在STL中的algorithm已经给出了一种健壮、高效的方法，下面进行介绍。

设目前有一个集合的一种全排列情况A : 3,7,6,2,5,4,3,1，求取下一个排列的步骤如下：

/** Tips: next permuation based on the ascending order sort
 * sketch :
 * current: 3   7  6  2  5  4  3  1  .
 *                    |  |     |     |
 *          find i----+  j     k     +----end
 * swap i and k :
 *          3   7  6  3  5  4  2  1  .
 *                    |  |     |     |
 *               i----+  j     k     +----end
 * reverse j to end :
 *          3   7  6  3  1  2  4  5  .
 *                    |  |     |     |
 *          find i----+  j     k     +----end
 * */

具体方法为：

a）从后向前查找第一个相邻元素对(i,j)，并且满足A[i] < A[j]。

易知，此时从j到end必然是降序。可以用反证法证明，请自行证明。

b）在[j,end)中寻找一个最小的k使其满足A[i]<A[k]。

由于[j,end)是降序的，所以必然存在一个k满足上面条件；并且可以从后向前查找第一个满足A[i]<A[k]关系的k，此时的k必是待找的k。

c）将i与k交换。

此时，i处变成比i大的最小元素，因为下一个全排列必须是与当前排列按照升序排序相邻的排列，故选择最小的元素替代i。

易知，交换后的[j,end)仍然满足降序排序。因为在(k,end)中必然小于i，在[j,k)中必然大于k，并且大于i。

d）逆置[j,end)

由于此时[j,end)是降序的，故将其逆置。最终获得下一全排序。

注意：如果在步骤a)找不到符合的相邻元素对，即此时i=begin，则说明当前[begin,end)为一个降序顺序，即无下一个全排列，STL的方法是将其逆置成升序。

2）Next Permutation代码

// STL next permutation base idea
int next_permutation(int *begin, int *end)
{
	int *i=begin, *j, *k;
	if (i==end || ++i==end) return 0;	// 0 or 1 element, no next permutation
	for (i=end-1; i!=begin;) {
		j = i--;	// find last increasing pair (i,j)
		if (!(*i < *j)) continue;
		// find last k which not less than i,
		for (k=end; !(*i < *(--k)););
		iter_swap(i,k);
		// now the range [j,end) is in descending order
		reverse(j,end);
		return 1;
	}
	// current is in descending order
	reverse(begin,end);
	return 0;
}

上面仅仅是STL中next_permutation的主要思路，原版是C++迭代器版，这里为了便于理解，改成了C的指针版本。

当返回为1时，表示找到了下一全排列；返回0时，表示无下一全排列。注意，如果从begin到end为降序，则表明全排列结束，逆置使其还原到升序。

3）使用next_permutation

如何获取所有全排列情况？STL中的代码非常精妙，利用next_permutation的返回值，判断是否全排列结束（否则将死循环）。对于给定的一个数组，打印其所有全排列只需如下：

// Display All Permutation
void all_permutation(int arr[], int n)
{
	sort(arr,arr+n);	// sort arr[] in ascending order
	do{
		for(int i=0; i<n; printf("%d ",arr[i++]));
		printf("\n");
	}while(next_permutation(arr,arr+n));
}

如果一个数组arr[]中存在重复元素，next_permutation是否工作正常呢？注意第8和10行，STL使用“!(*i < *j)”进行判断大小，若相等则继续寻找，这样就会跳过重复的元素，进而跳过重复的全排列（如：1,2,2; 和1,2,2）。有人会认为直接使用“*i>=*j”更清晰，对于int这种进本数据类型而言，这并没问题。然而，对于结构体甚至C++而言，元素是一个用户自定义数据类型，如何判断其大小？再退一步讲，如何进行排序？STL追求健壮、高效和精妙，对于用户自定义数据类型的排序，可以增加函数指针或者仿函数（Functional），只需要给定“a<b”的方法（如less(a,b)）即可。如需求“a>b”可以转化成“b<a”；求“a==b”可以转化成“!(a<b) && !(b<a)”；求“a>=b”可以转化成“!(a<b)”。因此，一般自定义比较器只需要给定less()即可（对于C++而言，即重载操作符operator<）。

有了全排列，那么排列问题A(M,N)则解决了一半，直接从A中选择选择M个元素，然后对这M个元素进行全排列。其中前一步为组合（Combination），记为(M,N)，感兴趣的可以自己解决。

4）前一个全排列（prev_permutation）

与next_permutation类似，STL也提供一个版本：

// STL prev permutation base idea
int prev_permutation(int *begin, int *end)
{
	int *i=begin, *j, *k;
	if (i==end || ++i==end) return 0;	// 0 or 1 element, no prev permutation
	for (i=end-1; i!=begin;) {
		j = i--;	// find last decreasing pair (i,j)
		if (!(*i > *j)) continue;
		// find last k which less than i,
		for (k=end; !(*i > *(--k)););
		iter_swap(i,k);
		// now the range [j,end) is in ascending order
		reverse(j,end);
		return 1;
	}
	// current is in ascending order
	reverse(begin,end);
	return 0;
}

这里不再详细介绍。

5)STL源码next_permutation分析

前面说到STL非常健壮、高效和精妙，下面以next_permutation作分析：

// STL next_permutation
template <class BidirectionalIterator>
bool next_permutation(
	BidirectionalIterator first,		// iterator, like the C point
	BidirectionalIterator last
	)
{
	if(first == last) return false;		// no element

	BidirectionalIterator i = first;
	if(++i == last) return false;		// only one element

	i = last;
	--i;								// do not use i--, why?

	for(;;) {	// no statemnet loop, why do not use line 29 ?
		BidirectionalIterator j = i;	// do not use j=i--; why?
		--i;
		// find the last neighbor pair (i,j) which element i < j
		if(*i < *j) {
			BidirectionalIterator k = last;
			while(!(*i < *--k));		// find last k >= i
			iter_swap(i, k);			// swap i and k
			reverse(j, last);			// reverse [j,last)
			return true;
		}

		if(i == first) {
			reverse(first, last);		// current is in descending order
			return false;
		}
	}
}

STL中首先判断是否为空，如果为空则直接返回false，因为没有下一个全排列。是否可以跟第11行调换呢？显然不行。那么是否可以跟第10行调换呢？虽然这样并不影响运行结果，但是对于为空的情况，多了对象的实例化（构造）和清理（析构）两个过程。可见STL对高效的炽热追求。

紧接着，第14行使用“--i;”而不是“i--;”，简言之，前者是先自减再使用，后者是先使用再自减。在这里虽然对结果也不影响，但是这两种实现方法还是有区别的。对于“i--;”来说，编译器首先会将i的值拷贝到临时变量中，然后对i进行自减，最后将临时变量返回；对于“--i”来说，编译器直接将i的值自减，然后将i的值返回。显然，“--i”只执行了两个指令操作，而“i--”执行了三个指令操作。所以能用“--i”的时候尽量不要使用“i--”。（PS：目前编译器已经十分智能了，对于上面的情况，即便写成“i--”仍然会按照“--i”进行编译，但请记住，不要指望任何版本的编译器都能帮你优化代码！）

注意：第17、18两句，并没有合并成一句，因为此时编译器无法进行合理优化，所以写成两句要比写成一句的少了一个指令操作。具体如下：

// C source  1                     |             2
int main(){                        |int main(){
    int i=0;                       |    int i=0;
    int j=i--;                     |    int j=i;
                                   |    --i;
    return 0;                      |    return 0;
}                                  |}
// assembly without optimization   |
_main:         1                   |_main:        2
    pushl   %ebp                   |    pushl   %ebp
    movl    %esp, %ebp             |    movl    %esp, %ebp
    andl    $-16, %esp             |    andl    $-16, %esp
    subl    $16, %esp              |    subl    $16, %esp
    call    ___main                |    call    ___main
    movl    $0, 12(%esp)           |    movl    $0, 12(%esp)
    movl    12(%esp), %eax         |    movl    12(%esp), %eax
    leal    -1(%eax), %edx         |
    movl    %edx, 12(%esp)         |    movl    %eax, 8(%esp)
    movl    %eax, 8(%esp)          |    subl    $1, 12(%esp)
    movl    $0, %eax               |    movl    $0, %eax
    leave                          |    leave
    ret                            |    ret
    .ident  "GCC: (GNU) 4.8.3"     |    .ident  "GCC: (GNU) 4.8.3"

因此，不要指望任何版本的编译器都能帮你优化代码！

然后看第16行的for语句，为什么不用while语句？从语法上讲，“while(1)”与“for(;;)”是相同的，都是死循环。但是后者是一个无条件跳转，即不需要条件判断直接循环；而前者多了条件判断，虽然这个条件判断永远为真，但是多了一个机器指令操作。（PS：目前编译器已经十分智能，对于这两种写法编译结果都是无条件跳转，并不需要额外的条件判断，还是那句话，不要指望任何版本的编译器都能帮你优化代码！）

尽管如此，第28行仍然需要条件判断，何不写在for中？抛开无条件跳转的优势之外，这样写有什么不同？仔细分析可知，如果循环到5次时，找到了满足条件的连续元素对(i,j)，那么第28行的条件判断只执行了4次；如果将28行条件判断写在for中，则需要5次条件判断。由此可见，STL源码对健壮、高效和精妙的卓越追求！

此外，STL同样提供了带比较器的next_permutation：

template <class BidirectionalIterator,
	  class BinaryPredicate>
bool next_permutation(
	BidirectionalIterator _First,
	BidirectionalIterator _Last,
	BinaryPredicate _Comp
);

这里不再进行分析。

注：本文涉及的源码：permutation : https://git.oschina.net/eudiwffe/codingstudy/tree/master/src/permutation/permutation.c

STL permutation : https://git.oschina.net/eudiwffe/codingstudy/tree/master/src/permutation/permutation_stl.cpp

posted @ 2017-01-08 01:25 eudiwffe 阅读(24962) 评论(1) 收藏举报

刷新页面返回顶部

eudiwffe

温故而知新，可以为师矣 https://git.oschina.net/eudiwffe

[算法]——全排列（Permutation）以及next_permutation

公告