线性表

目录

(*) 数组的排序
复杂度O(n*n)的有
selection sort
bubble sort
rank sort
下面对这三种算法进行讲解：

selection sort

思路：首先找出最大的元素，把它移动到最后（即a[n-1]的位置上），然后在余下的n-1个元素中找出最大的，移动到a[n-2]，如此进行下去直到只剩下一个元素。

template <class T>
void    SelectionSort(T a[], int n)
{
    T    max;
    int    idxOfMax;
    for(int i = n - 1; i >= 0; --i){
        // find max
        max = a[0];
        idxOfMax = 0;
        for(int j = 0; j < i; ++j){
            if(a[j] > max){
                max = a[j];
                idxOfMax = j;
            }
        }
        // move max to the end of the array
        swap(a[i], a[idxOfMax]);
    }
}

代码中用临时变量max来保存最大元素，而不用a[idxOfMax]，是为了减少寻址次数，因为a[idxOfMax]最终会被解释成*(a + idxOfMax)。

bubble sort

buuble sort和selection sort有些类似，都是每走一遍内层循环就把最大的元素移到最后，算法复杂度也一样，不同只在于swap的时机。

template <class T>
void    BubbleSort(T a[], int n)
{
    for(int i = n - 1; i >= 0; --i)
        for(int j = 0; j < i; ++j)
            if(a[j] > a[j + 1])
                swap(a[j], a[j + 1]);
}

由于bubble sort是比较相邻两个元素，就很容易在排序过程中将数组变得“比较有序”，所以用下面的改良版算法可能会提高效率。

template <class T>
void    BubbleSort(T a[], int n)
{
    bool quit;
    for(int i = n - 1; i >= 0; --i){
        quit = true;
        for(int j = 0; j < i; ++j)
            if(a[j] > a[j + 1]){
                quit = false;
                swap(a[j], a[j + 1]);
            }
        if(quit)
            return;
    }
}

改良后，swap次数并没有减少，减少的只是比较次数

rank sort

rank sort是一个空间换时间的做法，假设我们要排序的数组是a[]，先用一个数组rank[]来记录a[]中每个元素的大小排名，然后用这个名次表对号入座把它们放到一个临时的数组tmp[]里，放完后数组tmp[]是已序的，最后把tmp[]拷贝回a[]即可。
rank sort之所以比selection sort和bubble sort效率高一点，是因为它只需要做n(n-1)/2次比较和n次元素的赋值，而其余两者同样需要做n(n-1)/2次比较，但元素赋值的开销稍大，因为要做n(n-1/2次swap。
为提高这点效率而使用rank sort，我觉得没多大必要，尤其是当元素是一个类的对象，更会引入构造和析构带来的开销。

rank = 所有比它小的元素数目 + 左边出现的与它相同的元素数目
例如，给定一个数组a[] = {4,3,9,3,7}，则rank[] = {2,0,4,1,3}。
rank sort代码如下：

template <class T>
void    RankSort(T a[], int n)
{
    // init rank
    int *rank = new int[n];
    memset(rank, 0, sizeof(int) * n);

    // calc rank
    for(int i = 0; i < n; ++i){
        for(int j = 0; j < i; ++j)
        if(a[j] <= a[i])
            rank[i]++;
        else
            rank[j]++;
        /*
        for(int j = 0; j < i; ++j)
            if(a[j] <= a[i])
                rank[i]++;

        for(int j = i; j < n; ++j)
            if(a[j] < a[i])
                rank[i]++;
        */
    }

    // assign temp array
    T *tmp = new T[n];
    for(int i = 0; i < n; ++i)
        tmp[rank[i]] = a[i];

    memcpy(a, tmp, sizeof(T) * n);
    delete []rank;
    delete []tmp;
}

注释里的代码是我看了算法思路后写的，没有注释的是经过优化的代码，在比较次数上比我一上来写的算法减少了一半。原因在于经过优化的算法没有浪费掉任何一次比较，而我的算法显然在很多时候当if语句为false时什么都没做。

(*) 在有序数组中增删元素

注：本小节所说的“数组”均指有序数组。
在数组中插入和删除元素都要移动插入或删除点右边的所有元素，比较耗时，这也是与链表相比的唯一缺点。
如果数组不是很长，可以用简单的遍历查找来决定插入和删除的位置。

template <class T>
void    Insert(T a[], int &len, const T addMe)
{
    int i;
    for(i = len - 1; i >= 0 && addMe < a[i]; --i)
        a[i + 1] = a[i];
    a[i + 1] = addMe;
    ++len;
}

template <class T>
void    Delete(T a[], int &len, const T delMe)
{
    int i;
    for(i = 0; i < len && a[i] != delMe; ++i)
        ;
    if(i == len)
        return;
    for(; i < len - 1; ++i)
        a[i] = a[i + 1];
    --len;
}

如果数组很长，那么用二分法来确定插入和删除的位置会提高一些效率。下面写一个二分查找的算法，找到则返回所在位置的下标，找不到返回-1：

template <class T>
int    BinarySearch(T a[], int len, const T target)
{
    int left = 0;
    int right = len - 1;
    int mid;

    do{
        mid = (left + right) / 2;
        if(a[mid] == target)
            return mid;
        if(target < a[mid])
            right = mid - 1;
        else
            left = mid + 1;
    }while(left <= right);

    return -1;
}

上面这段代码有逻辑错误！当len=0时，mid = (left + right) / 2 = (0 - 1) / 2 = 0，数组越界。因为此时数组为空,mid[0]不存在。这些小地方很容易出错。
把do{}while循环改成while循环就没问题了。改正后的代码如下：

template <class T>
int    BinarySearch(T a[], int len, const T target)
{
    int left = 0;
    int right = len - 1;
    int mid;

    while(left <= right){
        mid = (left + right) / 2;
        if(a[mid] == target)
            return mid;
        if(target < a[mid])
            right = mid - 1;
        else
            left = mid + 1;
    }

    return -1;
}

对于查找要删除元素的位置，直接用二分搜索就可以了，因为只有已存在的元素才可以删除。而对插入，就不能完全照搬二分搜索了，下面是为找到合适的插入位置而改写的代码：

template <class T>
int    FindInsertPosition(T a[], int len, const T target)
{
　　 if(len == 0)
　　　　 return 0;
　　
    int left = 0;
    int right = len - 1;
    int mid;

    while(left <= right){
        mid = (left + right) / 2;
        if(a[mid] == target)
            return mid;
        if(target < a[mid])
            right = mid - 1;
        else
            left = mid + 1;
    }

    // 前面与二分搜索并无二般，唯有下面这行代码不同
　　 if
    return target > a[mid] ? mid + 1 : mid;
}

(*) 链表
链表中有一个头结点很重要，可以减少一些繁琐的特殊情况处理，还可以用头结点来存放特殊的值。一旦有了头结点，就更应该做成循环链表了，因为这样做不会多费一个字节的内存，还能让循环链表和头结点搭配使用，得到查找元素的高效算法。
双向链表的好处是在链表中间插入删除元素的时候不需要previous指针，算法写起来比较简单，缺点是空间上都会有所损耗。但只要对空间不是特别计较，链表最好选用带头结点的双向循环链表。
下面是一个双向循环链表的实现，其中用到了一些实用的技巧：

// 这里CNode类型使用了struct而非class
// struct与class有两点不同：
// 1 struct与class的区别就在成员的默认访问权限,，struct是public
// 2 为了兼容C，struct允许这样初始化： struct A = {1, NULL, NULL};
// 这里只是为了演示struct，实际中也可以让CLinkList作为CNode的友元类从而提高一点封装性。
template <class T>
struct CNode
{
    T        data;
    CNode    *next;
    CNode    *prev;

    CNode(){}
    CNode(T data):data(data)
    {}
};

template <class T>
class    CLinkList
{
    CNode<T>*    head;
    int            criterion; // 排序准则，ascend or descend
public:
    static enum {ascend, descend}; // 这里用static是为了减少内存占用

    // 注意以下友元函数的声明
    // 有个地方比较特殊，是参数列表前面的<T>
    // 如果没有这个<T>，是可以通过编译但是link时出错。原因有二：
    // 1 这个友元声明会被解释为引用了非模板函数
    // 2 模板函数和同名的非模板函数可以共存
    friend ostream& operator << <T>(ostream& out, const CLinkList<T>& list);

    CLinkList(int criterion = ascend):criterion(criterion)
    {
        head = (CNode<T>*)new CNode<T>;
        head->next = head;
        head->prev = head;
    }

    ~CLinkList()
    {
        CNode<T>*    cur = head->next;
        while(cur != head)
        {
            head->next = cur->next;
            delete cur;
            cur = head->next;
        }
        delete head;
    }

    // on the right side of head, as head's next
    void InsertAsMin(T data)
    {
        CNode<T>*    node = (CNode<T>*)new CNode<T>(data);
        node->next = head->next;
        node->prev = head;
        // 双向链表的指针赋值顺序
        head->next->prev = node;
        head->next = node;
    }

    // on the right side of head, as head's prev
    void InsertAsMax(T data)
    {
        CNode<T>*    node = (CNode<T>*)new CNode<T>(data);
        node->next = head;
        node->prev = head->prev;
        head->prev->next = node;
        head->prev = node;
    }

    static void InsertBefore(T data, CNode<T>* beforeMe)
    {
        CNode<T>*    node = (CNode<T>*)new CNode<T>(data);
        node->next = beforeMe;
        node->prev = beforeMe->prev;
        beforeMe->prev->next = node;
        beforeMe->prev = node;
    }

    static void InsertBehind(T data, CNode<T>* behindMe)
    {
        CNode<T>*    node = (CNode<T>*)new CNode<T>(data);
        node->prev = behindMe;
        node->next = behindMe->next;
        behindMe->next->prev = node;
        behindMe->next = node;
    }

    void InsertWhenSorted(T data)
    {
        CNode<T>*    node = (CNode<T>*)new CNode<T>(data);
        CNode<T>*    p = head->next;
        while(p != head && data > p->data)
            p = p->next;

        // 插在p前面，无论p是不是head
        node->next = p;
        node->prev = p->prev;
        p->prev->next = node;
        p->prev = node;
    }

    void Delete(T data)
    {
        CNode<T>*    p = Find(data);
        if(p){
            p->next->prev = p->prev;
            p->prev->next = p->next;
            delete p;
        }
    }

    void setCriterion(int newCriterion)
    {
        criterion = newCriterion;
    }

    int getCriterion()
    {
        return criterion;
    }

    // Find函数展示了用头结点作为哨兵以减少每次循环的比较次数
    CNode<T>* Find(T data)
    {
        head->data = data;
        CNode<T>*    p = head->next;
        while(p->data != data)//不必每次都判断是否p==head
            p = p->next;
        if(p == head)
            return NULL;
        else
            return p;
    }

    // selection sort
    // 把链表看成数组，只是交换data域，而链表指针不动
    // 是双向链表所以才能采用这种排序算法。单向链表则最好重建新链表然后逐个InsertWhenSorted
    void    Sort()
    {
        CNode<T>*    max;
        for(CNode<T>* end = head; end->prev != head; end = end->prev){
            max = end->prev;
            for(CNode<T>* p = head->next; p != end; p = p->next){
                if(p->data > max->data)
                    swap(p->data, max->data);
            }
        }
    }

    // 模板类的成员函数，不需要在参数列表前加<T>，只有友元函数才需要
    CLinkList<T>&    operator += (const CLinkList<T>& list)
    {
        CNode<T>*    self = head->next;
        CNode<T>*    other = list.head->next;
        while(self != head && other != list.head){
            if(self->data < other->data)
                self = self->next;
            else{
                InsertBehind(other->data, self);
                other = other->next;
            }
        }
        while(other != list.head){
            InsertBehind(other->data, self);
            other = other->next;
        }
        return *this;
    }
};

// 使用友元函数的原因是运算符<<和>>的重载必须使用友元
// 还有一点特殊，二元运算符重载只要一个参数，而这里是两个参数
template <class T>
ostream& operator << (ostream& out, const CLinkList<T>& list)
{
    CNode<T>*    p;
    if(list.criterion == CLinkList<T>::ascend){
        p = list.head->next;
        while(p != list.head){
            out << p->data << " --> ";
            p = p->next;
        }
    }
    else{
        p = list.head->prev;
        while(p != list.head){
            out << p->data << " --> ";
            p = p->prev;
        }
    }
    out << endl;
    return out;
}

posted on 2008-07-04 00:56 MainTao 阅读(546) 评论(0) 编辑收藏举报