7-7 拓扑排序(使用DFS)

拓扑排序(使用DFS — Topological Sort)

拓扑排序(Topological Sort)是将有向无环图(Directed Acyclic Graph,简称 DAG)中的所有顶点排成一个线性序列,使得对于图中的每一条有向边 (u, v),u 在序列中出现在 v 之前。拓扑排序广泛应用于任务调度(Task Scheduling)、编译依赖分析(Build Dependency Analysis)、课程先修关系(Course Prerequisites)等场景。

基于 DFS 的拓扑排序方法:对图进行深度优先搜索,当一个节点的所有邻居都被访问完毕(即 DFS 完成时),将该节点压入栈(Stack)中。最终栈中从栈顶到栈底的顺序就是拓扑序列——也就是说,越早完成 DFS 的节点越晚出现在拓扑序列中。

本文使用如下有向无环图(6 个顶点):

边: (5,2), (5,0), (4,0), (4,1), (2,3), (3,1)

邻接表(Adjacency List):
0: []
1: []
2: [3]
3: [1]
4: [0, 1]
5: [0, 2]

图的结构:
    5 → 2 → 3 → 1
    ↓       ↓
    0       1
    ↑
    4 → 0
    4 → 1

拓扑排序结果之一: 5 4 2 3 1 0
(拓扑序列不唯一,取决于 DFS 的起始节点和邻居访问顺序)

图的表示

拓扑排序处理的是有向图(Directed Graph),因此邻接表中每条边只存储一个方向。不同语言的实现方式:C++ 使用 vector<vector<int>>,C 语言使用二维数组模拟,Python 使用字典嵌套列表,Go 使用 [][]int 切片。

#include <iostream>
#include <vector>
using namespace std;

int main() {
    // Build adjacency list for the DAG
    // 6 nodes: 0-5
    int n = 6;
    vector<vector<int>> adj(n);

    // Add directed edges
    adj[5].push_back(2);
    adj[5].push_back(0);
    adj[4].push_back(0);
    adj[4].push_back(1);
    adj[2].push_back(3);
    adj[3].push_back(1);

    // Print adjacency list
    for (int i = 0; i < n; i++) {
        cout << i << ": ";
        for (int neighbor : adj[i]) {
            cout << neighbor << " ";
        }
        cout << endl;
    }

    return 0;
}
#include <stdio.h>

#define MAX_NODES 6
#define MAX_NEIGHBORS 2

int main() {
    // Use 2D array to represent adjacency list
    int adj[MAX_NODES][MAX_NEIGHBORS] = {0};
    int adjCount[MAX_NODES] = {0};

    // Macro to add directed edge (u -> v only)
    #define ADD_DIR_EDGE(u, v) do { \
        adj[u][adjCount[u]++] = v; \
    } while(0)

    // Build the DAG
    ADD_DIR_EDGE(5, 2);
    ADD_DIR_EDGE(5, 0);
    ADD_DIR_EDGE(4, 0);
    ADD_DIR_EDGE(4, 1);
    ADD_DIR_EDGE(2, 3);
    ADD_DIR_EDGE(3, 1);

    // Print adjacency list
    for (int i = 0; i < MAX_NODES; i++) {
        printf("%d: ", i);
        for (int j = 0; j < adjCount[i]; j++) {
            printf("%d ", adj[i][j]);
        }
        printf("\n");
    }

    return 0;
}
def main():
    # Build adjacency list for the DAG
    adj = {
        0: [],
        1: [],
        2: [3],
        3: [1],
        4: [0, 1],
        5: [0, 2],
    }

    # Print adjacency list
    for node in sorted(adj.keys()):
        print(f"{node}: {adj[node]}")

if __name__ == "__main__":
    main()
package main

import "fmt"

func main() {
    // Build adjacency list for the DAG using slice of slices
    adj := [][]int{
        {},    // node 0
        {},    // node 1
        {3},   // node 2
        {1},   // node 3
        {0, 1}, // node 4
        {0, 2}, // node 5
    }

    // Print adjacency list
    for i, neighbors := range adj {
        fmt.Printf("%d: %v\n", i, neighbors)
    }
}

上述代码构建了示例 DAG 的邻接表表示。由于是有向图,每条边只在一个方向上存储。例如边 (5, 2) 表示从节点 5 指向节点 2,只在 adj[5] 中添加 2,而不在 adj[2] 中添加 5。节点 0 和节点 1 没有出边(Outgoing Edge),它们的邻接表为空。

运行该程序将输出:

0:
1:
2: 3
3: 1
4: 0 1
5: 0 2

DFS 拓扑排序实现

基于 DFS 的拓扑排序算法步骤如下:

  1. 维护一个 visited 数组标记已访问的节点
  2. 维护一个栈(Stack)用于存储拓扑序列
  3. 对图中每个未访问的节点启动 DFS:
    • 标记当前节点为已访问
    • 递归访问所有未访问的邻居
    • 当一个节点的所有邻居都处理完毕后,将该节点压入栈
  4. 最终将栈中的元素依次弹出,即为拓扑序列
DFS 拓扑排序过程(按节点编号从小到大遍历起始节点):

从节点 0 开始: visited[0]=true, 无邻居 → push 0, stack=[0]
从节点 1 开始: visited[1]=true, 无邻居 → push 1, stack=[1,0]
从节点 2 开始: visited[2]=true
  邻居 3 未访问 → visited[3]=true
    邻居 1 已访问 → 返回
  push 3, stack=[3,1,0]
  push 2, stack=[2,3,1,0]
从节点 3 开始: 已访问,跳过
从节点 4 开始: visited[4]=true
  邻居 0 已访问 → 跳过
  邻居 1 已访问 → 跳过
  push 4, stack=[4,2,3,1,0]
从节点 5 开始: visited[5]=true
  邻居 0 已访问 → 跳过
  邻居 2 已访问 → 跳过
  push 5, stack=[5,4,2,3,1,0]

弹出栈: 5 4 2 3 1 0 ← 拓扑序列

C++ 实现

#include <iostream>
#include <vector>
#include <stack>
using namespace std;

// DFS helper: visit node, then push to stack after all neighbors done
void topoDFS(const vector<vector<int>>& adj, int node,
             vector<bool>& visited, stack<int>& stk) {
    visited[node] = true;

    // Recurse on all unvisited neighbors
    for (int neighbor : adj[node]) {
        if (!visited[neighbor]) {
            topoDFS(adj, neighbor, visited, stk);
        }
    }

    // All neighbors processed → push this node onto stack
    stk.push(node);
}

// Topological sort using DFS
void topologicalSort(const vector<vector<int>>& adj) {
    int n = adj.size();
    vector<bool> visited(n, false);
    stack<int> stk;

    // Try each node as potential DFS start
    for (int i = 0; i < n; i++) {
        if (!visited[i]) {
            topoDFS(adj, i, visited, stk);
        }
    }

    // Pop stack to get topological order
    cout << "Topological Sort: ";
    while (!stk.empty()) {
        cout << stk.top() << " ";
        stk.pop();
    }
    cout << endl;
}

int main() {
    vector<vector<int>> adj = {
        {},      // node 0
        {},      // node 1
        {3},     // node 2
        {1},     // node 3
        {0, 1},  // node 4
        {0, 2},  // node 5
    };

    topologicalSort(adj);

    return 0;
}

C 实现

#include <stdio.h>
#include <stdbool.h>

#define MAX_NODES 6
#define MAX_NEIGHBORS 2

// Simple stack implementation
typedef struct {
    int data[MAX_NODES];
    int top;
} Stack;

void stackInit(Stack* s) { s->top = -1; }
void stackPush(Stack* s, int val) { s->data[++s->top] = val; }
int stackPop(Stack* s) { return s->data[s->top--]; }
bool stackEmpty(Stack* s) { return s->top == -1; }

void topoDFS(int adj[][MAX_NEIGHBORS], int adjCount[], int node,
             bool visited[], Stack* stk) {
    visited[node] = true;

    // Recurse on all unvisited neighbors
    for (int i = 0; i < adjCount[node]; i++) {
        int neighbor = adj[node][i];
        if (!visited[neighbor]) {
            topoDFS(adj, adjCount, neighbor, visited, stk);
        }
    }

    // All neighbors processed → push this node onto stack
    stackPush(stk, node);
}

void topologicalSort(int adj[][MAX_NEIGHBORS], int adjCount[], int n) {
    bool visited[MAX_NODES] = {false};
    Stack stk;
    stackInit(&stk);

    // Try each node as potential DFS start
    for (int i = 0; i < n; i++) {
        if (!visited[i]) {
            topoDFS(adj, adjCount, i, visited, &stk);
        }
    }

    // Pop stack to get topological order
    printf("Topological Sort: ");
    while (!stackEmpty(&stk)) {
        printf("%d ", stackPop(&stk));
    }
    printf("\n");
}

int main() {
    int adj[MAX_NODES][MAX_NEIGHBORS] = {0};
    int adjCount[MAX_NODES] = {0};

    #define ADD_DIR_EDGE(u, v) do { \
        adj[u][adjCount[u]++] = v; \
    } while(0)

    ADD_DIR_EDGE(5, 2);
    ADD_DIR_EDGE(5, 0);
    ADD_DIR_EDGE(4, 0);
    ADD_DIR_EDGE(4, 1);
    ADD_DIR_EDGE(2, 3);
    ADD_DIR_EDGE(3, 1);

    topologicalSort(adj, adjCount, MAX_NODES);

    return 0;
}

Python 实现

def topo_dfs(adj, node, visited, stack):
    """DFS helper: visit node, push to stack after all neighbors done."""
    visited[node] = True

    # Recurse on all unvisited neighbors
    for neighbor in adj[node]:
        if not visited[neighbor]:
            topo_dfs(adj, neighbor, visited, stack)

    # All neighbors processed → push this node onto stack
    stack.append(node)

def topological_sort(adj):
    """Topological sort using DFS."""
    n = len(adj)
    visited = [False] * n
    stack = []

    # Try each node as potential DFS start
    for i in range(n):
        if not visited[i]:
            topo_dfs(adj, i, visited, stack)

    # Reverse stack to get topological order
    stack.reverse()
    return stack

if __name__ == "__main__":
    adj = {
        0: [],
        1: [],
        2: [3],
        3: [1],
        4: [0, 1],
        5: [0, 2],
    }

    result = topological_sort(adj)
    print("Topological Sort:", " ".join(map(str, result)))

Go 实现

package main

import "fmt"

// DFS helper: visit node, push to stack after all neighbors done
func topoDFS(adj [][]int, node int, visited []bool, stack *[]int) {
    visited[node] = true

    // Recurse on all unvisited neighbors
    for _, neighbor := range adj[node] {
        if !visited[neighbor] {
            topoDFS(adj, neighbor, visited, stack)
        }
    }

    // All neighbors processed → push this node onto stack
    *stack = append(*stack, node)
}

// Topological sort using DFS
func topologicalSort(adj [][]int) {
    n := len(adj)
    visited := make([]bool, n)
    var stack []int

    // Try each node as potential DFS start
    for i := 0; i < n; i++ {
        if !visited[i] {
            topoDFS(adj, i, visited, &stack)
        }
    }

    // Pop stack in reverse to get topological order
    fmt.Print("Topological Sort: ")
    for i := len(stack) - 1; i >= 0; i-- {
        fmt.Print(stack[i], " ")
    }
    fmt.Println()
}

func main() {
    adj := [][]int{
        {},      // node 0
        {},      // node 1
        {3},     // node 2
        {1},     // node 3
        {0, 1},  // node 4
        {0, 2},  // node 5
    }

    topologicalSort(adj)
}

上述代码实现了基于 DFS 的拓扑排序。核心函数 topoDFS 在递归处理完当前节点的所有邻居之后,才将当前节点压入栈中。这保证了:如果一个节点 u 有指向节点 v 的边,那么 u 一定在 v 之后被压入栈,即 u 在拓扑序列中排在 v 的前面。C++ 使用 STL 的 stack;C 语言手动实现基于数组的栈;Python 使用列表的 append() 追加后 reverse() 反转;Go 使用切片并从末尾向前遍历来模拟出栈。

运行该程序将输出:

Topological Sort: 5 4 2 3 1 0

环检测

拓扑排序的前提是图必须是有向无环图(DAG)。如果图中存在环(Cycle),则不存在合法的拓扑序列——因为环中的节点互相依赖,无法排出先后顺序。因此,在实际应用中,通常需要先检测图中是否有环,再进行拓扑排序。

基于 DFS 的环检测使用三色标记法(Three-Color Marking):

  • WHITE(白色):节点尚未被访问
  • GRAY(灰色):节点正在被探索(在当前 DFS 路径上)
  • BLACK(黑色):节点的所有邻居都已探索完毕

如果在 DFS 过程中遇到一个 GRAY 节点,说明存在一条回边(Back Edge),即当前路径回指了路径上的某个祖先节点——这就是环。

无环图 (DAG):
5 → 2 → 3 → 1, 5 → 0, 4 → 0, 4 → 1
DFS 过程中不会遇到 GRAY 节点 → 无环,拓扑排序可行

有环图:
0 → 1 → 2 → 0 (0→1→2→0 构成环)
DFS 过程: dfs(0) 标记 GRAY → 邻居 1(WHITE) → dfs(1) 标记 GRAY → 邻居 2(WHITE) → dfs(2) 标记 GRAY → 邻居 0(GRAY!) → 检测到环!

C++ 实现

#include <iostream>
#include <vector>
using namespace std;

const int WHITE = 0;  // Unvisited
const int GRAY  = 1;  // In current DFS path (being explored)
const int BLACK = 2;  // Fully explored

// Returns true if a cycle is found
bool hasCycleDFS(const vector<vector<int>>& adj, int node, vector<int>& color) {
    color[node] = GRAY;

    for (int neighbor : adj[node]) {
        if (color[neighbor] == GRAY) {
            // Back edge found → cycle detected
            return true;
        }
        if (color[neighbor] == WHITE) {
            if (hasCycleDFS(adj, neighbor, color)) {
                return true;
            }
        }
    }

    color[node] = BLACK;
    return false;
}

bool hasCycle(const vector<vector<int>>& adj) {
    int n = adj.size();
    vector<int> color(n, WHITE);

    for (int i = 0; i < n; i++) {
        if (color[i] == WHITE) {
            if (hasCycleDFS(adj, i, color)) {
                return true;
            }
        }
    }
    return false;
}

int main() {
    // DAG (no cycle)
    vector<vector<int>> dag = {
        {},      // 0
        {},      // 1
        {3},     // 2
        {1},     // 3
        {0, 1},  // 4
        {0, 2},  // 5
    };
    cout << "DAG: " << (hasCycle(dag) ? "has cycle" : "no cycle") << endl;

    // Graph with cycle: 0 → 1 → 2 → 0
    vector<vector<int>> cyclic = {
        {1},    // 0 → 1
        {2},    // 1 → 2
        {0},    // 2 → 0 (back edge creates cycle)
    };
    cout << "Cyclic graph: " << (hasCycle(cyclic) ? "has cycle" : "no cycle") << endl;

    return 0;
}

C 实现

#include <stdio.h>
#include <stdbool.h>

#define MAX_NODES 6
#define MAX_NEIGHBORS 2

#define WHITE 0
#define GRAY  1
#define BLACK 2

bool hasCycleDFS(int adj[][MAX_NEIGHBORS], int adjCount[], int node, int color[]) {
    color[node] = GRAY;

    for (int i = 0; i < adjCount[node]; i++) {
        int neighbor = adj[node][i];
        if (color[neighbor] == GRAY) {
            // Back edge found → cycle detected
            return true;
        }
        if (color[neighbor] == WHITE) {
            if (hasCycleDFS(adj, adjCount, neighbor, color)) {
                return true;
            }
        }
    }

    color[node] = BLACK;
    return false;
}

bool hasCycle(int adj[][MAX_NEIGHBORS], int adjCount[], int n) {
    int color[MAX_NODES] = {0};  // All WHITE initially

    for (int i = 0; i < n; i++) {
        if (color[i] == WHITE) {
            if (hasCycleDFS(adj, adjCount, i, color)) {
                return true;
            }
        }
    }
    return false;
}

int main() {
    // DAG (no cycle)
    int dag[MAX_NODES][MAX_NEIGHBORS] = {0};
    int dagCount[MAX_NODES] = {0};

    #define ADD_DAG(u, v) do { dag[u][dagCount[u]++] = v; } while(0)

    ADD_DAG(5, 2);
    ADD_DAG(5, 0);
    ADD_DAG(4, 0);
    ADD_DAG(4, 1);
    ADD_DAG(2, 3);
    ADD_DAG(3, 1);

    printf("DAG: %s\n", hasCycle(dag, dagCount, MAX_NODES) ? "has cycle" : "no cycle");

    // Graph with cycle: 0 → 1 → 2 → 0
    int cyc[3][MAX_NEIGHBORS] = {0};
    int cycCount[3] = {0};
    cyc[0][cycCount[0]++] = 1;
    cyc[1][cycCount[1]++] = 2;
    cyc[2][cycCount[2]++] = 0;

    printf("Cyclic graph: %s\n", hasCycle(cyc, cycCount, 3) ? "has cycle" : "no cycle");

    return 0;
}

Python 实现

WHITE, GRAY, BLACK = 0, 1, 2

def has_cycle_dfs(adj, node, color):
    """DFS helper for cycle detection using three-color marking."""
    color[node] = GRAY

    for neighbor in adj[node]:
        if color[neighbor] == GRAY:
            # Back edge found → cycle detected
            return True
        if color[neighbor] == WHITE:
            if has_cycle_dfs(adj, neighbor, color):
                return True

    color[node] = BLACK
    return False

def has_cycle(adj):
    """Check if a directed graph contains a cycle."""
    n = len(adj)
    color = [WHITE] * n

    for i in range(n):
        if color[i] == WHITE:
            if has_cycle_dfs(adj, i, color):
                return True
    return False

if __name__ == "__main__":
    # DAG (no cycle)
    dag = {
        0: [],
        1: [],
        2: [3],
        3: [1],
        4: [0, 1],
        5: [0, 2],
    }
    print("DAG:", "has cycle" if has_cycle(dag) else "no cycle")

    # Graph with cycle: 0 → 1 → 2 → 0
    cyclic = {
        0: [1],
        1: [2],
        2: [0],
    }
    print("Cyclic graph:", "has cycle" if has_cycle(cyclic) else "no cycle")

Go 实现

package main

import "fmt"

const (
    White = 0 // Unvisited
    Gray  = 1 // In current DFS path
    Black = 2 // Fully explored
)

func hasCycleDFS(adj [][]int, node int, color []int) bool {
    color[node] = Gray

    for _, neighbor := range adj[node] {
        if color[neighbor] == Gray {
            // Back edge found → cycle detected
            return true
        }
        if color[neighbor] == White {
            if hasCycleDFS(adj, neighbor, color) {
                return true
            }
        }
    }

    color[node] = Black
    return false
}

func hasCycle(adj [][]int) bool {
    n := len(adj)
    color := make([]int, n) // All White (0) initially

    for i := 0; i < n; i++ {
        if color[i] == White {
            if hasCycleDFS(adj, i, color) {
                return true
            }
        }
    }
    return false
}

func main() {
    // DAG (no cycle)
    dag := [][]int{
        {},      // 0
        {},      // 1
        {3},     // 2
        {1},     // 3
        {0, 1},  // 4
        {0, 2},  // 5
    }
    dagResult := "no cycle"
    if hasCycle(dag) {
        dagResult = "has cycle"
    }
    fmt.Println("DAG:", dagResult)

    // Graph with cycle: 0 → 1 → 2 → 0
    cyclic := [][]int{
        {1}, // 0
        {2}, // 1
        {0}, // 2
    }
    cycResult := "no cycle"
    if hasCycle(cyclic) {
        cycResult = "has cycle"
    }
    fmt.Println("Cyclic graph:", cycResult)
}

上述代码使用三色标记法在有向图中检测环。hasCycleDFS 函数在进入节点时将其标记为 GRAY(正在探索中),递归处理所有邻居。如果遇到一个 GRAY 节点,说明当前 DFS 路径回指了自身——即存在回边(Back Edge),图中有环。处理完所有邻居后将节点标记为 BLACK(已完成)。测试了两个图:示例 DAG 无环,而 0 → 1 → 2 → 0 构成有环图。

运行该程序将输出:

DAG: no cycle
Cyclic graph: has cycle

DFS 拓扑排序的性质

下表总结了基于 DFS 的拓扑排序算法的时间和空间复杂度:

指标 复杂度 说明
时间复杂度(Time Complexity) O(V + E) 每个顶点最多访问一次(O(V)),每条边最多检查一次(O(E))
空间复杂度(Space Complexity) O(V) visited 数组 O(V),递归栈深度最大 O(V),结果栈 O(V)

其中 V 是顶点数(Vertex Count),E 是边数(Edge Count)。

DFS 拓扑排序的关键性质:

  • 仅适用于 DAG:拓扑排序要求图必须是有向无环图。如果图中存在环,则不存在合法的拓扑序列。
  • 序列不唯一:一个 DAG 可能有多个合法的拓扑序列,具体取决于 DFS 的起始节点和邻居访问顺序。
  • 完成时间的逆序:DFS 拓扑排序的本质是按照节点的"完成时间"(Finish Time)从大到小排列。越晚完成 DFS 的节点,在拓扑序列中越靠前。
  • 与 Kahn 算法的区别:Kahn 算法(BFS-based)通过不断删除入度为 0 的节点来生成拓扑序列,同时能自然地检测环(如果最终序列长度小于顶点数则存在环)。DFS 方法则通过三色标记法检测环。

拓扑排序的典型应用场景:

应用场景 说明
任务调度(Task Scheduling) 确定有依赖关系的任务的执行顺序
编译依赖(Build Dependency) Makefile 等构建系统确定编译顺序
课程先修关系(Course Prerequisites) 根据课程依赖关系排课
电子表格计算(Spreadsheet Calculation) 确定单元格的计算顺序
符号链接解析(Symbolic Link Resolution) 检测文件系统中是否存在循环链接
posted @ 2026-04-17 07:58  游翔  阅读(24)  评论(0)    收藏  举报