18-2 迭代器介绍

在编程中，遍历数组（或其他数据结构）是非常常见的操作。迄今为止，我们已经介绍了多种实现方式：使用循环和索引（for循环和while循环）、使用指针和指针运算，以及使用基于范围的for循环：

#include <array>
#include <cstddef>
#include <iostream>

int main()
{
    // In C++17, the type of variable arr is deduced to std::array<int, 7>
    // If you get an error compiling this example, see the warning below
    std::array arr{ 0, 1, 2, 3, 4, 5, 6 };
    std::size_t length{ std::size(arr) };

    // while-loop with explicit index
    std::size_t index{ 0 };
    while (index < length)
    {
        std::cout << arr[index] << ' ';
        ++index;
    }
    std::cout << '\n';

    // for-loop with explicit index
    for (index = 0; index < length; ++index)
    {
        std::cout << arr[index] << ' ';
    }
    std::cout << '\n';

    // for-loop with pointer (Note: ptr can't be const, because we increment it)
    for (auto ptr{ &arr[0] }; ptr != (&arr[0] + length); ++ptr)
    {
        std::cout << *ptr << ' ';
    }
    std::cout << '\n';

    // range-based for loop
    for (int i : arr)
    {
        std::cout << i << ' ';
    }
    std::cout << '\n';

    return 0;
}

警告：
本课示例使用了C++17中名为类模板参数推导的功能，通过模板变量的初始化表达式推导其模板参数。在上例中，当编译器遇到 std::array arr{ 0, 1, 2, 3, 4, 5, 6 }; 时，会推断出我们需要的是 std::array<int, 7> arr { 0, 1, 2, 3, 4, 5, 6 };。

若编译器未启用C++17功能，将报错提示类似“在'arr'前缺少模板参数”。此时最佳解决方案是参照第0.12课《配置编译器：选择语言标准》启用C++17。若无法启用，可将使用类模板参数推导的代码行替换为显式模板参数（例如将 std::array arr{ 0, 1, 2, 3, 4, 5, 6 }; 替换为 std::array<int, 7> arr { 0, 1, 2, 3, 4, 5, 6 };）。

若仅需通过索引访问元素，使用索引进行循环会增加冗余操作。此方法仅适用于容器（如数组）支持直接访问元素的情况（数组支持，但列表等其他容器类型则不支持）。

使用指针和指针运算进行循环既冗长又容易混淆——尤其对不熟悉指针运算规则的读者而言。此外，指针运算仅在元素在内存中连续时才有效（数组满足此条件，但列表、树和映射等容器则不然）。

对于高级读者:
指针（不涉及指针运算）也可用于遍历某些非顺序结构。在链表中，每个元素通过指针与前一个元素相连。我们可沿着指针链条遍历整个链表。

基于范围的 for 循环则更具趣味性——其遍历容器的机制虽被隐藏，却能适用于各类结构（数组、列表、树、映射等）。其原理何在？关键在于迭代器的运用。

迭代器

迭代器是一种用于遍历容器的对象（例如数组中的值或字符串中的字符），在遍历过程中提供对每个元素的访问权限。

容器可能提供不同类型的迭代器。例如数组容器可能提供正向迭代器（按顺序遍历数组）和反向迭代器（逆序遍历数组）。

创建合适类型的迭代器后，程序员即可通过迭代器提供的接口进行遍历和访问元素，无需关注具体遍历方式或容器的数据存储机制。由于C++迭代器通常采用统一接口实现遍历（通过++运算符移动到下一个元素）和访问（通过*运算符访问当前元素），我们得以用一致的方法遍历各类容器类型。

指针作为迭代器

最简单的迭代器是指针，它通过指针运算可处理内存中顺序存储的数据。让我们重新审视使用指针和指针运算进行简单数组遍历的示例：

#include <array>
#include <iostream>

int main()
{
    std::array arr{ 0, 1, 2, 3, 4, 5, 6 };

    auto begin{ &arr[0] };
    // note that this points to one spot beyond the last element
    auto end{ begin + std::size(arr) };

    // for-loop with pointer
    for (auto ptr{ begin }; ptr != end; ++ptr) // ++ to move to next element
    {
        std::cout << *ptr << ' '; // Indirection to get value of current element
    }
    std::cout << '\n';

    return 0;
}

输出：

在上文中，我们定义了两个变量：begin（指向容器的起始位置）和end（标记终点位置）。对于数组而言，终点标记通常是内存中若容器再多一个元素时该元素所在的位置。

指针随后在 begin 和 end 之间迭代，通过解引用指针即可访问当前元素。

警告:
您可能会尝试使用地址运算符和数组语法计算结束标记，例如：
int* end{ &arr[std::size(arr)] };
但这会导致未定义行为，因为 arr[std::size(arr)] 会隐式解引用数组末尾之外的元素。
请改用：
int* end{ arr.data() + std::size(arr) }; // data() returns a pointer to the first element

标准库迭代器

迭代操作如此普遍，以至于所有标准库容器都直接支持迭代。我们无需自行计算起始点和终点，只需通过成员函数begin()和end()（命名十分贴切）向容器获取起始点和终点即可：

#include <array>
#include <iostream>

int main()
{
    std::array array{ 1, 2, 3 };

    // Ask our array for the begin and end points (via the begin and end member functions).
    auto begin{ array.begin() };
    auto end{ array.end() };

    for (auto p{ begin }; p != end; ++p) // ++ to move to next element.
    {
        std::cout << *p << ' '; // Indirection to get value of current element.
    }
    std::cout << '\n';

    return 0;
}

这将输出：

迭代器头文件还包含两个通用函数（std::begin 和 std::end），可供使用。

提示:
C 风格数组的 std::begin 和 std::end 定义在头文件中。

支持迭代器的容器（如、）的 std::begin 和 std::end 定义在这些容器的头文件中。

#include <array>    // includes <iterator>
#include <iostream>

int main()
{
    std::array array{ 1, 2, 3 };

    // Use std::begin and std::end to get the begin and end points.
    auto begin{ std::begin(array) };
    auto end{ std::end(array) };

    for (auto p{ begin }; p != end; ++p) // ++ to move to next element
    {
        std::cout << *p << ' '; // Indirection to get value of current element
    }
    std::cout << '\n';

    return 0;
}

这也会输出：

暂时不必担心迭代器的类型，我们将在后续章节重新探讨迭代器。关键在于迭代器负责处理遍历容器的细节。我们只需掌握四点：起始点、终点、用于将迭代器移至下一个元素（或终点）的++运算符，以及获取当前元素值的*运算符。

迭代器中 operator< 与 operator!= 的区别

在第 8.10 节——for 语句中，我们指出在循环条件中进行数值比较时，应优先使用 operator< 而非 operator!=：

for (index = 0; index < length; ++index)

使用迭代器时，通常采用不等号运算符 != 来检测迭代器是否已到达末尾元素：

for (auto p{ begin }; p != end; ++p)

这是因为某些迭代器类型不支持关系比较。而操作符!=适用于所有迭代器类型。

回归基于范围的 for 循环

所有同时具有 begin() 和 end() 成员函数的类型，或可与 std::begin() 和 std::end() 一起使用的类型，均可在基于范围的 for 循环中使用。

#include <array>
#include <iostream>

int main()
{
    std::array array{ 1, 2, 3 };

    // This does exactly the same as the loop we used before.
    for (int i : array)
    {
        std::cout << i << ' ';
    }
    std::cout << '\n';

    return 0;
}

在后台，基于范围的 for 循环会调用待迭代类型的 begin() 和 end() 函数。std::array 拥有 begin 和 end 成员函数，因此可用于基于范围的循环。C 风格的固定数组可配合 std::begin 和 std::end 函数使用，同样能通过基于范围的循环进行遍历。但动态C风格数组（或衰减的C风格数组）无法使用，因为它们没有std::end函数（由于类型信息不包含数组长度）。

后续课程将教你如何为类型添加这些函数，使其也能用于基于范围的 for 循环。

基于范围的 for 循环并非唯一使用迭代器的场景。它们同样应用于 std::sort 及其他算法中。现在你已了解迭代器的本质，会发现它们在标准库中被广泛使用。

迭代器失效（悬空迭代器）

与指针和引用类似，当被迭代元素改变地址或被销毁时，迭代器可能处于“悬空”状态。此时我们称迭代器已失效。访问失效迭代器将导致未定义行为。

某些修改容器的操作（如向std::vector添加元素）可能导致容器中元素的地址发生改变。此时指向这些元素的现有迭代器将失效。优质的C++参考文档应注明哪些容器操作可能或必然导致迭代器失效。例如，可参阅cppreference中std::vector的“迭代器失效”章节。

!> image

由于基于范围的 for 循环在后台使用迭代器，我们必须谨慎避免任何可能使当前遍历容器的迭代器失效的操作：

#include <vector>

int main()
{
    std::vector v { 0, 1, 2, 3 };

    for (auto num : v) // implicitly iterates over v
    {
        if (num % 2 == 0)
            v.push_back(num + 1); // when this invalidates the iterators of v, undefined behavior will result
    }

    return 0;
}

编码的时候clangd没有发现错误， clang 正常构建和运行，但尝试输出vecotor中的元素时出现错误。

以下是迭代器失效的另一个示例：

#include <iostream>
#include <vector>

int main()
{
	std::vector v{ 1, 2, 3, 4, 5, 6, 7 };

	auto it{ v.begin() };

	++it; // move to second element
	std::cout << *it << '\n'; // ok: prints 2

	v.erase(it); // erase the element currently being iterated over

	// erase() invalidates iterators to the erased element (and subsequent elements)
	// so iterator "it" is now invalidated

	++it; // undefined behavior
	std::cout << *it << '\n'; // undefined behavior

	return 0;
}

这里自己手打的构建运行成功，为确保手误copy原文也是成功，clang发现不了这个错误, 引以为戒。
clang version 21.1.8 (Fedora 21.1.8-1.fc43)

失效的迭代器可通过赋予其有效迭代器（如 begin()、end() 或其他返回迭代器的函数的返回值）来重新激活。
erase() 函数返回指向被删除元素后一个元素的迭代器（若删除的是最后一个元素则返回 end()）。因此，我们可以按以下方式修正上述代码：

#include <iostream>
#include <vector>

int main()
{
	std::vector v{ 1, 2, 3, 4, 5, 6, 7 };

	auto it{ v.begin() };

	++it; // move to second element
	std::cout << *it << '\n';

	it = v.erase(it); // erase the element currently being iterated over, set `it` to next element

	std::cout << *it << '\n'; // now ok, prints 3

	return 0;
}

posted @ 2026-01-15 16:15 游翔阅读(5) 评论(0) 收藏举报

刷新页面返回顶部

learncpp

18-2 迭代器介绍

迭代器

指针作为迭代器

标准库迭代器

回归基于范围的 for 循环

迭代器失效（悬空迭代器）

公告