C++14指北：花里胡哨的C++

类型！

在最经典的 C++ 代码中，我们使用类似 类型名变量名 = 表达式; 的形式声明并初始化变量，例如

int x = 1;
int y = x;

在上面代码中，我们知道 y 理应与 x 的类型相同，但是在上面代码中，如果我们后来把 x 的类型修改为 int64_t，而忘记对应地修改 y 的类型，则可能导致灾难性的后果。对此，我们最简单的做法是使用类型别名。在 C++ 中，可以使用 using 声明类型别名：

using my_int = int;
my_int x = 1;
my_int y = x;

我们也可以使用 decltype 说明符，这是在 C++11 中添加的新关键字：

int x = 1;
decltype(x) y = x;

decltype 关键字接受一个表达式，得到该表达式的类型。与 sizeof 类似，decltype 的操作数仅作为类型推导使用，并不会在运行时真正被求值。

验证 `decltype` 不会进行求值

考虑下面代码：

#include <cstdio>

int func() {
    std::puts("called!");
    return 0;
}

int main() {
    decltype(func()) a = 1;
    return 0;
}

执行后无输出，说明 decltype 的操作数不会在运行时进行求值。

`decltype` 可以保留 `void`、指针、值类型和类型限定符

查看类型推导结果

我们可以使用声明但未定义的模板函数来查看类型推导结果：

template<typename T>
void func();

该函数会阻止程序正确链接，链接器会按顺序汇报所缺失的函数特化，根据其报错信息即可得到 T 的类型。

考虑如下代码：

#include <utility>

template<typename T>
void func();

void void_func();

int main() {
	int y;
	int &y2 = y;
	int *y3;
	const int * volatile *y4;
	func<decltype(y)>();
	func<decltype(std::move(y))>();
	func<decltype(y2)>();
	func<decltype(y3)>();
	func<decltype(y4)>();
	return 0;
}

链接时，你会得到类似于如下的报错信息：

a.cpp:(.text+0x16): undefined reference to `void func<int>()'
a.cpp:(.text+0x1b): undefined reference to `void func<int&&>()'
a.cpp:(.text+0x20): undefined reference to `void func<int&>()'
a.cpp:(.text+0x25): undefined reference to `void func<int*>()'
a.cpp:(.text+0x2a): undefined reference to `void func<int const* volatile*>()'
a.cpp:(.text+0x2f): undefined reference to `void func<void>()'
collect2: error: ld returned 1 exit status

这说明，decltype 推导可以正确保留 void、指针、引用和类型限定符（const 和 volatile）。

值得注意的是，被括号包裹的标识符表达式或类成员访问表达式将被推导为左值：

template<typename T>
void func();

int main() {
	int y0 = 1;
	func<decltype(y0)>();
	func<decltype((y0))>();
	return 0;
}

这将产生类似于如下的链接错误：

a.cpp:(.text+0x15): undefined reference to `void func<int>()'
a.cpp:(.text+0x1a): undefined reference to `void func<int&>()'
collect2: error: ld returned 1 exit status

同时使用类型别名和 decltype 推导，我们可以把一些较长的类型名称存储下来，例如：

std::vector<int> a;
using It = decltype(a.begin());

不经过构造函数直接使用成员函数

std::declval 可以将任意类型 T 转换成引用类型，使得在 decltype 的操作数中不必经过构造函数就能使用成员函数。考虑下面代码，由于结构体 A 没有构造函数，于是不能通过编译：

struct A {
	A() = delete;
	int foo();
};

int main() {
	decltype(A().foo()) x;
	return 0;
}

使用 std::declval 即可解决这个问题：

#include <utility>

struct A {
	A() = delete;
	int foo();
};

int main() {
	decltype(std::declval<A>().foo()) x;
	return 0;
}

当然，许多时候 decltype 也是不方便的，大部分时候所需的类型是与初始化表达式一致的，而使用 decltype 会导致大量重复，考虑下面代码：

int x = 1;
decltype(func(x) * 2LL) y = func(x) * 2LL;

这时，我们就可以使用大名鼎鼎的占位类型说明符 auto 来简化代码：

int x = 1;
auto y = func(x) * 2LL;

自 C++14 起，函数的返回值（如果可以自动推导）也可以是 auto：

auto add(int x, int y) {
	return x + y;
}

不能进行返回类型推导的几种常见情况

函数在形式上有多个 return，且这些 return 返回的类型不相同（即使可以隐式类型转换）：

auto func(int x) {
	if (x) return 1LL;
	return 0;
}

函数返回初始化列表：

auto func() {
	return {1, 2};
}

要求 auto*，但是返回值不是指针：

auto* func(int x) {
	return x;
}

在递归前不存在可推导的返回语句：

auto func(int x) {
   return x == 0 ? 0 : func(x - 1);
}

作为对比，下面写法是可以的：

auto func(int x) {
	if (x == 0) return 0;
	return func(x - 1);
}

虚函数不能使用返回类型推导：

struct F {
	virtual auto f() {
		return 2;
	}
};

占位符 auto 本身会丢失类型限定符，因此占位符 auto 可伴随如 const 或 & 这样的修饰符，它们参与类型推导。考虑下面代码：

template<typename T>
void func();

int main() {
	int y0 = 1;
	const int &y1 = y0;
	auto y2 = y1;
	func<decltype(y1)>();
	func<decltype(y2)>();
	return 0;
}

链接时得到的报错信息类似于：

a.cpp:(.text+0x26): undefined reference to `void func<int const&>()'
a.cpp:(.text+0x2b): undefined reference to `void func<int>()'
collect2: error: ld returned 1 exit status

使用在 C++14 中加入的 decltype(auto) 即可避免这个问题，decltype(auto) 会完全忠实于初始化表达式右侧的类型，如同将其带入 decltype 说明符一般：

template<typename T>
void func();

int main() {
	int y0 = 1;
	const int &y1 = y0;
	decltype(auto) y2 = y1;
	func<decltype(y2)>(); // undefined reference to `void func<int const&>()'
	return 0;
}

迭代器！

迭代器是一种广义化的指针，它使得 C++ 程序可以通过统一的方式处理不同的数据结构。迭代器库提供了迭代器的定义，同时还提供了迭代器特征、适配器及相关的工具函数。

因为迭代器是指针的抽象，所以它们的语义是 C++ 的指针的大多数语义的泛化。这确保指针能够用于所有接受迭代器的函数模板。

最基本的迭代器用法看起来是这样的：

#include <vector>
#include <cstdio>
 
int main() {
	std::vector<int> a{1, 2, 3, 4};
	for (auto it = a.begin(); it < a.end(); ++it) std::printf("%d\n", *it);
	return 0;
}

我们的代码并不需要修改这个 std::vector，因此可以使用不可变迭代器，这在需要遍历的数组是不可变时格外有用：

#include <vector>
#include <cstdio>
 
int main() {
	const std::vector<int> a{1, 2, 3, 4};
	for (auto it = a.cbegin(); it < a.cend(); ++it) std::printf("%d\n", *it);
	return 0;
}

使用范围 for 循环的语法糖，上面代码可以改写成这样：

#include <vector>
#include <cstdio>
 
int main() {
	std::vector<int> a{1, 2, 3, 4};
	for (int x : a) std::printf("%d\n", x);
	return 0;
}

如果需要使用范围 for 循环的语法糖的同时强调其不可变性，则可以使用 std::as_const：

#include <vector>
#include <cstdio>

// std::as_const 需要 C++17，下面是一个可用的低版本替代实现
// template<typename T> struct add_const_t { using type = const T; };
// template<typename T> typename add_const_t<T>::type& as_const(T& t) noexcept { return t; }
// template<typename T> void as_const(const T&&) = delete;

using std::as_const;

int main() {
	std::vector<int> a{1, 2, 3, 4};
	for (int x : as_const(a)) std::printf("%d\n", x);
	return 0;
}

如果我们想要反向遍历容器，可以使用反向迭代器（可以使用 rcbegin() 和 rcend 表示反向不可变迭代器）：

#include <vector>
#include <cstdio>
 
int main() {
	std::vector<int> a{1, 2, 3, 4};
	for (auto it = a.rbegin(); it < a.rend(); ++it) std::printf("%d\n", *it);
	return 0;
}

通过上面初步认识，我们意识到迭代器和指针很相像。但是其比指针更加灵活，例如我们可以用迭代器遍历内部是树形结构而非顺序结构的 std::set：

#include <set>
#include <cstdio>

int main() {
	std::set<int> a{1, 2, 3, 4};
	for (int x : a) std::printf("%d\n", x);
	return 0;
}

std::vector 允许随机访问，因此写出 a.begin() + 5 是合理的，时间复杂度为 \(O(1)\)，但是 std::set 并不支持这一点，它只支持迭代器自增和自减，且时间复杂度为 \(O(\log n)\)。这说明，迭代器有不同的类型，有些可以支持更多的功能。

标准库中 std::advance、std::distance、std::next、std::prev 几个函数提供了对迭代器的基本操作。具体而言，

std::advance：接受一个迭代器 it 和一个距离 d（可为负值）作为参数，将增加给定的迭代器 it 以 d 个元素的步长；
std::distance：接受两个迭代器 first 和 last 作为参数，返回从 first 到 last 的路程（若 last 不可从 first 通过若干次自增 first 抵达，则行为未定义）；
std::next：接受一个迭代器 it 和一个距离 n （默认值为 \(1\)）作为参数，返回迭代器 it 的第 n 个后继；
std::prev：接受一个迭代器 it 和一个距离 n （默认值为 \(1\)）作为参数，返回迭代器 it 的第 n 个前驱。

标准库中有大量的函数接受迭代器作为参数，例如我们耳熟能详的 std::sort、std::reverse、std::unique 、std::shuffle 等。下面介绍几个使用迭代器的标准库函数。

std::fill、std:count、std::iota 三个函数接受两个迭代器 first 和 last 以及一个值 value 作为其参数，它们的作用分别如下：

std::fill：赋值给定的 value 给范围 [first, last) 中的元素；
std::count：计数在范围范围 [first, last) 中等于 value 的元素；
std::iota：以始于 value 并重复地求值 ++value 的顺序递增值填充范围 [first, last) 。

std::copy、std::copy_backward、std::move、std::move_backward 四个函数接受三个迭代器 s_first、s_last 和 d_first 作为其参数，它们的作用分别如下：

std::copy：复制 [s_first, s_last) 所定义的范围中的元素到始于 d_first 的另一范围（若 d_first 包含于 [s_first, s_last) 则行为未定义）；
std::copy_backward：从后向前复制 [s_first, s_last) 所定义的范围中的元素到始于 d_first 的另一范围；
std::move：移动 [s_first, s_last) 所定义的范围中的元素到始于 d_first 的另一范围（若 d_first 包含于 [s_first, s_last) 则行为未定义）；
std::move_backward：从后向前移动 [s_first, s_last) 所定义的范围中的元素到始于 d_first 的另一范围；

std::fill_n 函数接受一个迭代器 first、一个长度 n 以及一个值 value 作为其参数，其作用是将给定的 value 赋给从 first 开始的范围的首 n 个元素，若 \(n\leq 0\) 则不做任何事。

std::copy_n 函数接受一个迭代器 first、一个长度 n 以及另一个迭代器 res 作为其参数，其作用是准确复制来自始于 first 的范围的 n 个值到始于 res 的范围，若 \(n\leq 0\) 则不做任何事。

除了用于访问元素的迭代器，一些容器还支持插入迭代器 std::inserter_iterator，下面是一个使用 std::back_inserter 的例子（需要容器支持 push_back 操作）：

#include <vector>
#include <cstdio>

int main() {
	std::vector<int> a{1, 2, 3, 4};
	auto it = std::back_inserter(a);
	*it = 1;
	*it = 2;
	for (int x : a) {
		printf("%d ", x);
	}
	return 0;
}

上面的代码输出：

1 2 3 4 1 2

搭配使用需要输出迭代器的函数会很方便：

#include <bits/stdc++.h>

int main() {
	std::vector<int> a{1, 2, 3, 4};
	std::fill_n(std::back_inserter(a), 3, 5);
	for (int x : a) {
		printf("%d ", x);
	}
	return 0;
}

上面的代码输出：

1 2 3 4 5 5 5

类似的还有 std::front_inserter 用于在容器开头插入（需要支持 push_front）以及 std::inserter 用于在容器的特定位置插入（需要支持 insert），下面例子指出了 std::inserter 可以很好地搭配 std::set：

#include <bits/stdc++.h>

int main() {
	std::set<int> a{1, 2, 3, 4};
	std::vector<int> b{7, 8, 9};
	std::copy(b.begin(), b.end(), std::inserter(a, a.end()));
	for (int x : a) {
		printf("%d ", x);
	}
	return 0;
}

上面的代码输出：

1 2 3 4 7 8 9

输入输出迭代器（std::istream_iterator 和 std::ostream_iterator）赋予了将输入输出流当做迭代器的能力：

#include <bits/stdc++.h>

int main() {
	std::istringstream in("5 1 2 3 4 5");
	int n;
	in >> n;
	std::vector<int> a;
	std::copy_n(std::istream_iterator<int>(in), n, std::back_inserter(a));
	std::copy(a.begin(), a.end(), std::ostream_iterator<int>(std::cout, " "));
	return 0;
}

上面的代码输出：

1 2 3 4 5

也可以将上面代码的 in 改为 std::cin，这将改为由标准输入流输入数据。

函数！

函数作为在内存中的代码片段，我们可以取其地址存放在变量中，这个过程是隐含的：

#include <cstdio>

int add(int x, int y) {
	return x + y;
}

int main() {
	using add_type = int (*)(int, int);
	add_type a = add; // 隐式取址，等价于 &add
	std::printf("%d\n", a(1, 2));
	return 0;
}

当然，上面代码也可以直接使用 auto 而不必显式地定义 add_type 类型，这里只是为了说明一个函数指针的类型是：

返回类型 (*) (参数类型列表)

一个具有实践意义的例子是 std::sort 的比较函数：

#include <cstdio>
#include <algorithm>

bool cmp(int x, int y) {
	return x > y;
}

int main() {
	std::vector<int> a{1, 2, 3, 4};
	std::sort(a.begin(), a.end(), cmp);
	for (auto x : a) std::printf("%d\n", x);
	return 0;
}

我们给 std::sort 传入作为第三个参数的比较函数就是使用函数指针的方式传入的。当然，你可能知道标准库中的 std::greater，这将大幅简化代码：

#include <cstdio>
#include <functional>
#include <algorithm>

int main() {
	std::vector<int> a{1, 2, 3, 4};
	std::sort(a.begin(), a.end(), std::greater<>());	// C++ 14 允许省略可推导的模板参数
														// 在此之前则需要在尖括号中指明 int
	for (auto x : a) std::printf("%d\n", x);
	return 0;
}

那么 std::greater<>() 到底是什么？事实上，std::greater 是一个仿函数，它本质上是一个实现了 operator() 的结构体：

#include <cstdio>
#include <algorithm>

struct greater {
	bool operator()(int x, int y) {
		return x > y;
	}
};

int main() {
	std::vector<int> a{1, 2, 3, 4};
	std::sort(a.begin(), a.end(), greater());
	for (auto x : a) std::printf("%d\n", x);
	return 0;
}

在 C++11 中，我们迎来了一个新的语法（糖），使用 Lambda 表达式来创建一个“函数”：

#include <cstdio>
#include <algorithm>

int main() {
	std::vector<int> a{1, 2, 3, 4};
	std::sort(a.begin(), a.end(), [](int x, int y) -> bool {
		return x > y;
	});
	for (auto x : a) std::printf("%d\n", x);
	return 0;
}

你可能意识到了，Lambda 表达式本质上就是创建仿函数结构体的一个语法糖而已，因此即使两个 Lambda 拥有相同的返回类型和参数列表，它们的类型也不是相同的，下面代码验证了这一点：

#include <cstdio>

template<typename T, typename U>
struct is_same_helper {};

template<typename T>
struct is_same_helper<T, T> {
	using type = int;
};

template<typename T, typename U>
void _is_same(char x) {
	std::puts("not same!");
}

template<typename T, typename U>
void _is_same(typename is_same_helper<T, U>::type x) {
	std::puts("same!");
}

template<typename T, typename U>
void is_same() {
	_is_same<T, U>(0);
}

int main() {
	auto func1 = [](int, int) {};
	auto func2 = [](int, int) {};
	is_same<decltype(func1), decltype(func2)>(); // 输出 not same!
}

一般而言，Lambda 表达式可以省略其不需要的部分，下面均为合法的 lambda 表达式：

[] {}
[] (int x) {}
[sum = 0] (int x) mutable noexcept -> void {}

Lambda 表达式最开始的方括号是不可省略的，它被称为 Lambda 表达式的捕获列表，它可以以一个默认捕获符开始：

&（以引用隐式捕获被使用的自动变量）；
=（以复制隐式捕获被使用的自动变量）。

随即是要捕获的变量的列表，若变量名以 & 作为前缀，则其是引用捕获的，否者是值捕获的。这两种捕获模式的区别在于：

引用捕获可用于修改外部变量，而值捕获却不能实现此操作；
引用捕获会反映外部变量的更新，而值捕获不会。

Lambda 默认不可变

Lambda 表达式的值捕获变量默认是不可变的，下面代码将产生编译错误：

int main() {
	int x = 1;
	auto func1 = [x] {
		x = 2;
	};
	return 0;
}

如果将 Lambda 修饰为 mutable，则允许修改这些变量的副本：

#include <cstdio>

int main() {
	int x = 1;
	auto func1 = [x]() mutable {
		x = 2;
		std::printf("%d\n", x);
	};
	func1();
	std::printf("%d\n", x);
	return 0;
}

上面代码输出为：

2
1

Lambda 与悬空引用

Lambda 表达式并不会延长所引用捕获的变量的声明周期：

#include <cstdio>

auto make() {
	int x = 1;
	return [&x](int y) {
		return x = x + y;
	};
}

int main() {
	auto func = make();
	printf("%d\n", func(1));
	return 0;
}

上面代码在 make 函数返回后，其中的 x 生命周期结束，但是其返回的 func 还保有对 x 的引用，这时这个引用变为悬空引用，调用 func 的行为是未定义的。

一个值得注意的例外是，如果 Lambda 使用了以引用捕获的引用，那么它使用原引用所指代的对象，而非被捕获的引用自身，因此下面代码是可行的：

> #include <cstdio>
 
auto make_function(int& x) {
    return [&]{ std::printf("%d\n", x); };
}
 
int main() {
    int i = 3;
    auto f = make_function(i); // f 中对 x 的使用直接绑定到 i
    i = 5;
    f(); // 输出 5
}

容易看出，Lambda 表达式的捕获列表，本质上是仿函数结构体中的成员变量，例如下面 Lambda 表达式：

#include <cstdio>

int main() {
	int x = 1, y = 2;
	auto func = [&x, y] {
		x += y;
	};
	func();
	std::printf("%d\n", x); // 输出 3
	return 0;
}

可以使用仿函数结构体重写为：

#include <cstdio>

int main() {
	int x = 1, y = 2;
	struct _lambda_func {
		int &x;
		const int y;
		_lambda_func(int &x, int y) : x(x), y(y) {}
		void operator()() {
			x += y;
		}
	} func(x, y);

	func();
	std::printf("%d\n", x); // 输出 3
	return 0;
}

在 C++14 后，我们可以进行带初始化器的捕获，即在捕获子句中引入和初始化新变量，而无需将这些变量声明于 Lambda 函数的封闭范围内。初始化可以任何任意表达式表示，它的行为如同它声明并显式捕获一个以类型 auto 声明的变量，该变量的声明区是 Lambda 表达式体，但：

如果以复制捕获，那么闭包对象的非静态数据成员是指代这个 auto 变量的另一种方式。
如果以引用捕获，那么引用变量的生存期在闭包对象的生存期结束时结束。

下面代码将输出 1：

#include <cstdio>

int main() {
	int x = 1;
	auto func = [x = x]() {};
	std::printf("%d\n", func.x);
}

在 C++14 中，如果参数类型是泛型，则可以使用 auto 关键字作为类型说明符。此关键字将告知编译器将函数调用运算符创建为模板。参数列表中的每个 auto 实例等效于一个不同的类型参数。例如：

#include <cstdio>
#include <algorithm>

int main() {
	std::vector<int> a{1, 2, 3, 4};
	std::sort(a.begin(), a.end(), [](auto x, auto y) {
		return x > y;
	});
	for (auto x : a) std::printf("%d\n", x);
	return 0;
}

从现在开始，除非特殊说明，我们提及“函数”这个词实际上是指任何可调用的东西，包括函数指针，仿函数和 Lambda。为了方便起见，我们把返回类型为 bool 的函数称为谓词，只接受一个参数的函数称为一元函数，只接受一个参数的谓词被称为一元谓词。

下面介绍几个需要一元谓词的标准库函数：

std::all_of、std::any_of、std::none_of、std::count_if、std::find_if、std::find_if_not 六个函数接受两个迭代器 first 和 last 以及一个一元谓词 p 作为参数，它们的功能分别是：

std::all_of：检查一元谓词 p 是否对范围 [first, last) 中所有元素返回 true；
std::any_of：检查一元谓词 p 是否对范围 [first, last) 中至少一个元素返回 true；
std::none_of：检查一元谓词 p 是否不对范围 [first, last) 中任何元素返回 true；
std::count_if：计数一元谓词 p 对范围 [first, last) 中多少个元素返回 true；
std::find_if：返回一元谓词 p 对范围 [first, last) 中首个返回 true 的元素的迭代器，未找到则返回 last；
std::find_if_not：返回一元谓词 p 对范围 [first, last) 中首个不返回 true 的元素的迭代器，未找到则返回 last。

std::for_each 函数接受两个迭代器 first 和 last 以及一个一元函数 f 作为参数，它按顺序应用给定的函数对象 f 到解引用范围 [first,last) 中每个迭代器，返回 f 的右值引用。特别地，如果迭代器类型是可变的，那么 f 可以通过解引用后的迭代器修改范围的元素。忽略 f 返回的结果。

我们可以使用 std::for_each 来将一个函数应用到一个序列的每个元素上，下面代码会把 a 中的每个元素翻倍：

#include <cstdio>
#include <vector>
#include <algorithm>
 
int main() {
	std::vector<int> a{1, 2, 3, 4};
	std::for_each(a.begin(), a.end(), [](int &x) {
		x *= 2;
	});
	for (int x : a) std::printf("%d\n", x);
	return 0;
}

由于 std::for_each 返回 f 的右值引用，结合带有有初始化器的捕获的 Lambda，可以完成一些复杂的统计，下面代码计算 a 中元素的平方和：

#include <cstdio>
#include <vector>
#include <algorithm>
 
int main() {
	std::vector<int> a{1, 2, 3, 4};
	std::printf("%d\n", std::for_each(a.begin(), a.end(), [sum = 0](int &x) mutable {
		sum += x * x;
	}).sum);
	return 0;
}

std::transform 函数接受三个迭代器 s_first、s_last 和 d_first 以及一个一元函数 f 作为参数，应用给定的函数到范围 [s_first, s_last) 并将结果存储到始于 d_first 的范围。下面代码将字符串 a 转化为大写字母：

#include <bits/stdc++.h>
 
int main() {
	std::string a{"hello!"};
	std::transform(a.cbegin(), a.cend(), a.begin(), [](auto x) { return std::toupper(x); });
	std::printf("%s\n", a.c_str());
	return 0;
}

std::copy_if 函数接受三个迭代器 s_first、s_last 和 d_first 以及一个一元谓词 p 作为参数，复制范围 [s_first, s_last) 中的所有谓词 p 返回 true 的元素到始于 d_first 的范围。下面代码将输出 a 中所有的偶数：

#include <bits/stdc++.h>

int main() {
	std::vector<int> a{1, 2, 3, 4};
	std::copy_if(a.begin(), a.end(), std::ostream_iterator<int>(std::cout, " "), [](int x) {
		return x % 2 == 0;
	});
	return 0;
}

std::generate 函数接受两个迭代器 first 和 last 以及一个无参数的函数 f 作为参数，以给定函数对象 f 所生成的值赋值范围 [first, last) 中的每个元素。

std::generate_n 函数接受一个迭代器 first 和长度 n 以及一个无参数的函数 f 作为参数，赋值给定函数对象 f 所生成的值给始于 first 的范围的首 n 个元素，若 \(n\leq0\) 则不做任何事。下面代码将向标准输出流输出 5 个随机数：

#include <bits/stdc++.h>

int main() {
	std::mt19937 rng(std::random_device{}());
	std::generate_n(std::ostream_iterator<int>(std::cout, " "), 5, std::ref(rng));
	return 0;
}

posted @ 2023-07-17 21:24 方而静阅读(234) 评论(0) 收藏举报

刷新页面返回顶部

方而静的博客

C++14指北：花里胡哨的C++

类型！

迭代器！

函数！

公告