玄学技巧大全

本文主要记录考场上可用的（玄学）技巧。

若有补充，欢迎在评论区留言。

I/O 优化

ios::sync_with_stdio(false), cin.tie(nullptr), cout.tie(nullptr) 可以关闭同步流，有时比快读快写更快。与文件读写共同使用时，需要在程序结尾加入 cout.flush()。
（仅支持 Linux）可以使用 getchar_unlocked()、putchar_unlocked() 分别替换 getchar()、putchar() 来进行优化。
一般情况下，printf() 比快写更快。
快读快写模板。
用 cout 输出单个字符时不要用双引号，要用单引号。

位运算

该板块中出现的变量皆为整数。

可以用 x << k 表示 \(x \cdot 2^k\)。
可以用 x >> k 表示 \(\left \lfloor \dfrac{x}{2^k} \right \rfloor\)。在 \(k\) 大于 \(x\) 的位数时（如将 int 类型的 \(x\) 右移 \(35\) 位）为 UB，GCC 9.4.0 的实现为将 \(k\) 对位数取模。
（lowbit 运算）可以用 x & -x 表示 \(x\) 的二进制表示中最低位的 \(1\) 及其后面的 \(0\) 组成的数。
可以用 (x & (x - 1)) == 0 来判断 \(x\) 是否为 \(2\) 的整数次幂。
（int 除法优化）可以将 a / x 优化为 a * m >> n，其中 m = ((2147483648LL << log2(x)) + x - 1) / x, n = 31 + log2(x)。建议在除数固定或大量被使用时使用。
- 注：2147482648 = 1 << 31。

内置函数

该板块中的内置函数皆带有编译器的指令级优化，速度极快。

__builtin_ctz(int x) / __builtin_ctzll(long long x)：返回 x 二进制表示中末尾 \(0\) 的个数。
__buitlin_clz(int x) / __buitlin_clzll(long long x)：返回 x 二进制表示中前导 \(0\) 的个数。
__builtin_popcount(int x) / __builtin_popcountll(long long x)：返回 x 二进制表示中 \(1\) 的个数。
__builtin_parity(int x) / __builtin_parityll(long long x)：返回 x 二进制表示中 \(1\) 的个数的奇偶性。\(0\) 为偶数、\(1\) 为奇数。
__builtin_ffs(int x) / __builtin_ffsll(long long x)：返回 x 的二进制表示的最后一个 \(1\) 在第几位。
__lg(int x)：返回 \(\lfloor \log_2(x) \rfloor\)。
__builtin_sqrt(double x) / __builtin_sqrtf(float x) / __builtin_sqrtl(long double x)：返回 x 的开平方。

Policy-Based Data Structures

简称：pd_ds，~~并不是平板电视~~。
头文件：<bits/extc++.h>。
具体介绍见此。

伪平衡树

可以使用 vector + lower_bound + upper_bound 实现一个 \(O(n^2)\) 的伪平衡树，但是因为神秘常数可以通过模板题，跑的飞快。

以下为模板题 AC 代码，AC 记录。

#include <algorithm>
#include <iostream>
#include <vector>

using namespace std;

int main()
{
    int n;
    vector<int> M;

    cin >> n;

    while (n-- > 0) {
        int opt, x;
        cin >> opt >> x;

        if (opt == 1)
            M.insert(ranges::lower_bound(M, x), x);
        else if (opt == 2)
            M.erase(ranges::lower_bound(M, x));
        else if (opt == 3)
            cout << ranges::lower_bound(M, x) - M.begin() + 1 << '\n';
        else if (opt == 4)
            cout << M[x - 1] << '\n';
        else if (opt == 5)
            cout << *(ranges::lower_bound(M, x) - 1) << '\n';
        else if (opt == 6)
            cout << *ranges::upper_bound(M, x) << '\n';
    }

    return 0;
}

其他优化

将按顺序访问的内存挨在一起布局可以加速。

示例：在写倍增时，可以将倍增数组 f[N][32] 写成 f[32][N]，这样访问到的内存挨的更近。
使用 bitset 代替 bool 数组，可以将时间优化到原来的 \(1 / \omega\)（因为 bitset 为压位存储），其中 \(\omega\) 一般为 \(32\) 或 \(64\)（因机器而异）。

建议用 bitset 的内置函数代替 =，如 .set()。

示例：埃氏筛 + bitset 比线性筛快。
可以将高开销且大量被使用的值用局部变量存储。

示例：在大量使用 c % 114514 时，用 tmp = c % 114514 存储起来可以加速。
有一些 STL 常数较大，如 vector、list，和任何以 deque 为底层容器的容器。对于一些容器可以使用 reserve() 预先分配空间。
定义常数时用 constexpr / #define 而不是 const。

示例：const int N = 114514 \(\to\) constexpr int N = 114514 / #define N 114514。
减少数组嵌套调用。

示例：a[b[c[i]]] \(\to\) ci = c[i], bci = b[ci], abci = a[bci]。
当 a / m 很小时，可以用循环代替取模：a % m \(\to\) while (a >= m) { a -= m; }。
把返回值为 false 概率大一点的条件放在 && 的前面，返回值为 true 的概率大一点条件放在 || 前面。
把数组大小开到 \(2\) 的次幂少一点点。

posted @ 2025-08-02 16:12 David9006 阅读(25) 评论(0) 收藏举报

刷新页面返回顶部

Gapple

No one believes fairytales anymore / But our two hearts in sync / To wish upon a satellite