矩阵快速幂

讲矩阵快速幂之前，先讲讲它的一些前置知识。

一.矩阵

矩阵是一个按照矩形排列的二维数表，由行和列组成，通常用大写字母$A、B、C$表示。

一般来说，一个$n \times m$ 的矩阵$A$可以表示为:

其中$n、m$分别表示行数和列数，$a_{i,j}$表示第$i$行第$j$列的元素。

二、矩阵加法

对于两个同型（都为$n \times m$的矩阵）的矩阵$A、B$而言，它们相加的和、矩阵$C$也是$ n \times m$的，即：

$ A=\begin{bmatrix} a_{1,1} & a_{1,2} & a_{1,3} & \cdots & a_{1,m} \\ a_{2,1} & a_{2,2} & a_{2,3} & \cdots & a_{2,m} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ a_{n,1} & a_{n,2} & a_{n,3} & \cdots & a_{n,m} \\ \end{bmatrix} $ $+$ $ B=\begin{bmatrix} b_{1,1} & b_{1,2} & b_{1,3} & \cdots & b_{1,m} \\ b_{2,1} & b_{2,2} & b_{2,3} & \cdots & b_{2,m} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ b_{n,1} & b_{n,2} & b_{n,3} & \cdots & b_{n,m} \\ \end{bmatrix} $ $=$ $ C=\begin{bmatrix} c_{1,1} & c_{1,2} & c_{1,3} & \cdots & c_{1,m} \\ c_{2,1} & c_{2,2} & c_{2,3} & \cdots & c_{2,m} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ c_{n,1} & c_{n,2} & c_{n,3} & \cdots & c_{n,m} \\ \end{bmatrix} $

其中$\forall c_{i,j}，c_{i,j}=a_{i,j}+b_{i,j}$。

易证得矩阵加法满足交换律和结合律，即$A+B=B+A$，$A+(B+C)=(A+B)+C$。

例.B2104

红的不能再红的板子题，不讲了，看代码。

点击查看代码

#include<iostream>
#include<cstdio>
using namespace std;
const int N=520;
const int M=520;
int n,m,a[N][M],b[N][M],c[N][M];
int main(){
	cin>>n>>m;
	for(int i=1;i<=n;i++){
		for(int j=1;j<=m;j++){
			cin>>a[i][j];
		}
	}
	for(int i=1;i<=n;i++){
		for(int j=1;j<=m;j++){
			cin>>b[i][j];
		}
	}
	for(int i=1;i<=n;i++){
		for(int j=1;j<=m;j++){
			c[i][j]=a[i][j]+b[i][j];
		}
	}
	for(int i=1;i<=n;i++){
		for(int j=1;j<=m;j++){
			cout<<c[i][j]<<" ";
		}
		cout<<endl;
	}
	return 0;
}

三、矩阵乘法

什么是矩阵乘法呢？对于一个$n \times m$的矩阵$A$，和一个$m \times p$的矩阵$B$，它们相乘会得到一个$n \times p$的矩阵$C=A \times B$，且

$\forall c_{i,j}，c_{i,j}=\sum\limits_{k=1}^{m} {a_{i,k} \times b_{k,j} }$

举个栗子，
$ A=\begin{bmatrix} 6 & 8 & 5 \\ 3 & 4 & 1 \\ \end{bmatrix} $，$ B=\begin{bmatrix} 1 & 3 & 6 & 6 \\ 2 & 8 & 4 & 4 \\ 5 & 6 & 1 & 8 \\ \end{bmatrix} $ 则 $ C=\begin{bmatrix} 47 & 112 & 73 & 108 \\ 16 & 47 & 35 & 42 \\ \end{bmatrix} $。

矩阵乘法满足结合律和分配律，不满足交换律。

※数字乘里有单位1，矩阵乘里也有单位矩阵，一个$n \times n$的单位矩阵$A$形如：

$ A=\begin{bmatrix} 1 & 0 & 0 & \cdots & 0 \\ 0 & 1 & 0 & \cdots & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & \cdots & 1 \\ \end{bmatrix} $

也就是$\forall i \in [1,n],s_{i,i}=1$，而其他位置是0。

显然，根据上面的矩阵乘法规则，任意$m \times n$的矩阵$B$与之相乘，结果仍然是$B$。

例：B2105

板子题没什么好说的。

点击查看代码

#include<iostream>
#include<cstdio>
using namespace std;
const int N=520;
const int M=520;
const int K=520;
int n,m,k,a[N][M],b[M][K],c[N][K];
int main(){
	cin>>n>>m>>k;
	for(int i=1;i<=n;i++){
		for(int j=1;j<=m;j++){
			cin>>a[i][j];
		}
	}
	for(int i=1;i<=m;i++){
		for(int j=1;j<=k;j++){
			cin>>b[i][j];
		}
	}
    //矩阵乘法，套公式就行
	for(int i=1;i<=n;i++){
		for(int j=1;j<=k;j++){
			for(int h=1;h<=m;h++){
				c[i][j]+=a[i][h]*b[h][j];
			}
		}
	}
	for(int i=1;i<=n;i++){
		for(int j=1;j<=k;j++){
			cout<<c[i][j]<<" ";
		}
		cout<<endl;
	}
	return 0;
}

四、龟速乘

龟速乘是一种解决大整数相乘问题的算法，其核心思想为：

$a \times b$等价于$\underbrace{a + a + a + \cdots + a}_{b\text{ 个 a}}$

而$\forall b \in N，b=2^{k1} + 2^{k2} + \cdots + 2^{kn} $。

所以$\underbrace{a + a + a + \cdots + a}_{b\text{ 个 a}}=a \times 2^{k1} + a \times 2^{k2} + \cdots + a \times 2^{kn}$。

龟速乘就是将这样的$2^i$次方的a取模后相加，这样就不会爆long long了。

具体来说，先将b末位1取出，再加上对应个数的a，然后把这个1除掉，如此循环直到b=0。

而在找b末位1的过程中，a每次都要乘2，这样等到找到末位1时，a的值就是要累加的值了。

例.P10446

刚才讲过了，不说思路了。

点击查看代码

#include<iostream>
#include<cstdio>
using namespace std;
long long a,b,p,ans;
int main(){
	scanf("%lld%lld%lld",&a,&b,&p);
	while(b){//b等于0就结束了
		if(b&1){
			ans=(ans+a)%p;//如果b这个二进制位是1就累加进去
		}
		a=(a+a)%p;//a的更新
		b>>=1;//清除掉这个二进制位
	}
	printf("%lld",ans);
	return 0;
}

五.快速幂

快速幂的思想和龟速乘如出一辙，都是将b分解为二进制数位（$b=2^{k1} + 2^{k2} + \cdots + 2^{kn} $）。龟速乘是a个$2^{ki}$累加，而快速幂，由于$a^{k1 + k2 + \cdots +kn}=a^{k1} \times a^{k2} \times \cdots \times a^{kn}$（这条性质不知道的找xm补课去），所以改成a个$2^{ki}$累乘就行了。

例.P1226

刚才讲过了，不说思路了。

点击查看代码

#include<iostream>
#include<cstdio>
using namespace std;
long long a,b,p,ans=1;
int main(){
	scanf("%lld%lld%lld",&a,&b,&p);
	printf("%lld^%lld mod %lld=",a,b,p);
	while(b){
		if(b&1){
			ans=(ans*a)%p;
		}
		a=(a*a)%p;
		b>>=1;
	}
	printf("%lld",ans);
	return 0;
}

----------------------------------------------------------并不华丽的分割线----------------------------------------------------------

好了，前置知识终于讲完了，终于可以讲矩阵快速幂了。

六.矩阵快速幂

什么是个矩阵快速幂呢？其实就是将矩阵乘法和快速幂融合出的产物。

具体地说，就是把快速幂里的数字乘改成矩阵乘。然后它就是矩阵快速幂了。

矩阵快速幂有一个非常典型的应用，就是矩阵加速——可以将一些递推式转换为矩阵幂，通过构造矩阵、矩阵快速幂的方式，把时间复杂度降到$O(logn)$。

当然，这还是很考验数学功底的。而且并不是所有递推式都可以用矩阵幂表示。

总之，我们还是先看它最基本的例题吧。

例1.P3390

刚才已经讲过思想了，现在咱讲讲代码。

如何将数字乘变成矩阵乘呢？一种常用的办法是将$*$号重载成矩阵乘法，这样把矩阵放进结构体就好了。具体可以参考一下代码：

点击查看代码

#include<iostream>
#include<cstdio>
#include<cstring>
using namespace std;
const int N=100;
const int MOD=1e9+7;
long long n,k;
struct sw{
	long long mx[N][N];
}a,b,s;
sw operator *(const sw a,const sw b){//这里就是运算符重载了，里面就是普通的矩阵乘
	sw c;
	memset(c.mx,0,sizeof c.mx);
	for(int i=1;i<=n;i++){
		for(int j=1;j<=n;j++){
			for(int p=1;p<=n;p++){
				c.mx[i][j]=(c.mx[i][j]+(a.mx[i][p]*b.mx[p][j])%MOD)%MOD;
			}
		}
	}
	return c;
}
int main(){
	scanf("%lld%lld",&n,&k);
	for(int i=1;i<=n;i++){
		for(int j=1;j<=n;j++){
			scanf("%d",&a.mx[i][j]);
		}
	}
    //数字乘里s是单位1，这里s得变成单位矩阵
	for(int i=1;i<=n;i++) s.mx[i][i]=1;
    //快速幂
	while(k){
		if(k&1) s=s*a;
		a=a*a;
		k>>=1;
	}
	for(int i=1;i<=n;i++){
		for(int j=1;j<=n;j++){
			printf("%d ",s.mx[i][j]);
		}
		printf("\n");
	}
	return 0;
}

例2.P10502

我们规定$solve(k)=A^1+A^2+ \cdots +A^k$。

如果正常一个个求大抵会超时。这里我们注意到：

当k=1时显然结果为$A$。

当k为奇数时，把后面的高次幂项提出一个$A^{(k-1)/2}$：

$solve(k)=A^1+A^2+ \cdots +A^k=（A^1+A^2+ \cdots +A^{(k-1)/2})+A^{(k-1)/2} \times （A^1+A^2+ \cdots +A^{(k-1)/2})+A^k=solve((k-1)/2) + solve((k-1)/2) \times A^{(k-1)/2} + A^k$。

当k为偶数时，同样把后面的高次幂项提出一个$A^{k/2}$：

$solve(k)=A^1+A^2+ \cdots +A^k=（A^1+A^2+ \cdots +A^{k/2})+A^{k/2}（A^1+A^2+ \cdots +A^{k/2})=solve(k/2) + solve(k/2) \times A^{k/2}$

此时它就被转化成了一个分治+矩阵快速幂的题目。

代码：

点击查看代码

#include<bits/stdc++.h>
using namespace std;
typedef long long ll;

const int N=35;
ll n,k,mod;
struct sw{
	ll mx[N][N];
}a;

sw operator *(const sw a,const sw b){//矩阵乘 
	sw c;
	for(int i=1;i<=n;i++){//不使用memset来减小时间复杂度 
		for(int j=1;j<=n;j++){
			c.mx[i][j]=0;
		}
	}
	for(int i=1;i<=n;i++){
		for(int j=1;j<=n;j++){
			for(int k=1;k<=n;k++){
				c.mx[i][j]=(c.mx[i][j]+(a.mx[i][k]*b.mx[k][j])%mod)%mod;
			}
		}
	}
	return c;
}

sw operator +(const sw a,const sw b){//矩阵加 
	sw c;
	for(int i=1;i<=n;i++){
		for(int j=1;j<=n;j++){
			c.mx[i][j]=(a.mx[i][j]+b.mx[i][j])%mod;
		}
	}
	return c;
}

sw qpow(sw a,ll b){//矩阵快速幂 
	sw s;
	for(int i=1;i<=n;i++){
		for(int j=1;j<=n;j++){
			if(i==j){
				s.mx[i][j]=1;
			}
			else{
				s.mx[i][j]=0;
			}
		}
	}
	while(b){
		if(b&1){
			s=s*a;
		}
		a=a*a;b>>=1;
	}
	return s;
}

sw solve(sw a,ll x){//分治部分，和文字部分基本一致 
	if(x==1){
		return a;
	}
	sw mid=solve(a,x/2);sw pw=qpow(a,x/2);//solve(a,x/2)、qpow(a,x/2)只算了一遍，大大减小时间复杂度 
	if(x&1){
		return mid*pw+mid+qpow(a,x);
	}
	else{
		return mid*pw+mid;
	}
}

int main(){
	scanf("%lld%lld%lld",&n,&k,&mod);
	for(int i=1;i<=n;i++){
		for(int j=1;j<=n;j++){
			scanf("%lld",&a.mx[i][j]);
		}
	}
	sw ans=solve(a,k);
	for(int i=1;i<=n;i++){
		for(int j=1;j<=n;j++){
			printf("%lld ",ans.mx[i][j]);
		}
		printf("\n");
	}
	return 0;
}

----------------------------------------------------------并不华丽的分割线----------------------------------------------------------

接下来就是它的重要应用了——矩阵加速，就是我们刚才说过的。它的难点在于构造矩阵，很考验一个OIer的数学功底。

update on 2025.08.09:

重新学习了一遍矩阵快速幂，矩阵加速的适用式为齐次线性式，也就是形如

$f_i=a_1f_1+a_2f_2+ \cdots +a_{i-1}f_{i-1}$

其中$a_1,a_2,\cdots,a_{i-1}$均为常数，$f_i$只与前$i-1$项有关。

例3.P1962

算是最最简单的矩阵加速了（少有的绿题），重点说说构造矩阵。

答案求$F_n$，而$F_n=F_{n-1}+F_{n-2}$，所以只要有两项就能递推出下一项，所以基础矩阵只需要两个元素（一开始是$F_1、F_2$），递推矩阵是$2 \times 2$的。

那么当我们已知$F_{n-1}、F_{n-2}$时如何转移出来$F_n、F_{n-1}$呢？

$ \begin{bmatrix} F_{n-1} & F_{n-2} \\ \end{bmatrix} $ $\times$ $ \begin{bmatrix} ? & ? \\ ? & ? \\ \end{bmatrix} $ $=$ $ \begin{bmatrix} F_{n} & F_{n-1} \\ \end{bmatrix} $

首先根据刚才的公式（$F_n=F_{n-1}+F_{n-2}$），第一行第一列和第二行第一列应该填1，否则无法转移出$F_n$。

$ \begin{bmatrix} F_{n-1} & F_{n-2} \\ \end{bmatrix} $ $\times$ $ \begin{bmatrix} 1 & ? \\ 1 & ? \\ \end{bmatrix} $ $=$ $ \begin{bmatrix} F_{n} & F_{n-1} \\ \end{bmatrix} $

其次$F_{n-1}=1 \times F_{n-1} +0 \times F_{n-2}$，所以第一行第二列填1、第二行第二列应该填0，才能转移出$F_{n-1}$。

$ \begin{bmatrix} F_{n-1} & F_{n-2} \\ \end{bmatrix} $ $\times$ $ \begin{bmatrix} 1 & 1 \\ 1 & 0 \\ \end{bmatrix} $ $=$ $ \begin{bmatrix} F_{n} & F_{n-1} \\ \end{bmatrix} $

这样转移矩阵就是$ \begin{bmatrix} 1 & 1 \\ 1 & 0 \\ \end{bmatrix} $。

而同理，$ \begin{bmatrix} F_{n-2} & F_{n-3} \\ \end{bmatrix} $ $\times$ $ \begin{bmatrix} 1 & 1 \\ 1 & 0 \\ \end{bmatrix} $ $=$ $ \begin{bmatrix} F_{n-1} & F_{n-2} \\ \end{bmatrix} $，所以$ \begin{bmatrix} F_{n-2} & F_{n-3} \\ \end{bmatrix} $ $\times$ $ \begin{bmatrix} 1 & 1 \\ 1 & 0 \\ \end{bmatrix}^2 $ $=$ $ \begin{bmatrix} F_{n} & F_{n-1} \\ \end{bmatrix} $

以此类推，$ \begin{bmatrix} F_{2} & F_{1} \\ \end{bmatrix} $ $\times$ $ \begin{bmatrix} 1 & 1 \\ 1 & 0 \\ \end{bmatrix}^{n-2} $ $=$ $ \begin{bmatrix} F_{n} & F_{n-1} \\ \end{bmatrix} $。
而$ \begin{bmatrix} F_{2} & F_{1} \\ \end{bmatrix} $ $=$ $ \begin{bmatrix} 1 & 1 \\ \end{bmatrix} $，所以只要将转移矩阵^(n-2)，再乘上初始的基础矩阵，所得矩阵的第一项就是答案。

代码：

点击查看代码

#include<bits/stdc++.h>
using namespace std;
typedef long long ll;

const int mod=1e9+7;
ll n;
struct sw{
	ll mx[3][3];
}a,ans,qwq;
//重载乘号（矩阵乘） 
sw operator *(const sw &a,const sw &b){
	sw c;
	for(int i=1;i<=2;i++){
		for(int j=1;j<=2;j++){
			c.mx[i][j]=0;
		}
	}
	for(int i=1;i<=2;i++){
		for(int j=1;j<=2;j++){
			for(int k=1;k<=2;k++){
				c.mx[i][j]=(c.mx[i][j]+(a.mx[i][k]*b.mx[k][j])%mod)%mod;
			}
		}
	}
	return c;
}

int main(){
	scanf("%lld",&n);
	ll b=n-2;//别忘了这里开ll 
	//初始化 
	a.mx[1][1]=a.mx[1][2]=a.mx[2][1]=1;
	ans.mx[1][1]=ans.mx[2][2]=1;
	qwq.mx[1][1]=qwq.mx[1][2]=1;
	if(n<=2){ 
		printf("1");
		return 0;
	}
	//矩阵快速幂 
	while(b){
		if(b&1){
			ans=ans*a;
		}
		a=a*a;b>>=1;
	}
	qwq=qwq*ans;
	printf("%lld ",qwq.mx[1][1]%mod);
	return 0;
}

例4.P1349

和刚才的题大同小异，只给最终结论：$ \begin{bmatrix} F_{2} & F_{1} \\ \end{bmatrix} $ $\times$ $ \begin{bmatrix} p & 1 \\ q & 0 \\ \end{bmatrix}^{n-2} $ $=$ $ \begin{bmatrix} F_{n} & F_{n-1} \\ \end{bmatrix} $

代码：

点击查看代码

#include<bits/stdc++.h>
using namespace std;
typedef long long ll;

ll p,q,a1,a2,n,mod;
struct sw{
	ll mx[3][3];
}a,ans,qwq;
//重载乘号（矩阵乘） 
sw operator *(const sw &a,const sw &b){
	sw c;
	for(int i=1;i<=2;i++){
		for(int j=1;j<=2;j++){
			c.mx[i][j]=0;
		}
	}
	for(int i=1;i<=2;i++){
		for(int j=1;j<=2;j++){
			for(int k=1;k<=2;k++){
				c.mx[i][j]=(c.mx[i][j]+(a.mx[i][k]*b.mx[k][j])%mod)%mod;
			}
		}
	}
	return c;
}

int main(){
	scanf("%lld%lld%lld%lld%lld%lld",&p,&q,&a1,&a2,&n,&mod);
	ll b=n-2;//别忘了这里开ll 
	//初始化 
	a.mx[1][1]=p,a.mx[1][2]=1,a.mx[2][1]=q;
	ans.mx[1][1]=ans.mx[2][2]=1;
	qwq.mx[1][1]=a2,qwq.mx[1][2]=a1;
	if(n==1){ 
		printf("%lld",a1);
		return 0;
	}
	if(n==2){ 
		printf("%lld",a2);
		return 0;
	}
	//矩阵快速幂 
	while(b){
		if(b&1){
			ans=ans*a;
		}
		a=a*a;b>>=1;
	}
	qwq=qwq*ans;
	printf("%lld ",qwq.mx[1][1]%mod);
	return 0;
}

例5.P1939

由于第n项由第n-3和n-1项推得，所以矩阵只有两项的话会有些难办。

所以我们构造$\begin{bmatrix} F_{n} & F_{n-1} & F_{n-2} \\ \end{bmatrix}$的矩阵，由$\begin{bmatrix} F_{n-1} & F_{n-2} & F_{n-3} \\ \end{bmatrix}$转移来，那么易得转移矩阵为：
$ \begin{bmatrix} 1 & 1 & 0 \\ 0 & 0 & 1 \\ 1 & 0 & 0 \\ \end{bmatrix} $
而最终结论为：
$ \begin{bmatrix} F_{3} & F_{2} & F_{1} \\ \end{bmatrix} $ $\times$ $ \begin{bmatrix} 1 & 1 & 0 \\ 0 & 0 & 1 \\ 1 & 0 & 0 \\ \end{bmatrix}^{n-3} $ $=$ $ \begin{bmatrix} F_{n} & F_{n-1} & F_{n-2} \\ \end{bmatrix} $

代码：

点击查看代码

#include<bits/stdc++.h>
using namespace std;
typedef long long ll;

const ll mod=1e9+7;
ll T,n;
struct sw{
	ll mx[5][5];
}s,a,qwq;
sw operator *(const sw a,const sw b){//矩阵乘 
	sw c;
	for(int i=1;i<=3;i++){
		for(int j=1;j<=3;j++){
			c.mx[i][j]=0;
		}
	}
	for(int i=1;i<=3;i++){
		for(int j=1;j<=3;j++){
			for(int k=1;k<=3;k++){
				c.mx[i][j]=(c.mx[i][j]+(a.mx[i][k]*b.mx[k][j])%mod)%mod;
			}
		}
	}
	return c;
}

int main(){
	scanf("%lld",&T);
	while(T--){
		scanf("%lld",&n);
		if(n<=3){
			printf("1\n");
			continue;
		}
		//多测记得清空 
		memset(s.mx,0,sizeof s.mx);
		memset(a.mx,0,sizeof a.mx);
		//初始化 
		qwq.mx[1][1]=1,qwq.mx[1][2]=1,qwq.mx[1][3]=1;
		a.mx[1][1]=1;a.mx[1][2]=1;a.mx[2][3]=1;a.mx[3][1]=1;
		s.mx[1][1]=s.mx[2][2]=s.mx[3][3]=1;
		ll b=n-3;
		while(b){//快速幂 
			if(b&1){
				s=s*a;
			}
			a=a*a;b>>=1;
		}
		qwq=qwq*s;
		printf("%lld\n",qwq.mx[1][1]%mod);	
	}
	return 0;
}

例6.P5175

难点也在于构造矩阵。这里要用到我们小学二年级就学过的和的平方公式：$(x+y)^2=x^2+2xy+y^2$。

这样$a_n^2=x^2a_{n-1}^2+y^2a_{n-2}^2+2xya_{n-1}a_{n-2}$。

由于题目求$\sum\limits_{i=1}^{n} {a_{i}^2}$，所以我们定义$s_i=\sum\limits_{i=1}^{n} {a_{i}^2}$。

则我们的初始矩阵设置成$\begin{bmatrix} s_{n} & a_{n}^2 & a_{n-1}^2 & a_na_{n-1} \\ \end{bmatrix}$，转移关系式也就呼之欲出：
$ \begin{bmatrix} s_{n-1} & a_{n-1}^2 & a_{n-2}^2 & a_{n-1}a_{n-2} \\ \end{bmatrix} $ $\times$ $ \begin{bmatrix} 1 & 0 & 0 & 0 \\ x^2 & x^2 & 1 & x \\ y^2 & y^2 & 0 & 0 \\ 2xy & 2xy & 0 & y \\ \end{bmatrix} $ $=$ $ \begin{bmatrix} s_{n} & a_{n}^2 & a_{n-1}^2 & a_na_{n-1} \\ \end{bmatrix} $

（这里由于$a_n=x \times a_{n-1} + y \times a_{n-2}$，所以$a_na_{n-1}=x \times a_{n-1}^2 + y \times a_{n-1}a_{n-2}$）

所以倒推到$a_1、a_2$就是这样：

$ \begin{bmatrix} s_{2} & a_{2}^2 & a_{1}^2 & a_{2}a_{1} \\ \end{bmatrix} $ $\times$ $ \begin{bmatrix} 1 & 0 & 0 & 0 \\ x^2 & x^2 & 1 & x \\ y^2 & y^2 & 0 & 0 \\ 2xy & 2xy & 0 & y \\ \end{bmatrix}^{n-2} $ $=$ $ \begin{bmatrix} s_{n} & a_{n}^2 & a_{n-1}^2 & a_na_{n-1} \\ \end{bmatrix} $

代码：

点击查看代码

#include<bits/stdc++.h>
using namespace std;
typedef long long ll;

const ll mod=1e9+7;
int T;
ll n,a1,a2,x,y;
struct sw{
	ll mx[5][5];
}s,a,qwq;
sw operator *(const sw a,const sw b){
	sw c;
	for(int i=1;i<=4;i++){
		for(int j=1;j<=4;j++){
			c.mx[i][j]=0;
		}
	}
	for(int i=1;i<=4;i++){
		for(int j=1;j<=4;j++){
			for(int k=1;k<=4;k++){
				c.mx[i][j]=(c.mx[i][j]+(a.mx[i][k]*b.mx[k][j])%mod)%mod;
			}
		}
	} 
	return c;
}

int main(){
	scanf("%d",&T);
	while(T--){
		scanf("%lld%lld%lld%lld%lld",&n,&a1,&a2,&x,&y);
		if(n==1){
			printf("%lld\n",a1*a1%mod);
			continue;
		}
		if(n==2){
			printf("%lld\n",(a1*a1%mod+a2*a2%mod)%mod);
			continue;
		}
		memset(s.mx,0,sizeof s.mx);memset(a.mx,0,sizeof a.mx);
		s.mx[1][1]=s.mx[2][2]=s.mx[3][3]=s.mx[4][4]=1;
		qwq.mx[1][1]=(a1*a1%mod+a2*a2%mod)%mod;
		qwq.mx[1][2]=(a2*a2)%mod;qwq.mx[1][3]=(a1*a1)%mod;qwq.mx[1][4]=(a1*a2)%mod;
		a.mx[1][1]=a.mx[2][3]=1;
		a.mx[2][1]=a.mx[2][2]=(x*x)%mod;
		a.mx[3][1]=a.mx[3][2]=(y*y)%mod;
		a.mx[4][1]=a.mx[4][2]=((2*x%mod)*y)%mod;
		a.mx[2][4]=x%mod;a.mx[4][4]=y%mod;
		ll b=n-2;
		while(b){
			if(b&1){
				s=s*a;
			}
			a=a*a;b>>=1;
		}
		qwq=qwq*s;
		printf("%lld\n",qwq.mx[1][1]%mod);
	}
	return 0;
}

例7.P9510

几乎是纯数学题……

拿出草稿本和铅笔，这个题得好好推导一下。（不懂的可以找xm、szj或者其他数学dalao。）

由于过程有亿些复杂，所以不写Markdown，用图片表示推理过程。

先放结论：原式=$fib(n)^2fib(n+1)+fib(n)fib(n+1)$

证明过程见下图。

而fib(n)、fib(n+1)只需要一次矩阵快速幂（幂次为n-1）就能一次性求出来。

然后根据推导式计算即可。

注意这个题很卡常，记得常数优化一下。

代码：

点击查看代码

#include<cstdio>
#define gc() (p1==p2&&(p2=(p1=buf)+fread(buf,1,1000096,stdin),p1==p2)?EOF:*p1++)
using namespace std;

long long T,n,mod;
char *p1,*p2,buf[1000100];
struct sw{
	long long mx[3][3];
}a,ans,qwq;
inline sw operator *(const sw &a,const sw &b){
	sw c;
	for(int i=1;i<=2;i++){
		for(int j=1;j<=2;j++){
			c.mx[i][j]=0;
		}
	}
	c.mx[1][1]=(a.mx[1][1]*b.mx[1][1]+a.mx[1][2]*b.mx[2][1])%mod;
	c.mx[2][1]=(a.mx[2][1]*b.mx[1][1]+a.mx[2][2]*b.mx[2][1])%mod;
	c.mx[1][2]=(a.mx[1][1]*b.mx[1][2]+a.mx[1][2]*b.mx[2][2])%mod;
	c.mx[2][2]=(a.mx[2][1]*b.mx[1][2]+a.mx[2][2]*b.mx[2][2])%mod;
	return c;
}

inline long long read(){
    long long x=0;char c=gc();
    while(c<48)c=gc();
    while(c>47)x=(x<<3)+(x<<1)+(c&15),c=gc();
    return x;
}
inline void write(long long x){
	if(x<0) x=-x,putchar('-');
	if(x>9) write(x/10);
	putchar(x%10+'0');
}

int main(){
	T=read();
	while(T--){
		n=read();mod=read();
		if(n==1){
			write(2%mod);//mod是2~1e9+7的，所以千万记得取模！！！否则喜提80pts 
			printf("\n");
			continue;
		}
		if(n==2){ 
			write(4%mod);
			printf("\n");
			continue;
		}
		long long b=n-1;
		a.mx[1][1]=a.mx[1][2]=a.mx[2][1]=1;a.mx[2][2]=0;
		ans.mx[1][1]=ans.mx[2][2]=1;ans.mx[1][2]=ans.mx[2][1]=0;
		qwq.mx[1][1]=qwq.mx[1][2]=1;
		while(b){
			if(b&1){
				ans=ans*a;
			}
			a=a*a;b>>=1;
		}
		qwq=qwq*ans;
		write((qwq.mx[1][1]*qwq.mx[1][2])%mod*(qwq.mx[1][2]+1)%mod);
		printf("\n");
	}
	return 0;
}

课外拓展：P1357

是个状压DP+矩阵加速优化，感兴趣的可以看看。

参考资料：
1.各题题解（不一一展开）
2.快速幂课件（不展开，应该都有）
3.b站视频（声音很小，记得先调大音量）

posted @ 2025-07-15 14:47 qwqSW 阅读(37) 评论(0) 收藏举报

刷新页面返回顶部

qwqSW

好好学习，天天向上

矩阵快速幂

一.矩阵

二、矩阵加法

例.B2104

三、矩阵乘法

例：B2105

四、龟速乘

例.P10446

五.快速幂

例.P1226

六.矩阵快速幂

例1.P3390

例2.P10502

例3.P1962

例4.P1349

例5.P1939

例6.P5175

例7.P9510

公告