朴素高精度乘法的常数优化

2015年辽宁省赛热身赛有一道高精度乘法

1574: A * B

时间限制: 10 Sec 内存限制: 128 MB

题目描述

Calculate $a \times b$.

输入

Your program will be tested on one or more test cases. In each test case, two integer $a$, $b$ ($0\le a,b\le 10^{100000}$).

输出

For each test case, print the following line:

answer

where answer is $a\times b$.

样例输入

1000000000000000 2000000000000000

样例输出

2000000000000000000000000000000

提示

来源

2015省赛热身赛

朴素高精度乘法的典型写法是

 1 #include<cstdio>
 2 #include<cstring>
 3 using namespace std;
 4 const int MAX_N=1e5+10;
 5 char sa[MAX_N], sb[MAX_N];
 6 int a[MAX_N], b[MAX_N], res[MAX_N<<1];
 7 int main(){
 8     //freopen("in", "r", stdin);
 9     while(~scanf("%s%s", sa, sb)){
10         int i, j;
11         memset(res, 0, sizeof(res));
12         for(i=0; sa[i]; i++){
13              for(j=0; sb[j]; j++){
14                  res[i+j]+=(sa[i]-'0')*(sb[j]-'0');
15              }
16         }
17         int tot=i+j-2;    //error-prone
18         for(i=tot; i; i--){
19             res[i-1]+=res[i]/10;
20             res[i]%=10;
21         }
22         printf("%d", res[0]);    //error-prone
23         if(res[0]) for(i=1; i<=tot; i++) printf("%d", res[i]);
24         puts("");
25     }
26     return 0;
27 }

但这样写会TLE。下面要介绍一种常数优化：

将大整数从低位到高位每 $6$ 位分成一组（最后一组若不够 $6$ 位自动在高位补 $0$)，这样便自然得到了大整数的「一百万进制」表示，将每组内的十进制计数看成是一百万进制的一个“数字”，由于小于一百万的数的乘法无需采用高精度计算，可认为是 $O(1)$ 的，实际上对于不致溢出的小整数，机器实现的乘法比模拟竖式计算要快好多。这样便在原来十进制表示下的朴素高精度乘法的基础上，实现了常数优化。这样的常数优化常常是有效的。

 1 #include<cstdio>
 2 #include<cstring>
 3 #define set0(a) memset(a, 0, sizeof(a))
 4 using namespace std;
 5 const int MAX_N=1e5+10;
 6 typedef long long ll;
 7 char sa[MAX_N], sb[MAX_N];
 8 ll a[MAX_N], b[MAX_N], res[MAX_N<<1];
 9 int base=6;
10 ll mod=1e6;
11 int trans(char *s, ll *a){  //value-passed
12     //memset(a, 0, sizeof(a));    //error-prone
13     int ls=strlen(s), len=(ls+base-1)/base;
14     int now=0, rem=ls%base;
15     int l, r;
16     if(rem){
17         l=0; r=rem;
18         for(int i=r-1, p=1; i>=l; i--, p*=10){
19             a[0]+=(s[i]-'0')*p;
20         }
21         now++;
22     }
23     for(int i=0; now<len; now++, i++){
24         l=rem+base*i, r=l+base; //error-prone
25         for(int j=r-1, p=1; j>=l; j--, p*=10){
26             a[now]+=(s[j]-'0')*p;
27         }
28     }
29     return len;
30 }
31  
32 int main(){
33     //freopen("in", "r", stdin);
34     while(~scanf("%s%s", sa, sb)){
35         set0(a); set0(b); set0(res);
36         int la=trans(sa, a), lb=trans(sb, b);
37         for(int i=0; i<la; i++){
38             for(int j=0; j<lb; j++){
39                 res[i+j]+=a[i]*b[j];
40             }
41         }
42         int tot=la+lb-2;
43         for(int i=tot; i; i--){
44             res[i-1]+=res[i]/mod;
45             res[i]%=mod;
46         }
47         int i;
48         for(i=0; i<=tot&&!res[i]; i++);
49         if(i>tot) putchar('0');
50         else{
51             printf("%lld", res[i++]);   //error-prone
52             for(; i<=tot; i++) printf("%06lld", res[i]);
53         }
54         puts("");
55     }
56     return 0;
57 }

Time: 4158MS

选择每 $6$ 个分一组是要保证运算过程不会溢出 long long，就本题的数据范围而言，取 $7$ 位分为一组也可，这样常数会更优，只需将上面的代码稍加修改（注意加粗的三行）

 1 #include<cstdio>
 2 #include<cstring>
 3 #define set0(a) memset(a, 0, sizeof(a))
 4 using namespace std;
 5 const int MAX_N=1e5+10;
 6 typedef long long ll;
 7 char sa[MAX_N], sb[MAX_N];
 8 ll a[MAX_N], b[MAX_N], res[MAX_N<<1];
 9 int base=7;
10 ll mod=1e7;
11 int trans(char *s, ll *a){  //value-passed
12     //memset(a, 0, sizeof(a));    //error-prone
13     int ls=strlen(s), len=(ls+base-1)/base;
14     int now=0, rem=ls%base;
15     int l, r;
16     if(rem){
17         l=0; r=rem;
18         for(int i=r-1, p=1; i>=l; i--, p*=10){
19             a[0]+=(s[i]-'0')*p;
20         }
21         now++;
22     }
23     for(int i=0; now<len; now++, i++){
24         l=rem+base*i, r=l+base; //error-prone
25         for(int j=r-1, p=1; j>=l; j--, p*=10){
26             a[now]+=(s[j]-'0')*p;
27         }
28     }
29     return len;
30 }
31  
32 int main(){
33     //freopen("in", "r", stdin);
34     while(~scanf("%s%s", sa, sb)){
35         set0(a); set0(b); set0(res);
36         int la=trans(sa, a), lb=trans(sb, b);
37         for(int i=0; i<la; i++){
38             for(int j=0; j<lb; j++){
39                 res[i+j]+=a[i]*b[j];
40             }
41         }
42         int tot=la+lb-2;
43         for(int i=tot; i; i--){
44             res[i-1]+=res[i]/mod;
45             res[i]%=mod;
46         }
47         int i;
48         for(i=0; i<=tot&&!res[i]; i++);
49         if(i>tot) putchar('0');
50         else{
51             printf("%lld", res[i++]);   //error-prone
52             for(; i<=tot; i++) printf("%07lld", res[i]);
53         }
54         puts("");
55     }
56     return 0;
57 }

Time: 2937MS（这结果在所有AC的提交中算是很优的了）

当然高精度乘法有复杂度更优的解法，比如快速傅立叶变换（FFT），但通过本题，我们看到这种代码量较小的分组常数优化对于 $N$ 不太大，时限较宽的问题还是很有效的。

posted @ 2015-07-06 20:49 Pat 阅读(1118) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

Pat

「以解决问题为乐」

真的喜欢么？真的喜欢就去做吧。

Lost Boy Calling 。。。。

... Many of these issues are best dealt with at the algorithmic level, rather than by "tweaking" the code.

This is an obscurity that catches the unwary.

原来我什么都不懂。

朴素高精度乘法的常数优化

1574: A * B

题目描述

输入

输出

样例输入

样例输出

提示

来源

公告

Pat

「以解决问题为乐」 真的喜欢么？真的喜欢就去做吧。 Lost Boy Calling 。。。。 ... Many of these issues are best dealt with at the algorithmic level, rather than by "tweaking" the code. This is an obscurity that catches the unwary. 原来我什么都不懂。

朴素高精度乘法的常数优化

1574: A * B

题目描述

输入

输出

样例输入

样例输出

提示

来源

公告

「以解决问题为乐」

真的喜欢么？真的喜欢就去做吧。

Lost Boy Calling 。。。。

... Many of these issues are best dealt with at the algorithmic level, rather than by "tweaking" the code.

This is an obscurity that catches the unwary.

原来我什么都不懂。