HDU-4920-Matrix multiplication

最新推荐文章于 2020-02-23 10:20:02 发布

临江仙卜算子

最新推荐文章于 2020-02-23 10:20:02 发布

阅读量421

点赞数

分类专栏：算法分析与设计文章标签： c++

本文链接：https://blog.csdn.net/Y_liuzhenyuan/article/details/39007221

版权

算法分析与设计专栏收录该内容

36 篇文章 0 订阅

订阅专栏

Matrix multiplication

Time Limit: 4000/2000 MS (Java/Others) Memory Limit: 131072/131072 K (Java/Others)
Total Submission(s): 2812 Accepted Submission(s): 1228

Problem Description

Given two matrices A and B of size n×n, find the product of them.

bobo hates big integers. So you are only asked to find the result modulo 3.

Input

The input consists of several tests. For each tests:

The first line contains n (1≤n≤800). Each of the following n lines contain n integers -- the description of the matrix A. The j-th integer in the i-th line equals A _ij. The next n lines describe the matrix B in similar format (0≤A _ij,B _ij≤10 ⁹).

Output

For each tests:

Print n lines. Each of them contain n integers -- the matrix A×B in similar format.

Sample Input

Sample Output

Author

Xiaoxu Guo (ftiasch)

Source

2014 Multi-University Training Contest 5

800*800的矩阵，多组数据，直接算是会超时得飞起来的，只有考虑模3的特殊性。

读入后每个元素模3，得到的矩阵里全是0,1,2，随机数据的话有三分之一是零，所以我们的矩阵乘法要用k i j的循环嵌套顺序，第二层里面发现A[i][k]==0时就continue，直接少一维，也就是1/3概率少一维。这个是这题最关键的一步。（没有想到这步的话，貌似其他再怎么优化也没用。）但没有这一步，只是改成kij的循环，也能过。

因为kij循环时，最内层的ans[i][j] += a[i][k] * b[k][j]中的ans[i][k]是不变的，能存在缓存中，比ijk每次都从内存或者更慢的缓存中取数快多了。

另外运算时可以使用cal[i][j][k]提前计算好((i*j)+k)%3，矩阵乘法的时候直接用这个结果。

我们知道内存中二维数组是以行为单位连续存储的，逐列访问将会每次跳1000*4(bytes)。根据cpu cache的替换策略，将会有大量的cache失效。

而对于公式

c[i][j] += a[i][k] * b[k][j];

这条语句对内存的访问是连续的，增加了cache的命中率，大大提升了程序执行速度。

可见利用好cpu cache优化我们的程序，是非常有必要掌握的技能。

平时写程序，也应当尽量使cpu对内存的访问是尽可能连续的。

— — 其实这段我也不懂，~（^ - ^）~，以后再学习 — —

------------------------------其他------------------------------------

顺便说个笑话：为什么Dijkstra没发明Floyd算法？因为他是ijk不是kij……

提交代码：

#include <iostream>//C++标准输入输出头文件
#include <cstdio>
#include <cstring>

using namespace std;

int a[808][808];//由于(1≤n≤800)
int b[808][808];
int ans[808][808];

int main()
{
int n;
while(~scanf("%d", &n))//执行下去的条件
{
  for(int i = 1; i <= n; i++)
   for(int t = 1; t <= n; t++)
   {
       scanf("%d", &a[i][t]);
       a[i][t] %= 3;//录入时就取余
   }
  for(int i = 1; i <= n; i++)
   for(int t = 1; t <= n; t++)
   {
       scanf("%d", &b[i][t]);
       b[i][t] %= 3;//同样，录入时就取余
   }
  for(int i = 1; i <= n; i++)
   for(int t = 1; t <= n; t++)
       ans[i][t] = 0;//清空存储答案的二维矩阵
  for (int i = 1; i <= n; i++)
   for (int k = 1; k <= n; k++)
   {
         if (a[i][k] == 0)//其中一项为0，迅速转到下一项，节省时间
             continue;
         for (int t = 1; t <= n; t++)
                ans[i][t] += a[i][k] * b[k][t];//矩阵乘法的关键步奏
   }

  for (int i = 1; i <= n; i++)
  {
        printf("%d", ans[i][1] % 3);//答案再次取余，得到最后结果
        for (int t = 2; t <= n; t++)
             printf(" %d", ans[i][t] % 3);
        printf("\n");
  }
}
return 0;
}