pyhton 计算flop,了解如何计算FLOP

I am having a hard time grasping how to count FLOPs. One moment I think I get it, and the next it makes no sense to me. Some help explaining this would greatly be appreciated. I have looked at all other posts about this topic and none have completely explained in a programming language I am familiar with (I know some MATLAB and FORTRAN).

Here is an example, from one of my books, of what I am trying to do.

For the following piece of code, the total number of flops can be written as (n*(n-1)/2)+(n*(n+1)/2) which is equivalent to n^2 + O(n).

[m,n]=size(A)

nb=n+1;

Aug=[A b];

x=zeros(n,1);

x(n)=Aug(n,nb)/Aug(n,n);

for i=n-1:-1:1

x(i) = (Aug(i,nb)-Aug(i,i+1:n)*x(i+1:n))/Aug(i,i);

end

I am trying to apply the same principle above to find the total number of FLOPs as a function of the number of equations n in the following code (MATLAB).

% e = subdiagonal vector

% f = diagonal vector

% g = superdiagonal vector

% r = right hand side vector

% x = solution vector

n=length(f);

% forward elimination

for k = 2:n

factor = e(k)/f(k­‐1);

f(k) = f(k) – factor*g(k‐1);

r(k) = r(k) – factor*r(k‐1);

end

% back substitution

x(n) = r(n)/f(n);

for k = n‐1:­‐1:1

x(k) = (r(k)‐g(k)*x(k+1))/f(k);

end

解决方案

I'm by no means expert at MATLAB but I'll have a go.

I notice that none of the lines of your code index ranges of your vectors. Good, that means that every operation I see before me is involving a single pair of numbers. So I think the first loop is 5 FLOPS per iteration, and the second is 3 per iteration. And then there's that single operation in the middle.

However, MATLAB stores everything by default as a double. So the loop variable k is itself being operated on once per loop and then every time an index is calculated from it. So that's an extra 4 for the first loop and 2 for the second.

But wait - the first loop has 'k-1' twice, so in theory one could optimise that a bit by calculating and storing that, reducing the number of FLOPs by one per iteration. The MATLAB interpreter is probably able to spot that sort of optimisation for itself. And for all I know it can work out that k could in fact be an integer and everything is still okay.

So the answer to your question is that it depends. Do you want to know the number of FLOPs the CPU does, or the minimum number expressed in your code (ie the number of operations on your vectors alone), or the strict number of FLOPs that MATLAB would perform if it did no optimisation at all? MATLAB used to have a flops() function to count this sort of thing, but it's not there anymore. I'm not an expert in MATLAB by any means, but I suspect that flops() has gone because the interpreter has gotten too clever and does a lot of optimisation.

I'm slightly curious to know why you wish to know. I used to use flops() to count how many operations a piece of maths did as a crude way of estimating how much computing grunt I'd need to make it work in real time written in C.

Nowadays I look at the primitives themselves (eg there's a 1k complex FFT, that'll be 7us on that CPU according to the library datasheet, there's a 2k vector multiply, that'll be 2.5us, etc). It gets a bit tricky because one has to consider cache speeds, data set sizes, etc. The maths libraries (eg fftw) themselves are effectively opaque so that's all one can do.

So if you're counting the FLOPs for that reason you'll probably not get a very good answer.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
在PyTorch中,可以使用`torch.flops`函数来计算模型的浮点运算量(FLOPS)。 以下是一个使用`torch.flops`函数计算FLOPS的示例程序: ```python import torch import torch.nn as nn class MyModel(nn.Module): def __init__(self): super(MyModel, self).__init__() self.conv1 = nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1) self.relu1 = nn.ReLU(inplace=True) self.conv2 = nn.Conv2d(64, 128, kernel_size=3, stride=1, padding=1) self.relu2 = nn.ReLU(inplace=True) self.conv3 = nn.Conv2d(128, 256, kernel_size=3, stride=1, padding=1) self.relu3 = nn.ReLU(inplace=True) self.conv4 = nn.Conv2d(256, 512, kernel_size=3, stride=1, padding=1) self.relu4 = nn.ReLU(inplace=True) self.conv5 = nn.Conv2d(512, 1024, kernel_size=3, stride=1, padding=1) self.relu5 = nn.ReLU(inplace=True) self.fc = nn.Linear(1024, 10) def forward(self, x): x = self.conv1(x) x = self.relu1(x) x = self.conv2(x) x = self.relu2(x) x = self.conv3(x) x = self.relu3(x) x = self.conv4(x) x = self.relu4(x) x = self.conv5(x) x = self.relu5(x) x = x.view(x.size(0), -1) x = self.fc(x) return x model = MyModel() input = torch.randn(1, 3, 32, 32) flops = torch.flops(model, (input,)) print("FLOPS:", flops) ``` 在这个示例程序中,我们定义了一个名为`MyModel`的模型,并计算了其在输入大小为`(1, 3, 32, 32)`的情况下的FLOPS。最后,打印出计算出的FLOPS值。 需要注意的是,`torch.flops`函数只能计算模型中的浮点运算量,而不能计算模型的参数数量、内存占用等其他指标。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值