Active6 years, 1 month ago
Well, in this article I will calculate the percentage of your body weight that you would expect to 'push up' during both regular and inclined push-ups. Proper Push-Ups Before I begin with the math, let's define what a push-up is.
- Device : Tesla C2050
- OS : Windows 7 Enterprise
- IDE : VS 2012
Hello everyone. I'm using AMP C++ to do some volume calculations.
I have millions tetrahedrons with one point at (0,0,0). so I can get the volume of the tetrahedrons in a simple way:
So, I want to speed up my calculation by using AMP C++.
Here is the code.
And the main function is:
So, every work has down. But interesting things is. It cost more than the CPU(single-core) code.
C++ on CPU(single-core) costs 0.085 seconds to finish 1024 * 1024 * 2 triangles calculation.But the AMP C++ code costs 0.530 seconds. much more than the c++ code.
After searching on the internet, there is a tip: If we warmed up the device first, we can get the 'real' time costs on the calculation.
So I first calculate 128 triangles to warm up the device (costs about 0.2 seconds), then get the volume by calculating 1024 * 1024 * 2 triangles. It became much faster (costs about 0.091 seconds), but still slower than the CPU(single-core) code.
I'd like to know why, and anybody who can help me to speed up the calculation.
Thanks a lot.
Ade Miller12k11 gold badge3434 silver badges7070 bronze badges
Zavier XuZavier Xu
2 Answers
Firstly, below is what I think is a slightly better implementation with some comments. You code is doing some things that can be avoided.
However, what you are really doing here is a reduction. This is an algorithm that has been very heavily researched and optimized. There is a C++ AMP implementation on AMP Algorithms Codeplex site It is implemented as an STL-style algorithm. Before concluding that C++ AMP does not meet your needs I would try using this reduce implementation as it will be trivial to do and may give you much better perf. I'd be interested to see how you get on.
The AMP Book Codeplex site contains a helper class for timing C++ AMP kernels. The accompanying book also discusses implementing reduction. It has an entire chapter on it.
![Double Double](/uploads/1/2/5/6/125625896/137234546.png)
Here's a further example that uses the AMP Algorithms Library to implement a solution to your problem using a map/reduce pattern.
Ade MillerAde Miller![Double press up calculator & amp calendar manual 2017 Double press up calculator & amp calendar manual 2017](/uploads/1/2/5/6/125625896/583825652.png)
12k11 gold badge3434 silver badges7070 bronze badges
You should be able to speed it up a bit by factoring out.
Note that your formula for tetrahedron volume:
is equivalent to:
Original formula has 12 multiplications, and equivalent formula has 9 multiplications (25% less). It is hard to say how big of total improvement it will be, but I would not be surprised if it gives you 20%.
mvpmvp78.7k99 gold badges9696 silver badges127127 bronze badges