
Originally Posted by
pkohut
While technically true, there is no free lunch. "easy paralellized" "simple OpenCl kernel", again are dependent an the OP's capabilities and understanding of parallel programming issues.
It's not that bad. If you have a for loop such as the following:
for(int r = 0; r < rows; ++r) {
for(int c = 0; c < cols; ++c) {
doSomething(r,c);
}
}
for(int r = 0; r < rows; ++r) {
for(int c = 0; c < cols; ++c) {
doSomething(r,c);
}
}
To copy to clipboard, switch view to plain text mode
Can be rewritten as
QFutureSynchronizer<void> waiter;
for(int r = 0; r < rows; ++r) {
for(int c = 0; c < cols; ++c) {
waiter.addFuture(QtConcurrent::run(doSomething, r, c));
}
}
waiter.waitForFinished(); // or do something else in the meantime
QFutureSynchronizer<void> waiter;
for(int r = 0; r < rows; ++r) {
for(int c = 0; c < cols; ++c) {
waiter.addFuture(QtConcurrent::run(doSomething, r, c));
}
}
waiter.waitForFinished(); // or do something else in the meantime
To copy to clipboard, switch view to plain text mode
This doesn't require any skills and provided that doSomething() does more than just add two ints together, you will get a decent speed improvement on a multicore system. Of course there is a good chance the loop can be substituted with a flat call to QtConcurrent::mappedReduced(). OpenCL is more tricky although I see the OP's calculations are mostly simple vector operations (additions and multiplications) so the code doesn't need many changes.
Bookmarks