Results 1 to 2 of 2

Thread: Open MP on Windows with MingW32 almost no acceleration in release mode.

  1. #1
    Join Date
    Nov 2009
    Laval, France
    Thanked 2 Times in 2 Posts
    Qt products

    Default Open MP on Windows with MingW32 almost no acceleration in release mode.

    I created a Console Project to test Open MP inside QtCreator/MingW32. My Qt version is 5.3. The compiler is mingw482_32.
    Here is the project file
    Qt Code:
    1. #-------------------------------------------------
    2. #
    3. # Project created by QtCreator 2021-06-18T17:05:54
    4. #
    5. #-------------------------------------------------
    7. QT += core
    9. QT -= gui
    11. TARGET = OpenMPTest
    12. CONFIG += console
    13. CONFIG -= app_bundle
    15. TEMPLATE = app
    17. win32:CONFIG(release, debug|release):QMAKE_CXXFLAGS += -std=c++11 -O3
    18. QMAKE_CXXFLAGS+= -openmp
    19. QMAKE_LFLAGS += -fopenmp
    21. SOURCES += main.cpp \
    22. testOpenMP.cpp
    24. HEADERS += \
    25. testOpenMP.h
    To copy to clipboard, switch view to plain text mode 

    And here is the content of testOpenMP.cpp:
    Qt Code:
    1. #include <iostream>
    2. #include <math.h>
    3. #include <time.h>
    4. #include <omp.h>
    5. #define SIZE_ARRAY 20000
    7. int array_floor_total[SIZE_ARRAY];
    9. void do_compute(int j)
    10. {
    11. double total = 0;
    12. for (int i = 0; i < SIZE_ARRAY; ++i)
    13. total += sqrt(i+j);
    14. int floor_total;
    15. floor_total = floor(total);
    16. array_floor_total[j] = floor_total % 2;// the various threads need to write to common memory
    17. }
    19. void test_accellerate_loop()
    20. {
    21. int end = SIZE_ARRAY;
    22. clock_t t1 = clock();
    24. for (int i = 0; i < end; ++i)
    25. do_compute(i);
    26. clock_t t2 = clock();
    27. std::cout << "time taken (no acceleration)"<<t2 -t1<<"\n";
    28. clock_t t3 = clock();
    29. #pragma omp parallel for // This OMP directive tells the compiler to parallelise the next loop
    30. for (int i = 0; i < end; ++i)
    31. do_compute(i);
    32. clock_t t4 = clock();
    33. std::cout << "time taken (with acceleration)" << t4 - t3 << "\n";
    35. std::cout << "Press return\n";
    36. getchar();// pause
    37. }
    To copy to clipboard, switch view to plain text mode 

    The main file is essentially a call to test_accellerate_loop();

    Now here's the unexpected fact:
    Whereas the acceleration is considerable in Debug mode, it's almost non existent in Release mode.
    Here is one MignW32 output
    time taken (no acceleration)840
    time taken (with acceleration)847

    I compiled the same testOpenMp.cpp in Visual Studio. There is a big difference. And outside VS the program runs even much faster.
    Visual Studio executable, at the command line.
    time taken (no acceleration)1076
    time taken (with acceleration)203

    Any explanation? Do I have the wrong optimiser flags?

  2. #2
    Join Date
    Jan 2008
    Alameda, CA, USA
    Thanked 864 Times in 851 Posts
    Qt products

    Default Re: Open MP on Windows with MingW32 almost no acceleration in release mode.

    This is not a Qt Programming question, so I have moved your thread to the General Programming section.

    There are many explanations I can think of:

    1 - g++ is very good at optimizing and parallelizing loops, so OpenMP doesn't add much advantage
    2 - g++ has optimized and inlined your simple function.
    3 - the OpenMP implementation in g++ has significant overhead which does not result in much performance increase for a small number of evaluations of a simple calculation.
    4 - MSVC is not very good at optimizing loops.
    5 - MSVC did not inline your function and left it as a function call
    6 - MSVC produces slower code so OpenMP has a performance boost
    7 - differences in the compilation flags for g++ vs. MSVC gave different degrees of optimization of the non-OpenMP code.

    Also, most compilers turn off optimization in debug mode, so using a debug mode build to test performance isn't really valid. Depending on the compiler, optimization can inline function calls, unroll loops, or make other changes so that the release and debug mode versions of the program aren't really the same.

    I think you need to first research the compiler flags to make sure your non-OpenMP code is built on as close to an apples-to-apples basis as possible so you really do have the same starting point for comparison. And second, you need to make your test calculation more difficult, so it takes longer to execute and can't be optimized by the compiler, and you need to evaluate it maybe millions of times, not just 20000.

    I don't know if it is the case, but because you are accessing a global array in your compute function, there could also be some access control locking that basically defeats the parallelization in g++.
    <=== The Great Pumpkin says ===>
    Please use CODE tags when posting source code so it is more readable. Click "Go Advanced" and then the "#" icon to insert the tags. Paste your code between them.

Similar Threads

  1. Replies: 1
    Last Post: 23rd April 2014, 10:03
  2. Replies: 5
    Last Post: 14th April 2011, 19:10
  3. Replies: 1
    Last Post: 18th November 2009, 21:51
  4. QtSingleApplication in Windows with release mode
    By Auryn in forum Qt Programming
    Replies: 2
    Last Post: 28th October 2009, 11:36
  5. Windows change hardware acceleration setting
    By jakamph in forum Qt Programming
    Replies: 1
    Last Post: 15th November 2006, 07:10

Tags for this Thread


Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
Digia, Qt and their respective logos are trademarks of Digia Plc in Finland and/or other countries worldwide.