Mecway13 and Multi-threaded CCX

I've just installed v13 and am starting to migrate to it for some heavy (large models, lots of steps) nonlinear quasistatic analyses. It looks to be another very nice upgrade.

Two questions:

1. On p. 124 of the Mecway v13 user manual it lists some CCX environment variables (e.g. OMP_NUM_THREADS) that are now set by Mecway. How does a user specify within Mecway the values desired for these CCX environment variables? I've been working around this by having Mecway point to a .bat file that sets the environment variables and then starts CCX, but it would be cleaner if the environment variables could be set within Mecway.

2. I see that the Mecway v13 distribution included source for the Pardiso solver, as well as some MKL and other library files. Does the ccx.exe in the distribution include some solver options beyond SPOOLES? If not, is there a place I can download a CCX executable with Pardiso for Windows10? It's been years since I was deep enough into code development to do a complex build, and I don't have any such tools on my hardware here.

Thanks!

Comments

  • see this thread
    http://mecway.com/forum/discussion/750/propeller-hub/p1

    The method I suggest (see 3rd-4th entry) is not the onlyway, but it is simple and works. I have upgraded to 2.16 and it works fine.
  • edited November 2020
    Since that thread JohnM referred to is quite complicated and a bit outdated, here's a summary of the CCX options in increasing order of power and difficulty:

    1) SPOOLES. The slowest solver and what the default CCX included with Mecway uses.

    2) MKL CCX downloaded from http://www.dhondt.de/ where it says "For an update of the bconverged distribution replace the executables in the bconverged download by the following files .". Extract ccx_PARDISO.exe and put it in %ProgramFiles%\Mecway\Mecway13\ccx or wherever Mecway was installed to then set it in Mecway through Tools -> Options -> CalculiX -> Solver. This uses the MKL files included with Mecway.

    3) MKL CCX as in 2) but also install MKL from https://software.intel.com/en-us/mkl/choose-download/windows and copy all the DLL files from %ProgramFiles(x86)%\IntelSWTools\compilers_and_libraries_2019.4.245\windows\redist\intel64_win\mkl to the same location as ccx_pardiso_dynamic.exe . This takes advantage of CPU features like AVX2 and is significantly faster than 2).

    4) Compile CCX with MKL and a patch to enable Out-Of-Core (OOC) mode. The source code with the patch and instructions for compiling it are installed with Mecway in its ccx folder. After compiling, install it as with 2) or 3). This allows you to solve bigger models than will fit in RAM.

    For options 2), 3), and 4), you can also set the environment variable OMP_NUM_THREADS to the number of threads (eg. 8) for multithreading on multiple cores.
  • The manual for 13.0 is wrong about OMP_NUM_THREADS, sorry. It no longer sets that variable. Thanks for bringing this up.

    I set the environment variable globally in Windows (Control Panel\All Control Panel Items\System -> Advanced system settings -> Environment variables), but that's a good point that it would be easier done in Mecway.
  • Is there a reason why is not readily available an out of core MKL version of CCX?
  • Its GPL license prohibits distribution of a binary compiled with proprietary libraries.
  • Licences :-(
  • Great summary - thanks Victor! This is a strong support community.
  • Hi Victor,
    I'm trying to follow the build instructions for compiling the large model ccx you describe in option 4 above, and have run into a couple of snags (I've not done this before):
    1. There is no ccx/src/etc/ directory, but there is a build.sh in ccx/src. Is that the correct one?
    2. Where exactly in build.sh is the path defined that must be changed to MINGW_HOME? There is a command cp -p $BUILD_HOME/$file. Do you change $BUILD_HOME to MINGW_HOME near the bottom of the file under Building CCX.....?
    Thanks.
  • 1. Yes. /etc in the path is a mistake, sorry.

    2. No, it's this line export MINGW_HOME=/c/msys64/mingw64
    Only change it if /c/msys64/mingw64 isn't the correct location of mingw64 on your installation, such as if it's not on drive C:.

    Thanks for bringing up these issues. I'll make them clearer in in the next release.
  • OK, thanks Victor. I also noticed that you want to copy some MKL files into ccx/mkl, but I show mkl in ccx/src/mkl.....correct? Please also verify the correct path for the final compile to be ccx/src
  • edited April 2020
    You're right - ccx/src/mkl. So many mistakes!

    ccx/src is where build.sh that you run to compile it is. The final output files end up in ccx/src/x64/install.
  • I managed to get the compiler to build something, and got the following in the ccx/src/x64/install folder:
    1. ccx.exe
    2. ccx_MKL.exe
    3. Various llibxx.dll, mklxxx.dll, pthreadGC2.dll, and license.rtf

    I noticed there were no mkl_vml_avx files as suggested in 3 above, and no ccx_pardiso_dynamic as used previously.

    I ran the bolt file donated as a benchmark previously, and got roughly the same execution time for the previous pardiso version vs the ccx_MKL version.

    What exactly is the difference between the two ccx versions?

  • both methods you ask about use the same pardiso version in the intel mkl. the bolt assembly was something i made for a forum user, to try and show how to apply a bolt load. someone else started using it as a benchmark.
  • @tk1537 , if I understood well, following the option 4) what you would get is CCX with out of core capacity. The CCX executable available from CalculiX site works inside of core (put all the stiffnes matrix in RAM only), so if your problem is too big for your RAM, you get an error and cannot be solved. Out of core means that if the matrix is too big for the RAM, it will use the disk also, so, it will be a little slower, but at least will let you solve that big model. Using the bolt assy benchmark will not show any difference as the problem is small enough to fit entirely in RAM, you should try with a bigger problem that doesn't run with the standard CCX and see if it can be solved with the new one.
  • prop_design, thanks for the clarification. I would have given you credit but I couldn't recall who originally generated the model. I asked about the compiled ccx_mkl because Victor indicated that if the instructions were followed, as I did, that it would generate an executable that would have a higher node/element count capability, whereas the other ccx_pardiso_dynamic from dhondt.de did not. Did I misinterpret this?
  • Regarding the extra MKL DLLs for option 4), you have to copy them from the MKL installation as in option 3).
  • edited April 2020
    i'm not sure about compiling it yourself. i would think it would generate the same thing. but victor would know more. using the ccx_pardiso_dynamic definitely allows for faster solves and much bigger models. that much i can attest to. i have never tried compiling it myself. however, both methods would use the same solver. that is the pardiso solver in the mkl. if you don't compile it yourself, it still uses the mkl. so i wouldn't think it would be much different either way. no worries about credit. i was just saying, it wasn't intended as a benchmark. i guess it turned out to be a good one.

    at one time, i asked victor about the latest version of pardiso. on the pardiso website, they claim it is even faster than the older version in the intel mkl. i don't know if intel had a falling out with the pardiso developers or what. it's odd to have different versions of the same solver. victor looked into licensing it directly. however, they wanted a huge amount of money to do that. the intel version you can use for free. victor isn't allowed to distribute it though. thus, all the end user confusion.
  • edited May 2020
    prop_design
    screenshot above, something is wrong since you cannot fill out the registration :(
    breakdown? (Sorry, Parallel Studio EX)
    here I move my question.
    The problem is already solved → the web browser was a problem, ChG I didn't see places to write. IE it's ok, done, I have dowload now :) thank's
  • Those mistakes in the build process mentioned above are corrected in the current version of Mecway (13.1). Please don't use the CCX build script or instructions in v13.0 because they're so bad.

    But there's a new problem. GCC and gfortran were updated to version 10.1 this month and they break the build with these errors:
    Error: Missing actual argument for argument '_formal_15' at (1)
    Error: Rank mismatch between actual argument at (1) and actual argument at (2) (scalar and rank-1)
    To get it working, add some extra compiler flags to 3 files like this:

    ccx/src/patches/ARPACK/ARmake.inc

    Replace
    FFLAGS = -O3
    with
    FFLAGS = -O3 -fallow-argument-mismatch


    ccx/src/patches/CalculiX/ccx_2.16/src/Makefile

    Replace
    FFLAGS = -Wall -O2 -fopenmp
    with
    FFLAGS = -Wall -O2 -fopenmp -fallow-argument-mismatch


    ccx/src/patches/CalculiX/ccx_2.16/src/Makefile_MKL

    Replace
    FFLAGS = -Wall -O2 -fopenmp
    with
    FFLAGS = -Wall -O2 -fopenmp -fallow-argument-mismatch

    and replace
    CFLAGS = -Wall -O2 -I ../../../SPOOLES.2.2 -I ../../../pthreads-w32-2-9-1-release/Pre-built.2/include -DARCH="Linux" -DSPOOLES -DPARDISO -DARPACK -DMATRIXSTORAGE -DNETWORKOUT -D_SC_NPROCESSORS_CONF=1 -posix
    with
    CFLAGS = -Wall -O2 -I ../../../SPOOLES.2.2 -I ../../../pthreads-w32-2-9-1-release/Pre-built.2/include -DARCH="Linux" -DSPOOLES -DPARDISO -DARPACK -DMATRIXSTORAGE -DNETWORKOUT -D_SC_NPROCESSORS_CONF=1 -posix -fcommon

  • The source files included with Mecway 13.1 are also incompatible with newer MKL versions. Here is an update of the complete CCX source code package that incorporates the above changes as well as making it compatible with current and hopefully future MKL versions. It's also easier to use, omitting the requirement to download a separate file.

    https://mecway.com/download/ccx_win64_mkl_pardiso_source_2.16_2020-06-04.zip

  • 3) MKL CCX as in 2) but also install MKL from https://software.intel.com/en-us/mkl/choose-download/windows and copy all the DLL files from %ProgramFiles(x86)%\IntelSWTools\compilers_and_libraries_2019.4.245\windows\redist\intel64_win\mkl to the same location as ccx_pardiso_dynamic.exe . This takes advantage of CPU features like AVX2 and is significantly faster than 2).
    Could we then uninstall MKL after completing this? Or does the installation need to stick around?
  • Yes, you can uninstall MKL afterwards.
  • Thanks Victor!
  • Hello
    how can I get spooles or pardiso for multithreading?
    thanks
  • Intel MKL Pardiso files are included with Mecway 13.1. You just need to get the CCX version that has been compiled for it according to step 2 in this comment. https://mecway.com/forum/discussion/comment/4863#Comment_4255

    Or you can download MKL (now called oneMKL) it from the link in step 3 there if you want to recompile or use the full library.
  • While we are on this subject, I will point out that PASTIX is a rocket of a solver, but I have found that you need to keep your PARDISO handy, because PASTIX will sometimes struggle to converge when PARDISO does not.
Sign In or Register to comment.

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!