Hi all,
we keep getting p-9 error code for large models.
Does this mean that there is not enough RAM available (we have 192 GB in our workstation) or could there be something else wrong with the model or the setup?
Is there a max number of nodes that the solver (internal) can handle independent from the available RAM?
Comments
sorry, bad quality, but maybe helpful nonetheless.
approx. 1.500.000 nodes in the model.
I did solve a simple model with more than two million nodes in the meantime on my laptop.
@Victor ,any idea what could be the limiting factor in the model? It does not seem to be the mere number of nodes that causes problems.
The same model that solves on the laptop will not solve on the workstation for some reason. It gets the P-9 error shortly after the "solving matrix" phase begins.
Laptop:
Workstation
The workstation does have two different RAM installed. But the issue we have now has already been there before we upgraded the ram. At that point, all RAM bars were identical.
Does anyone have a clue?
I remember that there have been problems with the intel xeon before but they have been fixed already with v19, I think...
There is a parameter for the MKL solver which controls how much memory it can use. If this is set too low, it causes that P-9 error. I'm not sure what happens if it's set too high but it will likely also cause big models to fail. The default value is the minimum of the available memory and 14.4 GB. I've added an option to choose the maximum value under the Labs menu in the (released today) version 20. My guess is that on your laptop it might not have enough memory to use that 14.4 GB maximum but the workstation does. So maybe try reducing it?
to be honest I don't quite understand what you mean. You are suggesting that reducing the value in Mecway with the labs option might help although the P-9 error is caused if it is set too low. If the Workstation may wants to use more than the 14.4 GB, shouldnt the value be increased?
Ill give it a try anyway and see what happens.
Ill also try to find a stress test and see if we have a hardware issue.
Thanks for the advice!
From what I understand, OOC requires some RAM too, and it's not allowed to use more than this value. A 2 million node model might require more than 14.4 GB of RAM for OOC and fail if it hits the limit. But it might also fail if you allow it to use more than it can index with its 32-bit integers (hence OK on laptop if there isn't enough RAM to attempt that). Not sure which problem is happening or which direction to adjust the value.
The Laptop hast 64 GB by the way so it should have enough RAM available to use or try to exeed the specified value of 14.4 GB.
Allocation phase and assembling Phase also seem quite a bit quicker in V20 compared to V20Beta. Dont know if due the above parameter (20GB) or due to the solve improvements.
trying to set the value to 100 GB next, just for the sake of it...
V20Beta, Paramater not set (default)
V20. Parameter set to 20 GB
to the solving mateix phase that goes down from approx. 6h 15 min to 3h 30 min.
Rocketship....
Trying 160 GB next ...
I wonder if it should always be set to ~infinity. I'm not even sure there was ever a good reason for having that 14.4GB limit.
@MikeMcMullen I agree that it's probably using less or no OOC speeding it up.
I'll try setting a value that is higher than the available RAM and see what happens.
I had the parameter set to 1000 GB which is above the capacity of the system hard drive.
When I checkt the value in mecway after the solver failed, the parameter has been reset to zero which had not been the case before. So maybe mecway figured out that I dont have as much RAM as requested and then reset the value to zero for some reason which then again was not enough to solve...
Ill try something above available RAM but below available hard drive and see what happens.
btw, I checket the task manager and when comparing the RAM usage in idle and when solving it appears that mecway is using about 64 GB of RAM during matrix solver phase
Setting the Parameter to 200 caused the P-9 error (the machine has 192 GB RAM) and the parameter has again be reset to zero. Ill try 190 GB next, so just below installed RAM.
So it seems that it does not hurt to set it as high as you like as long as you stay below the installed RAM.
Looks like it's time for another PC. Is there any way to know roughly how much RAM a model will take to solve? I'm importing STEP geometry, and the largest part is 100MB. That one part meshes to well over 1M nodes...and that was after some simplification so meshing would succeed. Of primary concern is buckling, and likely nonlinear 3D analysis is appropriate.
Without more RAM, you should compile MKL Pardiso CCX from the source included in Mecway to get the out-of-core (OOC) functionality. That's faster than letting it use Windows's paging and could prevent out-of-memory if the pagefile reaches its limit. I would never solve a model too big for RAM without using OOC.
Thank you!
You don't need to set the MKL environment variables since Mecway does that when it calls CCX.
The solver I have Mecway pointed to is "ccx_MKL.exe". The unnecessary/redundant environment variables previously mentioned that I set are no longer set, so maybe CCX deleted them, but I ran the solve a second time just in case the variables that I set were present the first run, and caused a problem. Same failure the second run.
I tried changing the analysis type from Nonlinear Static 3D to Static 3D, thinking that it might be less demanding on memory. It ran for 16 minutes and ended with the same error.
The compile for the MKL version appeared to go smoothly, but I searched the buildlog.txt for the string "Error:", and there were none.
For the C literate (whatever I knew is mostly forgotten), this is the mentioned line 59 in insert.c (line numbers added my me).
57 if(*ifree>=*nzs_){
58 *nzs_=(ITG)(1.1**nzs_);
59 RENEW(mast1,ITG,*nzs_);
60 RENEW(next,ITG,*nzs_);
61 }
Any thoughts on how to move forward?
Entire CCX output follows.
************************************************************
CalculiX Version 2.19, Copyright(C) 1998-2021 Guido Dhondt
CalculiX comes with ABSOLUTELY NO WARRANTY. This is free
software, and you are welcome to redistribute it under
certain conditions, see gpl.htm
************************************************************
You are using an executable made on Mon Aug 14 18:47:44 PDT 2023
The numbers below are estimated upper bounds
number of:
nodes: 46574461
elements: 1059979
one-dimensional elements: 0
two-dimensional elements: 692928
integration points per element: 9
degrees of freedom per node: 3
layers per element: 1
distributed facial loads: 182016
distributed volumetric loads: 0
concentrated loads: 0
single point constraints: 49890816
multiple point constraints: 83151361
terms in all multiple point constraints: 532168705
tie constraints: 0
dependent nodes tied by cyclic constraints: 0
dependent nodes in pre-tension constraints: 0
sets: 7
terms in all sets: 2330845
materials: 2
constants per material and temperature: 2
temperature points per material: 1
plastic data points per material: 0
orientations: 692928
amplitudes: 1
data points in all amplitudes: 1
print requests: 0
transformations: 0
property cards: 0
*INFO reading *STEP: nonlinear geometric
effects are turned on
*WARNING reading *STATIC:
the minimum increment 0.0000000000000000
is smaller then 1.e-6 times the
step time;
the minimum increment is changed
to 9.9999999999999995E-007
which is the minimum of the initial
increment time and 1.e-6 times the step time
*WARNING in calinput: PEEQ-output requested
yet no (visco)plastic calculation
STEP 1
Static analysis was selected
Newton-Raphson iterative procedure is active
Nonlinear geometric effects are taken into account
Decascading the MPC's
Determining the structure of the matrix:
*ERROR in u_realloc: error allocating memory
variable=mast1, file=insert.c, line=59, size(bytes)=0, oldaddress=261025856
------------End of the CCX output