Bill Allombert on Mon, 16 Jun 2003 22:56:47 +0200 |
[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]
Re: memory management |
On Mon, Jun 16, 2003 at 01:18:57PM -0400, Igor Schein wrote: > Hi, > > I am trying to understand the general concept of memory management in > PARI. I am aware that pari uses both stack and heap memory, but I'd > like to hear what are strengths and weaknesses of each type of > allocation, and what are the considerations whenever one is chosen > over the other. My lame perspective is the following: > > - Stack memory helps you adopt to the size of physical memory of > machine. A huge drawback is that you may have a job that has been > running for a long period of time, and when you run out of stack, > that time has been completely wasted. To safeguard against that, I > usually start gp (32bit) with the stack size of about 90% of physical memory > of the machine or 2GB, whichever is smaller. > > - Heap memory has more flexibility, but can potentially bring a > machine to a halt on large memory jobs by eating up all virtual > memory. The above is just secondary effect on how things are implemented, for example you can implement stack to grow on demand, and you can limit the amount of heap memory to a maximum. So we need more background on how things are implemented: 1) The PARI stack is not the system stack, so it is in fact a part of the heap. 2) We use malloc(3) to allocate the PARI stack on the heap. In fact we do not have much choice, since PARI/GP is a library. Allocating memory without malloc would forbid programs linked with libpari to use malloc() which is not acceptable. This would be a real pain for GP already. 3) malloc(3) does not permit to extend a memory block in place, which is what is required to expend the PARI stack. Now what is the purpose of the PARI stack: 1) The PARI stack is very efficient way to allocate a lot of small memory block. This is very important if you have recursive object. Here the basic code to allocate memory on the PARI stack, without casts: GEN new_chunk(long x) { const GEN z = avma - x; if (x > avma-bot) err(errpile); avma = z; return z; } This is essentially 2 substractions and 1 comparison. A lot of system use a stack. 2) By contrast, malloc() is usually very inefficient for allocating/freeing lots of small block, but is much better for a few large blocks. So, to avoid overflowing the stack, when we know we have a few large block to allocate/free we use the heap. 3) The main drawback of the stack is that garbage collecting change addresses of object in the stack. To work-around this problem, we use clone which are temporary copy of GEN in the heap that we free later. This may make stack management easier in hard case. 4) GP use the heap to store contents of GP variable and history, so GP programs may sometimes use less stack that their C counterpart (but more heap). There are several ways to avoid stack overflow, but due to the point you raised, we never felt they were worth implementing. Also computer time is very cheap compared to computer RAM. 1) The good: (this is more or less what you do) We suppose the OS use copy on write for allocate memory. This allows us to start with a really large stack without in fact using that much memory. Only dirty pages really take up system resource. We let GP free()/malloc() the stack when it return, to refresh the pages. The drawback curently is that PARI will make no effort to not fill the stack, and thus will use more memory than required. This is easy to fix and I can eventually do it if I receive enough push :). This solution is transparent for system with poor VM engine. 2) The bad: mmap(2)ing /dev/zero with MAP_FIXED to allocate the stack at some specified address, so that we can reallocate it at the same place. This look clever, but not only it is not portable but also MAP_FIXED is not waranted to work at all. 3) The ugly: When the stack overflow we allocate a new stack twice as large as the current one and we copy the old one a the start of the new one. We keep the old one around until GP return, and make function that matter about stack connexity to promote operand in the old stack to the new one. This is slow and memory inefficient: for example with a 1Gb of stack, PARI will use 3Gb of stack after doubling once and 7Gb after doubling twice (to a 4Gb stack). Cheers, Bill.