Hans L on Mon, 22 Apr 2019 03:46:09 +0200


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: Help understanding generating functions, (partitioning n into k parts)


Oh one other thing I forgot to mention/ask in regards to deriving a polynomial p_k(n) for specific k, I am wondering if the pentagonal number theorem has some application here?

-Hans

On Sun, Apr 21, 2019 at 8:35 PM Hans L <thehans@gmail.com> wrote:
Thanks for the replies,

I am starting to understand the how the GFs work a bit better now.  I also didn't really understand the big-O notation before, but I think I get it more-or-less.  (I know it in terms of computer science algorithmic complexity, but hadn't seen it used in series like this).

I see now that the example GF for the k'th column, which Dirk illustrated, "directly" generates the column values as coefficients, and not as some polynomial equation for arbitrary n that I was hoping for.
I am still trying to understand how one would attempt to derive those type of polynomials specific k though.  I guess what I was looking for is almost like a polynomial regression for k'th columns?  I played a little with trying to use polinterpolate to help with this but without any success.  I did notice at least one small feature of these formula which is that the denominator seems to follow the sequence https://oeis.org/A010790 with an offset -1.
so for k=2, denom is (2-1)!2! = 2
k=3 => 2!3! = 12
k=4 => 3!4! = 144, etc.
But I still don't know how to approach determining those polynomial coefficients.

Also just a note that when computing T(n,k) I am mostly interested in computing values for specific n,k pairs, not necessarily whole rows or columns, but I can certainly use some of these techniques and grab the last value (it just feels a bit wasteful, but maybe unavoidable).
I found I can use polcoeff as follows for a simplified version to pluck a single value, although it doesn't seem substantially more efficient than just getting the whole column:
T(n,k) = polcoeff(prod(j=1,k,1/(1-x^j)+O(x^n)),n-k)

I also realized the significance of the x^k numerator in the GF, which is simply to shift the rows for the table, so a column starts at n=k (which is why i have to get the n-k coefficient above).
I think I might be able to do something analogous to this in FLINT using their polynomial interface (I'm completely new to using FLINT as well).

Mike, I tested your functions briefly and they are quite fast for large n and smallish k! (where my current approach falls short, unless k<=5).
I may need to rethink my recursive approach.  Maybe some hybrid could be even more efficient; using iterative evaluation in addition to some of the "shortcuts" I'm working on.  Though I'm not yet sure they are applicable since it seems that its computing from bottom-up rather than top-down.  I need to take some more time to look them over and think about it.
I definitely have a lot more to think about now.

Anyways here is the code I have been working on, which optimizes for specific cases of ak+b>=n and k<=5, and takes advantage of the builtin numbpart, as well as caching of the sums of numbpart via A000070

A70=[1]; A000070(n) = if(n<0,0,if(#A70<=n, for(n=#A70, n, A70=concat(A70, A70[#A70]+numbpart(n)))); A70[n+1])
A008284(n, k) = {
    local(result,n1,k2);
    result = if(k>n,0,if(k==n,1,
        if(k>5,
            n1 = n-k; 
            if(2*k>=n,
                numbpart(n1),
                if(3*k+2>=n,
                    numbpart(n1) - A000070(n1-k-1),
                    if(4*k+5>=n,
                        numbpart(n1)-A000070(n1-k-1)+sum(j=k+1,floor((n1-1)/2),A000070(n1-2*j-1)),
                        numbpart(n1)-sum(j=k+1,floor(n1/3)-1,A008284(n1,j))-A000070(n1-floor(n1/3))+sum(j=floor(n1/3),floor((n1-1)/2),A000070(n1-2*j-1))
                    )
                )
            ),
            if(k==5,floor((n^4+10*(n^3+n^2)-75*n-45*n*(-1)^n+1440)/2880),
                if(k==4,floor((n^3+3*n^2-9*n*(n%2)+72)/144),
                    if(k==3,floor((n^2+6)/12),
                        if(k==2,floor(n/2),
                            if(k==1,1,0)
                        )
                    )
                )
            )
        )
    ));
    result
}

I am actually interested in efficiently computing for values of n ranging up to around 10^8-10^9 if possible!

As an example, my function trivially computes n=10^9,k=10^9/2 in about a second (because it simplifies to a single numbpart call), but struggles with n=200,k=6 taking closer to a minute!
Also I'm not entirely satisfied with having to cache A000070 for large values, but its at least better in terms of memory complexity than caching the whole 2d (triangular) array.


On Sun, Apr 21, 2019 at 6:56 PM Mike Day <mike_liz.day@tiscali.co.uk> wrote:
I replied earlier, copied below - I'd thought I was sending to the forum, but see I only sent to Hans L.
As I offered in that message, here are two or three functions which are reasonably close translations of
the J functions mentioned below.
Here they are:

\\ numbers of k-partitions of k, k+1,..., n - more efficient as k/n approaches 1
pnkd(n,k)={my(s=1+n-k, p=vector(s,i,1));
   if(k==0, return(p*0));
   for(i=2, k,
      for(j=i + 1, s,
         p[j]+= p[j-i];
         );
      );
p;
}

NB. numbers of m-partitions of n for m = 0,1,2,...k - more efficient as k/n approaches 0
\\ simpler version - slowed down by vertical shift of whole matrix in each major cycle
pnkvslow(n,k) = {my(j=vector(k+1,i,[k+3-i,i]),
             t=matrix(1+k,1+k), tk=vector(1+k));
   t[k+1,2]=1;
   for(i=1, max(0, n-1),
      for(l=2,k+1,
         tk[l]=t[k+1,l-1]+t[k+3-l,l] ;
         );
      if(l==k+1, return(tk));
      for(r=1, k, t[r,]=t[r+1,]);
      t[k+1,]=tk;
   );
tk;
}

NB. numbers of m-partitions of n for m = 0,1,2,...k - more efficient as k/n approaches 0
\\ faster version - avoids vertical shift of whole matrix in each major cycle
pnkv(n,k) = {my(j=vector(k+1,i,[k+3-i,i]),
             t=matrix(1+k,1+k), tk=vector(1+k), offset=0, kx=k+1, ky);
   t[k+1,2]=1;
   for(i=1, max(0, n-1),
      for(l=2,k+1,
         ky=1+(k+2-l+offset)%(k+1);
         tk[l]=t[kx,l-1]+t[ky,l] ;
         );
      if(l==k+1, return(tk));
      offset++;
      kx=1+(k+offset)%(k+1);
      t[kx,]=tk;
   );
tk;
}

For example:
(00:44) gp > pnkd(20,3)
%317 = [1, 1, 2, 3, 4, 5, 7, 8, 10, 12, 14, 16, 19, 21, 24, 27, 30, 33]
(00:45) gp > pnkv(20,10)
time = 15 ms.
%319 = [0, 1, 10, 33, 64, 84, 90, 82, 70, 54, 42]
I believe the original J-language verbs were written by Roger Hui, as may be seen by
reference to
https://code.jsoftware.com/wiki/Essays/Partitions#Partitions_with_k_Parts
My version of pnkd may be compared with Dirk Laurie's implementation of a GF, which I
call pnk, here:

(00:52) gp > pnk
%323 = (n,k)->Vec(prod(j=1,k,1/(1-x^j)+O(x^n)))
(00:52) gp > pnk(20,3)
%324 = [1, 1, 2, 3, 4, 5, 7, 8, 10, 12, 14, 16, 19, 21, 24, 27, 30, 33, 37, 40]


Mike

On 21/04/2019 12:26, Mike Day wrote:

There are a couple of J (jsoftware.com) functions, not written by me(!),  which I could try

translating into PARI,  but I haven't got time right now. One, pnkv,  is more efficient for "low" k

than "high", the other, pnkd,  is said to better for k approaching n.

There's an essay here on the J Wiki pages on k-partitions,  using the recurrence relation/s you mention:

    https://code.jsoftware.com/wiki/Essays/Partitions#Partitions_with_k_Parts

Perhaps that will help a bit,  depending of course on the size of problem you're contemplating.  I'll
try to come back later with Pari versions when Easter lunch and crosswords are out of the way.

Mike

On 21/04/2019 02:53, Hans L wrote:
Hi, 
I'm interested in computing the number of partitions of n into exactly k parts ( https://oeis.org/A008284 ).  

This has the recurrence relation 
T(n,k) = T(n-1,k-1) + T(n-k,k)  OR
T(n,k) = sum(j=1,k,T(n-k,j))
With some initial/boundary conditions:
=1 when (k==1) || (k==n) || ((k==0) && (n==0))
=0 when (k>n) || (n != 0 && k < 1)

So I have some PARI code already which can do this(and indeed there is some listed in the link, which I started with), but I'm trying my best to make it more efficient for very large n values, without necessarily memoizing the entire table.  One thing that can help speed this up is to have polynomial equations for specific small k values.

I have adapted some equations for some small k values which are:
T(n,1)=1
T(n,2)=floor(n/2))
T(n,3)=floor((n^2+6)/12)
T(n,4)=floor((n^3+3*n^2-9*n*(n%2)+72)/144)
T(n,5)=floor((n^4+10*(n^3+n^2)-75*n-45*n*(-1)^n+1440)/2880)

I actually don't know how these equations are determined beyond k=2, but I found them from various OEIS pages.  My ultimate goal is to implement such a function in C using GMP or FLINT arbitrary precision libraries, but I'm using PARI as testing ground for getting the formulas right.
I use the floor function and add half the denominator for these(as opposed to calling round(...), which is how they were originally written), since this will translate better into integer division when I write the code for GMP.  I'm also slightly skeptical of the rounding in these formula and I'm not certain that they will hold true (with error < 1) for any arbitrarily large n.  Does the math always work out correctly that way?

The oeis page provides some generating functions for the sequence, but I just can't grasp how to make use of these:
G.f. for k-th column: x^k/(Product_{j=1..k} (1-x^j))
G.f.: A(x, y) = Product_{n>=1} 1/(1-x^n)^(P_n(y)/n), where P_n(y) = Sum_{d|n} eulerphi(n/d)*y^d
G.f.: G(t,x) = -1 + 1/Product_{j>=1} (1-t*x^j)
G.f.: -1 + e^(F(x,z)), where F(x,z) = Sum_{n >= 1} (x*z)^n/(n*(1 - z^n)) is a g.f. for A126988

Is there a way I can use PARI to help generate the formulas for specific values of k as above, by using these G.f.'s?

I've also been working on attacking the "other side" of this problem, using shortcuts for specific cases where ak+b>=n

Given the functions:
p(n) = numbpart(n) // total partitions of n
and
A000070(n)= sum(k=1,n,p(k))

And the equivalences:
p(n) == sum(k=1,n,T(n,k))
T(n,k) == sum(j=1,k,T(n-k,j))

It follows that:
T(n,k) == p(n-k) - sum(j=k+1,n, T(n-k,j))

I call the sum which is subtracted from p(n-k) the "anti-children" of T(n,k)
This can be applied to multiple times to get a formula which doesn't require recursively evaluating T(n,k).
Each successive step for ak+b>=n is defined such that the anti-children are guaranteed to satisfy the previous inequality:

k > n:       T(n,k) = 0
2k >= n:   T(n,k) = p(n-k)
3k+2>=n: T(n,k) = p(n-k) - A000070(n-2*k-1)
4k+5>=n: T(n,k) = p(n-k) - A000070(n-2*k-1)+Sum_{j=k+1..floor((n-k-1)/2)} A000070(n-k-2*j-1))
4k+5< n:  T(n,k) = p(n-k) - A000070(n-k-floor((n-k)/3)) + (Sum_{j=floor((n-k)/3)..floor((n-k-1)/2)} A000070(n-k-2*j-1) ) - Sum_{j=k+1..floor((n-k)/3)-1} T(n-k,j)

These cases were already on the oeis page up to 3k+2>=n, then I spent quite some time working out the formula for 4k+5>=n by hand, (and verifying against a printed table, and fixing mistakes).  I might try to continue this process (next would be 5k+9>=n, 6k+14>=n, 7k+20>=n, etc.) but it gets increasingly difficult for me.

Alternatively, if anyone understands how Hardy-Ramanujan-Rademacher equation works for p(n), could please tell me if its possible to apply this somehow to the restricted "n into k parts" problem, that would be amazing.  The FLINT library uses this HRR for partitions of n, and is apparently incredibly efficient, but doesn't provide a function for "n into k" parts 

Or if you have thoughts about any other ways to approach this problem, I would be interested to hear.

Thank you,
Hans Loeblich





Virus-free. www.avast.com