Karim Belabas on Fri, 21 Nov 2014 01:04:39 +0100


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: Growing ordered set


* Bill Allombert [2014-11-20 23:29]:
> On Thu, Nov 20, 2014 at 01:41:36PM -0500, Charles Greathouse wrote:
> > Fairly often I have a problem of the type 'generate a number by a given
> > process, and repeat until a duplicate is found'. I also have the related
> > problem 'generate a number by a given process, and repeat until some limit,
> > recording the number of distinct elements at each step'.
> > 
> > What's a good way to go about this in GP?
> > 
> > I can think of several ways, none satisfactory:
> > * concat a vector, search linearly. O(n) to add, O(n) to search, O(n)
> > overall.
> > * listput a list, search linearly. O(1) to add, O(n) to search, O(n)
> > overall.
> > * listput then listsort, setsearch. O(n log n) to add, O(log n) to search,
> > O(n log n) overall.
> > * setsearch(,,1) and listinsert. O(n) to add, O(log n) to search, O(n)
> > overall.
> 
> I am not quite sure what complexity mode you choose for your O(), but
> the last one is not that bad, compared to the cost induced by the GP memory
> manager...
> 
> Anyway, yes you can do better, using binary search tree!
> See the attached (rather crude) script.
> 
> binary search tree:
> 
> ? setrand(1);test(10^11)[1]
> %1 = 537777
> ? ##
>   ***   last result computed in 9,325 ms.
> 
> listinsert:
> 
> ? setrand(1);test2(10^11)[1]
> %2 = 537777
> ? ##
>   ***   last result computed in 43,584 ms.

That's beautiful, indeed, but...

  (00:48) gp > install(test3,lG)
  (00:48) gp > setrand(1); test3(10^11)
  time = 220 ms.
  %2 = 537777

  long
  test3(GEN N)
  {
    pari_sp av = avma;
    hashtable *h = hash_create(600000,
      (ulong (*)(void*))hash_GEN,
      (int (*)(void*,void*))gidentical, 1);
    long c = 0;

    while(1)
    {
      void *k = (void*)randomi(N);
      c++;
      if (hash_search(h, k)) { avma = av; return c; }
      hash_insert(h, k, NULL);
    }
  }

(N.B. my machine is a little slower than yours, I need about 11s for test() and
47s for test2.)

Cheers,

    K.B.
--
Karim Belabas, IMB (UMR 5251)  Tel: (+33) (0)5 40 00 26 17
Universite de Bordeaux         Fax: (+33) (0)5 40 00 69 50
351, cours de la Liberation    http://www.math.u-bordeaux1.fr/~kbelabas/
F-33405 Talence (France)       http://pari.math.u-bordeaux1.fr/  [PARI/GP]
`