Sierpinski Sieving

Background

Back in the mists of history, Paul Jobling wrote NewPGen. He added a "fixed k" sieve feature. Paul and I had fun optimising the inner loop, but in retrospect it was still not marvelous (better than anything anyone else had, though).

Recently Jack Brennen demonstrated that a "big steps baby steps" discrete logarithm would be more efficient than the drag-racer loop.

At the same time Payam Samidoost came up with some modular relationships that would prove, with very small overhead, that a prime would never remove any candidates, and didn't even need to enter the discrete log stage.

Within no time at all Paul implemented both of the above features. In order to feel I'd contributed something, I told Paul how to tweak the big steps baby steps for improved performance (big steps are expensive, baby steps are cheap, so don't use a square). And it truly was marvelous! One of the best team efforts for pooling ideas and getting an implementation out of the door in the history of prime numbers. Everyone deserves a big pat on the back.

And then I began sieving with it...

Intentions

After running round in ever tighter loops trying to find low-hanging prime records to beat, I finally got primes fatigue and decided that I'd retire from prime-hunting but still try to keep myself actively involved in some way.

At the turn of 2002 I began sieving for Yves Gallot's Generalised Fermat Number search, which has been most relaxing, and has swallowed up all of my fast machines. However, as my prime-finding tasks ran to an end on the machines I was using at work, I realised that if I could find the right kind of task (start it once, and leave it, with no mothering at all), then up to 15 slow machines (266MHz-450MHz) could be put to good use with very little effort. I was also looking for something that would take only a month, as I was approaching the end of a fixed-term contract (which has now finished).

It was just at this time that Payam started plugging the Sierpinski search, in particular his own range, 4847, which he's currently sub-letting. I was familiar with the project, and had a quick browse round the site, and noticed that there were 17 ranges still being searched, by 13 people (including Payam and "Seventeen or Bust", who are further subdividing the work between volunteers).

It was then obvious what to do with the 15 work machines.

Apologies

The plan was to ensure that each of the 13 hunters could have at least one range well sieved (my target was sieving to 60G in the time that I had). In theory, two lucky people would have both their ranges. However, it doesn't always work out perfectly, and it appears that some people where I worked actually wanted to use the machines I was sieving on! (I.e. there were continual loggings off and on, and the sieving had to be restarted by hand.) This effectively killed 2 of the machines, but worst of all they were both ranges that I had hoped to prioritise, being searcher's only ranges. So I must apologise to Ray Ballinger and Olivier Haeberlé, whose ranges never got anywhere. Sorry guys. I no longer have any access to these machines, which means I'm currently unable to perform the sieving on the missing ranges.

Full Credit

It's by no means just me who's done the sieving. My thanks to Michele and Manoj for sieving on their machines too, and more recently Joseph Arthur, who became interested in helping through Payam Samidoost's influence, and has put in some huge sieving effort.

No-one who's been involed as asked for any credit for the work, but in the style of Professor Caldwell's top 5000 primes database, the full breakdown of credit should probably be as follows:

Coding - Paul
Extra Algorithmics - Jack, Payam
Tweakage - Paul, Phil
Sieving - Phil (+Michele+Manoj), Joseph
It might make sense to create a "team" code for the above.

The Ranges

Each of the files contains a filter file suitable as input to Proth/PRP/PFGW, and also a list of all the non-trivial factors (>1m) found. They were all generated using NewPGen, and can be further sieved by NewPGen. They were generated with verification turned on, so in theory there should be no problems. I've doubly verified a few of the factors by hand, and of course they were all correct. Also included in more recently updated ranges are lists of factors sorted by the value they remove (file extension .byn (sorted "by n")).

k (file) Pmax sieved Status Current searcher
4847 96G Stopped, June Payam Samidoost
5359 179G* Stopped, September Dave Linton
10223 392G* Stopped, September Marc Thibeault
19249 208G* Stopped, September Tom Kuechler
21181 74G Stopped, June Daval Davis
22699 194G* Stopped, September Daval Davis
24737 - not sieved Payam Samidoost
27653 258G* Stopped, September S B
28433 61G Stopped, June Dave Linton
33661 - not sieved Marc Thibeault
44131 68G Stopped, June Didier Boivin
46157 158G* Stopped, September Ian Lowman
54767 - got nowhere Ray Ballinger
55459 75G# Stopped, June Janusz Szmidt
65567 - got nowhere Olivier Haeberlé
67607 178G* Stopped, September reserved for S B
69109 193G* Stopped, September Andy Penrose

Notes on Depth

Ranges marked with an asterisk have had some more sieving done to them by Joseph Arthur, who was introduced to the Sierpinski project by Payam Samidoost. My thanks to them both. Joseph in total has removed several hundred candidates from the 8 ranges. The range marked with a hash (#) has been sieved (entirely) by Michele Gualco - many thanks to him.

How far to sieve depends on how far you expect to be searching. The Sierpinski project has been predicted to be hard, and most k values will need to be searched up to n in the millions, or even milliards, or beyond. I chose to sieve a range up to n=3000000 because I reckoned that it would take quite a long time (years) for most ranges to exceed that value. The cost of sieving is the square root of the n range, so I didn't want n too high, or too low.

It has been noted that assuming all things are equal, one can expect to remove 3% more candidates by sieving twice as far. I.e. there's a quite harsh law of diminishing returns in sieving. If you think you'll be testing all the way to 3M, or beyond, then it's probably worth sieving further. If however, you think that you'll 'only' do a range of another 500000, then it might not make as much sense. Only you can decide.

Phil Carmody
Home page: http://fatphil.org/