Finding Primes

One of the most efficient ways to find smaller primes is to use a sieve. There are several different sieves but one of the easiest to code is the sieve of Eratosthenes.

Here is a simple example:


low_primes = [2,3,5,7]
low_primes += [x for x in range(9,100,2) if all(x % p for p in low_primes)]

low_primes = [2,3,5,7]

low_primes += [x for x in range(9,100,2) if all(x % p for p in low_primes)]

This starts with primes less than 10 and then finds primes between 10 and 100 by finding odd numbers that are not multiples of 2, 3, 5 or 7.

In order to test whether a number is prime, we only need to test whether it is a multiple of any prime up to the square root of the number since, if it has any integer factors, at least one of them must be at or below this limit. We could hence extend the above approach by using each new list to find primes in a larger interval – primes up to 10 allow the list to be extended to 100, primes up to 100 provide for extension up to 10,000 and so on.

But we really need a more general solution, which the following subroutine provides:


def prime_list(n):
  sieve = [False, False] + [True] * (n - 1)
  for i in range(2, int(n ** 0.5) + 1):
    if sieve[i]:
      m = n // i - i
      sieve[i * i : n + 1 : i] = [False] * (m + 1)
  return [i for i in range(n + 1) if sieve[i]]

def prime_list(n):

sieve = [False, False] + [True] * (n - 1)

for i in range(2, int(n ** 0.5) + 1):

if sieve[i]:

m = n // i - i

sieve[i * i : n + 1 : i] = [False] * (m + 1)

return [i for i in range(n + 1) if sieve[i]]

This is quite fast for an interpreted language such as Python but it uses a lot of memory when large primes are needed.An alternative that consumes less memory uses a Python generator:


def prime_gen(maxp = -1):
  d = {}
  q = 2
  while maxp < 0 or q < maxp:
    if q not in d:
      yield q
      d[q * q] = [q]
    else:
      for p in d[q]:
        d.setdefault(p + q, []).append(p)
      del d[q]
    q += 1

def prime_gen(maxp = -1):

d = {}

q = 2

while maxp < 0 or q < maxp:

if q not in d:

yield q

d[q * q] = [q]

else:

for p in d[q]:

d.setdefault(p + q, []).append(p)

del d[q]

q += 1

We can time these two approaches using the following example which counts the number of primes below 1,000,000:


from time import clock
class Timer() :
  def __enter__(self): 
    self.start = clock()
  def __exit__(self, *args): 
    print("Time", clock() - self.start)

with Timer() :
  x = prime_list(1000000)
print(len(x), x[-1])

with Timer() :
  x = [p for p in prime_gen(1000000)]
print(len(x), x[-1])

from time import clock

class Timer() :

def __enter__(self):

self.start = clock()

def __exit__(self, *args):

print("Time", clock() - self.start)

with Timer() :

x = prime_list(1000000)

print(len(x), x[-1])

with Timer() :

x = [p for p in prime_gen(1000000)]

print(len(x), x[-1])

This gives the output:


78498 999983
Time 0.637893911573
78498 999983
Time 7.35350640485

78498 999983

Time 0.637893911573

78498 999983

Time 7.35350640485

which shows that the prime list function is over ten times faster than the generator solution.

Nevertheless, the generator uses less memory and offers a lot of flexibility as can be seen in its use with a filter function such as:


def select(f, s):
  for x in s:
    if f(list(map(int, str(x)))):
      yield x

def select(f, s):

for x in s:

if f(list(map(int, str(x)))):

yield x

Here a list of digits in the input number is created and the function f is applied to this list. We can now define a property we want for the digits of a prime using the function f (which can be a lambda function, an ordinary function written in a compact form).

For example, suppose we want to list the primes below 1000 with repeated digits. This does the job in a single line of code:


print([p for p in select(lambda x : len(x) != len(set(x)), prime_gen(1000))])

[11, 101, 113, 131, 151, 181, 191, 199, 211, 223, 227, 229,
233, 277, 311, 313, 331, 337, 353, 373, 383, 433, 443, 449,
499, 557, 577, 599, 661, 677, 727, 733, 757, 773, 787, 797,
811, 877, 881, 883, 887, 911, 919, 929, 977, 991, 997]

print([p for p in select(lambda x : len(x) != len(set(x)), prime_gen(1000))])

[11, 101, 113, 131, 151, 181, 191, 199, 211, 223, 227, 229,

233, 277, 311, 313, 331, 337, 353, 373, 383, 433, 443, 449,

499, 557, 577, 599, 661, 677, 727, 733, 757, 773, 787, 797,

811, 877, 881, 883, 887, 911, 919, 929, 977, 991, 997]

In a similar way we can easily find primes whose digits sum to 32:


print([p for p in select(lambda x : sum(x) == 32, prime_gen(10000))])

[6899, 8699, 8969, 9689, 9887]

print([p for p in select(lambda x : sum(x) == 32, prime_gen(10000))])

[6899, 8699, 8969, 9689, 9887]

So far we have considered only primes where it is feasible to hold lists of primes in memory. But when numbers are very large, with hundreds or thousands of digits, it becomes impractical to find primes in this way. Although there are ways of determining whether a very large number is prime, in practice they are very slow. Fortunately, however, there are tests that can determine whether a given number is almost certain to be prime. Such tests are often known as probable prime tests but they would be more accurately called composite tests because they can determine with certainty that a number is composite (i.e. it is not a prime).

One such test is known as the Miller-Rabin test, which is as follows in Python:


from random import randrange
def miller_rabin(n, r = 10):
  if n < 10: 
    return n in (2, 3, 5, 7) 
  t = n - 1
  s = 0 
  while not t % 2:
    t >>= 1
    s += 1
  for i in range(r):
    a = randrange(2, n - 1)
    x = pow(a, t, n)
    if x != 1 and x != n - 1:
      for j in range(s - 1):
        x = (x * x) % n
      if x == 1:
        return False
      if x == n - 1:
        break
      else:
        return False
  return True

from random import randrange

def miller_rabin(n, r = 10):

if n < 10:

return n in (2, 3, 5, 7)

t = n - 1

s = 0

while not t % 2:

t >>= 1

s += 1

for i in range(r):

a = randrange(2, n - 1)

x = pow(a, t, n)

if x != 1 and x != n - 1:

for j in range(s - 1):

x = (x * x) % n

if x == 1:

return False

if x == n - 1:

break

else:

return False

return True

The first parameter is the number to test, the second parameter being the number of times the test is to be repeated (with a default of 10). Each time the test is run, the value returned will be False if the number is composite and True if the number is possibly prime.

For the Miller-Rabin test it is known that the probability of a test falsely indicating a prime is not higher than 1/4. Moreover, tests are independent of each other when run with different internal random numbers, which means that running N tests reduces the chance of finding a false prime to 1 in 4^N. So, if 10 tests all indicate that a value is prime, the probability that it is actually composite will be lower than 1 in 1,000,000. So, for example, if we want to find the largest primes below the powers of 10 between 10 and 30, we can use:


for n in range(10, 31):
  t = 10 ** n - 1
  while not miller_rabin(t):
    t -= 2
  print(n, t)

10 9999999967
11 99999999977
12 999999999989
13 9999999999971
14 99999999999973
15 999999999999989
16 9999999999999937
17 99999999999999997
18 999999999999999989
19 9999999999999999961
20 99999999999999999989
21 999999999999999999899
22 9999999999999999999973
23 99999999999999999999977
24 999999999999999999999743
25 9999999999999999999999877
26 99999999999999999999999859
27 999999999999999999999999901
28 9999999999999999999999999791
29 99999999999999999999999999973
30 999999999999999999999999999989

for n in range(10, 31):

t = 10 ** n - 1

while not miller_rabin(t):

t -= 2

print(n, t)

10 9999999967

11 99999999977

12 999999999989

13 9999999999971

14 99999999999973

15 999999999999989

16 9999999999999937

17 99999999999999997

18 999999999999999989

19 9999999999999999961

20 99999999999999999989

21 999999999999999999899

22 9999999999999999999973

23 99999999999999999999977

24 999999999999999999999743

25 9999999999999999999999877

26 99999999999999999999999859

27 999999999999999999999999901

28 9999999999999999999999999791

29 99999999999999999999999999973

30 999999999999999999999999999989

In practice this test is almost always far better than its theoretical probability suggests and running it 20 times will make the probability of falsely reporting a composite as a prime far lower than the probability that a random machine memory error will cause such a false report.

Finding Primes

Recent Posts

Recent Comments

Archives

Categories

Meta