Partition Unique

Jim Randell has developed a neat generic subroutine called filter_unique that can assist teasers of the form “if you knew X, then you could work out what Y is”. He has made this and other useful routines for solving teasers available in his enigma.py library, which is also available here. I call my version of Jim’s subroutine partition_unique because ‘filter’ implies for me that the output returns only a part of the input whereas ‘partition’ implies that the output divides the input into parts.

Update

My version of partition unique was recently updated after Frits took an interest in it and suggested that its performance could be significantly improved in situations where one of the functions involved is costly.

Here is the new version:


from collections import defaultdict
from inspect import signature

# Partition the input sequence (seq) into two sub-sequences
# such that the values of f(s) for items in the sequence do
# and do not imply unique values for g(s) respectively
# (credit: Jim Randell and Frits)

def partition_unique(seq, f=lambda x: x, g=lambda x: x, g_costly=False):

  unpack = lambda h: (lambda s: h(*s))
  if len(signature(f).parameters) &gt; 1:
    f = unpack(f)
  if len(signature(g).parameters) &gt; 1:
    g = unpack(g)

  d = defaultdict(list)
  for s in seq:
    d[f(s)].append(s)

  un, nu = [], []
  if not g_costly:
    
    for k, vs in d.items():
      gs = {g(v) for v in vs}
      (nu, un)[len(gs) == 1].extend(vs)    
  
  else:

    for k, vs in d.items():
      seen = []
      for v in vs:
        gv = g(v)
        if seen:
          if gv != seen[0]:
            nu.extend(vs)
            break
        else:    
          seen = [gv]
      else:
        un.extend(vs)
  
  return un, nu

from collections import defaultdict

from inspect import signature

# Partition the input sequence (seq) into two sub-sequences

# such that the values of f(s) for items in the sequence do

# and do not imply unique values for g(s) respectively

# (credit: Jim Randell and Frits)

def partition_unique(seq, f=lambda x: x, g=lambda x: x, g_costly=False):

unpack = lambda h: (lambda s: h(*s))

if len(signature(f).parameters) > 1:

f = unpack(f)

if len(signature(g).parameters) > 1:

g = unpack(g)

d = defaultdict(list)

for s in seq:

d[f(s)].append(s)

un, nu = [], []

if not g_costly:

for k, vs in d.items():

gs = {g(v) for v in vs}

(nu, un)[len(gs) == 1].extend(vs)

else:

for k, vs in d.items():

seen = []

for v in vs:

gv = g(v)

if seen:

if gv != seen[0]:

nu.extend(vs)

break

else:

seen = [gv]

else:

un.extend(vs)

return un, nu

It is available here (with a further change since the ‘g_costly’ parameter is unnecessary). Here is a test program that illustrates the reason for the update:


from number_theory import is_prime, factor
from partition_unique import partition_unique

from random import randrange
from timer import Timer

data = list(randrange(10 ** 6) for _ in range(100_000))

with Timer():
  un, nu = partition_unique(data, lambda n: is_prime(n), 
                            lambda n: len(factor(n)) == 1)
assert all(is_prime(n) for n in un) and not any(is_prime(n) for n in nu)

with Timer():
  un, nu = partition_unique(data, lambda n: is_prime(n), 
                            lambda n: len(factor(n)) == 1, g_costly=True)
assert all(is_prime(n) for n in un) and not any(is_prime(n) for n in nu)

from number_theory import is_prime, factor

from partition_unique import partition_unique

from random import randrange

from timer import Timer

data = list(randrange(10 ** 6) for _ in range(100_000))

with Timer():

un, nu = partition_unique(data, lambda n: is_prime(n),

lambda n: len(factor(n)) == 1)

assert all(is_prime(n) for n in un) and not any(is_prime(n) for n in nu)

with Timer():

un, nu = partition_unique(data, lambda n: is_prime(n),

lambda n: len(factor(n)) == 1, g_costly=True)

assert all(is_prime(n) for n in un) and not any(is_prime(n) for n in nu)

There are two functions in my number theory library that can test for primes: is_prime(), which is fast, and factor(), which is slow. The test program above uses 100,000 random numbers to show that these two ways of testing for primes give the same results. It does this twice, one using the usual approach and one using the feature of the new version to deal with costly second functions, in our case the prime test “len(factor(n)) == 1”.

The previous version runs this test in 1.822 seconds, compared with 390 milliseconds for the new version.

An Example of Use

Here is a teaser that illustrates the use of partition_unique:

—————————————————————————
Sunday Times Teaser 2857 by Danny Roth

Published June 25 2017

Significant Errors

Last year George and Martha were going to visit one of their daughters. Her
house number was a two-figure number but unfortunately Martha wrote the
number incorrectly by making the first digit less than it should be. When
George discovered the error he commented that the incorrect number was a
whole number percentage of the correct one. If you knew that percentage
then you should be able to work out their daughter’s house number.

Then the daughter moved to a different two-figure house number, Martha
again wrote the number incorrectly by making the first digit less than it
should be, and again the incorrect number was a whole number percentage of
the correct one. If you knew that percentage then you should be able to
work out their daughter’s new house number.

What was the daughter’s original house number, and what is the new one?
—————————————————————————

which can be solved as follows:


from partition_unique import partition_unique

nps = []
# consider the daughter's posible house numbers
for m in range(20, 100):
  # ... and the tens digit of the incorrect one
  for i in range(1, m // 10):
    n = 10 * i + m % 10
    # look for integer percentages 
    p, r = divmod(100 * n, m)
    if not r:
      nps.append((p, m, n))

# given an item (percentage, house_number, incorrect_number)
# return the percentage
f = lambda p, m, n: p

# given an item (percentage, house_number, incorrect_number)
# return the house number
g = lambda p, m, n: m

# find a percentage for which the associated house number is unique 
n1 = partition_unique(nps, f, g)[0]

# remove the first house number and repeat to find the second house number
nps = [x for x in nps if x[1] != n1[0][1]]
n2 = partition_unique(nps, f, g)[0]

for i in (0, 1):
  print(f'first: {n1[i][1]} ({n1[i][2]}), second: {n2[0][1]} ({n2[0][2]})')

from partition_unique import partition_unique

nps = []

# consider the daughter's posible house numbers

for m in range(20, 100):

# ... and the tens digit of the incorrect one

for i in range(1, m // 10):

n = 10 * i + m % 10

# look for integer percentages

p, r = divmod(100 * n, m)

if not r:

nps.append((p, m, n))

# given an item (percentage, house_number, incorrect_number)

# return the percentage

f = lambda p, m, n: p

# given an item (percentage, house_number, incorrect_number)

# return the house number

g = lambda p, m, n: m

# find a percentage for which the associated house number is unique

n1 = partition_unique(nps, f, g)[0]

# remove the first house number and repeat to find the second house number

nps = [x for x in nps if x[1] != n1[0][1]]

n2 = partition_unique(nps, f, g)[0]

for i in (0, 1):

print(f'first: {n1[i][1]} ({n1[i][2]}), second: {n2[0][1]} ({n2[0][2]})')

Partition Unique

Update

An Example of Use

Recent Posts

Recent Comments

Archives

Categories

Meta