Partition Unique
Jim Randell has developed a neat generic subroutine called filter_unique that can assist teasers of the form “if you knew X, then you could work out what Y is”. He has made this and other useful routines for solving teasers available in his enigma.py library, which is also available here. I call my version of Jim’s subroutine partition_unique because ‘filter’ implies for me that the output returns only a part of the input whereas ‘partition’ implies that the output divides the input into parts.
Update
My version of partition unique was recently updated after Frits took an interest in it and suggested that its performance could be significantly improved in situations where one of the functions involved is costly.
Here is the new version:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 |
from collections import defaultdict from inspect import signature # Partition the input sequence (seq) into two sub-sequences # such that the values of f(s) for items in the sequence do # and do not imply unique values for g(s) respectively # (credit: Jim Randell and Frits) def partition_unique(seq, f=lambda x: x, g=lambda x: x, g_costly=False): unpack = lambda h: (lambda s: h(*s)) if len(signature(f).parameters) > 1: f = unpack(f) if len(signature(g).parameters) > 1: g = unpack(g) d = defaultdict(list) for s in seq: d[f(s)].append(s) un, nu = [], [] if not g_costly: for k, vs in d.items(): gs = {g(v) for v in vs} (nu, un)[len(gs) == 1].extend(vs) else: for k, vs in d.items(): seen = [] for v in vs: gv = g(v) if seen: if gv != seen[0]: nu.extend(vs) break else: seen = [gv] else: un.extend(vs) return un, nu |
It is available here (with a further change since the ‘g_costly’ parameter is unnecessary). Here is a test program that illustrates the reason for the update:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
from number_theory import is_prime, factor from partition_unique import partition_unique from random import randrange from timer import Timer data = list(randrange(10 ** 6) for _ in range(100_000)) with Timer(): un, nu = partition_unique(data, lambda n: is_prime(n), lambda n: len(factor(n)) == 1) assert all(is_prime(n) for n in un) and not any(is_prime(n) for n in nu) with Timer(): un, nu = partition_unique(data, lambda n: is_prime(n), lambda n: len(factor(n)) == 1, g_costly=True) assert all(is_prime(n) for n in un) and not any(is_prime(n) for n in nu) |
There are two functions in my number theory library that can test for primes: is_prime(), which is fast, and factor(), which is slow. The test program above uses 100,000 random numbers to show that these two ways of testing for primes give the same results. It does this twice, one using the usual approach and one using the feature of the new version to deal with costly second functions, in our case the prime test “len(factor(n)) == 1”.
The previous version runs this test in 1.822 seconds, compared with 390 milliseconds for the new version.
An Example of Use
Here is a teaser that illustrates the use of partition_unique:
—————————————————————————
Sunday Times Teaser 2857 by Danny Roth
Published June 25 2017
Significant Errors
Last year George and Martha were going to visit one of their daughters. Her
house number was a two-figure number but unfortunately Martha wrote the
number incorrectly by making the first digit less than it should be. When
George discovered the error he commented that the incorrect number was a
whole number percentage of the correct one. If you knew that percentage
then you should be able to work out their daughter’s house number.
Then the daughter moved to a different two-figure house number, Martha
again wrote the number incorrectly by making the first digit less than it
should be, and again the incorrect number was a whole number percentage of
the correct one. If you knew that percentage then you should be able to
work out their daughter’s new house number.
What was the daughter’s original house number, and what is the new one?
—————————————————————————
which can be solved as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
from partition_unique import partition_unique nps = [] # consider the daughter's posible house numbers for m in range(20, 100): # ... and the tens digit of the incorrect one for i in range(1, m // 10): n = 10 * i + m % 10 # look for integer percentages p, r = divmod(100 * n, m) if not r: nps.append((p, m, n)) # given an item (percentage, house_number, incorrect_number) # return the percentage f = lambda p, m, n: p # given an item (percentage, house_number, incorrect_number) # return the house number g = lambda p, m, n: m # find a percentage for which the associated house number is unique n1 = partition_unique(nps, f, g)[0] # remove the first house number and repeat to find the second house number nps = [x for x in nps if x[1] != n1[0][1]] n2 = partition_unique(nps, f, g)[0] for i in (0, 1): print(f'first: {n1[i][1]} ({n1[i][2]}), second: {n2[0][1]} ({n2[0][2]})') |