def permutations(a, k):
# inefficient version that uses permutations code from class, unmodified,
# and then, assuming len(a)>=k, uses only first k elements of each permutation,
# ignoring the rest (which is not ok). This also produces some duplicates,
# so either have to remove all the duplicates at the end (which is also not ok),
# or add an expensive O(n) check inside our loop to check if a permutation
# is already in our list (which is also not ok).
result = [ ]
allPerms = allPermutations(a)
for fullLengthPerm in allPerms:
perm = fullLengthPerm[0:k] # not ok: essentially "removes" (n-k) elements
if perm not in result: # also not ok
result += [perm]
print "len(allPerms) = ", len(allPerms)
print "len(result) = ", len(result)
return result
def allPermutations(a):
# renamed permutations code from class notes
# returns a list of all permutations of the list a
if (len(a) == 0):
return [[]]
else:
allPerms = [ ]
for subPermutation in allPermutations(a[1:]):
for i in xrange(len(subPermutation)+1):
allPerms += [subPermutation[:i] + [a[0]] + subPermutation[i:]]
return allPerms
Why is this
unacceptable? Well, say we are finding permutations(range(9), 2).
The answer contains 72 permutations of two numbers each. However, the code
above will generate all 362,880 permutations of length 9 to find these 72
permutations. This huge amount of extra work makes it intractable to use
this code to compute permutations(range(30),2), let alone
permutations(range(50),3). Try it, if you want to wait a very, very long
time!
Big Hint: one way to solve this is to modify the
permutations code from class to take a second parameter, k, and to return all
the permutations of a that are no larger than length k. This still
requires a non-recursive wrapper function (like above) that drops all the
permutations that are not of length k, but since we never produce any
permutations larger than length k, this is much, much faster than the
code above.
To help test your code, here are some assertions for use with sampleFiles:
assert(countFiles("sampleFiles/folderB/folderF/folderG") == 0)
assert(countFiles("sampleFiles/folderB/folderF") == 0)
assert(countFiles("sampleFiles/folderB") == 2)
assert(countFiles("sampleFiles/folderA/folderC") == 4)
assert(countFiles("sampleFiles/folderA") == 6)
assert(countFiles("sampleFiles") == 10)
Note: regarding efficiency, it is
overtly inefficient to create a list of any kind in your solution to this
problem. In particular, you may not do this:
def countFiles(path):
return len(listFiles(path))
To help test
your code, here are some assertions for use with sampleFiles:
assert(findLargestFile("sampleFiles/folderA") ==
"sampleFiles/folderA/folderC/giftwrap.txt")
assert(findLargestFile("sampleFiles/folderB") ==
"sampleFiles/folderB/folderH/driving.txt")
assert(findLargestFile("sampleFiles/folderB/folderF") == "")
Note: regarding
efficiency, it is overtly inefficient to call os.path.getsize(path) more than
once for any given file. One way to manage this is to use a helper
function that not only returns the largest file for a given path, but returns it
in a tuple along with other information that may be of use (hint, hint).
Then findLargestFile is a non-recursive wrapper function that calls that helper
function.
carpe diem - carpe diem - carpe diem - carpe diem - carpe diem - carpe diem - carpe diem - carpe diem - carpe diem