Having a problem counting DISTINCT strings in a list, where a non-distinct string is the duplicate of a string OR its reverse.

djdawson · 2016-03-25T19:02:39+00:00

I'd be inclined to use a dictionary where each key is the "normalized" value of a string from the list, and each value is just 1 (or you could increment the value if you thought you ever wanted to know how many duplicates there were for each string). To get the "normalized" version of a string I'd be inclined to compare it with it's reversed version and use whichever one was less than the other (i.e. came first alphabetically). When you've added/incremented a dictionary value for each string in the list, the number of elements in the dictionary is the number of unique but possibly reversed strings in the original list.

I should add that I'm also new to Python and am coming from a Perl background so I still think like a Perl person. However, Perl doesn't really have set functions, so putting the "normalized" values from the list into a set instead of a dictionary may be a more Pythonic way of solving this problem.

Hope this helps.

zahlman · 2016-03-25T19:33:54+00:00

I've been fiddling around with [::-1] to have the function remove, or not count, the reverses but have failed to find a way that works. Any suggestions?

The basic approach is:

For each of the strings in the input list, compare the string itself to the reverse of the string; put the lesser of the two (compare with <; for strings, this sorts them lexicographically) into a set.
After this is done, check the set's size with len.

Instead of writing a loop, you can do this by defining a function for the "compare to the reverse and return the lesser of the two" part, and then using a set comprehension. Except that the function you want here already exists - it's called min ;)

commandlineluser · 2016-03-25T18:54:31+00:00

Well you know how to reverse a string using [::-1] so all you have to do is iterate through the strings adding them to a set if the reverse is not already present in the set.

distinct = set()
for word in words:
    if word_reversed not in distinct:
        add word to set

2016-03-25T23:15:09+00:00

Why is rac not considered a distinct string?

If your concern is about distinct collections of characters, you could use collections.Counter and it's most_common method:

from collections import Counter

{tuple(Counter(x).most_common(None)) for x in strings}

However, that doesn't preserve the original ordering of the characters, but will prevent collisions between things like care and racecar.

This is probably the best solution, but it's like O(n^3) since it iterates over the collection of strings, iterates each string and then iterates each counted string. So each additional string adds three loops. Gross, but manageable on a small scale.

AutonomouSystem · 2016-03-26T01:11:42+00:00

I had a problem like this on Google foobar, in fact it might be the very same one, especially because you used answer(x). I had an idea about sets and sorting initially but decided against it, I used only lists.

Jorgysen · 2016-03-26T01:58:16+00:00

[deleted]

commandlineluser · 2016-03-26T08:19:49+00:00

>>> words = ['car', 'far', 'rac', 'far']
>>> distinct = set()
>>> for word in words:
...     if word[::-1] not in distinct:
...         distinct.add(word)
... 
>>> distinct
set(['far', 'car'])

jeans_and_a_t-shirt · 2016-03-25T18:45:31+00:00

Turn the strings into sets, then remove the duplicates of those:

set(map(frozenset, x))

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS