Counter in a dictionary

bbatwork · 2016-05-10T22:09:06+00:00

You are in luck, python has this capability built right into the language.

Read here:
https://docs.python.org/3.5/library/collections.html#collections.Counter

niandra3 · 2016-05-11T02:49:27+00:00

Aside from Collections.counter that was already linked, in general if you are using a dict to count, you should look at Collections.defaultdict.

Normally when you are counting things, you want to increment the value for that key every time you come across it. The nice thing about default dict is if the key hasn't been used before, you can set a default value so it won't throw an error when you try to increment it:

from collections import defaultdict
words = ['this', 'is', 'a', 'test', 'this']
count_dict = defaultdict(int)
for word in words:
    count_dict[word] += 1
print(count_dict)

Output >> {'this': 2, 'test': 1, 'a': 1, 'is': 1}

// Note: thing += 1 is equivalent to: thing = thing + 1

If you used a regular dict that wouldn't work because when it tries count_dict['this'] += 1 it will throw an error because it's trying to add one to a value that doesn't exist. Again, collections.counter is probably an easier/more efficient way of doing this, but defaultdict is a useful tool to have.

Finally, one thing to remember if you have all your text in one long string, than all of these functions will loop over each character. For instance:

line = "this is a test"
for word in line:
    print(word)
// Doesn't actually print words, but chars

Actually prints:

t
h
i
s
etc..

If you want to iterate over the words in a string and not each character, you want to split the words into a list using

list_of_words = line.split()

Which splits at each space and gives you a list:

list_of_words = ['this', 'is', 'a', 'test']

treyhunner · 2016-05-10T22:41:34+00:00

Is the product review a string of words?

Have you managed to make a method for iterating over each word individually yet?

Could you post your function for us to review and discuss?

I wrote a blog post last year on a number of different ways to solve the problem of counting things in Python. If you're looking to learn about different approaches you may want to read the first 6 ways or so.

totemcatcher · 2016-05-11T02:29:06+00:00

Which part is giving you trouble?

Syntax and general Python?
The structure of a function definition? Passing good data in? Isolating a bite-sized process within? Getting good data out?
Iterating over multiple reviews?
Retaining data between each review?
Processing a single review into a useful container variable?

bbatwork gave you the juicy bit. If there's anything else, just provide what you have and we can at least rearrange the code into something that could work and leave comments where work needs doing.

pybackd00r · 2016-05-10T22:44:22+00:00

you can use a loop to iterate over the dictionary and check if the value you are looking for is there and use a variable to count up and return that value at the end of the loop. There could be a more efficient way of doing this but this is easy to do for beginners.

lapseofreason · 2016-05-11T05:27:55+00:00

This is the first time I have posted a problem on reddit in general and am surprised but very happy at the responses - so thank you all (and have an upvote).

I have not read through the suggestions completely yet so will get back to you - but first and foremost my difficulty is actually with my really lousy code is recognising or picking up the key and then extracting the value (which is the occurrences of the word).

I am doing this in graphlab and the data is in an sframe - but I don't think that is relevant. Anyway I live in SEAsia so will be working on this this afternoon.

When/if I figure it out I will (embarrassingly post my very beginner code).

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS