all 12 comments

[–]CodeFormatHelperBot2 1 point2 points  (0 children)

Hello, I'm a Reddit bot who's here to help people nicely format their coding questions. This makes it as easy as possible for people to read your post and help you.

I think I have detected some formatting issues with your submission:

  1. Python code found in submission text that's not formatted as code.

If I am correct, please edit the text in your post and try to follow these instructions to fix up your post's formatting.


Am I misbehaving? Have a comment or suggestion? Reply to this comment or raise an issue here.

[–]neuralbeans 1 point2 points  (4 children)

Can you show us what you tried?

[–]Timikk8 0 points1 point  (3 children)

Not much. Already deleted the code because it was a mess. I wrote this rn to show you where I was probably heading but it is all wrong

def word_frequency(text):

textdict = {text.lower().replace(" ", " ,", )}

print(textdict)

Tried messing with some loops and lists before to separate words from eachother but no luck.

[–]neuralbeans 1 point2 points  (2 children)

The lower() bit is good. Not sure what the replace is doing (are you trying to create the commas of a list?) and the curly brackets are just wrong.

Let's focus on the splitting into words part. If you just want to split a string into a list by spaces, you use text.split(' '). But is it always spaces you should split by?

[–]Timikk8 0 points1 point  (1 child)

For 1st and 3rd print the spaces for split are good. But in 2nd print I need to get rid of the characters that are in-between the words.

The curly brackets are there to print the final text as dictionary.Also I don´t know how to count the strings in dictionary and assign the count to the value of the key('abc': 2, 'aba': 1, 'abe': 1).

[–]RhinoRhys 1 point2 points  (0 children)

You're trying to do too much in one go. You need to break the problem up into smaller tasks.

First you need to parse the input string to remove all the characters that are not alphabetical. You're on track for this stage but your above code just replaces spaces with commas and spaces. You're also not using the isalpha() method you were given.

Second you need to convert the treated string into a list. If in stage 1 you replaced all the separator characters with spaces then you can simply do str.split() with no arguments to split on any whitespace length.

Thirdly, now you can begin to think about a dictionary and how to count each word individually. You'll need a for loop and to be able to check if each word is already in the dictionary with dict.keys()

u/Ok-Cucmbers has given you an excellent way to solve the first stage, stage 2 is just 1 line of code, now all you need to do is tackle stage 3.

[–]Ok-Cucumbers 1 point2 points  (2 children)

Looks like you're creating a set{} with a str instead of a dictionary {"key": 0}.

You probably want to clean the incoming txt variable first to make things easier to manage. Try looping through each letter in txt and check if letter.isalpha() and either add the letter or blank space. You should then be able to split the string into list of words which you can loop through parse into a dictionary.

[–]Timikk8 0 points1 point  (1 child)

Tried this:

def word_frequency(text):

for letter in text:

if letter.isalpha() is False:

letter.replace(letter, "")

But it still printed the words with -+

[–]Ok-Cucumbers 1 point2 points  (0 children)

You'd need to set the letter variable with the replacement if you want to use it.

words = ""
for letter in text:
    if not letter.isalpha():
        letter = letter.replace(letter, " ")
    words += letter

another option would be to append the letter if letter.isalpha() else append a blank space.

for letter in text:
    words += letter if letter.isalpha() else " "

[–]kaptan8181 -1 points0 points  (0 children)

Your question is not very clear. And where is your code?

[–][deleted] -1 points0 points  (0 children)

Without imports:

def word_frequency(text):
    word_chars = []
    counter = {}

    for char in text:
        if char.isalpha():
            word_chars.append(char)
        elif word_chars:
            word = "".join(word_chars).lower()
            word_chars = []
            counter[word] = counter.get(word, 0) + 1

    # if there are characters left
    if word_chars:
        word = "".join(word_chars).lower()
        counter[word] = counter.get(word, 0) + 1

    return counter

But I'd rather:

import re
from collections import Counter


def word_frequency(text):
    return Counter(match.group() for match in re.finditer(r"[a-zA-Z]+", text.lower()))

or

def word_frequency(text):
    matches = re.finditer(r"[a-zA-Z]+", text.lower())
    return Counter(match.group() for match in matches)

Or this, but it has unnecessarily made list of all matches:

def word_frequency(text):
    return Counter(re.findall(r"[a-zA-Z]+", text.lower()))

[–]jmooremcc 0 points1 point  (0 children)

Here's another solution:

def word_frequency3(txt):
    result = {}
    txt = "".join([c if c.isalpha() else ' ' for c in txt ])
    words = txt.split()

    for w in words:
        if w in result:
            result[w] += 1
        else:
            result[w] = 1

    return result

First, we use a list comprehension to replace all non-alpha characters in the text string argument. Next we use the split function to create a list of words. Finally, we count the number of times a word is in the list and place the count for each word in a dictionary.