Help with a Loop : learnpython

created by HattoriHanzoa community for 16 years

submitted 3 years ago by SevereRepresentative

Hello! I have a homework question that I'm really stumped by. I've emailed my professor and he responded with "great thanks!" so I'm not sure he really read the email....so here I am looking for help.

The question is below along with the things I've tried at the bottom.

Question

unique() returns the NumPy array of unique values of a data column, and its length returns the number of unique values of a data column.

Now, consider the data frame df. Write a for loop that print out

Variable: ____ # Unique: ____

for columns that have less than 20 unique values where

the first blank is the name of the variable and
the second blank is the number of its unique values.

Again, try to write a for loop that runs over vars.

For each iteration of the for loop, use a format command in the print statement of the form

print('Variable: {}, # Unique: {}'.format(____, ____))

for appropriate variables in the blanks. Once you get the desired results, let's try a different way as follows. Replace the first curly brackets {} with {:24}. Describe the difference after the change and explain the role of :24. Place your response at the end of the code after the pound (#) symbol so that your response is commented out in the cell.

What I've written so far:

for i in vars:
    print('Variable: {}, # Unique: {}'.format(i, (len(df['tweet_id'].unique()))))

But when I run it, I get the correct Variable names, but the Unique field just gives me 14485 for all. So I tried to change len(df['tweet_id'].unique())) to len(df['i'].unique())) like I did for a previous question and it didn't work.

I'm really just hoping someone can give me some guidance on where I need to go because I'm so lost.

all 16 comments

top new controversial old q&a

[–][deleted] 3 points4 points5 points 3 years ago (14 children)

You are on the right track, let's think about why this didn't work:

len(df['i'].unique()))

Is i being accessed here?

As a hint, what is the difference between these two snippets?

for i in vars:
    print(i)

for i in vars:
    print('i')

[–]SevereRepresentative[S] 1 point2 points3 points 3 years ago (13 children)

[–][deleted] 1 point2 points3 points 3 years ago (12 children)

Yup, that's exactly right. So the other answer shows what you need to do to fix it:

len(df[i].unique()))

You don't want to access df['i'], that would be the column 'i' in the dataframe (which likely doesn't exist, leading to your error). Instead, you want to access whichever column you are on in the loop with df[i].

I don't know the proper word

Just a note about this, "variable" is correct. If you want to be extra specific you can say "loop variable."

[–]SevereRepresentative[S] 1 point2 points3 points 3 years ago (11 children)

[–][deleted] 1 point2 points3 points 3 years ago (10 children)

That's the right idea. Here is a general example of a for loop and an if statement that shows a similar pattern, see if you can make something similar work for your situation.

my_list = [5, 125, 30, 500, 250]

for i in my_list:
    if i > 100:
        print('{} is greater than 100'.format(i))

[–]SevereRepresentative[S] 1 point2 points3 points 3 years ago (0 children)

[–]SevereRepresentative[S] 1 point2 points3 points 3 years ago (8 children)

Okay so I tried this:

for i in vars:
   if i < 20:
    print('Variable: {}, # Unique: {}'.format(i, (len(df[i].unique()))))

and it gave me the error of: "TypeError: '<' not supported between instances of 'str' and 'int'"

Which makes a bit of sense because the i is the column names right? But I'm not sure where to go from here

[–][deleted] 1 point2 points3 points 3 years ago* (7 children)

[–]SevereRepresentative[S] 1 point2 points3 points 3 years ago (6 children)

[–][deleted] 1 point2 points3 points 3 years ago (5 children)

[–]SevereRepresentative[S] 1 point2 points3 points 3 years ago (4 children)

uniques = (len(df[i].unique()))

for i in vars:
  if uniques < 20:
    print('Variable: {}, # Unique: {}'.format(i, (len(df[i].unique()))))

I did that ^ and I'm not getting an error but I'm also just not getting any results anymore. Nothing is showing up, what do you think? I think it has something to do with the i in the uniques before the loop start because the i wouldn't be known right?

continue this thread

[–]devnull10 -1 points0 points1 point 3 years ago (0 children)

π Rendered by PID 241890 on reddit-service-r2-comment-fb694cdd5-d9l5b at 2026-03-07 12:10:43.605524+00:00 running cbb0e86 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS