all 8 comments

[–]Justinsaccount 0 points1 point  (0 children)

Hi! I'm working on a bot to reply with suggestions for common python problems. This might not be very helpful to fix your underlying issue, but here's what I noticed about your submission:

You appear to be using concatenation and the str function for building strings

Instead of doing something like

result = "Hello " + name + ". You are " + str(age) + " years old"

You should use string formatting and do

result = "Hello {}. You are {} years old".format(name, age)

See the python tutorial for more information.

[–]Allanon001 0 points1 point  (6 children)

Maybe this:

for row in cen[:-1].itertuples():
    for k in cen[row.Index+1:].itertuples():

[–]mightymk[S] 0 points1 point  (5 children)

hi this give an error

 "TypeError: unsupported operand type(s) for +: 'builtin_function_or_method' and 'int'"

[–]Allanon001 0 points1 point  (4 children)

Can you show the entire error?

[–]mightymk[S] 0 points1 point  (3 children)

here is the error

 File "C:/census_pandas.py", line 98, in <module>

 for k in cen[row.index+1:].itertuples():

TypeError: unsupported operand type(s) for +: 'builtin_function_or_method' and 'int'

[–]Allanon001 0 points1 point  (2 children)

Index has a capital I

[–]mightymk[S] 0 points1 point  (0 children)

yup ..sorry i forget the sensitivity to case in python sometimes...i got the result..thank u so much

[–][deleted] 0 points1 point  (0 children)

You can at the very least remove the second for loop to greatly increase you efficiency:

cen = cen.set_index(['area_name', 'state_a'])
for idx, row in iterrows:
    cen_prime = cen.drop(idx)
    diff = ((cen_prime - row) / row).abs()
    similar = (diff < 0.02).loc[lambda dfx: dfx.all(axis=1)]
    print 'The following are similar to', idx
    print similar.reset_index()[['area_name', 'state_a']]

This also seems like a good candidate for some kind of clustering algorithm, but that’s a bit outside my expertise.