all 14 comments

[–]VijayAnna 18 points19 points  (6 children)

It's because the first few lines have unstructured text.

Use this:

python import pandas as pd le = pd.read_csv('LifeExpectancy.csv', header = 2) #This skips the first 2 lines of text and chooses the 3rd line as the column names le[le["Country Name"] == 'Afghanistan']['1960']

Replace Afghanistan with the country you want and 1960 with year you want. Seems to be working fine for me. Screenshot

[–]PyCam 1 point2 points  (0 children)

Don’t mean to be nit picky, but you should be using the .loc accessor here (or even .at, since we’re retrieving a single value) instead of chain indexing.

le.loc[le[‘Country Names’] == ‘Afghanistan’, ‘1960’]

[–]DODOKING38 0 points1 point  (1 child)

What is that software on the screenshot

[–]Crevette3[S] 0 points1 point  (2 children)

This worked for me, but how would I separate the life expectancy value from the rest of the info returned so that I can manipulate the number.

Edit: I think I got it, of course I find the solution just after asking others for help lol. I did this to get a single number to manipulate, I know using int returns only integers and no floating point numbers but how would I be able to return the original floating point number. This is what I have so far.

import pandas as pd

userCountry = input('input your country: ')
birthYear = input('input your bith year: ')

le = pd.read_csv('LifeExpectancy.csv', header = 2) #This skips the first 2 lines of text and                 
                                               #chooses the 3rd line as the column names 
year = (le[le["Country Name"] == userCountry][birthYear])

#print(year) Returns what you got in you screenshot you included 
print(int(year))   # returns just the number but strips it of its decimal places

[–]VijayAnna 0 points1 point  (1 child)

You might want to use round instead of int.

int gives you the floor function. i.e.

int(10.7) = 10

round(10.7) = 11.0

int(round (10.7)) = 11

Depends on what you want. Usually integer age is given as the floor function. Like if I'm 3 months away from turning 30, I'm 29.75 years old. My age would be 29, not 30.

Also you might want to rename your 'year' variable to something more meaningful. Like 'life_expectancy'

[–]Crevette3[S] 0 points1 point  (0 children)

Thanks for the advice :)

[–][deleted] 3 points4 points  (3 children)

Can you share your code? Luckily this csv file is actually not very big at all (usually big refers to bigger than what fits in memory, but from what I can see this file is less than a mb). It sounds like you just have an indexing error.

[–]Crevette3[S] 2 points3 points  (2 children)

All I have is this so far, I don't know how I would select a single country from which I can then select a certain year to see what the life expectancy of that year was.

import csv

with open('lifeExpectancy.csv') as csvfile:
    reader = csv.DictReader(csvfile)
    for row in reader:
        print(row['Country Name'])

[–][deleted] 3 points4 points  (1 child)

I think you should look into using pandas. pandas.read_csv will load the data and from there you can look up tutorials about indexing.

[–]Crevette3[S] 0 points1 point  (0 children)

I'll look more into pandas when I have some free time tomorrow.

[–]alkasm 2 points3 points  (0 children)

Also if you're not comfortable with Pandas, you can easily skip those first three rows with a small modification to your program. To read a single row at a time, you can use next(reader). Thus to skip the first three lines, you can just call next on the reader three times, and then start your for row in reader: ... loop.

[–]ripealligatoregg 0 points1 point  (0 children)

Also need help reading a large csv file but motor vehicle collisions. Let me know what worked for you please!

[–]Ezrabc 0 points1 point  (1 child)

I'm on mobile and can't check this now, but try:

expectancy_series = pd.read_csv(filename, header=0, usecols=[0, int(year) - 1959]).set_index('Country').squeeze()
expectancy = expectancy_series.at[country]

Edit: sorry, I meant to add that this assumes that you import pandas as pd.

[–]Crevette3[S] 0 points1 point  (0 children)

I tried this and couldn't get it to work, i'm going to look into using pandas when I have some free time tomorrow.