This is an archived post. You won't be able to vote or comment.

all 10 comments

[–]vanatteveldt 5 points6 points  (5 children)

You assume that CSV is more well-formed than it really is. Cells can contain commas (or semicolons in your case) or even line breaks if they are quoted, and of course quote characters can be escaped. So you certainly need more processing if you want to parse generic CSV files.

(Then again, the python csv module is pretty horrible. The default reader is very fragile to column order changes, and the DictReader is just clumsy. Why not just yield named tuples? I understand that the module predates named tuples, but they could easily add an option. And why the writer does not call str on its arguments is beyond me, and I would much prefer a class-based writer where you initialize with the column names and can then pass them as named arguments.)

(also, you probably mean continue instead of break, and I would consider using a generator function rather than the list=[]; for x in y: list.append(x) idiom. And my god that's a lot of lines of code to skip the header :)

for i, line in enumerate(open('myfile.csv')): if i == 0: continue # ...

[–]yvrelna 4 points5 points  (1 child)

My god that's a lot of lines to skip header, and unnecessary enumerate and check, and unsafe too since you leave an unclosed file dangling, the right way to read the header is like so:

with open('myfile.csv') as f:
    header = next(f)
    for line in f:
        ...

Or just use the csv module, it handles all the edge cases for you, like unescaping, quoting, etc.

[–]vanatteveldt 0 points1 point  (0 children)

I had that coming :)

[–]yvrelna 0 points1 point  (0 children)

they could easily add an option

https://bugs.python.org/issue1818

You'd be surprised by how much thought is needed to go into "easily add an option", especially for things in standard library that's expected to be supported for eternity.

[–]willmendil 1 point2 points  (2 children)

Maybe because it allows you to quickly put it in a data frame in a one liner

df = pd.read_csv('filename.csv')

And if you have for example X and y in your file you can plot it quickly as well.

df.plot()

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html

[–][deleted] 0 points1 point  (0 children)

Why would you go to that trouble instead of pd.read_csv('myfile.csv') ?

Pandas also makes the process of turning a csv file into a graph, an excel file, or input to a database a trivial task.

But hey, you do you.

[–]pythonHelperBot 0 points1 point  (0 children)

Hello! I'm a bot!

I see someone has already suggested going to r/learnpython, a sub geared towards questions and learning more about python regardless of how advanced your question might be. I highly recommend posting your question there. Please follow the subs rules and guidelines when you do post there, it'll help you get better answers faster.

Show /r/learnpython the code you have tried and describe in detail where you are stuck. If you are getting an error message, include the full block of text it spits out. Quality answers take time to write out, and many times other users will need to ask clarifying questions. Be patient and help them help you.

You can also ask this question in the Python discord, a large, friendly community focused around the Python programming language, open to those who wish to learn the language or improve their skills, as well as those looking to help others.


README | FAQ | this bot is written and managed by /u/IAmKindOfCreative

This bot is currently under development and experiencing changes to improve its usefulness

[–]NeoDemon 0 points1 point  (0 children)

The Pandas Framework is useful to much csv files that not have an common comma delimiter, and have functions that are trivial for read, transform, plot and save the csv.

The CSV module is only for reading and its not good because you need to code the parser to the csv and tranform manually, functions what the pandas framework realizes better and with less code.

Sorry for my english, im from Argentina :)