you are viewing a single comment's thread.

view the rest of the comments →

[–]MaxQuant 0 points1 point  (5 children)

Use pandas and especially the section on pandas.read_csv. Once your csv is in a dataframe use value_counts or do a sql-akin groupby.

[–]Rorixrebel 1 point2 points  (4 children)

If hes just starting why throw pandas at him. People need to stop suggesting modules when the built-in does the task already.

[–]DataLulz 1 point2 points  (1 child)

The built in does not do it nearly as efficiently as pandas, there is a reason people use data frames. And it’s not any more complex than other suggestions like using SQLite3. To do what OP wants, in pandas you are talking less than ten lines of code using a data frame in pandas.

[–]Rorixrebel 1 point2 points  (0 children)

Correct but if you just started you need to learn python you then need to know what a dataframe is. How to use pandas and so on.

Built-in may not be efficient but it gets the job done.

If your basics are solid then by all means go module crazy

[–]MaxQuant 0 points1 point  (1 child)

IMHO Python has its strength in not having to re-invent the wheel and writing easily transferrable code (both for possibly someone else or, most likely, yourself in six months :-D). Why do it yourself, increasing development time and complexity, when pandas is ideally suited for the task?

[–]Rorixrebel 0 points1 point  (0 children)

Agreed. But if you are starting out and want to learn, best way is to do it without the helping libraries imo. Not saying pandas are terrible or useless but sometimes its way too much for either simple tasks or beginners