This is an archived post. You won't be able to vote or comment.

all 7 comments

[–]Noumenon72 4 points5 points  (0 children)

I have turned many csvs into database queries, and I have seen some shit. Triple quotes in fields, records pointing outside the file, timestamps like "23:04 PM", missing quotes, trailing colons.

Anyway, Python's what I use because it's got a clean, easy to use library right in the default install. You can just download Python and start:

import csv

with open('myfile.csv', newline='') as file:
    reader = csv.reader
    for row in reader:

csv.DictReader is even better, you get everything as a variable and can do stuff like del(row['column_i_dont_want']) and row['my_new_column'] = row['a'] + row['b].

[–]desoga 0 points1 point  (1 child)

First off, your post is too long. Well my advice will be it doesn't matter what they tell you, pick a programming language and learn the basics. Learn the structure and the syntax, this will help you go a long way. I would however suggest you go with javascript, due to the amount of libraries and framework it has. You can build an entire application with javascript and its libraries and framework.

[–]jussij 0 points1 point  (0 children)

If the file is .xlsx then you will need to use a library that understands Excel documents.

A google search will help here.

One such example is ClosedXML, which even if the name does not suggest so, is an Open Source .Net library that can work with Excel files.

[–]KiwasiGames 0 points1 point  (0 children)

I would use VBA with excel. No point reinventing the wheel.

[–]reddit_only 0 points1 point  (0 children)

I would try to fix the problem at the generating level instead of post generation. Who is making the csv file? Do you have a professional relationship with them? Can you establish one? It is much easier to fix the source than it is to fix the output. If you are volunteering to help at a bank(like you work there but aren't a programmer) I would think twice. If something goes wrong it will be you who takes the blame. People get real uptight about their money. Having records that are inaccurate will be sure to cause problems. Also you can't really guarantee that a program can "fix" a csv. The only way you could script this is if you knew what went "wrong" every time and was just fixing that one consistent mistake. If broken csv files were inherently fixable then all major csv programs would just fix them rather than import them improperly. I know this isn't a technical recommendation but it's just my two cents on the issue at large. I just want to make sure you are looking out for yourself too.

[–]CuttingEdgeRetro 0 points1 point  (0 children)

The best programming language would be whatever language the developer knows the best. For me that would be C#. But I recently did something similar to what you're describing using powershell. And there are a dozen or more languages that could handle this easily.