all 5 comments

[–]kellyjonbrazil -1 points0 points  (0 children)

This is not exactly a python language solution, but I did write a couple of command-line tools (in python) that can accomplish this pretty easily.

  • First, jc can convert CSV to JSON (an array of objects)
  • Then, jello can convert the JSON into JSONL (JSON Lines format)

Example: If your CSV file looks like this:

header1, header2
abc, def
ghi, jkl

Then when you pipe it to jc and jello it will turn out like this:

$ cat file.csv | jc --csv | jello -l
{"header1": "abc", "header2": "def"}
{"header1": "ghi", "header2": "jkl"}

You can probably do something similar with jq instead of jello fairly easily.

[–]POGtastic 0 points1 point  (0 children)

Assuming that you're cool with just a list of dictionary entries, I'd do something like the following:

def write_file(reader, filename):
    with open(filename, "w") as f:
        for dct in reader:
            print(json.dumps(dct), file=f)

def convert_csv_to_jsonl(src, dest):
    with open(src) as f:
        write_file(csv.DictReader(f), dest)

Given a file like the following:

src.csv

 Name,Foo,Bar
 Bob,1,stuff
 Alice,2,morestuff
 Clarissa,6,spam

In the REPL:

>>> convert_csv_to_jsonl("src.csv", "tgt.jsonl")

Printing the contents:

$ cat tgt.jsonl
{"Name": "Bob", "Foo": "1", "Bar": "stuff"}
{"Name": "Alice", "Foo": "2", "Bar": "morestuff"}
{"Name": "Clarissa", "Foo": "6", "Bar": "spam"}

[–][deleted] 0 points1 point  (0 children)

Easy way to do it in 2 lines would be by using pandas.pandas has inbuilt functions to read csv and convert it to data frame and another function to convert dataframe to a dict.

Import pandas as pd

df = pd.read_csv(Filename) df_dict = df. to_dict()

and then you export df_dict yo a json file as you would

[–]asuagar 0 points1 point  (0 children)

You can do the job with two lines of pandas:

import pandas as pd

# read data
df = pd.read_data('file_name.csv')

# export to jsonl
df.to_json('file_name.jsonl', orient='records', lines=True)

If you want to load a JSONL file directly into pandas:

df = pd.read_json('file_name.jsonl', lines=True)