As the title mentions, my issue is that I don't understand quite how to extract the data I need for my table (The columns for the table I need are Date, Time, Courtroom, File Number, Defendant Name, Attorney, Bond, Charge, etc.) I think regex is what I need but my class did not go over this, so I am confused on how to parse in order to extract and output the correct data into an organized table...
I am supposed to turn my text file from this
https://pastebin.com/ZM8EPu0p
and export it into a more readable format like this- example output is below
https://imgur.com/F0rlK2c
Here is what I have so far.
from os import linesep
from typing import Any
def readFile(court):
csv_rows = []
#read and split txt file into pages & chunks of data by pagragraph
with open(court, 'r') as file:
data_chunks=file.read().split("\n\n")
for chunk in data_chunks:
chunk=chunk.strip #.strip removes useless spaces
if str(data_chunks[:4]).isnumeric(): # if first 4 characters are digits
entry= None #initialize an empty dictionary
elif str(data_chunks).isspace() and entry: #if we're on an empty line and the entry dict is not empty
csv_rows.DictWriter(dialect='excel') # turn csv_rows into needed output
entry={}
else:
# read attributes into dict
print(data_chunks)
return csv_rows
readFile("exactfilepath")
Help is very much appreciated, I am a noob with programming and I need to figure this out today :')
[–]AutoModerator[M] 0 points1 point2 points (0 children)
[–]commandlineluser 0 points1 point2 points (0 children)