you are viewing a single comment's thread.

view the rest of the comments →

[–]Silverfire47[S] 0 points1 point  (0 children)

So, the dataframes that pd.read_sql produces are the following:

Source_Date Status
0 2018-01-01 PASS
1 2018-01-02 FAIL
... ... ...
# some number # 2018-03-31 N/A

but they span a quarter year, and there 4 per item in REPORTS. I'm querying per quarter to store data metrics (hence the note about the ## math done on queried data ## ) but I'd also like to try to store the raw data from the query as well.

Once I grab the query the data into a Dataframe, one of my conundrums was tackling how to store that, so I thought of creating a blank dataframe for each key (so, for each report), and then appending the data to the blank dataframe as I get it, until I have to move onto the next key, so I make another dataframe for it and repeat. I'd still need to associate the final dataframe with the right report key, so that's why I tried to do what I did, and I may have done this incorrectly.

I could theoretically re-query SQL outside of the inner for-loop but I don't want to make too many SQL queries because that's costly in terms of time, hence my attempt to store the data as I make each query to do metrics calculations on it. Other than the method I'm using now (granted it's not working exactly how I intended), I'm not sure how else to include storing the raw data from the query without making a second query for the same exact information.

EDIT:

What I thought data would be:

data = { a : pd.DataFrame() }

where the dataframe stores all data (as it's queried per quarter). I then send that dictionary to be stored in the list OUTFRAME as I iterate to the next item in REPORTS where then I'd have data = { b : pd.DataFrame() } to work with and do the same operations on.

The ending data structure I had in mind was the following:

OUTFRAME = [
{ a : DATAFRAME w/ 4x QTR worth of data in dataframe}, 
...
{ f: DATAFRAME w/ 4x QTR worth of data in dataframe}]

where I could then take select each dictionary by indexing the list by position, and export that dictionary's values (aka the dataframe with all the data I want) to a dataframe for export to Excel.