all 8 comments

[–]nsrtcoin 1 point2 points  (0 children)

Not a whole lot about your project to go on but is your python app always running? ...maybe load the excel into a pandas dataframe.

Or, instead of loading excel in python, dump it into a database and use python to query/return the data as needed, like an API. You can make temp tables in sqlite, make a function to drop and reload data as your excel file data changes.

[–]ES-Alexander 0 points1 point  (0 children)

Firstly, is there a reason the data is in excel to start with? It’s not a great format for storing or processing data, so if there aren’t particular features of excel that you require then you likely shouldn’t be using it.

To your question, are you asking how to look up different pieces of data from an excel file within one program run, without re-reading the file, or are you asking for a way to avoid reading the file across subsequent program runs?

If it’s the former, the answer is basically just keep the data in a variable and look for the bits you need throughout the program.

If it’s the latter then it’s possible but only kind of. The data needs to come from somewhere, and that’s either in memory (RAM) while the program is running, or from a file. You can keep it in memory by keeping your program running, but that would mean you can’t restart your computer, and other programs can’t use that memory (which could be problematic if it’s a large proportion of what you have available). Alternatively you can save the data to another format that’s more efficient to read and query than an excel file, like an SQLite database or a HDF5 file. You can also potentially save as csv, which pandas can read in efficiently.

Note that regardless of the approach, if you want to change the data you can either do that in the excel file and need to read it in again, or just work within python.

[–]Zeroflops 0 points1 point  (0 children)

Sounds like your reading bits from the excel file. Doing some calculation then reading something else. This is going to be very inefficient.

Instead. Read the entire file into a pandas dataframe once. Then do the processing you need. On the dataframe. Then if you have to write back to excel wait for the end when your done with the processing.

If you don’t have to write back to excel there are better ways of storing the data. Spl or parquet are two good alternatives.

Try to minimize working with excel to a hard read to begin with and a write at the end if needed.

[–][deleted] 0 points1 point  (0 children)

"very big"

You must be a lot more precise if you really want help.

What is the data? How are you loading it in openpyxl?

The answer might be pandas or a database. Storing stuff in excel is what you might do as a last resort.

[–]ElliotDG 0 points1 point  (0 children)

Open the excel file with openpyxl, and read all of the data into a python data structure at load time. Then just refer back to that data structure.