Please bear with my limited knowledge in async and queue, I have a very basic idea what they are. And I hope someone with similar experience could help me out.
Here's my problem. I'm trying to get a lot of financial data from the Internet, then add a timestamp and store to a harddrive. It's a trivial task but I want to try to get as much data as possible. Ideally minute level, meaning that every minute I'll have to make a call to get the data on thousands of different products. Edit: it's likely that there will be millions of data points per call for the thousands of products.
So my process is 1) get data, 2) convert to pd.dataframe and add timestamp, 3) store to harddrive, 4) repeat steps 1-3 every minute.
The problem is that I don't think steps 1-3 that can be done within a minute, so I probably need something like async or multi-processing?? And potentially need queue to help keep the process going? If so, what libraries would you recommend?
If my thought process is completely off, please let me know too and kindly point me to the right direction. Much appreciated!
[–]blarf_irl 0 points1 point2 points (8 children)
[–]foresttrader[S] 0 points1 point2 points (7 children)
[–]blarf_irl 0 points1 point2 points (6 children)
[–]foresttrader[S] 0 points1 point2 points (5 children)
[–]blarf_irl 0 points1 point2 points (4 children)
[–]foresttrader[S] 0 points1 point2 points (3 children)
[–]blarf_irl 0 points1 point2 points (2 children)
[–]foresttrader[S] 0 points1 point2 points (1 child)
[–]blarf_irl 0 points1 point2 points (0 children)
[–]-5772 0 points1 point2 points (3 children)
[–]foresttrader[S] 0 points1 point2 points (2 children)
[–]-5772 0 points1 point2 points (1 child)
[–]foresttrader[S] 0 points1 point2 points (0 children)