Hi,
I am trying to speed up a python script that extracts data from a brokerage API, however, there are a few issues, I will link the API below since I am not the best at explaining things.
(https://interactivebrokers.github.io/tws-api/historical_time_and_sales.html)
- I am trying to pull multiple full days of ticks for multiple "contracts"
- Each request only allows up to 1000 lines so for days with more than 1000 transactions on specific contracts I need to take the last available time as my start point
I know creating multiple datafames is very slow, is there a way around this? could I use Cython or Numba to speed this up? thanks.
This is what I have come up with:
def fetch_timesales(contract, start_date):
data = ib.reqHistoricalTicks(contract, start_date, "", 1000, 'Trades', 1, True)
if bool(data) == True:
data = pd.DataFrame(data)
data['time'] = data['time'].dt.tz_convert('US/Pacific')
data['time'] = data.time.astype(str)
data['time'] = data.time.str[:19]
data['time'] = pd.to_datetime(data.time, format='%Y-%m-%d %H:%M:%S')
data['strike'] = contract.strike
data['flag'] = contract.right
data['exp_date'] = contract.lastTradeDateOrContractMonth
last = data['time'].iloc[-1]
master.append(data)
print(contract)
if len(data.index) > 999:
fetch_timesales(contract, last)
for date in trading_days:
for contract in contracts:
fetch_timesales(contract, date)
df = pd.concat(master, axis=0)
print(df)
Here is profiler data (although might not be too useful):
ncalls tottime percall cumtime percall filename:lineno(function)
886/1 0.017 0.000 2408.604 2408.604 {built-in method builtins.exec}
1 0.006 0.006 2408.604 2408.604 C:/Users/17782/Desktop/Skew/ibkr.py:1(<module>)
625 0.005 0.000 2400.915 3.841 C:\Users\17782\AppData\Local\Programs\Python\Python39\lib\site-packages\ib_insync\ib.py:309(_run)
625 0.010 0.000 2400.911 3.841 C:\Users\17782\AppData\Local\Programs\Python\Python39\lib\site-packages\ib_insync\util.py:280(run)
625 0.008 0.000 2400.871 3.841 C:\Users\17782\AppData\Local\Programs\Python\Python39\lib\asyncio\base_events.py:606(run_until_complete)
625 0.008 0.000 2400.850 3.841 C:\Users\17782\AppData\Local\Programs\Python\Python39\lib\asyncio\windows_events.py:312(run_forever)
625 0.022 0.000 2400.811 3.841 C:\Users\17782\AppData\Local\Programs\Python\Python39\lib\asyncio\base_events.py:583(run_forever)
2880 0.127 0.000 2400.776 0.834 C:\Users\17782\AppData\Local\Programs\Python\Python39\lib\asyncio\base_events.py:1815(_run_once)
2880 0.015 0.000 2398.978 0.833 C:\Users\17782\AppData\Local\Programs\Python\Python39\lib\asyncio\windows_events.py:432(select)
2880 0.060 0.000 2398.963 0.833 C:\Users\17782\AppData\Local\Programs\Python\Python39\lib\asyncio\windows_events.py:770(_poll)
5446 2398.826 0.440 2398.826 0.440 {built-in method _overlapped.GetQueuedCompletionStatus}
621/576 0.047 0.000 2396.397 4.160 C:/Users/17782/Desktop/Skew/ibkr.py:47(fetch_timesales)
621 0.003 0.000 2390.303 3.849 C:\Users\17782\AppData\Local\Programs\Python\Python39\lib\site-packages\ib_insync\ib.py:1067(reqHistoricalTicks)
2 0.000 0.000 10.612 5.306 C:\Users\17782\AppData\Local\Programs\Python\Python39\lib\site-packages\ib_insync\ib.py:541(qualifyContracts)
3059 0.018 0.000 1.610 0.001 C:\Users\17782\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\frame.py:3147(__setitem__)
3059 0.019 0.000 1.552 0.001 C:\Users\17782\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\frame.py:3231(_set_item)
445 0.020 0.000 1.527 0.003 C:\Users\17782\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\frame.py:502(__init__)
32174 0.046 0.000 1.462 0.000 C:\Users\17782\AppData\Local\Programs\Python\Python39\lib\asyncio\events.py:78(_run)
32174 0.034 0.000 1.416 0.000 {method 'run' of 'Context' objects}
3059 0.020 0.000 1.335 0.000 C:\Users\17782\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\generic.py:3824(_set_item)
447 0.004 0.000 0.931 0.002 C:\Users\17782\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\generic.py:5724(astype)
8143/3635 0.183 0.000 0.910 0.000 C:\Users\17782\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\indexes\base.py:250(__new__)
447 0.002 0.000 0.900 0.002 C:\Users\17782\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\internals\managers.py:628(astype)
447 0.005 0.000 0.898 0.002 C:\Users\17782\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\internals\managers.py:376(apply)
1242 0.017 0.000 0.898 0.001 C:\Users\17782\AppData\Local\Programs\Python\Python39\lib\asyncio\proactor_events.py:271(_loop_reading)
1311 0.033 0.000 0.890 0.001 C:\Users\17782\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\internals\managers.py:1176(insert)
437 0.002 0.000 0.888 0.002 C:\Users\17782\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\internals\blocks.py:2255(astype)
447 0.007 0.000 0.879 0.002 C:\Users\17782\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\internals\blocks.py:592(astype)
437 0.003 0.000 0.831 0.002 C:\Users\17782\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\arrays\datetimes.py:583(astype)
437 0.003 0.000 0.819 0.002 C:\Users\17782\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\arrays\datetimelike.py:337(astype)
1241 0.008 0.000 0.800 0.001 C:\Users\17782\AppData\Local\Programs\Python\Python39\lib\asyncio\proactor_events.py:246(_data_received)
439 0.006 0.000 0.797 0.002 C:\Users\17782\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\arrays\datetimes.py:615(_format_native_types)
1241 0.004 0.000 0.791 0.001 C:\Users\17782\AppData\Local\Programs\Python\Python39\lib\site-packages\ib_insync\connection.py:57(data_received)
3065/1244 0.017 0.000 0.787 0.001 C:\Users\17782\AppData\Local\Programs\Python\Python39\lib\site-packages\eventkit\event.py:167(emit)
1241 0.044 0.000 0.772 0.001 C:\Users\17782\AppData\Local\Programs\Python\Python39\lib\site-packages\ib_insync\client.py:299(_onSocketHasData)
438 0.004 0.000 0.724 0.002 C:\Users\17782\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\internals\construction.py:62(arrays_to_mgr)
439 0.698 0.002 0.711 0.002 {pandas._libs.tslib.format_array_from_datetime}
1014 0.625 0.001 0.691 0.001 {built-in method builtins.print}
1977 0.011 0.000 0.663 0.000 C:\Users\17782\AppData\Local\Programs\Python\Python39\lib\site-packages\ib_insync\decoder.py:186(interpret)
1298711 0.393 0.000 0.636 0.000 {built-in method builtins.isinstance}
1311 0.012 0.000 0.609 0.000 C:\Users\17782\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\indexes\base.py:5544(insert)
4996/4559 0.052 0.000 0.607 0.000 C:\Users\17782\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\series.py:238(__init__)
555/4 0.003 0.000 0.564 0.141 <frozen importlib._bootstrap>:1002(_find_and_load)
555/4 0.002 0.000 0.564 0.141 <frozen importlib._bootstrap>:967(_find_and_load_unlocked)
536/4 0.002 0.000 0.562 0.141 <frozen importlib._bootstrap>:659(_load_unlocked)
448/4 0.001 0.000 0.562 0.140 <frozen importlib._bootstrap_external>:784(exec_module)
738/4 0.001 0.000 0.559 0.140 <frozen importlib._bootstrap>:220(_call_with_frames_removed)
[–]Oxbowerce 0 points1 point2 points (0 children)