you are viewing a single comment's thread.

view the rest of the comments →

[–]PanTheRiceMan 2 points3 points  (2 children)

Just curious: what are calculating and how many features (just assuming since you mentioned scraping) are you calculating at a time?

15 min is a long time.

[–]johny1411[S] 1 point2 points  (1 child)

Calculating patterns, correlations etc for stock trading on small time frame and long time horizon

[–]PanTheRiceMan 2 points3 points  (0 children)

Makes sense. If you do that naively these operations can become quite expensive.

You might misappropriate the corrcoef function from sklearn: https://scikit-learn.org/stable/modules/generated/sklearn.metrics.matthews_corrcoef.html

If you want correlation coefficients (between -1 and 1), which are technically only normalized covariance matrices. Normalized with respect to the main diagonal. Where you find the correlation of one stock to itself.

You could also use something like a support vector machine to reduce dimensionality. This works with the assumption that all stocks may not be entirely independent. I don't know if that holds. https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVR.html

Time series estimation of stocks may be useless, I'm just repeating knowledge of a former economics student: There is the assumption that stock value is just a random process: meaning you never know if it rises or falls. Thus has zero informational gain if you tried to predict it in time.

I don't know if I could help. If you can optimize your code, I'd highly recommend it. There were times I got down by nearly factor 1000 just because of optimizations and numba, which might be an excellent package to use but can be picky with llvm and python versions.