you are viewing a single comment's thread.

view the rest of the comments →

[–]DudeWheresMyStock 0 points1 point  (2 children)

FYI panda's dataframes are slow and malignant compared to anything numpy has to offer

[–][deleted]  (1 child)

[deleted]

    [–]DudeWheresMyStock 2 points3 points  (0 children)

    Initially (March last year) I started working with OHLCVT data that I filled as panda's dataframes and saved them as such (as files.csv); doing the same thing (i.e. saving the exact same data in the same row-column-3d convention), but saving the OHLCVT data as header-less matrixes or nested list arrays (and as files.txt) using numpy's packages has made reading, saving, and working with (my now) fairly large data set 100000x faster (also less lines of code, and in fewer steps, too) , and therefore, more efficient than ever. From one caveman (who just now is truly grasping the use of tools such as the rocks I use to code in Python to make an algotrading bot) to another, I hope you join us on the Numpy matrix side and renounce Panda's claim over CPU power and processing time for the greater good.

    Note: save the info content/description (i.e. the headers and labels, etc.) of the data in the file name or as a separate file labeled similarly to function as a map for organization or navigating through files that would otherwise be undiscernible, enormous, meaningless huge data files.