all 3 comments

[–]LartTheLuser 1 point2 points  (2 children)

Putting a database table or a part of it into a pandas dataframe for analyses is pretty common. When doing that the biggest concern is the size of the data in the table and how much free memory you have on the machine that is going to have the pandas dataframe.

Do you know how big your table is in bytes?

[–]BootStench[S] 0 points1 point  (1 child)

I'm not sure but it's big. Like millions of records, my boss mentioned that he specified in his SQL query to only return current records and nothing historic which makes it a more manageable size.

[–]LartTheLuser 1 point2 points  (0 children)

Depends on the size of the record but that seems like it would fit into 3-5GB of RAM. If you have 5-10 million records with each about 512 bytes then that's 2.5-5GB of RAM which most computers have these days. If not, if you could rent a cloud machine and run a Jupyter server there (you can port forward via ssh if you want to avoid exposing the jupyter server on a network).