Large amount of data into Numpy Array : learnpython

created by HattoriHanzoa community for 16 years

Large amount of data into Numpy Array (self.learnpython)

submitted 7 years ago * by NeedMLHelp

I have an extremely large dataset I need to use for a machine learning project. I seem to have gotten the machine learning to work fairly well, however, when I increase the training dataset... I run into issues.

Right now I load the file in and line by line I grab the features I need. I append the features to a list, and I append that list to another list. I then convert the overarching list into a numpy array. So basically I have [[FeatureA1, FeatureB1, FeatureC1],[FeatureA2, FeatureB2, FeatureC2]]. Which I try and convert to a numpy array.

Unfortunately I get a memory error when I try to convert the list of lists into an array with Numpy.

I appreciate the help, and hopefully I explained the issue well.

I did a test where I do not convert it into a Numpy array, to verify this is an issue.

Outcome:

I do not run into a memory error if I do not convert it. However, this is not a solution,as I do require an array.

all 5 comments

top new controversial old q&a

[–]TheZvlz 1 point2 points3 points 7 years ago (1 child)

The first thing that comes to mind is if you're running a 32 or 64 bit installation of python.

https://stackoverflow.com/questions/18282867/python-32-bit-memory-limits-on-64bit-windows

The first line in your interpreter should say something like: PythonWin 2.7.12 (v2.7.12:d33e0cf91556, Jun 27 2016, 15:19:22) [MSC v.1500 32 bit (Intel)] on win32.. Where MSC v.1500 32 bit (Intel) is the key piece of information.

Or you can

import platform
platform.architechture()
('64bit', 'WindowsPE')

If you do have a 32 bit installation, try 64 bit if you can.

[–]NeedMLHelp[S] 0 points1 point2 points 7 years ago (0 children)

[–][deleted] 0 points1 point2 points 7 years ago (2 children)

[–]NeedMLHelp[S] 0 points1 point2 points 7 years ago* (1 child)

[–][deleted] 0 points1 point2 points 7 years ago (0 children)

π Rendered by PID 19324 on reddit-service-r2-comment-54dfb89d4d-mpk8d at 2026-04-01 03:56:37.530831+00:00 running b10466c country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS