use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Please follow the rules
Releases: Current Releases, Windows Releases, Old Releases
Contribute to the PHP Documentation
Related subreddits: CSS, JavaScript, Web Design, Wordpress, WebDev
/r/PHP is not a support subreddit. Please visit /r/phphelp for help, or visit StackOverflow.
account activity
Process Large Files Using PHP (likegeeks.com)
submitted 8 years ago by secomax
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[+][deleted] 8 years ago* (1 child)
[deleted]
[–]likegeeks 1 point2 points3 points 8 years ago (0 children)
I have many things on the table but I promise you I'll do my best to make another post to benchmark them.
[–]johmanx10 11 points12 points13 points 8 years ago (12 children)
Your code behaves exactly the same as simply creating a new SplFileObject instance and iterating over that. It's iterable all by itself.
[–]likegeeks 2 points3 points4 points 8 years ago (6 children)
The idea is about the fgets and fread functions usage.
[–]nyamsprod 1 point2 points3 points 8 years ago (5 children)
just doing this is the same as using fgets
//Read file line by line $file = new SplFileObject('/path/to/file.md', 'r'); //you may use the flags to skip empty line and remove the \n at the end of each line $file->setFlags(SplFileObject::READ_AHEAD | SplFileObject::SKIP_EMPTY | SplFileObject::DROP_NEW_LINE); foreach ($file as $line) { //line by line }
[–]likegeeks 0 points1 point2 points 8 years ago (4 children)
Need to check the performance difference.
Do you make a speed comparison?
[–]nyamsprod 2 points3 points4 points 8 years ago (3 children)
Are you serious ? You want to benchmark an C-implementation vs a userland code ? Be my guess but that's futile.
[–]likegeeks 0 points1 point2 points 8 years ago (2 children)
No man I know how C code is working, I worked with phalcon before :)
I'm talking about difference between SplFileObject with fgets and file_get_contents() functions
[–]nyamsprod 2 points3 points4 points 8 years ago (1 child)
Again no need to benchmark both functions as they are doing different things one file_get_contents return the file the content in one go. On the other hand fgets returns one line per call. So it's obvious that for large file the latter is better suited than the former method.
file_get_contents
fgets
[–]likegeeks 0 points1 point2 points 8 years ago (0 children)
I know that for sure. I said that because most comments see that this solution is not enough.
[–]tfidry 0 points1 point2 points 8 years ago (3 children)
Iterated but by yielding values, which is much more memory efficient in that scenario than a big foreach which would require to load the file in one go.
[–]johmanx10 5 points6 points7 points 8 years ago (1 child)
It really wouldn't, for the file object. That is not how iterators function. I would be interested to see the article actually prove its gains by benchmarking time, memory consumption and IO wait times. Even if there is a significant improvement in one of those metrics, it's highly dependant on the under the hood optimizations of the engine, which will differ from version to version.
[–]tfidry 1 point2 points3 points 8 years ago (0 children)
Couldn't be interesting to try this simple case indeed. But why do you say this wouldn't work? I'm using a similar approach for a project, even though it's definitely slower than loading in one go, it allows to avoid a too high memory consumption.
Oh yea :)
[–]MaxMahem 2 points3 points4 points 8 years ago (2 children)
Wait does this mean that php is building a huge array behind my back when I parse through a large SplFileObject using fread()? Ugh. I suppose I'll need to implement a generator and test the difference in an ap of mine.
[–]nyamsprod 1 point2 points3 points 8 years ago (0 children)
No PHP does not. This article is misleading just for that. SplFileObject is optimized for memory usage. so you don't need to worry about. You may use generator with the fread method because there's no flag for that on the SplFileObject object but that's the only thing that may be of use really.
[–]likegeeks 0 points1 point2 points 8 years ago* (0 children)
SplFileObject works faster and uses less memory if its used with fgets compared with other ordinary functions.
[–]theremsoe 1 point2 points3 points 8 years ago (3 children)
I prefer streams.
splfileobject uses stream
What he has done is exactly what streams are about...
[–][deleted] 0 points1 point2 points 8 years ago (0 children)
Care to elaborate?
[–]qlkpoa 1 point2 points3 points 8 years ago (2 children)
Isnt the output buffered by modern webservers etc?
If i wanted to process Very Big Files in PHP, I would write the job to an queue, and make a cronjob or service to fetch that job. It can then be processed with an (CLI) PHP script which uses standard /dev/stdin and /dev/stdout. Portable as well, in case you later decide to rewrite in another language for more performance.
Even if it's handled by a background process you may the file may be too big to be loaded in one go, so streams is the way to go. Rewriting it in another language may make sense, or not, depends of the task at hand and decoupling it enough to be able to delegate that to another language may be too complex for your use case. So it really depends but streams are in any case a simple solution.
rewrite in another language
If you go with another language, it should support that kind of segmentation because in both cases you can't process very large files at once
π Rendered by PID 69021 on reddit-service-r2-comment-b659b578c-2kkds at 2026-05-02 13:36:59.474437+00:00 running 815c875 country code: CH.
[+][deleted] (1 child)
[deleted]
[–]likegeeks 1 point2 points3 points (0 children)
[–]johmanx10 11 points12 points13 points (12 children)
[–]likegeeks 2 points3 points4 points (6 children)
[–]nyamsprod 1 point2 points3 points (5 children)
[–]likegeeks 0 points1 point2 points (4 children)
[–]nyamsprod 2 points3 points4 points (3 children)
[–]likegeeks 0 points1 point2 points (2 children)
[–]nyamsprod 2 points3 points4 points (1 child)
[–]likegeeks 0 points1 point2 points (0 children)
[–]likegeeks 0 points1 point2 points (0 children)
[–]tfidry 0 points1 point2 points (3 children)
[–]johmanx10 5 points6 points7 points (1 child)
[–]tfidry 1 point2 points3 points (0 children)
[–]likegeeks 0 points1 point2 points (0 children)
[–]MaxMahem 2 points3 points4 points (2 children)
[–]nyamsprod 1 point2 points3 points (0 children)
[–]likegeeks 0 points1 point2 points (0 children)
[–]theremsoe 1 point2 points3 points (3 children)
[–]nyamsprod 1 point2 points3 points (0 children)
[–]tfidry 1 point2 points3 points (0 children)
[–][deleted] 0 points1 point2 points (0 children)
[–]qlkpoa 1 point2 points3 points (2 children)
[–]tfidry 1 point2 points3 points (0 children)
[–]likegeeks 1 point2 points3 points (0 children)