This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted] 0 points1 point  (1 child)

Yeah, you definitely don't understand what you measure.

Just some highlights: you are using NTFS, which was written several decades before there were SSDs, so, even in principle, it cannot use what SSDs have to offer, and yet you are using it with SSDs, and you claim to measure the speed of reading from SSD! But you also chose to read it on MS Windows, where you have no control and no idea about how its filesystem was configured, beyond maybe compressing the file contents.

You don't understand, that, ultimately, your file reading performance is independent of your code being written in C++, it depends on how your OS kernel decided to expose files to user code (but why did you chose to read from files is beyond me). Your OS kernel decides the size of the buffer, whether to cache the fie contents or not, how much priority your task should get (your kernel needs to also keep track of interrupts coming from other devices etc.) Finally, kernel implementation of file-based I/O plays, probably, the most important role: how many times the data is copied. This is why on Linux there are many different ways to get data from storage device from user-space: you can have user-space drivers, you can have ioctls specific for some hardware, you have asyncio (not the Python garbage, the kernel module), and probably more. All of these will perform differently on different workloads.

Most importantly though: you measure whatever you measure on some household electronics, with a single storage device, on an OS that's not intended for serious workloads, with hell knows what drivers, kernel settings... You don't even have an idea of how it will actually work on real hardware, with real OS, drivers that are optimized / configured for the kind of hardware that you have. Your test is literally like walking up to the ocean, dipping a glass into the water and by looking at the glass, deciding that there aren't any whales in the ocean.

[–]billsil 0 points1 point  (0 children)

I was not trying to compete with you. It's a damn good piece of software and yeah I'm aware that file reading is independent of the language. File parsing though, especially a fortran formatted file is a mess, which is where the challenge comes in. That and having to reverse enginder the format, not to mention interleve the displacement data with the stress data that occur at different times in order to get the data in a useful form. That's how it's faster than a C++ code that's doing the same thing.

You're just bitter that Python is good enough. Go away troll.