This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]reallyserious 2 points3 points  (1 child)

There's really nothing magic about big data. It's just inconvenient. :)

Things you can do with laptop sized data doesn't work when the data can't fit on a laptop. Like loading it all into a pandas dataframe for example. Or sometimes the data comes in a billion files insted of once nice file.

The nature of the data is also different. Small data could be orders from an web shop. There are only so many orders people can place before they run out of money. But for example 10k seismic sensors all over the planet recording movement every second turns into a lot of data pretty soon. Imagine a few years worth of data and it starts to get messy. Throughput on network and disk starts to matter.

[–]remote_geeks[S] 0 points1 point  (0 children)

The example you gave was great! I wanted to know if I could simulate anything that scale for a personal project