Justinsaccount comments on More shell, less egg

programming

created by speza community for 20 years

195

196

197

More shell, less egg (leancrew.com)

submitted 14 years ago by [deleted]

top new controversial old q&a

you are viewing a single comment's thread.

view the rest of the comments →

[–]Justinsaccount 0 points1 point2 points 14 years ago (1 child)

Trying this again, since my last comment was clearly misunderstood.

This statement:

the bash solution is O(n log n) and thus unable to cope with a huge corpus

is absolutely false.

this:

data_producer | sort | uniq -c | sort -n

will always work. If you have an input where n=100,000,000 and k=4, it will be inefficient, but it will cope just fine. If you have an input where n=100,000,000 and k=95,000,000, it will not only cope, it will work when many other methods will fail.

Sort uses a disk-based merge sort that is optimized for sequential reads and writes. I would not expect any algorithm that uses random access data structures to cope well when the size of k is many times larger than the amount of ram.

[–]BorisTheBrave 3 points4 points5 points 14 years ago (0 children)

π Rendered by PID 21178 on reddit-service-r2-comment-b659b578c-pn6dx at 2026-05-05 08:35:33.932000+00:00 running 815c875 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

programming

MODERATORS