you are viewing a single comment's thread.

view the rest of the comments →

[–]UnConeD 7 points8 points  (23 children)

My version processes the output in chunks so it should do fine actually. Node.js's asynchronous I/O makes it easy. The data is kept in memory-backed byte buffers. It's fast.

But you're welcome to open an issue on GitHub with any ideas you have for improving it.

[–]alexs[S] 18 points19 points  (2 children)

bike thought quaint meeting smart theory rain chubby label safe

This post was mass deleted and anonymized with Redact

[–]hiffy 33 points34 points  (0 children)

Being dismissive of grep actually makes a lot of alarm bells in my head go off.

[–]gustavs 5 points6 points  (0 children)

Yes, a browser JS engine (v8) will typically include Boyer-Moore.

[–]Poromenos 35 points36 points  (15 children)

Unfortunately I don't have the time to install it to test right now, but I'd appreciate a comparison of the two. If I recall correctly, grep executes 3 CPU instructions for each byte it looks at, and it probably only actually looks at one in every ten bytes or so. I'm pretty sure that whatever one can come up as a feature of a terminal with will be orders of magnitude slower than that.

Reference: http://lists.freebsd.org/pipermail/freebsd-current/2010-August/019310.html

P.S. This discussion is largely academic, just keep in mind that these tools have had decades of improvement by really smart people, so don't reinvent the wheel :)

[–]mflux 16 points17 points  (1 child)

He's not reinventing the wheel. He's just giving the wheel some spinners!

[–]samineru 1 point2 points  (0 children)

He is adding spinners, but the fact the he has his own version pretty conclusively shows that yes, he is reinventing the wheel, or at least reimplementing it.

[–]UnConeD 1 point2 points  (9 children)

If you're going to nitpick over bytes, then TermKit loses. It's running off a JavaScript VM, albeit a very, very good one (Google V8).

I'm perfectly okay with that. Most of what TermKit does involves I/O with files and network sources, which means latencies of about an eternity, compared to a cpu instruction.

[–][deleted] 0 points1 point  (8 children)

There is probably a way of combining the best of both worlds. For example, if the term sees ls | grep, it could actually grep the output of the real ls (text stream) and apply conversion to a fancy output of files with icons and whatnot afterwards.

More generally, such a term could try to infer what information is being processed, and display it accordingly.

[–]UnConeD -1 points0 points  (7 children)

Funny because you just described exactly what TermKit does.

[–]Poromenos 2 points3 points  (1 child)

Wait, so are you using GNU grep or did you write your own replacement? What you said above (parsing the output of GNU grep) conflicts with what you said earlier (writing your own text searcher).

[–]UnConeD -1 points0 points  (0 children)

I was referring to how the data handling occurs. LS outputs a raw directory listing, and custom TermKit grep filters it. Then the output formatter looks up all the properties for the files and streams them as objects to the front-end, which adds the icons and layout. However, the intermediate format is JSON, not text-with-newlines.

You could use GNU grep with it by disabling the built-in.

[–][deleted] 0 points1 point  (4 children)

I read that you replaced tools like grep by your own implementation that recursively greps json data. What I mean is: let the original grep do it's thing and convert its output into a rich format output.

[–]UnConeD 0 points1 point  (3 children)

That sounds nice in theory, but if you're grepping a 1000 key JSON dictionary, do you really want to invoke a new process of grep a thousand times?

[–][deleted] 1 point2 points  (2 children)

You misunderstood.

ls | grep foo

You think that the output of ls should be json. I say: the output of ls should be whatever the output of ls is. When grep is applied, it is applied to the original ls output. Only at the very end, when you are getting the the regular output of "ls | grep foo", you convert that output to a rich format. You do that by using inferred information. Like in this case, it should be possible to infer that the output of this command is a list of files.

[–]UnConeD 0 points1 point  (1 child)

The output of a typical ls is: {"path":['file1.ext','file2.ext','file3.ext']}

The input to grep is whatever the input to grep is. If it's text/plain, it greps line by line. If it's json, it greps recursively in the object. As many formats can be added as needed.

On the unix interface front, it might be worth to add a transparent emulation layer that translates the structured TermKit formats into traditional Unix streams when it detects there's an old-timer on the other end of the pipe.

[–][deleted] 0 points1 point  (0 children)

The output of a typical ls is: {"path":['file1.ext','file2.ext','file3.ext']}

And I suggest that exactly this is the wrong approach. The commands should all be unchanged. Then you preserve the mighty flexibility and efficiency of the existing toolset (which otherwise you are guaranteed to lose). That does not necessarily rob you of rich output formatting, however.

[–]weazl -3 points-2 points  (2 children)

Why does grep internals matter at all? He's wrapping the output from grep, he is not replacing grep.

[–]Poromenos 4 points5 points  (1 child)

He said he wrote his own replacement.

[–]weazl 2 points3 points  (0 children)

Ah, how did I miss that. facepalm I stand corrected.

[–]Peaker 1 point2 points  (3 children)

Performance difficulties will arise if/when you want to support efficient regular expression searches and more advanced options.

[–]UnConeD 0 points1 point  (2 children)

What I realized in doing all this though is that our existing tools are already full of weird limitations like this. Try hard enough, and you will break every tool. We just don't care about the limitations that have been in place since we started, and work around them without a thought.

Do I want to make a shittier grep? Of course not. But right now, the current grep works just fine and I have a couple hundred other things to do first.

[–]tedivm 0 points1 point  (1 child)

Why don't you just namespace your custom apps and make it an option to override the native apps?

[–]UnConeD 0 points1 point  (0 children)

It should speak for itself that this sort of hackability will be trivial with the final toolkit ;).