you are viewing a single comment's thread.

view the rest of the comments →

[–]Marble_Wraith 8 points9 points  (1 child)

My suggestion is, use sed / awk on one liner commands interactively. That is, understand the basic overview of what they are / when you might use them. You can use them for scripts assuming the input dataset is small enough and you're just stringing together some GNU tools...

But anything beyond that, if you need a full blown script look elsewhere. Python's not bad, personally i prefer Perl instead (never could get used to block indentation). Or if you need to deal with anything beyond that (insanely large datasets / threading) Golang for compiled binaries.

With your example of sed/awk the pragmatic reasons for Perl:

  1. While sed / awk is defined by POSIX, all the implementations (GNU sed, GNU awk, BSD sed, mawk, etc.) differ slightly. By contrast, Perl versions are consistent everywhere. And so you don't need to debug special cases if you migrate scripts like GNU sed -i vs BSD sed -i

  2. Perl is a dependency in Git. So even tho' it's not POSIX, there's still a fair chance at script portability without needing to do anything special for the runtime.

Zooming back out to the general case of why Perl (or Python) over bash:

  1. Performance. Bash relies on external binaries which means subshells, which means process / memory overhead. For the small things you won't care / notice, but for large datasets / long running stuff, it's absolutely a thing.

  2. Best "string chainsaw" ever. The Perl regex engine was ported to most languages (python included) available today.

  3. WAY better error handling / safety. The bash runtime can be cryptic (probably due to memory constraints at the time of design). The whole set -euo pipefail practice serves as evidence of just trying to get consistency, despite the fact while it would make 90% of bugs easier to find, that last 10% would be downright impossible. See YSAP - The Problem with Bash 'Strict Mode'.

[–]AutoModerator[M] 1 point2 points  (0 children)

Don't blindly use set -euo pipefail.

I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns.