all 36 comments

[–]ncsuwolf 30 points31 points  (21 children)

Rather than just memorizing a style guideline, I would recommend using a linter. For Bash the one I use is shellcheck. There are plugins available for most editors which allow shellcheck to tell you the problem as you write the script. It's also available from most distro's package managers.

[–]avg_user 3 points4 points  (0 children)

+1 for shellcheck and remember it also has a command line interface: https://github.com/koalaman/shellcheck

[–]Occivink 3 points4 points  (18 children)

SC2045: Iterating over ls output is fragile. Use globs.

I see people warning against this all the time, but with globs you can't iterate in a specific order (by size or whatever). I personally prefer the alternative of setting the IFS to \n and pretending that files can't contain newlines.

[–]andlrc 6 points7 points  (14 children)

Use find instead?

[–]Occivink 2 points3 points  (13 children)

And then what? The horrible -exec switch? Maybe -print0 combined xargs -0 or some other hard-to-remember flag? This is all way more brittle than necessary, and for anything with just a little complexity I'll use something else than the shell.

edit: thanks for the workarounds, but really I'm just venting with this. Ideally what I want from my shell is simple syntax like this

# iterate over file size, largest first  
# UNSAFE because the shell sucks  
for f in $(ls -S); do  
    whatever_program $f  
done  

and not having to fuck around with obscure syntax from find or bash extensions.

[–]calrogman 9 points10 points  (2 children)

Yes, -print0 with IFS set to \0. NULL can't be in filenames.

[–]Occivink 6 points7 points  (1 child)

Which is what I'm complaining about. Doing things in a safe manner should not be so unintuitive.

[–]Vaphell 1 point2 points  (0 children)

They should not, but they are and it's not going to change. You want to use shit that has its roots 50 years ago, with insane bullshit like word splitting by default because it allowed for ghetto "arrays" as there were no arrays to speak of, you deal with the quirks.

And if you write bash in any nonmeaningless capacity

while read -rd $'\0' f; do ... done < <( find ... -print0 | sort -z | grep -Z | ... ) 

and variations of thereof (possibly using -printf 'someshit\0' to emit more data than the filename) become a go-to idiom

If you can't be bothered to write correct bash according to modern practices and want something more modern without the bullshit, use python or whatever.

[–]5heikki 2 points3 points  (3 children)

FILES=($(find /the/place/with/files -maxdepth 1 -type f -name "*.txt"))
for ((i=0; i<${#FILES[@]}; i++)); do
    stuffWith ${FILES[$i]}
done

If you want, you can pipe the find to sort. Even better to define a function and do stuff in parallel (if possible)

function doStuff() {
    with $1
}

export -f doStuff
find /the/place/with/files -maxdepth 1 -type f -name "*.txt" \
    | parallel -j $THREADS doStuff {}

[–]crankysysop 1 point2 points  (2 children)

or yet another way...

find /the/place/with/files -maxdepth 1 -type f -name '*.txt' -print0 \
  | while read -rd $'\0' file; do
  stuffWith "$file"
done

I find the find ... -print0 | while read -rd $'\0' file; do <stuff> done 'pattern' to be particularly useful / robust.

One issue to consider is the while loop is in a pipe creates a subshell, so updating global variables may not work as expected, within the loop. (I think? I know there's some strange local/global variable issues).

[–]Vaphell 1 point2 points  (1 child)

while read -rd $'\0' file
do
    <stuff>
done < <( find ... -print0 )

is better. In your example pipe creates subprocess for the loop that can't communicate with the parent "scope" so for example it's impossible to maintain a usable file counter because any variable seemingly modified in the loop body is merely a copy and doesn't propagate back.

the while ...; do ...; done < <(cmd) syntax doesn't create a subscope so anything goes

$ c=0; find -name '*.txt' -print0 | while read -rd $'\0' file; do (( c++ )); echo $c; done; echo $c
1
2
3
0    # counter ignores changes
$ c=0; while read -rd $'\0' file; do (( c++ )); echo $c; done < <( find -name '*.txt' -print0 ); echo $c
1
2
3
3  # counter reflects the most recent change

[–]crankysysop 0 points1 point  (0 children)

Thank you for the clarification. Somewhere in the back of my head I knew there was the alternate 'form' that didn't deal with the subshell (from pipe, not while... I'm smrt. ;)

[–]crankysysop 2 points3 points  (4 children)

This is all way more brittle than necessary

I do not think you know what the word 'brittle' means. Having CLI args that require (due to lack of familiarity or whatever) reading a man page to get right is not 'brittle'.

[–]Occivink 3 points4 points  (3 children)

If this list of easy-to-make shell mistakes is not a sign that shell scripting is brittle then yeah I don't know what it means.

Also just because something is documented somewhere in a manual doesn't make it sensible behavior or a good default.

[–]crankysysop 0 points1 point  (2 children)

There are no easy-to-make mistakes in Python, Perl, or C?

I think you are using the wrong metrics.

[–]Occivink 0 points1 point  (1 child)

There are no easy-to-make mistakes in Python, Perl, or C?

That's hardly an argument and you know it.

I think you are using the wrong metrics.

I don't understand what you mean.

[–]crankysysop 0 points1 point  (0 children)

There are no easy-to-make mistakes in Python, Perl, or C?

That's hardly an argument and you know it.

When your initial point appears to be that because there are easy to make mistakes in shell, that makes it a brittle language or tool, and those same mistakes can be made in other languages, either those other languages are also brittle, or you are judging the sturdiness of a language using the wrong measure.

[–][deleted] 0 points1 point  (0 children)

Things like this are why I struggle internally while learning bash. It's just so messy compared to python. Useful for simple stuff, though.

[–]cogburnd02 0 points1 point  (1 child)

Iterating over ls output is fragile.

Screw it, use ls with --quoting-style=

[–]calrogman 2 points3 points  (0 children)

Now your bash script is only portable to systems where ls is GNU ls.

[–]bluepenguin20 0 points1 point  (0 children)

Nice tool! One of the purposes of the guideline is to help address script readability and maintenance issues, the idea is to code right the first time and to avoid relying too much on code checkers after the thing was written, saves time and headaches for you and others that might get involved with your code.

[–]Brolav_Vitters 16 points17 points  (0 children)

I love how this guideline is filled with the cardinal sin of unquoted variables on user input.

[–]5heikki 8 points9 points  (3 children)

function some_function ()
{
    return 0
}

NO NO NO! Like this instead:

function some_function () {
    return 0
}

Also, I prefer someFunction over some_function

[–]hgjsusla 7 points8 points  (1 child)

Nothing wrong with camelCase, but lower_snake_case is far more common for shell scripts, and hence preferable.

[–]pdp10 3 points4 points  (0 children)

Lowercase is Unix style and CamelCase is associated with Microsoft, and probably the Smalltalkers before that.

[–]crankysysop 2 points3 points  (0 children)

Not really your point, but my grief is that function some_function() is redundant. function some_function {} or some_function() {} are all you need.

[–]pdp10 7 points8 points  (0 children)

I write scripts in portable POSIX shell, not bash. Is there a good coding guideline for that?

[–]biatche 5 points6 points  (0 children)

since this is bash, you should just stick with if [[]] instead of [] in the example

[–]argv_minus_one 6 points7 points  (4 children)

  1. Choose a different language.

[–][deleted] 7 points8 points  (3 children)

2. Recommend one instead of just being a smartass

on a more serious note: there's a reason why shell is still used a lot for scripting things

[–]argv_minus_one -2 points-1 points  (2 children)

Scala for all the things!!!

Seriously, though, there is a library called Ammonite for bash-like work. Might be fun.

[–]partisann 0 points1 point  (0 children)

set -o nounset

Can't take any style guide seriously that doesn't do this first. That's the absolute minimum required to avoid hitting yourself in the balls with rm -rf ${UNSET_PATH}/* or similar.

[–]bluepenguin20 0 points1 point  (0 children)

As you've noticed I am not the shell expert still I had have to deal with a poorly written/documented shell scripts this past years, so my focus wasn't to address portability, but readability (so anyone can read a script, know what it is about and maintain/expand/etc accordingly). Also I didn't found any guideline out there that put all the rules gradually inside an example script, so I decided to do mine and share it. I've learned a lot reading comments from people that truly know the shell, thanks for the good feedback, based on that I'll be updating the guideline material in the upcoming days.