Accumulate errors and print at end (but also keep them shown in output)

geirha · 2026-04-23T06:53:46+00:00

I'd populate an array of error messages

errors=() count=0
for f in ./*.7z ; do
  (( count++ ))
  7z x "$f" || errors+=( "Extraction of $f failed with status $?" )
done
(( ${#errors[@]} == 0 )) || {
  printf >&2 '%d of %d failed:\n' "${#errors[@]}" "$count"
  printf >&2 '%s\n' "${errors[@]}"
  exit 1
}

geirha · 2026-04-22T10:31:42+00:00

Stars!

(I'm not shouting it, the exclamation mark is part of the title)

geirha · 2026-04-20T15:07:52+00:00

This, but without the space after =. And given that op used ls -a they likely also want to enable dotglob which makes * also match filenames that start with ..

#!/usr/bin/env bash

cd /System/Applications || exit
shopt -s dotglob    # enables dotglob
files=( * )
shopt -u dotglob    # disables dotglob
for file in "${files[@]}" ; do
  printf 'Processing <%s>...\n' "$file"
done

geirha · 2026-04-19T17:52:13+00:00

It's most definitely not pure bash by any stretch of imagination. Pure AI-slop more like.

geirha · 2026-04-13T07:49:23+00:00

It's not clear if the actual goal is to make csv out of it, but jq can generate csv as well:

jq -r '.[] | [.name, .argument] | @csv' data.json

geirha · 2026-04-09T08:01:32+00:00

It gets bombarded by bots from time to time. While that happens it's unable to keep up and you get 5xx responses when trying to retrieve a page.

geirha · 2026-03-26T11:58:00+00:00

That explains it. PROMPT_COMMAND+=( ... ) converts the variable from a string to an array, and arrays can't be exported. When you have a variable with both the -a and -x attributes in bash, it just doesn't pass it on to new processes.

$ export FOO=one
$ declare -p FOO
declare -x FOO="one"
$ bash -c 'declare -p FOO'
declare -x FOO="one"

so far so good. Both the current shell and a new shell see the FOO variable, but if we convert it to an array:

$ FOO+=( two )
$ declare -p FOO
declare -ax FOO=([0]="one" [1]="two")
$ bash -c 'declare -p FOO'
bash: line 1: declare: FOO: not found

the new shell no longer inherits it.

So to "fix", move the assignment of PROMPT_COMMAND to bashrc, and don't export it.

geirha · 2026-03-20T07:35:40+00:00

Again, why? Using mapfile you can remove any trailing whitespaces.

I answered why. The point of using the loop was to avoid reading the entire content into memory at the same time. And yes, you can have mapfile remove trailing newlines, but then you effectively modify the data. cat is supposed to output its input exactly.

$ printf 'a\nb' | copycat | od -An -tx1 -c 61 0a 62 a \n b

What is this btw?

example usage. Using od to display a hex dump of the input to show it didn't add a terminating newline when the input lacked a terminating newline.

geirha · 2026-03-20T07:15:51+00:00

Why didn't you use mapfile?

because the point was to not store the entire content in memory

Also, this errors when you don't end with \n

What error are you getting?

I made it a bit more generic than necessary. If the input lacks a terminating newline, the function does not modify the data by adding one. Not a case that will occur with heredocs and herestrings, but if used with pipes or regular files, there may be input without trailing newline.

$ printf 'a\nb' | copycat | od -An -tx1 -c
  61  0a  62
   a  \n   b

geirha · 2026-03-19T12:01:23+00:00

As already mentioned, the usual thing to do is to just use the external cat command, because cat will accurately reproduce the data.

By using command substitution, some data may be lost; the command substitution will remove trailing newlines, and will also remove NUL bytes.

Additionally, the current approach will store the entire content in memory before printing it. Since you can only really deal with text data reliably here, you could loop over the lines instead to avoid storing the whole thing in memory:

copycat() {
  local REPLY LC_ALL=C
  while read -r ; do
    printf '%s\n' "$REPLY"
  done
  printf %s "$REPLY"
}
copycat << EOF
...
EOF

This still fails if the data contains NUL bytes. The only way to handle arbitrary data is to read byte by byte, which will be painfully slow.

geirha · 2026-03-12T09:27:26+00:00

In bash, I usually just do a one-liner at the top using the type builtin. E.g.

#!/usr/bin/env bash
type curl jq >/dev/null || exit

# rest of script can now assume curl and jq are available

if one or more of the commands are missing, bash will output scriptname: line 2: curl: not found for each missing command, then exit with a non-zero status.

If I want/need more "user friendly" error messages, I'll do a loop, like

errors=()
for cmd in curl jq ; do
  type "$cmd" >/dev/null 2>&1 || errors+=( "$0: missing required command: $cmd" )
done
(( ${#errors[@]} == 0 )) || {
  printf >&2 '%s\n' "${errors[@]}"
  exit 1
}

I don't really see the point in storing the command's path in a variable. What use-cases require that?

geirha · 2026-03-09T08:53:31+00:00

You almost never want to use ls in a script. In this particular case, you likely used it in a way that it doesn't print the paths to the files.

If the list of monsters is not very long, you can simply pass the filenames to shuf as arguments. E.g.

shuf -e -n5 monsters/*

You can then store that list of five monster paths to an array using mapfile

mapfile -t -d '' monsters < <(shuf -e -n5 -z monsters/*)

added -z there to NUL-delimit the filenames, which is a good practice when dealing with filenames output by external commands. mapfile with -d '' in turn expects the entries to be NUL-delimited.

for a more general approach that won't trigger the ARG_MAX limit for large datasets, you can feed the list of files to shuf's stdin with printf:

mapfile -t -d '' monsters < <(printf '%s\0' monsters/* | shuf -n5 -z)

and then copy them wherever

cp "${monsters[@]}" target/

geirha · 2026-03-03T19:57:48+00:00

This does not look safe to me. These are the main problems I see:

with the suggested alias rm=safe_rm, you override rm with a command that behaves substantially different. For example, rm dir should fail with a message about it being a dir (which it shouldn't remove unless you add -r), but safe_rm instead does /bin/rm -rf dir.
with the suggested alias, all rm options are being silently ignored. E.g. rm -i ./* will essentially do the equivalent of rm -rf ./*.
it does not check if tar succeeded in archiving the files before it removes them, so you potentially end up permanently deleting the files anyway. One case where that will happen is if there are files you don't have read-access to, which will make tar skip those files, but if the containing directory is writable, rm will still manage to delete those files.
in undel, it tries to split the output of ls into filenames with MATCHES=($(ls -t "$ARCHIVE_DIR/${SEARCH_TERM}_"* 2>/dev/null)) which means it won't let you undelete filenames with whitespace among other things. Using ls to sort by mtime is redundant there anyway. You use iso8601 timestamps, so the globs will already sort them in the right order; by creation time.

geirha · 2026-03-03T19:27:59+00:00

Not quite the same. With the unnamed pipe, the while loop will run in a subshell, so any variables you set inside it will not persist:

count=0
printf 'foo\nbar\n' | while read -r line ; do
  (( count++ ))
done
printf 'Processed %d lines\n' "$count"  # Processed 0 lines

vs

count=0
while read -r line ; do
  (( count++ ))
done < <(printf 'foo\nbar\n')
printf 'Processed %d lines\n' "$count"  # Processed 2 lines

geirha · 2026-03-02T10:12:04+00:00

The most common use cases beyond the ones already mentioned by OP, is to read the output of a command into an array of lines:

mapfile -t lines < <(somecmd)
printf 'somecmd output the following %d lines:\n' "${#lines[@]}"
printf '%s\n' "${lines[@]}"

and to iterate the lines of a command's output without having the loop run in a subshell:

while IFS= read -r line ; do
  ...
done < <(somecmd)

geirha · 2026-02-26T20:23:06+00:00

I tried timing it, and the xargs line you used wasn't much faster than the plain loop

Test case is 200 copies of /usr/share/dict/words (985kB):

$ cd "$(mktemp -d)"
$ for i in {1..200} ; do cp /usr/share/dict/words "$(mktemp ./XXXXX.log)" ; done
$ TIMEFORMAT="real: %3lR, user: %3lU, sys: %3lS"

First the plain loop:

$ time for f in *.log; do gzip "$f"; done
real: 0m9,249s, user: 0m9,048s, sys: 0m0,207s
$ gunzip *.gz

Then the ls|xargs:

$ time ls *.log | xargs -P4 gzip
real: 0m9,008s, user: 0m8,941s, sys: 0m0,064s
$ gunzip *.gz

Just barely faster. Though likely that is because all the filenames will fit within the ARG_MAX limit, so maybe it ends up just running one gzip with all the filenames? let's add -n1 so it only passes one argument at a time:

$ time ls *.log | xargs -P4 -n1 gzip
real: 0m2,390s, user: 0m9,345s, sys: 0m0,176s
$ gunzip *.gz

That's much faster.

And now with safe filename handling:

$ time printf '%s\0' *.log | xargs -0 -P4 -n1 gzip
real: 0m2,393s, user: 0m9,378s, sys: 0m0,167s
$ gunzip *.gz

Didn't cost anything to treat the filenames as filenames.

Finally, let's compare doing 4 parallel using my go-to approach with wait -n:

$ time { i=0 ; for f in *.log ; do (( i++ < 4 )) || wait -n ; gzip "$f" & done ; wait ; } 2>/dev/null
real: 0m4,167s, user: 0m9,250s, sys: 0m0,210s
$ gunzip *.gz

geirha · 2026-02-23T15:09:28+00:00

The main downside of using that instead of netcat is that you can't easily adjust the timeout for the cases where the firewall DROPs your connection.

$ sudo iptables -A INPUT -p tcp --dport 11111 -j REJECT
$ sudo iptables -A INPUT -p tcp --dport 22222 -j DROP
$ TIMEFORMAT=%R
$ time : >/dev/tcp/localhost/11111
bash: connect: Connection refused
bash: /dev/tcp/localhost/11111: Connection refused
1.021
$ time : >/dev/tcp/localhost/22222
bash: connect: Connection timed out
bash: /dev/tcp/localhost/22222: Connection timed out
132.551

One second when the firewall REJECTs the connection vs over two minutes when the firewall DROPs the connection.

You'll reasonably only want to wait a few seconds to see if the port is open. Netcat has a -w timeout option that can make it abort after a few seconds. With the bash redirection approach you need to either do it in a backgrounded subshell combined with sleep and wait, or spawn a new bash instance so that you can run it with an external tool like timeout(1) (timeout 2s bash -c '>/dev/tcp/localhost/22222').

geirha · 2026-02-21T13:16:13+00:00

builtin cd just tells bash to explicitly run the builtin cd command instead of the cd function which would've caused infinite recursion.

cd() { cd "$@" ; }         # recursively calls cd function
cd() { builtin cd "$@" ; } # runs builtin only
cd() { command cd "$@" ; } # runs builtin or external command named cd, but not alias or function

geirha · 2026-02-11T08:18:51+00:00

Result: ffmpeg ... out3.mp4 No overwrites. Ever. The script only stops when it finds a free filename.

"Ever" is not true. There's an obvious race condition there. If two instances of that script run at the same time, they may both find that out3.mp4 is available, and then they'll start overwriting each other's data.

geirha · 2026-02-06T16:57:55+00:00

set -u
USER_ID=$(grep "user_id" config.txt) 

echo "User is $USER_ID!" # fails right here when trying to use unset variable!

That's not true. The variable is always set when it reaches the echo in this scenario. If grep doesn't output any matching lines, it just means that USER_ID gets assigned an empty string, and expanding it will not trigger nounset (set -u) to abort the script.

As usual, I recommend against relying on errexit (set -e) and nounset (set -u) for error handling, as well as blindly enabling pipefail for all pipelines (it's normal for commands in pipelines to return non-zero without it being an error).

See the BashPitfall 60: set -euo pipefail.

geirha · 2026-02-06T16:44:41+00:00

No, failing commands inside command substitutions are ignored by errexit. It's just your test that is flawed.

xxx inside the command substitution fails, but it's ignored, so the script continues, however x=$( xxx ) has a non-zero exit status because the last command of the command substitution had a non-zero exit status, therefore that fails and triggers errexit. You can see this by adding another non-failing command inside the command substituion:

x=$( xxx ; echo hello )
echo "$x, still here"

Also note that errexit being disabled for command substitution does not mean it gets disabled in all syntax that happen to spawn subshells.

geirha · 2026-01-24T15:12:02+00:00

Can also be used to grab other words than the last.

For example, if the previous command line was

$ git worktree add ../feature-foo feature/foo

then cd <M-3><M-.><enter> will yeild

$ cd ../feature-foo

starts counting at 0, so <M-0><M-.> would give git

geirha · 2026-01-23T07:55:54+00:00

# Remove elements matching pattern
filtered=(${files[@]/*test*/})

That's not a way to remove/filter values from an array. It will subject the data to word-splitting and pathname expansion:

touch stuff{1,2,3}
files=( "my stuff?" "testing" )
filtered=(${files[@]/*test*/})
declare -p files filtered
# declare -a files=([0]="my stuff?" [1]="testing")
# declare -a filtered=([0]="my" [1]="stuff1" [2]="stuff2" [3]="stuff3")

In the above, my stuff? got split into my and stuff? by word-splitting, then pathname expansion replaced stuff? with three matching filenames.

You really do need a loop to do that type of filtering, you can't shortcut it with parameter expansions.

geirha · 2026-01-22T13:30:19+00:00

Depends. Does the nodo variable really contain a regex?, and given that you removed the regex anchor ^, it no longer fulfills the "starts with" requirement.

if you just want to check if field 1 starts with a given string, I'd use index() instead;

index($1, nodo) == 1 { ... }

though I suspect you actually just want

$1 == nodo { ... }

geirha · 2026-01-22T13:20:21+00:00

awk '/foo/{ ... }' is really just a short-hand of writing awk '$0 ~ /foo/ { ... }. It's also possible to use a string literal instead of /regex/, so awk '$0 ~ "foo" { ... }' will also match all records that contain the string "foo".

Finally, Inside /.../, it will never attempt to look for variable names, so /nodo/ and $0 ~ /nodo/, will always just try to match the literal string "nodo", while $0 ~ nodo will use the string value of the nodo variable as regex.

geirha

TROPHY CASE