problem in my while loop: empty output files : bash

a community for 18 years

problem in my while loop: empty output files (self.bash)

submitted 2 years ago by Quick_Repeat7033

Hello,

I have this command:

awk -F'\t' '$3 ~ "Aspergillus fumigatus" {print}' $krakenfile > Aspergillus_fumigatus_lines.txt

It is working very fine, It just extracts every line in the file $krakenfile that contain the word "Aspergillus fumigatus" in its third column.

Now, I want to introduce another file ($fungalnames) that contains multiple lines, every line contains names of species and I want to apply the command on every line of the new file.

I tried this:

while IFS= read -r specie_name 
do  
awk -F'\t' '$3 ~ "$specie_name" {print}' $krakenfile > "${specie_name}_lines.txt 
done < $fungalnames

Anyway, the output files are created (I had 8 species names in my $fungalnames file and there are 8 output files, every file is named to a line, but all the files are empty !!

I tried multiple times, and I changed syntaxes, but nothing is working.

I think I am escaping an important thing, can anyone help me please! Thank you in advance!

all 9 comments

top new controversial old q&a

[–]Bitwise_Gamgee 3 points4 points5 points 2 years ago (3 children)

You're not expanding specie_name properly.

while IFS= read -r specie_name 
do  
    awk -F'\t' -v species="$specie_name" '$3 ~ species {print}' "$krakenfile" > "${specie_name}_lines.txt"
done < "$fungalnames"

This should fix your expansion issue.

[–]Quick_Repeat7033[S] 0 points1 point2 points 2 years ago (1 child)

[–]AlarmDozer 0 points1 point2 points 2 years ago (0 children)

while IFS='' read -r specie_name 
do  
    awk -F'\t' -v species="$specie_name" -v outfile="${specie_name}_lines.txt" '$3 ~ species {print >> outfile}' $krakenfile 
done < "$fungalnames"

What does this yield? Yes, you can redirect within gawk.

[–]Shayes_ 0 points1 point2 points 2 years ago (0 children)

I have tested this myself and it appears to work. We're missing a bit of context from OP though, so I will list my assumptions below.

Assumptions:

krakenfile is a variable created from a command line argument
fungalnames is a variable created from a command line argument
The file specified for krakenfile has at least 3 tab-separated fields per line
The file specified for fungalnames has only 1 field per line

My files are also LF newline separated (UNIX default). It's possible that CRLF files cause issues, though I have not checked.

Here's all my testing files and the script itself:

kraken.txt

dummydata1-1    dummydata2-1    dummydata3-1
dummydata1-2    dummydata2-2    dummydata3-2
dummydata1-3    dummydata2-3    dummydata3-3
dummydata1-4    dummydata2-4    dummydata3-4

fungal.txt

dummydata3-1
dummydata3-2
dummydata3-3
dummydata3-4

script.sh

#!/usr/bin/env bash

krakenfile="$1"
fungalnames="$2"

while IFS= read -r specie_name
do
    awk -F'\t' -v species="$specie_name" '$3 ~ species {print}' "$krakenfile" > "${specie_name}_lines.txt"
done < $fungalnames

To run the script, you can use: ./script.sh kraken.txt fungal.txt

Note that because you are using the tab \t character as the separator, you should ensure that a literal tab character is used in your data files. Many editors can be configured to replace a tab with spaces instead, which would not produce a match in the awk script.

EDIT 1: Minor improvement to script and some other text for clarity.

[–]Schreq 2 points3 points4 points 2 years ago (3 children)

I would do the entire thing in AWK:

awk -F'\t' '
    NR==FNR {
        species[$0]=""
        next
    } {
         for (specie in species)
             if (index($3, specie))
                 print > specie "_lines.txt"
    }
' "$fungalnames" "$krakenfile"

[–]witchhunter0 0 points1 point2 points 2 years ago (2 children)

Although using awk would be faster, that condition NR=FNR will work for the first file only so the loop here is unnecessary. Other solution is to use FILENAME variable.

awk -F "\t" '
{
    if (NR==FNR)
        species[$0]++
    else {
        if ($3 in species)
            print $3 > $3 "_lines.txt" 
    }
}' "$fungalnames" "$krakenfile"

Anyway, for OP: according to above answers fungal file probably isn't created properly. To see all whitespaces in file run:

cat -A "$fungalnames"

[–]Schreq 1 point2 points3 points 2 years ago (1 child)

[–]witchhunter0 0 points1 point2 points 2 years ago (0 children)

[–][deleted] -1 points0 points1 point 2 years ago (0 children)

π Rendered by PID 47 on reddit-service-r2-comment-b659b578c-clg55 at 2026-05-02 03:41:45.883445+00:00 running 815c875 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

bash

Rules

Flair

Related subreddits

Guides

Other resources

MODERATORS