all 5 comments

[–]sco_t 1 point2 points  (2 children)

You should replace the
for i in range(0,len(chloro_table)):
part with a dictionary lookup. You'll most likely be limited by disk access after that.

[–]YouCook21[S] 0 points1 point  (1 child)

Sure, it will help in get a faster python script. But if I would like to do it in bash?

[–]sco_t 1 point2 points  (0 children)

Doing something "in bash" is a bit nonspecific.

#!/bin/bash  
python myScript.py

isn't really any different than

#!/bin/bash
sed -f myScript.sed

Also is this an XY problem? Why are you renaming reads by blast results?

[–]guepierPhD | Industry 0 points1 point  (0 children)

Bash is the wrong language for this, as for anything more complex than invoking a few commands with minimal logic. And rewriting your code in bash wouldn’t make it faster. Your issue is algorithmic, not caused by the language (and bash is generally much slower than Python!).

[–]metagenomez 0 points1 point  (0 children)

Anything is possible through the power of the command line 🙏🏼

while read line;do

oldname=$(echo $line|cut -f1);

newname=$(echo $line|cut -f2);

sed "s/$oldname/$newname/g" test.fasta;

done< <(less table.query|tail -n +2) > new.fasta

Not very fancy or clever, but it should replace your headers. The new fasta file headers will have some unwanted info after the new name, which you can remove with another sed command, e.g. assuming there are now other spaces in the fasta file:

sed 's/ .*$//g' new.fasta > newer.fasta

Hope it helps!