Use awk to count number of sequences in a FASTA file

Nice one-liner that simply counts the number of greater-than (>) symbols in a file. In a FASTA file, there should only be a single “>” for each sequence in the file.

$awk '/>/ { count++ } END { print count }' InputFastaFile.fasta

Here’s an example of a FASTA file format for those who don’t know:

>sequence_ID_1
atcgatcgggatcaatgacttcattggagaccgaga
>sequence_ID_2
gatccatggacgtttaacgcgatgacatactaggatcagat

Leave a Comment