Linux 中awk命令根据reads ID 从fastq格式中提取数据
001、
(base) [b20223040323@admin2 test]$ ls read.list SRR1770413_1.fastq.gz SRR1770413_2.fastq.gz (base) [b20223040323@admin2 test]$ head read.list ## reads ID 列表 SRR1770413.sralite.1.1 SRR1770413.sralite.1.2 SRR1770413.sralite.1.3 SRR1770413.sralite.1.4 SRR1770413.sralite.1.5 SRR1770413.sralite.1.6 SRR1770413.sralite.1.7 SRR1770413.sralite.1.8 SRR1770413.sralite.1.9 SRR1770413.sralite.1.10 (base) [b20223040323@admin2 test]$ awk '{if(NR == FNR) {ay1["@"$1]} else {if($1 ~ /^@/ && $1 in ay1){k = "yes"} else if($1 ~ /^@/ && (!($1 in ay1))) {k = "no"}}; if(k == "yes") {print $0}}' read.list <(gzip -dc SRR1770413_1.fastq.gz) | head -n 4
。
002、
(base) [b20223040323@admin2 test]$ ls read.list SRR1770413_1.fastq.gz SRR1770413_2.fastq.gz (base) [b20223040323@admin2 test]$ head read.list SRR1770413.sralite.1.1 SRR1770413.sralite.1.2 SRR1770413.sralite.1.3 SRR1770413.sralite.1.4 SRR1770413.sralite.1.5 SRR1770413.sralite.1.6 SRR1770413.sralite.1.7 SRR1770413.sralite.1.8 SRR1770413.sralite.1.9 SRR1770413.sralite.1.10 (base) [b20223040323@admin2 test]$ awk 'NR == FNR {ay1["@"$1]; next} /^@/ {f=($1 in ay1)}f' read.list <(gzip -dc SRR1770413_1.fastq.gz) | head
。