Linux 中awk命令根据reads ID 从fastq格式中提取数据

 

001、

(base) [b20223040323@admin2 test]$ ls
read.list  SRR1770413_1.fastq.gz  SRR1770413_2.fastq.gz
(base) [b20223040323@admin2 test]$ head read.list        ## reads ID 列表
SRR1770413.sralite.1.1
SRR1770413.sralite.1.2
SRR1770413.sralite.1.3
SRR1770413.sralite.1.4
SRR1770413.sralite.1.5
SRR1770413.sralite.1.6
SRR1770413.sralite.1.7
SRR1770413.sralite.1.8
SRR1770413.sralite.1.9
SRR1770413.sralite.1.10
(base) [b20223040323@admin2 test]$ awk '{if(NR == FNR) {ay1["@"$1]} else {if($1 ~ /^@/ && $1 in ay1){k = "yes"} else if($1 ~ /^@/ && (!($1 in ay1))) {k = "no"}}; if(k == "yes") {print $0}}' read.list <(gzip -dc SRR1770413_1.fastq.gz) | head -n 4

image

 

002、

(base) [b20223040323@admin2 test]$ ls
read.list  SRR1770413_1.fastq.gz  SRR1770413_2.fastq.gz
(base) [b20223040323@admin2 test]$ head read.list
SRR1770413.sralite.1.1
SRR1770413.sralite.1.2
SRR1770413.sralite.1.3
SRR1770413.sralite.1.4
SRR1770413.sralite.1.5
SRR1770413.sralite.1.6
SRR1770413.sralite.1.7
SRR1770413.sralite.1.8
SRR1770413.sralite.1.9
SRR1770413.sralite.1.10
(base) [b20223040323@admin2 test]$ awk 'NR == FNR {ay1["@"$1]; next} /^@/ {f=($1 in ay1)}f' read.list <(gzip -dc SRR1770413_1.fastq.gz) | head

image

 。

 

posted @ 2025-10-13 22:41  小鲨鱼2018  阅读(3)  评论(0)    收藏  举报