seqkit命令对fasta文件scaffold批量重命名

 

seqkit命令对fasta文件scaffold批量重命名

001、查看原始的scaffold名称:

(base) [b20223040323@admin2 GCA_024222265.1_ASM2422226v1]$ grep "^>" GCA_024222265.1_ASM2422226v1_genomic.fna | head -n 5      ## 查看前五条染色体名称
>CM044122.1 Ovis aries isolate WAG1 breed Waggir sheep chromosome 1, whole genome shotgun sequence
>CM044123.1 Ovis aries isolate WAG1 breed Waggir sheep chromosome 2, whole genome shotgun sequence
>CM044124.1 Ovis aries isolate WAG1 breed Waggir sheep chromosome 3, whole genome shotgun sequence
>CM044125.1 Ovis aries isolate WAG1 breed Waggir sheep chromosome 4, whole genome shotgun sequence
>CM044126.1 Ovis aries isolate WAG1 breed Waggir sheep chromosome 5, whole genome shotgun sequence

image

 

002、

配置文件rename.list; 制表符分隔

(base) [b20223040323@admin2 GCA_024222265.1_ASM2422226v1]$ cat rename.list | head      ## 第一列为原始名称,第二列为打算替换的名称
CM044122.1      chr1
CM044123.1      chr2
CM044124.1      chr3
CM044125.1      chr4
CM044126.1      chr5
CM044127.1      chr6
CM044128.1      chr7
CM044129.1      chr8
CM044130.1      chr9
CM044131.1      chr10

image

 。

 

003、

(base) [b20223040323@admin2 GCA_024222265.1_ASM2422226v1]$ seqkit replace -p "^(\S+)" -r "{kv}" -k rename.list GCA_024222265.1_ASM2422226v1_genomic.fna | grep "^>" | head -n 5  ## 批量重命名查看前五条染色体
[INFO] read key-value file: rename.list
[INFO] 27 pairs of key-value loaded
>chr1 Ovis aries isolate WAG1 breed Waggir sheep chromosome 1, whole genome shotgun sequence
>chr2 Ovis aries isolate WAG1 breed Waggir sheep chromosome 2, whole genome shotgun sequence
>chr3 Ovis aries isolate WAG1 breed Waggir sheep chromosome 3, whole genome shotgun sequence
>chr4 Ovis aries isolate WAG1 breed Waggir sheep chromosome 4, whole genome shotgun sequence
>chr5 Ovis aries isolate WAG1 breed Waggir sheep chromosome 5, whole genome shotgun sequence

image

 。

 

posted @ 2025-12-05 10:32  小鲨鱼2018  阅读(2)  评论(0)    收藏  举报