nextflow入门(一)

生信分析流工具常见流派snakemake、wdl、cwl、nextflow,甚至还有make流,在snakemake和nextflow之间反复横跳后,发现最好两个都要会。

官方网站: https://www.nextflow.io/docs/latest/index.html

安装

依赖java环境,当前nextflow版本为25.10.2

curl -s https://get.nextflow.io | bash
chmod +x nextflow

测试文件构建

data目录下创建20个双端测序空文件备用

seq 1 20|xargs -I {}  touch sample_{}_1.fq
seq 1 20|xargs -I {}  touch sample_{}_2.fq

编写一个测试例子

#!/usr/bin/env nextflow

nextflow.enable.dsl = 2

params.read_path = "${workflow.projectDir}/data"
params.outdir = "${workflow.projectDir}/result"
params.pattern = "*_{1,2}.fq"


process chd {
        publishDir  params.outdir, mode: 'link'

        input:
                tuple val(sample_id), path(reads)

        output:
                path "${sample_id}_info.txt", emit: sample_info
        script:
                """
                        echo  "sample_id: $sample_id, seq_file: ${reads[0]}:\t:${reads[1]}" > ${sample_id}_info.txt
                """
				
				// 三重单引号内变量获取方式 !{variable}
                // ''' echo "sample_id: !{sample_id}, seq_file: !{reads}"  '''
}

process cats {
        publishDir params.outdir, mode: 'link'
        cache 'lenient' // 避免重复运行

        input:
                path(sample_files)

        output:
                path "merged.txt"

        script:
                """
                        for file in ${sample_files}; do
                                cat \$file >> "merged.txt"
                        done
                """
}

workflow {
	println "workdir: ${workflow.projectDir}"
	ch_fq = channel.fromFilePairs("${params.read_path}/${params.pattern}", flat: false, checkIfExists: true)

    // 显式声明参数名(比如sample_data),替换隐式的it
    ch_fq.view { sample_data ->  "raw ctx: sm=${sample_data[0]}, fq1 = ${sample_data[1][0]}, fq2= ${sample_data[1][1]}" }

    chd_out = chd(ch_fq)
    res1 = chd_out.sample_info.collect()
    cats(res1)
}

检查语法错误:

nextflow lint main.nf

输出检查结果如下:

Nextflow linting complete!
 ✅ 1 file had no errors
posted @ 2026-01-21 14:23  un-define  阅读(1)  评论(0)    收藏  举报