nextflow入门(一)
生信分析流工具常见流派snakemake、wdl、cwl、nextflow,甚至还有make流,在snakemake和nextflow之间反复横跳后,发现最好两个都要会。
官方网站: https://www.nextflow.io/docs/latest/index.html
安装
依赖java环境,当前nextflow版本为25.10.2
curl -s https://get.nextflow.io | bash
chmod +x nextflow
测试文件构建
在data目录下创建20个双端测序空文件备用
seq 1 20|xargs -I {} touch sample_{}_1.fq
seq 1 20|xargs -I {} touch sample_{}_2.fq
编写一个测试例子
#!/usr/bin/env nextflow
nextflow.enable.dsl = 2
params.read_path = "${workflow.projectDir}/data"
params.outdir = "${workflow.projectDir}/result"
params.pattern = "*_{1,2}.fq"
process chd {
publishDir params.outdir, mode: 'link'
input:
tuple val(sample_id), path(reads)
output:
path "${sample_id}_info.txt", emit: sample_info
script:
"""
echo "sample_id: $sample_id, seq_file: ${reads[0]}:\t:${reads[1]}" > ${sample_id}_info.txt
"""
// 三重单引号内变量获取方式 !{variable}
// ''' echo "sample_id: !{sample_id}, seq_file: !{reads}" '''
}
process cats {
publishDir params.outdir, mode: 'link'
cache 'lenient' // 避免重复运行
input:
path(sample_files)
output:
path "merged.txt"
script:
"""
for file in ${sample_files}; do
cat \$file >> "merged.txt"
done
"""
}
workflow {
println "workdir: ${workflow.projectDir}"
ch_fq = channel.fromFilePairs("${params.read_path}/${params.pattern}", flat: false, checkIfExists: true)
// 显式声明参数名(比如sample_data),替换隐式的it
ch_fq.view { sample_data -> "raw ctx: sm=${sample_data[0]}, fq1 = ${sample_data[1][0]}, fq2= ${sample_data[1][1]}" }
chd_out = chd(ch_fq)
res1 = chd_out.sample_info.collect()
cats(res1)
}
检查语法错误:
nextflow lint main.nf
输出检查结果如下:
Nextflow linting complete!
✅ 1 file had no errors
作者:un-define
本文版权归作者和博客园共有,欢迎转载,但未经作者同意必须保留此段声明,且在文章页面明显位置给出原文连接,否则保留追究法律责任的权利.

浙公网安备 33010602011771号