Downsampling Bam file | 平衡测序深度
目前对peak的数据处理上,发现测序深度对peak的数量有很大影响,即使做了normalization也没办法,所以这里希望从原始的bam文件开始做downsampling。
参考一:Downsample BAM file to specific amount of reads
input_dir=/home/da528/ATAC_Analysis/cut_run/zhixin_analysis/processed_bam/Exp1/
output_dir=/home/da528/ATAC_Analysis/cut_run/zhixin_analysis/downsampling_bam/Exp1
cd $input_dir
for bam in `ls *.bam`
do
echo $bam
reads=5000000
fraction=$(samtools idxstats $bam | cut -f3 | awk -v ct=$reads 'BEGIN {total=0} {total += $1} END {print ct/total}')
samtools view -b -s ${fraction} $bam > $output_dir/$bam
done
以上脚本的不完善的地方就是当测序read低于5000000,它还是执行downsampling,这不是我们想看到的。
input_dir=/home/da528/ATAC_Analysis/cut_run/zhixin_analysis/processed_bam/Exp1/
output_dir=/home/da528/ATAC_Analysis/cut_run/zhixin_analysis/downsampling_bam2/Exp1
cd $input_dir
for bam in `ls *.bam`
do
echo $bam
reads=5000000
# fraction=$(samtools idxstats $bam | cut -f3 | awk -v ct=$reads 'BEGIN {total=0} {total += $1} END {print ct/total}')
frac=$( samtools idxstats $bam | cut -f3 | awk -v ct=$reads 'BEGIN {total=0} {total += $1} END {frac=ct/total; if (frac > 1) {print 0.99} else {print frac}}' )
echo $frac
samtools view -b -s ${frac} $bam > $output_dir/$bam
done
samtools view: Incorrect sampling argument "1"
浙公网安备 33010602011771号