I 知识背景

CheckM provides a set of tools for assessing the quality of genomes recovered from isolates, single cells, or metagenomes. It provides robust estimates of genome completeness and contamination by using collocated sets of genes that are ubiquitous and single-copy within a phylogenetic lineage. Assessment of genome quality can also be examined using plots depicting key genomic characteristics (e.g., GC, coding density) which highlight sequences outside the expected distributions of a typical genome. CheckM also provides tools for identifying genome bins that are likely candidates for merging based on marker set compatibility, similarity in genomic characteristics, and proximity within a reference genome tree.

II 计算原理

Estimation of completeness, contamination, and strain heterogeneity

 III 安装流程:

##
#1.通过conda安装 conda create -n checkm conda activate checkm conda install -c bioconda checkm-genome #2.通过pip安装 pip3 install numpy pip3 install matplotlib pip3 install pysam pip3 install checkm-genome conda install hmmer prodigal pplacer #3.下载数据库并设置数据路径 wget -c https://data.ace.uq.edu.au/public/CheckM_databases/checkm_data_2015_01_16.tar.gz tar -zxvf checkm_data_2015_01_16.tar.gz checkm data setRoot /path/to/checkm_data
#4.安装成功后,测试通过命令 checkm lineage_wf -t 150 -x fasta -f sag_2554.txt --tab_table ../fasta_file output

四、参考文献
https://ecogenomics.github.io/CheckM/
Parks DH, Imelfort M, Skennerton CT, Hugenholtz P, Tyson GW. 2014. Assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Research, 25: 1043-1055.
posted on 2023-07-11 13:01  白泽儿  阅读(149)  评论(0)    收藏  举报