linux-统计一个文件中出现的单词数

#!/bin/bash
if [ $# -ne 1 ]
then
echo "Usage: $0 filename";
exit -1
fi

filename=$1
egrep -o "\b[[:alpha:]]+\b" $filename | \
awk '{ count[$0]++ }
END{ printf("%-14s%s\n", "word", "count");
for(ind in count)
{ printf("%-14s%d\n", ind, count[ind]); }
}'
egrep -o "\b[[:alpha:]]+\b" $filename 可以得到文件中所有的单词 \b为单词边界标记符
posted @ 2016-08-26 11:01  無限大  阅读(4285)  评论(0编辑  收藏  举报