给定一个仅包含英文字母和空格的字符串,请实现一个函数找出该字符串中出现次数最多的10个字母(不区分大小写)。

思路:

(1)从文件中读取内容 转化成字符数组
(2)检查内容是否符合要求
(3)计算字母出现频率
(4)找出出现频率最高的前N个字母
(5)打印这些字母

但是要注意一些异常情况的判断:
(1)内容是否只包含英文字母和空格?
(2)如果文件中没有包含字母,该如何处理?
(3)如果给出的字符里面 只有三个字母如"abc",但是确要找出次数出现最多的前10个字母,该如何处理?

 

具体Java代码如下:

public class ComputeLetterFrequency {

    public static void main(String[] args) {
        char[] charArray = null;
        String fileName = "C:/article.txt";
        try {
            // 从文件中读取内容转化成字符数组
            charArray = readCharArrayFromFile(fileName);
            // 检查内容是否符合要求
            checkContentValid(charArray);
            // 计算字母出现频率
            int[] frequency = computeFrequency(charArray);
            // 获取频率出现最高的前N个字母
            int[] tops = getTopLettersByNum(frequency, 3);
            // 打印这些字母
            System.out.println("频率出现最高的字母有:");
            for (int i = 0; i < tops.length; i++) {
                char c = (char) ('a' + tops[i]);
                System.out.print(c + "\t");
            }

        } catch (FileNotFoundException e) {
            // TODO Auto-generated catch block
            System.out.println(fileName + " not found,so program exits!");
            System.exit(-1);
        } catch (RuntimeException e) {
            e.printStackTrace();
        }

    }

    public static void checkContentValid(char[] charArr) {
        for (int i = 0; i < charArr.length; i++) {
            if ((charArr[i] >= 'a' && charArr[i] <= 'z')
                    || (charArr[i] >= 'A' && charArr[i] <= 'Z')
                    || (charArr[i] == ' ')) {

                continue;

            } else {
                throw new RuntimeException(
                        "The contents is invalid,and the contents should only contain english letter and blankspace");

            }

        }

    }

    public static int getValidLettersLength(int[] frequency) {
        int validLength = 0;
        for (int i = 0; i < frequency.length; i++) {
            if (frequency[i] != 0) {
                validLength++;
            }

        }
        if (validLength == 0) {
            throw new RuntimeException("No valid letter in file");
        }
        return validLength;

    }

    public static int[] getTopLettersByNum(int[] frequency, int num) {
        // 求出字母出现过一次(或以上)的实际有效字母个数
        int validLength = getValidLettersLength(frequency);

        int[] tops = null;
        // 实际有效字母个数如果小于num,直接根据实际有效字母数创建数组
        if (validLength < num) {
            tops = new int[validLength];
        } else {
            tops = new int[num];

        }
        for (int i = 0; i < tops.length; i++) {
            int max = -1;
            int maxIndex = 0;
            for (int j = 0; j < frequency.length; j++) {

                if (frequency[j] > max) {
                    max = frequency[j];
                    maxIndex = j;
                }

            }

            tops[i] = maxIndex;
            frequency[maxIndex] = -1;

        }
        return tops;
    }

    public static int[] computeFrequency(char[] charArr) {
        int[] frequency = new int[26];
        for (int i = 0; i < charArr.length; i++) {
            if (charArr[i] >= 'a' && charArr[i] <= 'z') {

                frequency[charArr[i] - 'a']++;

            }
            if (charArr[i] >= 'A' && charArr[i] <= 'Z') {

                frequency[charArr[i] - 'A']++;
            }
        }
        return frequency;
    }

    public static char[] readCharArrayFromFile(String fileName)
            throws FileNotFoundException {
        File file = new File(fileName);
        Scanner scanner = new Scanner(file);
        StringBuilder sb = new StringBuilder();
        while (scanner.hasNextLine()) {
            sb.append(scanner.nextLine());
        }

        return sb.toString().toCharArray();

    }

}

posted @ 2009-11-27 13:19  Chris Wang  阅读(5190)  评论(1编辑  收藏  举报