词法分析程序的设计与实现

词法分析程序(Lexical Analyzer)要求:

- 从左至右扫描构成源程序的字符流

-  识别出有词法意义的单词(Lexemes

-  返回单词记录(单词类别,单词本身)

-  滤掉空格

-  跳过注释

-  发现词法错误

 

程序结构:

输入:字符流(什么输入方式,什么数据结构保存)

处理:

–遍历(什么遍历方式)

–词法规则

输出:单词流(什么输出形式)

–二元组

 

单词类别:

1.标识符(10)

2.无符号数(11)

3.保留字(一词一码)

4.运算符(一词一码)

5.界符(一词一码)

 

单词符号

种别码

单词符号

种别码

public

1

:

17

class 

2

:=

18

static 

3

<

20

void 

4

<=

21

main

5

<>

22

}

30

>

23

l(l|d)*

10

>=

24

dd*

11

=

25

+

13

;

26

-

14

(

27

*

15

)

28

/

16

{

29

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

package cn.itcast.day13Collection;

import java.io.*;
import java.util.*;

public class Lexical_Analyzer {
    public static void main(String[] args) throws IOException {


        StringBuffer str=new StringBuffer("");

        FileReader fr = new FileReader("D:\\webquanzhan\\day05-code\\day05_code\\src\\cn\\itcast\\day13Collection\\demo01.txt");
        BufferedReader bf = new BufferedReader(fr);
        //读取文章结尾
        //消除注释
        String tail = "";
        String data = "";
        while ((tail = bf.readLine()) != null) {
            if (!tail.startsWith("//") && !tail.endsWith("//")) {
                data += tail;
            }
        }
        bf.close();
        //分割文章,划分为多个单词可能含有a+b,这种字母和符号结合
        String regex = "[ ,|\\n,|,|\\s]";
        String[] word = data.split(regex);
        //添加到链表里进行操作
        ArrayList<String> wordlist = new ArrayList<>();
        for (int i = 0; i < word.length; i++) {
            wordlist.add(word[i]);
        }


        HashMap<String, Integer> rewordmap = new HashMap<>();
        rewordmap.put("public", 1);
        rewordmap.put("class", 2);
        rewordmap.put("static ", 3);
        rewordmap.put("void ", 4);
        rewordmap.put("main", 5);
        rewordmap.put("+", 13);
        rewordmap.put("-", 14);
        rewordmap.put("*", 15);
        rewordmap.put("/", 16);
        rewordmap.put(":", 17);
        rewordmap.put(":=", 18);
        rewordmap.put("<", 20);
        rewordmap.put("<=", 21);
        rewordmap.put("<>", 22);
        rewordmap.put(">", 23);
        rewordmap.put(">=", 24);
        rewordmap.put("=", 25);
        rewordmap.put(";", 26);
        rewordmap.put("(", 27);
        rewordmap.put(")", 28);
        rewordmap.put("{", 29);
        rewordmap.put("}", 30);

        Iterator<String> it = wordlist.iterator();

        while (it.hasNext()) {
            String words = it.next();
            //将每个单词进行遍历,查看是否有符号,将符号提取出来
            char[] chars = words.toCharArray();
            for (int i = 0; i < chars.length; i++) {
                String s = String.valueOf(chars[i]);
                if (rewordmap.containsKey(s)) {
                    //判断map是否含有对应的键,返回对应的值
                    Integer integer1 = rewordmap.get(s);
                    System.out.println(s + "-->" + integer1);
                }
            }
            //将符号替代为空格,剩下单词空格隔开,然后利用空格将单词分割成一个数组
            String new_words = words.replaceAll("[+,-,*,/,=,;]", " ");
            String[] split_new_words = new_words.split("[ ,{,},(,)]");
            //遍历数组查看是否符合条件
            for (int i = 0; i < split_new_words.length; i++) {
                String  s=split_new_words[i];
                //判断是否为数子,查阅百度得知该方法
                boolean isNum01 = checkStrIsNum01(s);
                Integer integer = rewordmap.get(split_new_words[i]);
                //判断是否为保留字
                if (rewordmap.containsKey(split_new_words[i])) {
                    System.out.println(split_new_words[i] + "-->" + integer);
                } else if (isNum01) {
                    System.out.println(split_new_words[i] + "-->11");
                } else if (split_new_words[i]!=""){
                    System.out.println(split_new_words[i] + "-->10");
                }
            }


        }
    }


    public static boolean checkStrIsNum01(String str) {
        for (int i = 0; i < str.length(); i++) {
            if (!Character.isDigit(str.charAt(i))) {
                return false;
            }
        }
        return true;
    }

}

测试文件:

 

 

运行结果:

 

posted @ 2019-10-11 20:01  缪孝文  阅读(329)  评论(0编辑  收藏  举报