Queue and Stack Html Parser

Some info: HTML is a markup language, that is based on tags. There are many types of tags, for example, html, div, h1, h2.
Each type of tag has start and end notations, for example

and

.
This pair may hold text content or children tags:

text

You may learn more on the wiki.

Task: Today you will create your own HTML parser! It should print the content for each pair of tags.

There are two rules for printing order:

the same hierarchy level tags are processed left to right;
if a tag has other tags as children, children should be processed first. If there are no children or all of them were processed already, the tag can be processed.
Let's see how it works and why:

First example input: content

Output: content

Explanation: There is a single pair of html tag, its content is "content".

More complicated input:

content1

content2

Output:

content1
content2

content1

content2

Explanation: There are 3 pairs of tags: html, h1 and h2. h1 and h2 are preferred by rule 2 because they are children of html.
h1 goes before h2 by rule 1.

And the hardest one so far input:

content1

content2

Output:

content1
content2

content2

content1

content2

See this one's explanation in the hint, if needed.

Hint

Sample Input 1:

hello

nestedHello

nestedWorld

top

world
Sample Output 1:

hello
nestedHello
nestedWorld
top

top

nestedHello

nestedWorld

top

world hello

nestedHello

nestedWorld

top

world

import java.util.*;

class Main {
    public static void main(String[] args) {
        Scanner scanner = new Scanner(System.in);

        String line = scanner.nextLine();
        char[] html = line.toCharArray();

        ArrayDeque<Integer> deque = new ArrayDeque<>();
        boolean flag = false;

        for (int i = 0; i < html.length; i++) {
            if (html[i] == '>' && !flag) {
                deque.addLast(i + 1);
            } else if (html[i] == '/') {
                flag = true;
                System.out.println(line.substring(deque.pollLast(), i - 1));
            } else if (html[i] == '>' && flag) {
                flag = false;
            }
        }
    }
}


import java.util.*;
import java.util.regex.*;

class Main {
    public static void main(String[] args) {
        dfs(new Scanner(System.in).nextLine());
    }

    private static void dfs(String input) {
        Matcher m = Pattern.compile("<(\\w+)>(.+?)</\\1>").matcher(input);
        while (m.find()) {
            dfs(m.group(2));
            System.out.println(m.group(2));
        }
    }
}

posted @ 2020-09-28 11:08 longlong6296 阅读(217) 评论(0) 收藏举报

刷新页面返回顶部

longlong6296

Queue and Stack Html Parser

text

content1

content2

content1

content2

content1

content2

content2

content1

content2

nestedHello

nestedWorld

top

nestedHello

nestedWorld

top

nestedHello

nestedWorld

top

公告