Queue and Stack Html Parser
Some info: HTML is a markup language, that is based on tags. There are many types of tags, for example, html, div, h1, h2.
Each type of tag has start
This pair may hold text content or children tags:
text
Task: Today you will create your own HTML parser! It should print the content for each pair of tags.
There are two rules for printing order:
the same hierarchy level tags are processed left to right;
if a tag has other tags as children, children should be processed first. If there are no children or all of them were processed already, the tag can be processed.
Let's see how it works and why:
- First example input: content
Output: content
Explanation: There is a single pair of html tag, its content is "content".
- More complicated input:
content1
content2
Output:content1
content2
content1
content2
Explanation: There are 3 pairs of tags: html, h1 and h2. h1 and h2 are preferred by rule 2 because they are children of html.
h1 goes before h2 by rule 1.
- And the hardest one so far input:
content1
content2
content1
content2
content2
content1
content2
See this one's explanation in the hint, if needed.
Hint
Sample Input 1:
hellonestedHello
nestedWorld
top
world
Sample Output 1:
hello
nestedHello
nestedWorld
top
top
nestedHello
nestedWorld
top
world
hellonestedHello
nestedWorld
top
world
import java.util.*;
class Main {
public static void main(String[] args) {
Scanner scanner = new Scanner(System.in);
String line = scanner.nextLine();
char[] html = line.toCharArray();
ArrayDeque<Integer> deque = new ArrayDeque<>();
boolean flag = false;
for (int i = 0; i < html.length; i++) {
if (html[i] == '>' && !flag) {
deque.addLast(i + 1);
} else if (html[i] == '/') {
flag = true;
System.out.println(line.substring(deque.pollLast(), i - 1));
} else if (html[i] == '>' && flag) {
flag = false;
}
}
}
}
import java.util.*;
import java.util.regex.*;
class Main {
public static void main(String[] args) {
dfs(new Scanner(System.in).nextLine());
}
private static void dfs(String input) {
Matcher m = Pattern.compile("<(\\w+)>(.+?)</\\1>").matcher(input);
while (m.find()) {
dfs(m.group(2));
System.out.println(m.group(2));
}
}
}

浙公网安备 33010602011771号