POJ 1002 解题分析

Technorati 标签: ACM,POJ

题目描述

题目链接 POJ 1002

487-3279

Time Limit: 2000MS
Memory Limit: 65536K

Total Submissions: 135193
Accepted: 22975

Description

Businesses like to have memorable telephone numbers. One way to make a telephone number memorable is to have it spell a memorable word or phrase. For example, you can call the University of Waterloo by dialing the memorable TUT-GLOP. Sometimes only part of the number is used to spell a word. When you get back to your hotel tonight you can order a pizza from Gino's by dialing 310-GINO. Another way to make a telephone number memorable is to group the digits in a memorable way. You could order your pizza from Pizza Hut by calling their ``three tens'' number 3-10-10-10.
The standard form of a telephone number is seven decimal digits with a hyphen between the third and fourth digits (e.g. 888-1200). The keypad of a phone supplies the mapping of letters to numbers, as follows:
A, B, and C map to 2
D, E, and F map to 3
G, H, and I map to 4
J, K, and L map to 5
M, N, and O map to 6
P, R, and S map to 7
T, U, and V map to 8
W, X, and Y map to 9
There is no mapping for Q or Z. Hyphens are not dialed, and can be added and removed as necessary. The standard form of TUT-GLOP is 888-4567, the standard form of 310-GINO is 310-4466, and the standard form of 3-10-10-10 is 310-1010.
Two telephone numbers are equivalent if they have the same standard form. (They dial the same number.)
Your company is compiling a directory of telephone numbers from local businesses. As part of the quality control process you want to check that no two (or more) businesses in the directory have the same telephone number.

Input

The input will consist of one case. The first line of the input specifies the number of telephone numbers in the directory (up to 100,000) as a positive integer alone on the line. The remaining lines list the telephone numbers in the directory, with each number alone on a line. Each telephone number consists of a string composed of decimal digits, uppercase letters (excluding Q and Z) and hyphens. Exactly seven of the characters in the string will be digits or letters.

Output

Generate a line of output for each telephone number that appears more than once in any form. The line should give the telephone number in standard form, followed by a space, followed by the number of times the telephone number appears in the directory. Arrange the output lines by telephone number in ascending lexicographical order. If there are no duplicates in the input print the line:
No duplicates.

Sample Input

12
4873279
ITS-EASY
888-4567
3-10-10-10
888-GLOP
TUT-GLOP
967-11-11
310-GINO
F101010
888-1200
-4-8-7-3-2-7-9-
487-3279

Sample Output

310-1010 2
487-3279 4
888-4567 3

Source

East Central North America 1999

解题分析

解题分析就不负责翻译工作了，以后不再声明。

题目非常简单，可以分为以下三个步骤：

对输入进行翻译（格式化）使之成为标准的电话号码格式XXX-XXXX
统计电话号码的出现次数
按照字典序升序输出重复出现的电话号码及其重复次数，没有重复的输出"No duplicates."

第一个步骤没有什么难度，遇到"-"就跳过，遇到数字直接减"0"就可以了，遇到字母就按照题目中的规则进行Map。

第二个步骤就出现分歧了，大约有以下三种可能：

直接把解析后的电话号码插入到序列容器里，等到扫描完毕后对序列进行排序，然后统计并输出重复的电话号码
使用哈希表存放包含电话号码和出现次数的一个数据结构，然后排序输出
使用二叉查找树存放包含电话号码和出现次数的一个数据结构，然后遍历并输出

排序法

第一种方法最为直接，但是后来我从论坛里看到很多这么做的人都超时了。

个人感觉这种方法不会超时，不知道为什么那么多人卡住了。

使用数组作为容器，快速排序作为排序方法，分析一下时间复杂性：

假设每个输入的电话号码的长度不超过m，一共有n个电话号码

则解析一个电话号码的时间为O(m)，插入的时间为O(1)，排序的时间为O(nlogn)，最后统计加输出的时间为O(n)，

整体时间复杂度为T(n)=n(O(m)+O(1))+O(nlogn)+O(n)=O(nlogn)

给出伪代码：

Procedure
Begin
	telNumberArray <- empty
	For currentLine <- each line of input
	Begin
		telNumber <- parse tel number from currentLine
		push telNumber at the end of telNumberArray
	End For

	Quick sort telNumberArray

	duplicates <- 0
	noDuplicates <- true
	For i from 1 to n
	Begin
		If telNumberArray(i – 1) = telNumberArray(i)
			duplictates <- duplicates + 1
		Else If duplicates > 0
		Begin
			Print telNumberArray(i – 1) and duplicates
			duplicates <- 0
			noDuplicates <- false
		End If

		If duplicates > 0
		Begin
			Print telNumberArray(i – 1) and duplicates
			duplicates <- 0
			noDuplicates <- false
		End If
	End For

	If noDuplicates
		Print “No duplicates.”

End Procedure