一道微软笔试题
上周末, 新鲜出炉的.
已知一个字符串, 只含常见可打印ascii字符以及空格和换行, 要求进行如下过滤:
1, 过滤掉前导空白和后导空白;
2, 中间的连续空白字符, 只保留一个;
3, 删除换行前后的空白字符;
题目不难, 不过按照微软一贯的作风, 这种题目的目的不是在于考察学生会不会写程序(当然, 要是写不出就不太好了), 而是在于考察学生是不是能够考虑到方方面面的问题, "于细微处见功力".
本着"测试先行"的原则, 可以先从测试用例入手, 如下所示:
// Test cases for leading & trailing spaces. char arr00[] = "hello_world"; char arr01[] = "hello world"; char arr02[] = " hello world"; char arr03[] = "hello world "; char arr04[] = " hello world "; // Test cases for consecutive spaces. char arr05[] = "hello world"; char arr06[] = " hello world "; // Test cases for spaces around new-lines. char arr07[] = "hello \n world "; char arr08[] = " hello world \n "; char arr09[] = "\n hello world \n "; // Corner cases char arr10[] = " "; char arr11[] = "\n"; char arr12[] = " \n ";
更多的测试用例, 希望读者可以补充.
有了这些测试用例, 再考虑如何实现. 从前往后走, 需要频繁地移动后面的内存, 不如从后往前走.
完整代码如下:
#include <stdio.h>
#include <assert.h>
#include <string.h>
// Test cases for leading & trailing spaces.
char arr00[] = "hello_world";
char arr01[] = "hello world";
char arr02[] = " hello world";
char arr03[] = "hello world ";
char arr04[] = " hello world ";
// Test cases for consecutive spaces.
char arr05[] = "hello world";
char arr06[] = " hello world ";
// Test cases for spaces around new-lines.
char arr07[] = "hello \n world ";
char arr08[] = " hello world \n ";
char arr09[] = "\n hello world \n ";
// Corner cases
char arr10[] = " ";
char arr11[] = "\n";
char arr12[] = " \n ";
void filter_spaces(char *str, size_t len)
{
char *dst = str + len - 1;
char *curr = str + len - 1;
while (*curr == ' ' && curr >= str)
--curr; // remove trailing spaces;
if (curr < str) { // all spaces.
*str = '\0';
return;
}
int after_space = 0;
int around_newline = 0;
while (curr >= str) {
switch (*curr) {
case ' ':
if (after_space) { // a space followed by another space, omit it.
--curr;
} else if (around_newline) { // a space around a newline, omit it.
--curr;
} else {
after_space = 1;
*dst-- = *curr--;
}
break;
case '\n':
around_newline = 1;
if (after_space) { // remove last recorded space.
assert(*(dst + 1) == ' ');
++dst;
after_space = 0;
}
*dst-- = *curr--;
break;
default: // other chars
*dst-- = *curr--;
after_space = 0;
around_newline = 0;
break;
}
}
++dst;
if (*dst == ' ') // remove leading spaces.
++dst;
// now the filtered string size is ( (str + size) - dst + 1 ),
// including the trailing '\0'
memmove(str, dst, (str + len) - dst + 1);
}
#define TEST_STR(str) do {\
filter_spaces(str, strlen(str));\
printf(#str ": \"%s\"\n", str);\
} while(0);
int main(int argc, char *argv[])
{
TEST_STR(arr00);
TEST_STR(arr01);
TEST_STR(arr02);
TEST_STR(arr03);
TEST_STR(arr04);
TEST_STR(arr05);
TEST_STR(arr06);
TEST_STR(arr07);
TEST_STR(arr08);
TEST_STR(arr09);
TEST_STR(arr10);
TEST_STR(arr11);
TEST_STR(arr12);
return 0;
}
注意最后的memmove, 因为这两块内存可能是重叠(overlap)的, 所以memcpy或者strcpy都不可行.
浙公网安备 33010602011771号