C#中字符串的比较操作

字符串比较是比较常用的操作,一般出于以下两个原因比较字符串:

  • 判断相等
  • 字符串排序

查询API判断字符串相等或排序时,由以下方法:

 

      public override bool Equals(object obj);
        public bool Equals(string value);
        public static bool Equals(string a, string b);
        public bool Equals(string value, StringComparison comparisonType);
        public static bool Equals(string a, string b, StringComparison comparisonType);

        public static int Compare(string strA, string strB);
        public static int Compare(string strA, string strB, bool ignoreCase);
        public static int Compare(string strA, string strB, StringComparison comparisonType);
        public static int Compare(string strA, string strB, bool ignoreCase, CultureInfo culture);
        public static int Compare(string strA, string strB, CultureInfo culture, CompareOptions options);
        public static int Compare(string strA, int indexA, string strB, int indexB, int length);
        public static int Compare(string strA, int indexA, string strB, int indexB, int length, bool ignoreCase);
        public static int Compare(string strA, int indexA, string strB, int indexB, int length, StringComparison comparisonType);
        public static int Compare(string strA, int indexA, string strB, int indexB, int length, bool ignoreCase, CultureInfo culture);
        public static int Compare(string strA, int indexA, string strB, int indexB, int length, CultureInfo culture, CompareOptions options);

        public static int CompareOrdinal(string strA, string strB);
        public static int CompareOrdinal(string strA, int indexA, string strB, int indexB, int length);

        public int CompareTo(object value);
        public int CompareTo(string strB);

发现上述的方法中大多都有StringComparison类型的枚举,查询msdn后得到:

stringComparison

现简单写一段代码,测试Compare(string strA, string strB, StringComparison comparisonType)方法。分别用到StringComparison.CurrentCulture 和StringComparison.Ordinal。代码如下:

      static void Main(string[] args)
       {
           string strA = "asdfadsfasdfew我ò啊?地?方?的?asd";
           string strB = "adsfeaqfaead啊?多à发¢安2德?森-efadsfa";
           Stopwatch sw = new Stopwatch();

           sw.Start();
           for (int i = 0; i < 1000000; i++)
           {
               string.Compare(strA, strB, StringComparison.CurrentCulture);
           }
           sw.Stop();
           Console.WriteLine(sw.ElapsedMilliseconds);
           
           sw.Reset();
           for (int i = 0; i < 1000000; i++)
           {
               string.Compare(strA, strB,StringComparison.Ordinal);
           }
           sw.Stop();
           Console.WriteLine(sw.ElapsedMilliseconds);

           sw.Reset();
           for (int i = 0; i < 1000000; i++)
           {
               string.CompareOrdinal(strA, strB);
           }
           sw.Stop();
           Console.WriteLine(sw.ElapsedMilliseconds);
       }

执行结果如下:

result

测试结果非常明显,StringComparison.Currentculture显式传递了当前语言文化,而传递了String.Ordinal则会忽略指定的语言文化,这是执行字符串最快的一种方式。

使用.NET Reflector查看源代码:

public static int Compare(string strA, string strB, StringComparison comparisonType)
        {
            if ((comparisonType < StringComparison.CurrentCulture) || (comparisonType > StringComparison.OrdinalIgnoreCase))
            {
                throw new ArgumentException(Environment.GetResourceString("NotSupported_StringComparison"), "comparisonType");
            }
            if (strA == strB)
            {
                return 0;
            }
            if (strA == null)
            {
                return -1;
            }
            if (strB == null)
            {
                return 1;
            }
            switch (comparisonType)
            {
                case StringComparison.CurrentCulture:
                    return CultureInfo.CurrentCulture.CompareInfo.Compare(strA, strB, CompareOptions.None);

                case StringComparison.CurrentCultureIgnoreCase:
                    return CultureInfo.CurrentCulture.CompareInfo.Compare(strA, strB, CompareOptions.IgnoreCase);

                case StringComparison.InvariantCulture:
                    return CultureInfo.InvariantCulture.CompareInfo.Compare(strA, strB, CompareOptions.None);

                case StringComparison.InvariantCultureIgnoreCase:
                    return CultureInfo.InvariantCulture.CompareInfo.Compare(strA, strB, CompareOptions.IgnoreCase);

                case StringComparison.Ordinal:
                    return CompareOrdinalHelper(strA, strB);

                case StringComparison.OrdinalIgnoreCase:
                    if (!strA.IsAscii() || !strB.IsAscii())
                    {
                        return TextInfo.CompareOrdinalIgnoreCase(strA, strB);
                    }
                    return CompareOrdinalIgnoreCaseHelper(strA, strB);
            }
            throw new NotSupportedException(Environment.GetResourceString("NotSupported_StringComparison"));
        }
在上例中,同时测试了String的CompareOrdinal方法,效率同样惊人。查看其源代码后发现与Compare方法String.Ordinal源代码一样,此方法只是Compare方法的一个特例:
 
public static int CompareOrdinal(string strA, string strB)
        {
            if (strA == strB)
            {
                return 0;
            }
            if (strA == null)
            {
                return -1;
            }
            if (strB == null)
            {
                return 1;
            }
            return CompareOrdinalHelper(strA, strB);
        }

 

 

接下来看看String.CompareTo()方法的源代码:

[TargetedPatchingOptOut("Performance critical to inline across NGen image boundaries")]
       public int CompareTo(string strB)
       {
           if (strB == null)
           {
               return 1;
           }
           return CultureInfo.CurrentCulture.CompareInfo.Compare(this, strB, CompareOptions.None);
       }

与类型参数为StringComparison.CurrentCulture的Compare方法相同。

 

 

另外StringComparer也实现了字符串比较方法Compare()方法。直接看源代码:

public int Compare(object x, object y)
      {
          if (x == y)
          {
              return 0;
          }
          if (x == null)
          {
              return -1;
          }
          if (y == null)
          {
              return 1;
          }
          string str = x as string;
          if (str != null)
          {
              string str2 = y as string;
              if (str2 != null)
              {
                  return this.Compare(str, str2);
              }
          }
          IComparable comparable = x as IComparable;
          if (comparable == null)
          {
              throw new ArgumentException(Environment.GetResourceString("Argument_ImplementIComparable"));
          }
          return comparable.CompareTo(y);
      }

 

 

如果程序只将字符串用于内部编码目的,如路径名、文件名、URL、环境变量、反射、XML标记等,这些字符串通常只在程序内部使用,不会像用户展示,应该使用String.Ordinal或者使用String.CompareOrdinal()方法

 

总结及建议:

  1. 使用显示地指定了字符串比较规则的重载函数。一般来说,需要带有StringComparison类型参数的重载函数
  2. 在对未知文化的字符串做比较时,使用StringComparison.Ordinal和StringComparison.OrdinallgnoreCase作为默认值,提高性能
  3. 在像用户输出结果时,使用基于StringComparison.CurrentCulture的字符串
  4. 使用String.Equals的重载版本来测试两个字符串是否相等。
  5. 不要使用String.Compare或CompareTo的重载版本来检测返回值是否为0来判断字符串是否相等。这两个函数是用于字符串比较,而非检查相等性。
  6. 在字符串比较时,应以String.ToUpperInvariant函数使字符串规范化,而不用ToLowerInvariant方法,因为Microsoft对执行大写比较的代码进行了优化。之所以不用ToUpper和ToLower方法,是因为其对语言文化敏感。
posted @ 2012-03-23 13:55  JunBird  阅读(25814)  评论(1编辑  收藏  举报