为什么System.Attribute的GetHashCode方法需要如此设计?

昨天我在实现《通过扩展改善ASP.NET MVC的验证机制[使用篇]》的时候为了Attribute 的一个小问题后耗费了大半天的精力,虽然最终找到了问题的症结并解决了问题,但是我依然不知道微软如此设计的目的何在。闲话少说,我们先来演示一下我具体遇到的问题如何发生的。

目录:
一、问题重现
二、通过Attribute的Equals方法和GetHashCode方法进行对等判断
三、Attribute对象和Attribute类型的HashCode
四、倘若为FooAttribute添加一个属性/字段
五、Attribute的GetHashCode方式是如何实现的?

一、问题重现

如下面的代码片断所示,我们定义了两个Attribute。其中抽象的BaseAttribute中定义了一个Name属性,而FooAttribute直接继承自BaseAttribute,并不曾定义任何属性和字段。在类型Bar上,我们应用了三个FooAttribute特性,其Name属性分别为A、B和C。

   1: [Foo(Name = "A")]
   2: [Foo(Name = "B")]
   3: [Foo(Name = "C")]
   4: public class Bar
   5: { 
   6:  
   7: }
   8:  
   9: [AttributeUsage( AttributeTargets.Class, Inherited=true, AllowMultiple=true)]
  10: public abstract class BaseAttribute : Attribute
  11: {
  12:     public string Name { get; set; }
  13: }
  14: public class FooAttribute : BaseAttribute
  15: { 
  17: } 

在我的程序中具有类似于如下一段代码:我们调用Bar类型对象的GetCustomAttributes方法得到所有的Attribute特性并筛选出类型为FooAttribute特性列表,毫无疑问,这个列表包含Name属性分别为A、B和C的三个FooAttribute对象。然后我们从该列表中将Name属性为C的FooAttribute对象移掉,最终打印列表出余下的FooAttribute的Name属性。

   1: var attributes = typeof(Bar).GetCustomAttributes(true).OfType<FooAttribute>().ToList<FooAttribute>();
   2: var attribute = attributes.First(item => item.Name == "C");
   3: attributes.Remove(attribute);
   4: Array.ForEach(attributes.ToArray(), a => Console.WriteLine(a.Name));

按照绝大部分人思维,最终打印出来的肯定是A和B,但是真正执行的结果却是B和C。下面所示的确实就是最终的执行结果:

   1: B
   2: C

二、通过Attribute的Equals方法和GetHashCode方法进行对等判断

然后我们通过如下的方式判定两个FooAttribute对象的对等性。如下面的代码片断所示,我们直接调用构造函数创建了两个FooAttribute对象,它们的Name属性分别设置为“ABC”和“123”。最后两句代码分别通过调用Equals和HashCode判断两个FooAttribute是否相等。

   1: FooAttribute attribute1 = new FooAttribute{ Name = "ABC" };
   2: FooAttribute attribute2 = new FooAttribute{ Name = "123"};
   3: Console.WriteLine("attribute1.Equals(attribute2) = {0}",attribute1.Equals(attribute2));
   4: Console.WriteLine("attribute1.GetHashCode() == attribute2.GetHashCode() = {0}", attribute1.GetHashCode() == attribute2.GetHashCode());

通过如下的输出结果我们可以看出这两个分明具有不同Name属性值FooAttribute居然被认定为是“相等的”:

   1: attribute1.Equals(attribute2) = True
   2: attribute1.GetHashCode() == attribute2.GetHashCode() = True

三、Attribute对象和Attribute类型的HashCode

实际上两个FooAttribute对象的HashCode和FooAttribute类型是相等的。为此我们添加了额外两行代码判断typeof(FooAttribute)和FooAttribute对象的HashCode之间的对等性。

   1: FooAttribute attribute1 = new FooAttribute{ Name = "ABC" };
   2: FooAttribute attribute2 = new FooAttribute{ Name = "123"};
   3: Console.WriteLine("attribute1.Equals(attribute2) = {0}",attribute1.Equals(attribute2));
   4: Console.WriteLine("attribute1.GetHashCode() == attribute2.GetHashCode() = {0}", attribute1.GetHashCode() == attribute2.GetHashCode());
   5: Console.WriteLine("attribute1.GetHashCode() == typeof(FooAttribute).GetHashCode() = {0}",
   6:     attribute1.GetHashCode() == typeof(FooAttribute).GetHashCode());

typeof(FooAttribute)和FooAttribute对象之间对等性可以通过如下的输出结果看出来:

   1: attribute1.Equals(attribute2) = True
   2: attribute1.GetHashCode() == attribute2.GetHashCode() = True
   3: attribute1.GetHashCode() == typeof(FooAttribute).GetHashCode() = True

四、倘若为FooAttribute添加一个属性

但是不要以为Attribute的GetHashCode方法总是返回类型本身的HashCode,如果我们在FooAttribute定义一个属性/字段,最终的对等性判断又会不同。为此我们在FooAttribute定义了一个Type属性。

   1: public class FooAttribute : BaseAttribute
   2: { 
   3:     public Type Type {get;set;}
   4: }   

然后我们在创建FooAttribute时指定其Type属性:

   1: FooAttribute attribute1 = new FooAttribute{ Name = "ABC", Type=typeof(string)};
   2: FooAttribute attribute2 = new FooAttribute{ Name = "ABC" , Type=typeof(int)};
   3: Console.WriteLine("attribute1.Equals(attribute2) = {0}",attribute1.Equals(attribute2));
   4: Console.WriteLine("attribute1.GetHashCode() == attribute2.GetHashCode() = {0}", attribute1.GetHashCode() == attribute2.GetHashCode());
   5: Console.WriteLine("attribute1.GetHashCode() == typeof(FooAttribute).GetHashCode() = {0}",
   6:     attribute1.GetHashCode() == typeof(FooAttribute).GetHashCode());
现在具有不同Type属性值得两个FooAttribute就不相等了,这可以通过如下所示的输出结果看出来:
 
   1: attribute1.Equals(attribute2) = False
   2: attribute1.GetHashCode() == attribute2.GetHashCode() = False
   3: attribute1.GetHashCode() == typeof(FooAttribute).GetHashCode() = False

五、Attribute的GetHashCode方式是如何实现的?

Attribute的HashCode是由定义在自身类型的字段值派生,不包括从基类继承下来的属性值。如果自身类型不曾定义任何字段,则直接使用类型的HashCode,这可以通过Attribute的GetHashCode方法的实现看出来,而Equals的逻辑与此类似。

   1: [SecuritySafeCritical]
   2: public override int GetHashCode()
   3: {
   4:     Type type = base.GetType();
   5:     FieldInfo[] fields = type.GetFields(BindingFlags.NonPublic | BindingFlags.Public | BindingFlags.Instance);
   6:     object obj2 = null;
   7:     for (int i = 0; i < fields.Length; i++)
   8:     {
   9:         object obj3 = ((RtFieldInfo) fields[i]).InternalGetValue(this, false, false);
  10:         if ((obj3 != null) && !obj3.GetType().IsArray)
  11:         {
  12:             obj2 = obj3;
  13:         }
  14:         if (obj2 != null)
  15:         {
  16:             break;
  17:         }
  18:     }
  19:     if (obj2 != null)
  20:     {
  21:         return obj2.GetHashCode();
  22:     }
  23:     return type.GetHashCode();
  24: }
posted @ 2012-01-12 17:03 Artech 阅读(2778) 评论(25) 编辑 收藏

 回复 引用 查看   
#1楼 2012-01-12 17:35 Leo.Z      
学习了
 回复 引用 查看   
#2楼 2012-01-12 17:58 zzfff      
设计类时,值语义还是引用语义是不能忽视的。是值语义的话,(最好)实现IEquatable<>接口,或专门的IEqualityComparer<>类。
 回复 引用 查看   
#3楼 2012-01-12 18:44 水牛刀刀      
大胆猜测这里是一个bug,GetHashCode第5行应该用GetProperties而不是GetFields。这么猜测的原因是:(1)对于一个Attribute,属性才有业务意义,而不是任何字段。(2)如果改成GetProperties,默认是能获取到继承的属性(也包含了业务意义),GetFields显然是因为私有字段没有获取到,造成这么奇葩的结果。
 回复 引用 查看   
#4楼 2012-01-12 19:24 _冻结_      
坐等高手解答,
 回复 引用 查看   
#5楼 2012-01-12 19:56 Fish Li      
@水牛刀刀
属性是方法,不存储数据的。
不找继承关系,可以优化性能。

 回复 引用 查看   
#6楼[楼主] 2012-01-12 20:36 Artech      
引用Fish Li:
@水牛刀刀
属性是方法,不存储数据的。
不找继承关系,可以优化性能。

除了性能,我也实在想不出另外一个更好理由了。不过我个人觉得这确实算是一个Bug,两个包含不同数据的对象能够相等,我实在是不能接受。

 回复 引用 查看   
#7楼 2012-01-12 20:36 kiminozo      
学习了,Attribute那样实现大部分情况可以偷懒不去重写GetHashCode(),但是遇到这种情况估计就杯具了。
以后要小心没有成员的自定义Attribute

 回复 引用 查看   
#8楼 2012-01-12 21:41 Fish Li      
@Artech
哈哈,我能接受。因为我从来都不会这样做,我会这样写:
(from a in typeof(Bar).GetCustomAttributes(true).OfType<FooAttribute>()
where a.Name != "C"
select a).ToList().ForEach(x => Console.WriteLine(x.Name));

 回复 引用 查看   
#9楼 2012-01-12 22:19 阿毅      
@Fish Li
同意,因为Attribute的设计就不应该或者极少用于两个实例互相直接比较,而是条件查询的多。

 回复 引用 查看   
#10楼 2012-01-13 00:47 水牛刀刀      
引用Fish Li:
@水牛刀刀
属性是方法,不存储数据的。
不找继承关系,可以优化性能。

性能优化是建立在逻辑正确的基础上的。

 回复 引用 查看   
#11楼 2012-01-13 00:51 水牛刀刀      
@阿毅
那微软可以不重写GetHashCode和Equals

 回复 引用 查看   
#12楼 2012-01-13 01:28 水牛刀刀      
我去SO上提问了这个问题:http://stackoverflow.com/questions/8839445/gethashcode-and-equals-are-implemented-incorrectly-in-system-attribute 那里的答案应该会比较权威。
 回复 引用 查看   
#13楼 2012-01-13 09:13 JerryHao      
另外一个问题,为什么只比较找到的第一个成员呢?
 回复 引用 查看   
#14楼[楼主] 2012-01-13 09:33 Artech      
引用Fish Li:
@Artech
哈哈,我能接受。因为我从来都不会这样做,我会这样写:
(from a in typeof(Bar).GetCustomAttributes(true).OfType<FooAttribute>()
where a.Name != "C"
select a).ToList().ForEach(x => Console.WriteLine(x.Name));

如果不重写Equals和GetHashCode,我以后就这么写:)

 回复 引用 查看   
#15楼[楼主] 2012-01-13 09:34 Artech      
引用JerryHao:另外一个问题,为什么只比较找到的第一个成员呢?

这也是让我不解的地方:)

 回复 引用 查看   
#16楼[楼主] 2012-01-13 09:37 Artech      
引用水牛刀刀:我去SO上提问了这个问题:http://stackoverflow.com/questions/8839445/gethashcode-and-equals-are-implemented-incorrectly-in-system-attribute 那里的答案应该会比较权威。

没想到stackoverflow还很快,自从很早之前发个了帖子没人搭理我之后我就没有怎用过它的:)看来已经要多利用stackoverflow。

 回复 引用 查看   
#17楼[楼主] 2012-01-13 09:40 Artech      
@水牛刀刀
来源于SO的第一个“答案”:
A clear bug, no. A good idea, perhaps or perhaps not.

What does it mean for one thing to be equal to another? We could get quite philosophical, if we really wanted to.

Being only slightly philosophical, there are a few things that must hold:

Equality is reflexive: Identity entails equality. x.Equals(x) must hold.
Equality is symmetric. If x.Equals(y) then y.Equals(x) and if !x.Equals(y) then !y.Equals(x).
Equality is transitive. If x.Equals(y) and y.Equals(z) then x.Equals(z).
There's a few others, though only these can directly be reflected by the code for Equals() alone.

If an implementation of an override of object.Equals(object), IEquatable<T>.Equals(T), IEqualityComparer.Equals(object, object), IEqualityComparer<T>.Equals(T, T), == or of != does not meet the above, it's a clear bug.

The other method that reflects equality in .NET are object.GetHashCode(), IEqualityComparer.GetHashCode(object) and IEqualityComparer<T>.GetHashCode(T). Here there's the simple rule:

If a.Equals(b) then it must hold that a.GetHashCode() == b.GetHashCode(). The equivalent holds for IEqualityComparer and IEqualityComparer<T>.

If that doesn't hold, then again we've got a bug.

Beyond that, there are no over-all rules on what equality must mean. It depends on the semantics of the class provided by its own Equals() overrides or by those imposed upon it by an equality comparer. Of course, those semantics should either be blatantly obvious or else documented in the class or the equality comparer.

In all, how does an Equals and/or a GetHashCode have a bug:

If it fails to provide the reflexive, symmetric and transitive properties detailed above.
If the relationship between GetHashCode and Equals is not as above.
If it doesn't match its documented semantics.
If it throws an inappropriate exception.
If it wanders off into an infinite loop.
In practice, if it takes so long to return as to cripple things, though one could argue there's a theory vs. practice thing here.
With the overrides on Attribute, the equals does have the reflexive, symmetric and transitive properties, it's GetHashCode does match it, and the documentation for it's Equals override is:

This API supports the .NET Framework infrastructure and is not intended to be used directly from your code.

You can't really say your example disproves that!

Since the code you complain about doesn't fail on any of these points, it's not a bug.

There's a bug though in this code:
var attributes = typeof(Bar).GetCustomAttributes(true).OfType<FooAttribute>().ToList<FooAttribute>();
var getC = attributes.First(item => item.Name == "C");
attributes.Remove(getC);

You first ask for an item that fulfills a criteria, and then ask for one that is equal to it to be removed. There's no reason without examining the semantics of equality for the type in question to expect that getC would be removed.
What you should do is:
bool calledAlready;
attributes.RemoveAll(item => {
  if(!calledAlready && item.Name == "C")
  {
    return calledAlready = true;
  }
});

That is to say, we use a predicate that matches the first attribute with Name == "C" and no other.

 回复 引用 查看   
#18楼[楼主] 2012-01-13 09:44 Artech      
@水牛刀刀
来源于SO的第二个“答案”:
Yep, a bug as others have already mentioned in the comments. I can suggest a few possible fixes:

Option 1, Don't use inheritence in the Attribute class, this will allow the default implementation to function. The other option is use a custom comparer to ensure you are using reference equality when removing the item. You can implement a comparer easily enough. Just use Object.ReferenceEquals for comparison and for your use you could use the type's hash code or use System.Runtime.CompilerServices.RuntimeHelpers.GetHashCode.
public sealed class ReferenceEqualityComparer<T> : IEqualityComparer<T>
{
    bool IEqualityComparer<T>.Equals(T x, T y)
    {
        return Object.ReferenceEquals(x, y);
    }
    int IEqualityComparer<T>.GetHashCode(T obj)
    {
        return System.Runtime.CompilerServices.RuntimeHelpers.GetHashCode(obj);
    }
}

 回复 引用 查看   
#19楼 2012-01-13 10:09 君之蘭      
Property虽然是方法,但也是获取的backingfield的值
而getfileds是拿不到backingfield的值的。
如果不判断基类的Property,就认定相等,这逻辑上说不过去

 回复 引用   
#20楼 2012-01-13 10:31 ray.li[未注册用户]
var attributes = typeof(Bar).GetCustomAttributes(true).OfType<FooAttribute>().ToList<FooAttribute>();
var getC = attributes.First(item => item.Name == "C");
attributes.Remove(getC);
attributes.ForEach(a => Console.WriteLine(a.Name));
Console.WriteLine("=====================");
(typeof(Bar).GetCustomAttributes(true).OfType<FooAttribute>().Where(u => u.Name != "C").ToList()).ForEach(
u => Console.WriteLine(u.Name));


两个一起打印就是:A,B,A,B
如果换成:
var attributes = typeof(Bar).GetCustomAttributes(true).OfType<FooAttribute>().ToList<FooAttribute>();
var getC = attributes.First(item => item.Name == "C");
attributes.Remove(getC);
foreach (var attribute in attributes)
Console.WriteLine(attribute.Name);
Console.WriteLine("=====================");
(typeof(Bar).GetCustomAttributes(true).OfType<FooAttribute>().Where(u => u.Name != "C").ToList()).ForEach(
u => Console.WriteLine(u.Name));

则是:A,C,A,B
为啥呢?

 回复 引用 查看   
#21楼[楼主] 2012-01-13 10:59 Artech      
引用ray.li:
var attributes = typeof(Bar).GetCustomAttributes(true).OfType<FooAttribute>().ToList<FooAttribute>();
var getC = attributes.First(item => item.Name == "C");
attributes.Remove(getC);
attributes.ForEach(a => Console.WriteLine(a.Name));
Console.WriteLine("==...

这段代码有三种可能的输出:
A
B

A
C

B
C

因为总是“第一个”FooAttribute被删除!

 回复 引用 查看   
#22楼 2012-01-13 14:05 Ivony...      
仔细看那个GetHashcode的逻辑,竟然只有第一个非数组的字段会被考虑,其余的字段都不会考虑的。

但在Equals里面,则会比较所有的字段。

看起来的确是因为获取字段的时候,漏了基类的。

 回复 引用 查看   
#23楼 2012-01-13 14:38 水牛刀刀      
@Artech
是的,SO的效率和质量都很高,所以我有问题一般去那里讨论。

 回复 引用 查看   
#24楼 2012-01-13 17:56 黑曜石      
对象的对比还是要重写一下Equals比较好,对比所有属性及值
 回复 引用 查看   
#25楼 2012-01-13 17:58 黑曜石      
对了,昨天还是前天来着,在评论中问过你的一个问题解决了
在global中加入如下路由代码:
routes.Add(new Route("", new MvcRouteHandler())
{
//默认的系统路由
Defaults = new RouteValueDictionary(new { controller = "Home", action = "Login", id = UrlParameter.Optional }),
//注册默认的area值
DataTokens = new RouteValueDictionary(new { area = "Home", namespaces = new[] { "AF.Areas.Areas.Home" } })
});
可以把默认的路由删掉
这样就可以用http://localhost/直接访问到
http://localhost/Home/Home/Login了,不用打全

发表评论

昵称: [登录] [注册]

主页:

邮箱:(仅博主可见)

评论内容:

  登录  注册

[使用Ctrl+Enter键快速提交评论]

0 2320832 EzdfZHHbOKs=