NET中栈和堆的区别(比较)(2)

英文原著:

Even though with the .NET framework we don't have to actively worry about memory management and garbage collection (GC), we still have to keep memory management and GC in mind in order to optimize the performance of our applications. Also, having a basic understanding of how memory management works will help explain the behavior of the variables we work with in every program we write. In this article I'll cover some of the behaviors we need to be aware of when passing parameters to methods.

In Part I we covered the basics of the Heap and Stack functionality and where Variable Types and Reference Types are allocated as our program executes. We also covered the basic idea of what a Pointer is.

Parameters, the Big Picture.

Here's the detailed view of what happens as our code executes. We covered the basics of what happens when we make a method call in Part I. Let's get into more detail...

When we make a method call here's what happens:

  1. Space is allocated and the method itself is copied from the instance of our object to the Stack for execution (called a Frame). This is only the bits containing the instructions required to execute the method and includes no data items.
  2. The calling address (a pointer) is placed on the stack. This is basically a GOTO instruction so when the thread finishes running our method it knows where to go back to in order to continue execution. (However, this is a nice-to-know, not a need-to-know, because it will not affect how we code.)
  3. Space is allocated for our method parameters and they are copied over. This is what we want to look at more closely.
  4. Control is passed to the base of the frame and the thread starts executing code. Hence, we have another method on the "call stack".

The code:

          public int AddFive(int pValue)
          {
                int result;
                result = pValue + 5;
                return result;

          }

Will make the stack look like this:


 
As discussed in Part I, Parameter placement on the stack will be handled differently depending on whether it is a value type or a reference type. A value types is copied over and the reference of a reference type is copied over.

Passing Value Types.

Here's the catch with value types...

First, when we are passing a value types, space is allocated and the value in our type is copied to the new space on the stack. Look at the following method:

     class Class1

     {

          public void Go()

          {

              int x = 5;

              AddFive(x);

              Console.WriteLine(x.ToString());

              

          }

          public int AddFive(int pValue)

          {

              pValue += 5;

              return pValue;

          }

     }

The method is placed on the stack and as it executes, space for "x" is placed on the stack with a value of 5.


 
Next, AddFive() is placed on the stack with space for it's parameters and the value is copied, bit by bit from x.


 
When AddFive() has finished execution, the thread is passed back to Go() and AddFive() and pValue are removed:


 
So it makes sense that the output from our code is "5", right? The point is that any value type parameters passed into a method are carbon copies and we count on the original variable's value to be preserved.

One thing to keep in mind is that if we have a very large value type (such as a big struct) and pass it to the stack, it can get very expensive in terms of space and processor cycles to copy it over each time. The stack does not have infinite space and just like filling a glass of water from the tap, it can overflow. A struct is a value type that can get pretty big and we have to be aware of how we are handling it.

Here's a pretty big struct:

          public struct MyStruct

           {

               long a, b, c, d, e, f, g, h, i, j, k, l, m;

           }

Take a look at what happens when we execute Go() and get to the DoSomething() method below:

          public void Go()

          {

             MyStruct x = new MyStruct();

             DoSomething(x);

              

          }

           public void DoSomething(MyStruct pValue)

           {

                    // DO SOMETHING HERE....

           }

This can be really inefficient. Imaging if we passed the MyStruct a couple thousand times and you can understand how it could really bog things down.

So how do we get around this problem? By passing a reference to the original value type as follows: 

          public void Go()

          {

             MyStruct x = new MyStruct();

             DoSomething(ref x);

              

          }

           public struct MyStruct

           {

               long a, b, c, d, e, f, g, h, i, j, k, l, m;

           }

           public void DoSomething(ref MyStruct pValue)

           {

                    // DO SOMETHING HERE....

           }

This way we end up with more memory efficient allocation of our objects in memory. 


 
The only thing we have to watch out for when passing our value type by reference is that we have access to the value type's value. Whatever is changed in pValue is changed in x. Using the code below, our results are going to be "12345" because the pValue.a actually is looking at the memory space where our original x variable was declared.

          public void Go()

          {

             MyStruct x = new MyStruct();

             x.a = 5;

             DoSomething(ref x);

             Console.WriteLine(x.a.ToString());

               

          }

          public void DoSomething(ref MyStruct pValue)

          {

                   pValue.a = 12345;

          }

Passing Reference Types.

Passing parameters that are reference types is similar to passing value types by reference as in the previous example.

If we are using the value type

           public class MyInt

           {

               public int MyValue;

           }

And call the Go() method, the MyInt ends up on the heap because it is a reference type:

          public void Go()

          {

             MyInt x = new MyInt();              

          }

 

If we execute Go() as in the following code ...

          public void Go()

          {

             MyInt x = new MyInt();

             x.MyValue = 2;

             DoSomething(x);

             Console.WriteLine(x.MyValue.ToString());

              

          }

           public void DoSomething(MyInt pValue)

           {

               pValue.MyValue = 12345;

           }

Here's what happens...

 

  1. The method Go() goes on the stack
  2. The variable x in the Go() method goes on the stack
  3. DoSomething() goes on the stack
  4. The parameter pValue goes on the stack
  5. The value of x (the address of MyInt on the stack) is copied to pValue

So it makes sense that when we change the MyValue property of the MyInt object in the heap using pValue and we later refer to the object on the heap using x, we get the value "12345".

So here's where it gets interesting. What happens when we pass a reference type by reference?

Check it out. If we have a Thing class and Animal and Vegetables are both things:

           public class Thing

           {

           }

           public class Animal:Thing

           {

               public int Weight;

           }

           public class Vegetable:Thing

           {

               public int Length;

           }

And we execute the Go() method below:

          public void Go()

          {

             Thing x = new Animal();

           

             Switcharoo(ref x);

              Console.WriteLine(

                "x is Animal    :   "

                + (x is Animal).ToString());

              Console.WriteLine(

                  "x is Vegetable :   "

                  + (x is Vegetable).ToString());

              

          }

 

           public void Switcharoo(ref Thing pValue)

           {

               pValue = new Vegetable();

           }

Our variable x is turned into a Vegetable.

x is Animal    :   False
x is Vegetable :   True

Let's take a look at what's happening:

 

  1. The Go() method goes on the stack
  2. The x pointer goes on the stack
  3. The Animal goes on the heap
  4. The Switcharoo() method goes on the stack
  5. The pValue goes on the stack and points to x


  6. The Vegetable goes on the heap
  7. The value of x is changed through pValue to the address of the Vegetable

If we don't pass the Thing by ref, we'll keep the Animal and get the opposite results from our code.

If the above code doesn't make sense, check out my article on types of Reference variables to get a better understanding of how variables work with reference types.

In Conclusion.

We've looked at how parameter passing is handled in memory and now know what to look out for. In the next part of this series, we'll take a look at what happens to reference variables that live in the stack and how to overcome some of the issues we'll have when copying objects.

For now.

-Happy Coding
中文翻译:

尽管在.NET framework下我们并不需要担心内存管理和垃圾回收(Garbage Collection),但是我们还是应该了解它们,以优化我们的应用程序。同时,还需要具备一些基础的内存管理工作机制的知识,这样能够有助于解释我们日常程序编写中的变量的行为。在本文中我将讲解我们必须要注意的方法传参的行为。

在第一部分里我介绍了栈和堆的基本功能,还介绍到了在程序执行时值类型和引用类型是如何分配的,而且还谈到了指针。


* 参数,大问题

这里有一个代码执行时的详细介绍,我们将深入第一部分出现的方法调用过程...

当我们调用一个方法时,会发生以下的事情:

1.方法执行时,首先在栈上为对象实例中的方法分配空间,然后将方法拷贝到栈上(此时的栈被称为帧),但是该空间中只存放了执行方法的指令,并没有方法内的数据项。
2.方法的调用地址(或者说指针)被放置到栈上,一般来说是一个GOTO指令,使我们能够在方法执行完成之后,知道回到哪个地方继续执行程序。(最好能理解这一点,但并不是必须的,因为这并不会影响我们的编码)
3.方法参数的分配和拷贝是需要空间的,这一点是我们需要进一步注意。
4.控制此时被传递到了帧上,然后线程开始执行我们的代码。因此有另一个方法叫做"调用栈"。

示例代码如下:
复制C#代码保存代码public int AddFive(int pValue)
{
     int result;
     result = pValue + 5;
     return result;
}
此时栈开起来是这样的:


就像第一部分讨论的那样,放在栈上的参数是如何被处理的,需要看看它是值类型还是引用类型。值类型的值将被拷贝到栈上,而引用类型的引用(或者说指针)将被拷贝到栈上。


* 值类型传递

首先,当我们传递一个值类型参数时,栈上被分配好一个新的空间,然后该参数的值被拷贝到此空间中。

来看下面的方法:
复制C#代码保存代码class Class1
{
     public void Go()
     {
         int x = 5;
         AddFive(x);
         Console.WriteLine(x.ToString());
     }

     public int AddFive(int pValue)
     {
         pValue += 5;
         return pValue;
     }
}
方法Go()被放置到栈上,然后执行,整型变量"x"的值"5"被放置到栈顶空间中。


然后AddFive()方法被放置到栈顶上,接着方法的形参值被拷贝到栈顶,且该形参的值就是"x"的拷贝。


当AddFive()方法执行完成之后,线程就通过预先放置的指令返回到Go()方法的地址,然后从栈顶依次将变量pValue和方法AddFive()移除掉:


所以我们的代码输出的值是"5",对吧?这里的关键之处就在于任何传入方法的值类型参数都是复制拷贝的,所以原始变量中的值是被保留下来而没有被改变的。

必须注意的是,如果我们要将一个非常大的值类型数据(如数据量大的struct类型)入栈,它会占用非常大的内存空间,而且会占有过多的处理器周期来进行拷贝复制。栈并没有无穷无尽的空间,它就像在水龙头下盛水的杯子,随时可能溢出。struct是一个能够存放大量数据的值类型成员,我们必须小心地使用。

这里有一个存放大数据类型的struct:
复制C#代码保存代码public struct MyStruct
{
     long a, b, c, d, e, f, g, h, i, j, k, l, m;
}
来看看当我们执行了Go()和DoSometing()方法时会发生什么:
复制C#代码保存代码public void Go()
{
     MyStruct x = new MyStruct();
     DoSomething(x);
}

public void DoSomething(MyStruct pValue)
{
     // DO SOMETHING HERE....
}


这将会非常的低效。想象我们要是传递2000次MyStruct,你就会明白程序是怎么瘫痪掉的了。

那么我们应该如何解决这个问题?可以通过下列方式来传递原始值的引用:
复制C#代码保存代码public void Go()
{
     MyStruct x = new MyStruct();
     DoSomething(ref x);
}
public struct MyStruct
{
     long a, b, c, d, e, f, g, h, i, j, k, l, m;
}
public void DoSomething(ref MyStruct pValue)
{
     // DO SOMETHING HERE....
}
通过这种方式我们能够提高内存中对象分配的效率。


唯一需要注意的是,在我们通过引用传递值类型时我们会修改该值类型的值,也就是说pValue值的改变会引起x值的改变。执行以下代码,我们的结果会变成"123456",这是因为pValue实际指向的内存空间与x变量声明的内存空间是一致的。
复制C#代码保存代码public void Go()
{
     MyStruct x = new MyStruct();
     x.a = 5;
     DoSomething(ref x);
     Console.WriteLine(x.a.ToString());
}
public void DoSomething(ref MyStruct pValue)
{
     pValue.a = 12345;
}

* 引用类型传递

传递引用类型参数的情况类似于先前例子中通过引用来传递值类型的情况。

如果我们使用引用类型:
复制C#代码保存代码public class MyInt
{
     public int MyValue;
}
然后调用Go()方法,MyInt对象将放置在堆上:
复制C#代码保存代码public void Go()
{
     MyInt x = new MyInt();
}


如果我们执行下面的Go()方法:
复制C#代码保存代码public void Go()
{
     MyInt x = new MyInt();
     x.MyValue = 2;
     DoSomething(x);
     Console.WriteLine(x.MyValue.ToString());
}
public void DoSomething(MyInt pValue)
{
     pValue.MyValue = 12345;
}
将发生这样的事情...


1.方法Go()入栈
2.Go()方法中的变量x入栈
3.方法DoSomething()入栈
4.参数pValue入栈
5.x的值(MyInt对象的在栈中的指针地址)被拷贝到pValue中

因此,当我们通过MyInt类型的pValue来改变堆中MyInt对象的MyValue成员值后,接着又使用指向该对象的另一个引用x来获取了其MyValue成员值,得到的值就变成了"12345"。

而更有趣的是,当我们通过引用来传递一个引用类型时,会发生什么?

让我们来检验一下。假如我们有一个"Thing"类和两个继承于"Thing"的"Animal"和"Vegetable" 类:
复制C#代码保存代码public class Thing
{
}
public class Animal : Thing
{
     public int Weight;
}
public class Vegetable : Thing
{
     public int Length;
}
然后执行下面的Go()方法:
复制C#代码保存代码public void Go()
{
     Thing x = new Animal();
     Switcharoo(ref x);
     Console.WriteLine(
       "x is Animal     :    "
       + (x is Animal).ToString());
     Console.WriteLine(
         "x is Vegetable :    "
         + (x is Vegetable).ToString());
}
public void Switcharoo(ref Thing pValue)
{
     pValue = new Vegetable();
}
变量x被返回为Vegetable类型。
复制代码保存代码x is Animal     :    False
x is Vegetable :    True
让我们来看看发生了什么:


1.Go()方法入栈
2.x指针入栈
3.Animal对象实例化到堆中
4.Switcharoo()方法入栈
5.pValue入栈且指向x


6.Vegetable对象实例化到堆中
7.x的值通过被指向Vegetable对象地址的pValue值所改变。


如果我们不使用Thing的引用,相反的,我们得到结果变量x将会是Animal类型的。

如果以上代码对你来说没有什么意义,那么请继续看看我的文章中关于引用变量的介绍,这样能够对引用类型的变量是如何工作的会有一个更好的理解。

我们看到了内存是怎样处理参数传递的,在系列的下一部分中,我们将看看栈中的引用变量发生了些什么,然后考虑当我们拷贝对象时是如何来解决某些问题的。

posted on 2007-11-09 13:18  hzwang  阅读(240)  评论(0)    收藏  举报

导航