How to read code

Posted on 2006-01-06 21:28 飞鼠阅读(208) 评论(0) 收藏举报

Recently, I found myself in the position of having to understand a large body of (largely undocumented) .NET code in a very short amount of time. When you’re in that situation, what strategies do you use to quickly get up to speed? For me, Reflector is *the* tool for understanding a large code base quickly. I’m at the point now where I prefer Reflector to the IDE (at least when the goal is understanding as much code as possible in the shortest amount of time). Even if I have the source code available to me, I’ll generally prefer to forgo the IDE initially and start with Reflector. The reason I like Reflector over the IDE is because it seems to hit the sweet spot between hiding the information I don’t care about while still giving me easy access to a lot of detail when I need it. Maybe that will change in Whidbey, but for me right now, Reflector kicks Visual Studio’s ass when it comes to this specific use case.

The hardest part about tackling a large foreign code base is determining where to start. It’s possible to just start opening up random source files and being reading code until you die, but I can’t ever get that approach to work. Here’s how I attack these sorts of problems:

Start at a high level. Personally, I find that the easiest way to dive into a code base is to look at the binaries first – in other words, I like to point Reflector at the compiled binaries and see what’s inside. Reflector is great because it immediately shows you the namespace structure of the assembly, which gives you an idea of the high-level organization of the code. From that, I usually pick the root-level namespace and look at what classes are defined. At this point, I’m really looking to ascertain what the important classes are, and what behaviors they implement.

Root out the initialization story. Usually, I’m looking at a piece of code with the intent of understanding one particular aspect of its functionality. By looking at the class organization, I can usually determine which classes are related to that that feature. From there, I try to piece together the initialization story – what state do objects need to get their jobs done? How does that state get initialized? If it requires external resources, how are those resources acquired? Understanding initialization is important, because it gives you a feel for the division of responsibility within the system. Reflector’s hyperlinked decompiler comes in *really* handy for this task – the decompiled code it produces is generally pretty readable, and since every member, function, and type is immediately accessible with only one click. Figuring out initialization is important because it helps you understand what parts of the system are important, what dependencies are absolutely required for the code to run, and what pieces might be able to be stubbed out. My goal is to figure out how I can execute some of this code inside of NUnit.

Look down the call stack. Sometimes I’ll find a method and wonder “how does that work?” Look at the decompiled code and find out. Hyperlink around – look at the methods that get called. What do they do? How do they work? Explore the path of execution. Develop a solid understand of one small piece of the code.

Look up the call stack. Now that I understand how a method gets it work done, it’s time to figure out what role it plays in the larger system. Who calls it? What do they do with the results? This is where Reflector’s “Member References” feature is invaluable. The number and types of the method’s callers are often quite informative and tell much about the larger structure of the system.

Start asking yourself “why did they do it that way?” Developing a mental model as to *why* code was written in a certain way is almost as important as to understanding the “how” of its implementations. Look for clues that might indicate reasons for the design decisions that were made by the original implementers. This is a great way to unearth some hidden requirements.

Make some hypotheses, and use NUnit to validate them. I always end up writing a lot of NUnit tests when I’m trying to understand a codebase. Not because I’m testing the codebase but rather I’m testing my understanding of it. It’s always interesting to write an NUnit test asserting something I think should be true and then have that test fail, because it exposes a flaw in my mental model. Why did that test fail? What did I miss? What don’t I understand yet?

If you have the source, set some strategic breakpoints. This is another good place to check the validity of the mental model. The Call Stack window is your friend!

I treat code reading as a combination of exploration and detective work. The key is to structure your exploration in a goal-oriented fashion -- by formulating hypotheses and then finding ways to validate them. The process can be arduous at times, but (for me at least) the sense of accomplishment I get when I feel like I’ve “tamed the beast” is quite worth it.

刷新页面返回顶部

老鼠爱大米

How to read code