.Text(博客园)Url rewrite详细分析

在写这篇随笔之前,其实网上早已有高手分析了.text的urlrewrite的实现(点击这里查看全文),问题是作为一个初学者的我,看了以后,感觉没有切中要害。所以,产生了写这篇dd的冲动。

从头说起,.text是通过httpModules和httpHandlers来实现urlrewrite的,看看web.config中的配置
 1        <httpHandlers>
 2            <!-- Can not see to load asmx like .aspx, since we will grap all requests later, make sure these are processed by their default factory -->
 3            <add verb="*" path="*.asmx" type="System.Web.Services.Protocols.WebServiceHandlerFactory, System.Web.Services, Version=1.0.5000.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a"
 4                validate="false" />
 5            <!--Since we are grabbing all requests after this, make sure Error.aspx does not rely on .Text -->
 6            <add verb="*" path="Error.aspx" type="System.Web.UI.PageHandlerFactory" />
 7            <!--This will process any ext mapped to aspnet_isapi.dll -->
 8            <add verb="*" path="*" type="Dottext.Common.UrlManager.UrlReWriteHandlerFactory,Dottext.Common" />
 9        </httpHandlers>
10        <httpModules>
11            <add name="UrlReWriteModule" type="Dottext.Common.UrlManager.UrlReWriteModule, Dottext.Common" />
12            <add name="EventHttpModule" type="Dottext.Framework.ScheduledEvents.EventHttpModule, Dottext.Framework" />
13            <!--<add name="MsftBlogsHttpModule" type= "AspNetWeb.MsftBlogsHttpModule, MsftBlogsHttpModule" />-->
14        </httpModules>
15

看看第11行,配置了Dottext.Common.UrlManager.UrlReWriteModule这个httpModule,很显然,从名字来看,它就是实现urlrewrite功能的httpmodule,去看看它的源代码
 1        static UrlReWriteModule()
 2        {
 3            regexPath = new Regex(@"^/?(\w|-|_)+\.aspx$",RegexOptions.IgnoreCase|RegexOptions.Compiled);
 4            regexApplication = new Regex(HttpContext.Current.Request.ApplicationPath,RegexOptions.IgnoreCase|RegexOptions.Compiled);
 5        }

 6
 7        private void context_BeginRequest(object sender, EventArgs e)
 8        {
 9            if(ConfigProvider.Instance().IsAggregateSite)
10            {
11                HttpContext context  = ((HttpApplication)sender).Context;
12
13                string path = context.Request.Path.ToLower();
14                int iExtraStuff = path.IndexOf(".aspx");
15                if(iExtraStuff > -1 || path.IndexOf("."== -1)
16                {
17                    if(iExtraStuff > -1)
18                    {
19                        path = path.Remove(iExtraStuff+5,path.Length - (iExtraStuff+5));
20                    }

21
22                    path = regexApplication.Replace(path,string.Empty,1,0);
23
24                    if(path == "" || path == "/"  || regexPath.IsMatch(path))
25                    {
26                        UrlHelper.SetEnableUrlReWriting(context,false);
27                    }

28                    
29                }

30                else if(context.Request.Path.ToLower().IndexOf("services"> 0 && context.Request.Path.ToLower().IndexOf(".asmx"> 0 )
31                {
32                    if(AlllowService(context))
33                    {
34                        if(context.Request.RequestType!="POST")
35                        {
36                            string regexstr=@"/\w+/services/";
37                            string url=Regex.Replace(context.Request.RawUrl,regexstr,"/services/",RegexOptions.IgnoreCase);
38                            context.RewritePath(url);
39                        }

40                        //string fileName =context.Request; //System.IO.Path.GetFileName(context.Request.Path);
41                        //context.RewritePath("~/Services/" + fileName);
42                    }

43                    else
44                    {
45                        context.Response.Clear();
46                        context.Response.End();
47                    }

48                }

49            
50            }

51        }

52

代码有点长,不过没关系,结构还是很清晰的。用户发出的url请求,在context_BeginRequest中进行了第一次过滤,这个函数把url做了一下归整,判断来自用户的请求是 应用程序根目录下的.aspx文件,如果是则不进行urlrewrite(由UrlHelper.SetEnableUrlReWriting这个函数进行设置)。什么?看不懂regexPath = new Regex(@"^/?(\w|-|_)+\.aspx$",……); 这句话?建议你去看一下正则表达式,这里的意思是指,需要匹配的字符串所对应的文件是否在根目录下,举例来说,
/contact.aspx
default.aspx
匹配,而
 /oop80/default.aspx
则不匹配,
总而言之,context_BeginRequest函数把对根目录下.aspx文件的请求,以及webservice,设置成不需要进行urlrewrite,其余的,则需要进行下一步的处理,在什么地方处理呢,我们来看看web.config中的配置,第8行,很清晰的告诉我们,在Dottext.Common.UrlManager.UrlReWriteHandlerFactory中,会对用户的请求做进一步分析和处理。 来看看源代码

 1public class UrlReWriteHandlerFactory:  IHttpHandlerFactory
 2    {
 3        public UrlReWriteHandlerFactory(){} //Nothing to do in the cnstr
 4        
 5        protected virtual HttpHandler[] GetHttpHandlers(HttpContext context)
 6        {
 7            return HandlerConfiguration.Instance().HttpHandlers;
 8        }

 9
10        /// <summary>
11        /// Implementation of IHttpHandlerFactory. By default, it will load an array of HttpHanlder (Dottext.UrlManager.HttpHandler) from
12        /// the blog.config. This can be changed, by overrideing the GetHttpHandlers(HttpContext context) method. 
13        /// </summary>
14        /// <param name="context">Current HttpContext</param>
15        /// <param name="requestType">Request Type (Passed along to other IHttpHandlerFactory's)</param>
16        /// <param name="url">The current requested url. (Passed along to other IHttpHandlerFactory's)</param>
17        /// <param name="path">The physical path of the current request. Is not gaurenteed to exist (Passed along to other IHttpHandlerFactory's)</param>
18        /// <returns>
19        /// Returns an Instance of IHttpHandler either by loading an instance of IHttpHandler or by returning an other
20        /// IHttpHandlerFactory.GetHanlder(HttpContext context, string requestType, string url, string path) method
21        /// </returns>

22        public virtual IHttpHandler GetHandler(HttpContext context, string requestType, string url, string path)
23        {
24            //Get the Handlers to process. By defualt, we grab them from the blog.config
25            //Dottext.Framework.Logger.LogManager.Log("url",context.Request.Path);
26            HttpHandler[] items = GetHttpHandlers(context);
27            //Dottext.Framework.Logger.LogManager.Log("path",Dottext.Framework.Util.Globals.RemoveAppFromPath(context.Request.Path,context.Request.ApplicationPath));
28            //Do we have any?
29            if(items != null)
30            {
31                int count = items.Length;
32
33                for(int i = 0; i<count; i++)
34                {
35                    //We should use our own cached Regex. This should limit the number of Regex's created
36                    //and allows us to take advantage of RegexOptons.Compiled 
37                    
38                    if(items[i].IsMatch(Dottext.Framework.Util.Globals.RemoveAppFromPath(context.Request.Path,context.Request.ApplicationPath)))
39                    {
40                        //throw new Exception();
41                        switch(items[i].HandlerType)
42                        {
43                            case HandlerType.Page://默认是Page
44                                                        
45                                return ProccessHandlerTypePage(items[i],context,requestType,url);
46                            case HandlerType.Direct:
47                                HandlerConfiguration.SetControls(context,items[i].BlogControls);
48                                return (IHttpHandler)items[i].Instance();
49                            case HandlerType.Factory:
50                                //Pass a long the request to a custom IHttpHandlerFactory
51                                return ((IHttpHandlerFactory)items[i].Instance()).GetHandler(context,requestType,url,path);
52                            default:
53                                throw new Exception("Invalid HandlerType: Unknown");
54                        }

55                    }

56                }

57            }

58            //If we do not find the page, just let ASP.NET take over
59            return PageHandlerFactory.GetHandler(context,requestType,url, path);
60        }

61
62
63        private IHttpHandler ProccessHandlerTypePage(HttpHandler item, HttpContext context, string requestType, string url)
64        {
65            string pagepath = item.FullPageLocation;
66            if(pagepath == null)
67            {
68                pagepath = HandlerConfiguration.Instance().FullPageLocation;
69            }

70            HandlerConfiguration.SetControls(context,item.BlogControls);
71            IHttpHandler myhandler=PageParser.GetCompiledPageInstance(url,pagepath,context);
72            return myhandler;
73        }

74
75
76        public virtual void ReleaseHandler(IHttpHandler handler) 
77        {
78
79        }

80    }

81

代码很长,但是它是实现urlrewrite的关键,我们来分析分析看。
UrlReWriteHandlerFactory实现了IHttpHandlerFactory接口,而IHttpHandlerFactory最关键的地方在于GetHandler这个函数接口,所以我们从GetHandler函数入手。
首先是 HttpHandler [] items = GetHttpHandlers(context); 通过查看GetHttpHandlers的定义,可以得知返回的其实是HandlerConfiguration中一个叫HttpHandlers的属性,它的值,是通过这么一句话来进行初始化的,return ((HandlerConfiguration)ConfigurationSettings.GetConfig("HandlerConfiguration")); 想必大家现在都明白了,HandlerConfiguration对象是通过将web.config中<HandlerConfiguration>节进行反序列化获得的,关于ConfigurationSettings.GetConfig的使用,网上已经有很多了,在这里就不重复劳动了。看看web.config中的HandlerConfiguration节
 1    <HandlerConfiguration defaultPageLocation="default.aspx" type="Dottext.Common.UrlManager.HandlerConfiguration, Dottext.Common">
 2        <HttpHandlers>
 3            <HttpHandler pattern="(\.config|\.asax|\.ascx|\.config|\.cs|\.vb|\.vbproj|\.asp|\.licx|\.resx|\.resources)$"    type="Dottext.Framework.UrlManager.HttpForbiddenHandler, Dottext.Framework" handlerType="Direct" />
 4            <HttpHandler pattern="(\.gif|\.js|\.jpg|\.zip|\.jpeg|\.jpe|\.css|\.rar|\.xml|\.xsl)$" type="Dottext.Common.UrlManager.BlogStaticFileHandler, Dottext.Common" handlerType="Direct" />
 5            <HttpHandler pattern="/rss\.aspx$" type="Dottext.Common.Syndication.RssHandler, Dottext.Common"    handlerType="Direct" />
 6            <HttpHandler pattern="/CommentsRSS\.aspx$" type="Dottext.Common.Syndication.RecentCommentsRSS, Dottext.Common" handlerType="Direct" />
 7            <HttpHandler pattern = "/RecentCommentsRSS\.aspx$" type = "Dottext.Common.Syndication.RecentCommentsRSS, Dottext.Common" handlerType = "Direct" />
 8            <HttpHandler pattern="/atom\.aspx$" type="Dottext.Common.Syndication.AtomHandler, Dottext.Common" handlerType="Direct" />
 9            <HttpHandler pattern="/comments/commentRss/\d+\.aspx$" type="Dottext.Common.Syndication.RssCommentHandler, Dottext.Common" handlerType="Direct" />
10            <HttpHandler pattern="/aggbug/\d+\.aspx$" type="Dottext.Framework.Tracking.AggBugHandler, Dottext.Framework" handlerType="Direct" />
11            <HttpHandler pattern="/customcss\.aspx$" type="Dottext.Web.UI.Handlers.BlogSecondaryCssHandler, Dottext.Web" handlerType="Direct" />
12            <HttpHandler pattern="/category\/(\d|\w|\s)+\.aspx/rss$" type="Dottext.Common.Syndication.RssCategoryHandler, Dottext.Common" handlerType="Direct" />
13            <HttpHandler pattern="/favorite\/(\d|\w|\s)+\.aspx/rss$" type="Dottext.Common.Syndication.RssLinksHandler, Dottext.Common" handlerType="Direct" />
14            <HttpHandler pattern="/articles/\d+\.aspx$" controls="viewpost.ascx,Comments.ascx,AnonymousPostComment.ascx,LoginPostComment.ascx" />
15            <HttpHandler pattern="/articles/\w+\.aspx$" controls="viewpost.ascx,Comments.ascx,AnonymousPostComment.ascx,LoginPostComment.ascx" />
16            <HttpHandler pattern="/PreviewPost.aspx$" controls="PreviewPost.ascx" />
17            <HttpHandler pattern="/archive/\d{4}/\d{2}/\d{2}/\d+\.(aspx|htm)$" controls="viewpost.ascx,Comments.ascx,AnonymousPostComment.ascx,LoginPostComment.ascx" />
18            <HttpHandler pattern="/archive/\d{4}/\d{2}/\d{2}/\w+\.(aspx|htm)$" controls="viewpost.ascx,Comments.ascx,AnonymousPostComment.ascx,LoginPostComment.ascx" />
19            <HttpHandler pattern="/archive/\d{4}/\d{1,2}/\d{1,2}\.aspx$" controls="ArchiveDay.ascx" />
20            <HttpHandler pattern="/archive/\d{4}/\d{1,2}\.aspx$" controls="ArchiveMonth.ascx" />
21            <HttpHandler pattern="/archives/\d{4}/\d{1,2}\.aspx$" controls="ArticleArchiveMonth.ascx" />
22            <HttpHandler pattern="/contact\.aspx$" controls="Contact.ascx" />
23            <HttpHandler pattern="/AddToFavorite\.aspx$" handlerType="Page" pageLocation="AddToFavorite.aspx" />
24            <HttpHandler pattern="/BlogSearch\.aspx$" controls="BlogSearch.ascx" />
25            <HttpHandler pattern="/posts/|/story/|/archive/" type="Dottext.Web.UI.Handlers.RedirectHandler,Dottext.Web"    handlerType="Direct" />
26            <HttpHandler pattern="/gallery\/\d+\.aspx$" controls="GalleryThumbNailViewer.ascx" />
27            <HttpHandler pattern="/gallery\/image\/\d+\.aspx$" controls="ViewPicture.ascx" />
28            <HttpHandler pattern="/(?:category|stories)/(\w|\s)+\.aspx$" controls="CategoryEntryList.ascx" />
29            <HttpHandler pattern="/favorite/(\w|\s)+\.aspx$" controls="FavoriteList.ascx" />
30            <HttpHandler pattern="/(?:admin)" type="Dottext.Web.UI.Handlers.BlogExistingPageHandler, Dottext.Web" handlerType="Factory" />
31            <!--<HttpHandler pattern = "^(?:\/(\w|\s|\.(?!aspx))+((\/login\.aspx)?|(\/?))?)$" type = "Dottext.Web.UI.Handlers.BlogExistingPageHandler, Dottext.Web" handlerType = "Factory" />-->
32            <!--<HttpHandler pattern = "^(?:/\w+\/(\w|\s|\.)+\/(?:admin|logout\.aspx|login\.aspx))" type = "Dottext.Web.UI.Handlers.BlogExistingPageHandler, Dottext.Web" handlerType = "Factory" />-->
33            <HttpHandler pattern="/comments\/\d+\.aspx$" type="Dottext.Common.Syndication.CommentHandler, Dottext.Common" handlerType="Direct" />
34            <HttpHandler pattern="/services\/trackbacks/\d+\.aspx$" type="Dottext.Framework.Tracking.TrackBackHandler, Dottext.Framework" handlerType="Direct" />
35            <HttpHandler pattern="/services\/pingback\.aspx$" type="Dottext.Framework.Tracking.PingBackService, Dottext.Framework"    handlerType="Direct" />
36            <HttpHandler pattern="/services\/metablogapi\.aspx$" type="Dottext.Framework.XmlRpc.MetaWeblog, Dottext.Framework"    handlerType="Direct" />
37            <HttpHandler pattern="/Services\/SyncHanlder\.aspx$" handlerType="Page" pageLocation="Sevices/DottextAPI.aspx" />
38            <!-- 只显示文章标题列表 -->
39            <!--<HttpHandler pattern="^((/default\.aspx)|(/)|(/index\.aspx))$" controls="CategoryPostsList.ascx" />-->
40            <!-- 显示文章内容 -->
41            <HttpHandler pattern="^((/default\.aspx)|(/)|(/index\.aspx))$" controls="PagedPosts.ascx" />
42            <HttpHandler pattern="^(/posts\.aspx)$" controls="PagedPosts.ascx" />
43            <!--<HttpHandler pattern = "/services/\w+.asmx$" type="System.Web.Services.Protocols.WebServiceHandlerFactory, System.Web.Services, Version=1.0.5000.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a"  handlerType = "Factory" />-->
44            <HttpHandler pattern = "^(?:\/(\w|\s|\.(?!aspx))+((\/default\.aspx)?|(\/?))?)$"  controls = "homepage.ascx"/>
45            <!--<HttpHandler pattern="^(?:/\w+\/(\w|\s|\.(?!aspx))+((\/default\.aspx)?|(\/?))?)$" controls="homepage.ascx" />-->
46            
47        </HttpHandlers>
48    </HandlerConfiguration>
49

其实在visual studio.net中看这段代码会更加清楚一点,这里的每一个pattern其实都会建立对应的Regex ,来对用户请求的url进行匹配,下面的代码会很详细说明这一点,我们往下看。

接下来是一个循环,循环中执行这么一句话
if (items[i].IsMatch(Dottext.Framework.Util.Globals.RemoveAppFromPath(context.Request.Path,context.Request.ApplicationPath))) {...}
它把web.config中每一个httphandler来匹配用户的请求,IsMatch函数中会根据pattern建立相应的Regex,并调用Regex.IsMatch函数,如果请求匹配该pattern,则调用对应的模块来进行处理。很显然,这里是实现urlrewrite的关键之处,比如,请求的是这么一个url 
http://oop80.cnblogs.com/archive/2005/08/09/210550.html
那么,他就会匹配这一个handler
<HttpHandler pattern="/archive/\d{4}/\d{2}/\d{2}/\d+\.(aspx|htm)$" controls="viewpost.ascx,Comments.ascx,AnonymousPostComment.ascx,LoginPostComment.ascx" />
那么,程序就会调用 viewpost.ascx,Comments.ascx,AnonymousPostComment.ascx,LoginPostComment.ascx 来处理用户的请求,
又如,请求这么一个url
http://oop80.cnblogs.com/category/23435.aspx
匹配
<HttpHandler pattern="/(?:category|stories)/(\w|\s)+\.aspx$" controls="CategoryEntryList.ascx" />

最后程序是一个switch语句,根据handlertype的不同进行各种处理,具体就不在解释了,因为……这一段我自己也没有进行更深入一步的研究,里面还有一些细节,如SetControls的控件在什么地方载入,GetCompiledPageInstance函数的具体用法等等。不过,这些东西终归和urlrewrite没多大关系了,不在我们讨论的范围之内。

请大家多多指教,谢谢
posted @ 2005-08-09 17:00  OOP  阅读(2846)  评论(15编辑  收藏  举报