Recently, I’m working on parse URLs, and then getting some useful info from those pages. It’s like a crawler. When I use the multi-thread to send a Web Request, I got a problem.
In test web, there are 4 links at index page. When I run my program, I can’t get all of them contents. Sometimes I can successed, but always not. In the fail state, I can get the first page right content; sometimes can get second page content. The error message is: timeout. The code is like following:
1
private void Run()
2
{
3
// get 50 urls
4
List<string> urls = GetTobeURLs(50);
5
foreach (string url in urls)
6
{
7
ThreadPool.QueueUserWorkItem(PraseURL, url);
8
}
9
}
10
11
private void PraseURL(object state)
12
{
13
string url = state as string;
14
HttpWebRequest request = CreateWebRequest(url);
15
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
16
DoSomeThing(response);
17
response.Close();
18
}
19
20
private void DoSomeThing(HttpWebResponse response)
21
{
22
string content = GetContent(response);
23
UpLoadDataToService(content);
24
}
At the DoSomeThing () method, I get the page content and then call the service method to add the date to server. When I comment the UpLoadDataToService (), it run ok. I can get all of 4 pages contents rightly. I thought it may caused by the spending of upload data. So, I try to set the WebRequest timeout to longer, but the problem is still.
I don’t know where is wrong. Finally, I solve this problem like the following codes. It takes me much time.
1
private void PraseURL(object state)
2
{
3
string url = state as string;
4
HttpWebRequest request = CreateWebRequest(url);
5
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
6
ThreadPool.QueueUserWorkIte(DoSomeThing, response);
7
response.Close();
8
}
I don’t know what the difference between these is. So I google it. At last I got it. The reason is that: single-user client SHOULD NOT maintain more than 2 connections with any server or proxy.
Its means the default limit connection is 2. We can solve this problem by these:
ServicePointManager.CheckCertificateRevocationList = true;
ServicePointManager.DefaultConnectionLimit = 100;
U can more info. from
http://blogs.msdn.com/darrenj/archive/2005/03/07/386655.aspx
So when I call the service method it takes some seconds. The WebRequst can’t release quickly. When I create the third WebRequest, it’ll throw an error: timeout.
Let’s see what the difference between my two solutions above is right now. I put out some info to see what’s happen. I find that, if I use the ThreadPool.QueueUserWorkItem(DoSomeThing, response);
The response.Close (); will reached soon, needn’t waiting for DosomeThing () finished. There are no three WebRequest send request at the same times, so it will works ok, How Fun!
posted on 2007-01-23 11:02
Anders06 阅读(725)
评论(0) 编辑 收藏 网摘 所属分类:
.NET 技巧