首先, 要知道http协议中,references字段 依赖关系;
比如说,做登录操作:抓取session结果如下:

此处 首页Ctrl+F5强制刷新获取到响应实体中的userSession
以代码形式获取进行 参数化传递, 拿到上一个page首页面的响应中的session作为登录的条件参数进行传入即可;
直接上代码:
public String doHomePage() throws ClientProtocolException, IOException, URISyntaxException {
// 强制刷新 http://172.31.3.161:1080/WebTours/nav.pl?in=home
// http://172.31.3.161:1080/WebTours/nav.pl?in=home
client = HttpClients.createDefault();
URI uri = new URIBuilder().setScheme("http").setHost("172.31.3.161").setPort(1080).setPath("/WebTours/nav.pl")
.setParameter("in", "home").build();
HttpGet get = new HttpGet(uri);
response = client.execute(get);
httpEntity = response.getEntity();
String result = EntityUtils.toString(httpEntity);
// JSONObject object = JSONObject.parseObject(result);
// System.out.println(result);
return result;
}
此处 要从 result String类型的字段中获取 到HTML中的Session字段, 我们对Html 进行解析,jsoup ;
此处为Jsoup连接地址:http://www.oschina.net/p/jsoup/?fromerr=YuZhBZHi
jsoup 是一款 Java 的HTML 解析器,可直接解析某个URL地址、HTML文本内容。它提供了一套非常省力的API,可通过DOM,CSS以及类似于JQuery的操作方法来取出和操作数据。 jsoup的主要功能如下: 从一个URL,文件或字符串中解析HTML; 使用DOM或CSS选择器来查找、取出数据; 可操作HTML元素、属性、文本; jsoup是基于MIT协议发布的,可放心使用于商业项目。 示例代码: File input = new File("/tmp/input.html"); Document doc = Jsoup.parse(input, "UTF-8", "http://example.com/"); Element content = doc.getElementById("content"); Elements links = content.getElementsByTag("a"); for (Element link : links) { String linkHref = link.attr("href"); String linkText = link.text(); }
解析代码如下:
Document parse = Jsoup.parse(html);
// 获取body元素内容
Document bodyFragment = Jsoup.parse(html);
// System.out.println(bodyFragment.data());
Elements sessionElement = parse.getElementsByAttributeValue("name", "userSession");
String attr = sessionElement.attr("value");
// System.out.println(attr.toString());
对登录页面用fiddler进行抓取操作,查看请求对象以及响应实体:

图中红色标记处为 Page首页面传递过来的Session值:
我们以代码形式实现传递:

此处红色标记处发现打印出来的结果与fiddler 抓取到的结果不一致,经查询是进行了转义操作 ;
post = new HttpPost("http://172.31.3.161:1080/WebTours/login.pl"); // HttpPost post = new // HttpPost("http://172.31.3.183:1080/cgi-bin/nav.pl?in=home"); ArrayList<BasicNameValuePair> list = new ArrayList<BasicNameValuePair>(); list.add(new BasicNameValuePair("userSession", session)); BasicNameValuePair valuePair = new BasicNameValuePair("userSession", session); String value = valuePair.getValue(); System.out.println(valuePair.getName() + "111111111111111---->" + value); list.add(new BasicNameValuePair("username", "jojo")); list.add(new BasicNameValuePair("password", "bean")); list.add(new BasicNameValuePair("login.x", "0")); list.add(new BasicNameValuePair("login.y", "0")); list.add(new BasicNameValuePair("login", "Login")); list.add(new BasicNameValuePair("JSFormSubmit", "off")); formEntity = new UrlEncodedFormEntity(list, Consts.UTF_8); post.setEntity(formEntity); response = client.execute(post); httpEntity = response.getEntity(); String result = EntityUtils.toString(httpEntity); System.out.println(result);
最后 :和fiddler 进行抓取到的结果进行对比: 结果保持一致:
<!-- User password was correct - added a cookie with the user's default information. Set the user up to make reservations... ---> <html> <title>Web Tours</title> <frameset cols="160,*" border=1 frameborder=1> <frame src=nav.pl?page=menu&in=home name=navbar marginheight=2 marginwidth=2 noresize scrolling=auto> <frame src=login.pl?intro=true name=info marginheight=2 marginwidth=2 noresize scrolling=auto> </frameset> </body> </html>
完整实例代码如下:
import java.io.IOException;
import java.net.URI;
import java.net.URISyntaxException;
import java.util.ArrayList;
import org.apache.http.Consts;
import org.apache.http.HttpEntity;
import org.apache.http.client.ClientProtocolException;
import org.apache.http.client.entity.UrlEncodedFormEntity;
import org.apache.http.client.methods.CloseableHttpResponse;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.client.methods.HttpPost;
import org.apache.http.client.utils.URIBuilder;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClients;
import org.apache.http.message.BasicNameValuePair;
import org.apache.http.util.EntityUtils;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.select.Elements;
import org.junit.Test;
/**
* Q1: 申明在静态方法中变量是否之间会影响
* Q2: 代码解析出的Html 中session 值和 fiddler抓取到的不一样,
* Q3: 如何验证是否代码无误,并且测试已通过.
* Q4: Authion (认证) -暂时保留
*/
public class SessionTest {
CloseableHttpClient client;
HttpPost post;
UrlEncodedFormEntity formEntity;
CloseableHttpResponse response;
HttpEntity httpEntity;
;;
@Test
public void startDo() throws ClientProtocolException, IOException, URISyntaxException {
String result = doHomePage();
String session = HtmlParse(result);
System.out.println(session);
System.out.println("----------------------------");
doLogin(session);
}
public static String HtmlParse(String html) {
Document parse = Jsoup.parse(html);
// 获取body元素内容
Document bodyFragment = Jsoup.parse(html);
// System.out.println(bodyFragment.data());
Elements sessionElement = parse.getElementsByAttributeValue("name", "userSession");
String attr = sessionElement.attr("value");
// System.out.println(attr.toString());
return attr;
}
public String doHomePage() throws ClientProtocolException, IOException, URISyntaxException {
// 强制刷新 http://172.31.3.161:1080/WebTours/nav.pl?in=home
// http://172.31.3.161:1080/WebTours/nav.pl?in=home
client = HttpClients.createDefault();
URI uri = new URIBuilder().setScheme("http").setHost("172.31.3.161").setPort(1080).setPath("/WebTours/nav.pl")
.setParameter("in", "home").build();
HttpGet get = new HttpGet(uri);
response = client.execute(get);
httpEntity = response.getEntity();
String result = EntityUtils.toString(httpEntity);
// JSONObject object = JSONObject.parseObject(result);
System.out.println(result);
return result;
}
public void doLogin(String session) throws ClientProtocolException, IOException {
post = new HttpPost("http://172.31.3.161:1080/WebTours/login.pl");
// HttpPost post = new
// HttpPost("http://172.31.3.183:1080/cgi-bin/nav.pl?in=home");
ArrayList<BasicNameValuePair> list = new ArrayList<BasicNameValuePair>();
list.add(new BasicNameValuePair("userSession", session));
BasicNameValuePair valuePair = new BasicNameValuePair("userSession", session);
String value = valuePair.getValue();
System.out.println(valuePair.getName() + "111111111111111---->" + value);
list.add(new BasicNameValuePair("username", "jojo"));
list.add(new BasicNameValuePair("password", "bean"));
list.add(new BasicNameValuePair("login.x", "0"));
list.add(new BasicNameValuePair("login.y", "0"));
list.add(new BasicNameValuePair("login", "Login"));
list.add(new BasicNameValuePair("JSFormSubmit", "off"));
formEntity = new UrlEncodedFormEntity(list, Consts.UTF_8);
post.setEntity(formEntity);
response = client.execute(post);
httpEntity = response.getEntity();
String result = EntityUtils.toString(httpEntity);
System.out.println(result);
}
}
图中 备注中遗留问题,将会在下一章节进行详细讲解;
如果已经有小伙伴解决掉,请及时告知与我,谢谢;
浙公网安备 33010602011771号