How to write simple HTTP proxy with Boost.Asio
Меню:
In this article I describe process of writing of simple cross-platform HTTP proxy.
What we need
To develop this example (source code) I used Boost version 1.35. To build example, you can use cmake (but you can also build sources manually). To configure and build you need to run following commands (on Unix-like OSes)1:
> cmake . > makeand after compilation you'll get
proxy-asio-asyncexecutable, that you can run from command line. This program accepts only one argument — number of threads, that will perform request processing (by default, this value is equal 2). Port number on which requests will accepted is hardcoded in source code and equal to100012.Architecture
As in previous examples, our program consists from three parts:
- the
mainfunction, that parses command line, creates separate threads for asio services together withserverobject, and then enters into request processing loop;serverclass, that accepts requests, and createsconnectionobject, that implements all logic of connection handling;connectionclass, that implements all logic, and pass data between client & web-server.The data processing is performed in asynchronous mode, and to distribute load between processors, we can use several independent asio services, that perform dispatching of calls (
asio::io_service).Note: Most hard part of the development of asynchronous code is proper design of data flow. I usually draw a state diagram and then transform each state to separate function. Presence of such diagram is very helpful for understanding of code by other developers.
Implementation
The
mainfunction is pretty simple, so we'll not analyze it — you can just look to its source code and understand, what it does (all common definitions are in file common.h.Implementation of server (the
serverclass — proxy-server.hpp & proxy-server.cpp) also not so much different from previous examples — changes were made only for method, that is used to select service, that will implement dispatching. In our example new service is selected from circular list of services, that allow us to get some load balancing for requests.All data processing logic is implemented in
connectionclass (proxy-conn.hpp & proxy-conn.cpp. I want to say, that parsing of headers was done without any optimisation3.Data processing is started from call to
startfunction fromserverclass, that accepts connection and creates new object ofconnectionclass. This function initiates asynchronous reading of request headers from browser.Reading of request headers is performed in the
handle_browser_read_headersfunction, that is called when we get some part of data from browser. I need to mention, that if we get incomplete headers (there is no empty string (\r\n\r\n)), then this function initiates new reading of headers, trying to get them all.After we get all headers, this function parses them and extracts version of HTTP protocol, used method and address of web-server (some of these data will be required to detect persistent connections).
After parsing of headers, this function calls
start_connect, that parses address of web-server, and if we don't have opened connection to this server, then it initiates process of name resolution. If we have opened connection, then we simply start data transfer withstart_write_to_serverfunction.The
handle_resolvefunction is called after name resolution, and if we get address of server, then it initiates process of connection establishing. Result of this process is handled byhandle_connectfunction, that initiates process of data transfer to the server withstart_write_to_serverfunction, that forms correct headers, and pass these data to the server.After transferring data to server, in function
handle_server_writewe initiate reading of response (only headers first) from server. Processing of headers is handled byhandle_server_read_headersfunction, that is similar to thehandle_browser_read_headers, but it also tries to understand — should we close connection after data transfer, or not. After processing of headers, this function initiates process of sending data to browser.After sending of headers, we create a loop, that transfer body of response from server to browser. In this loop we use two functions —
handle_server_read_bodyandhandle_browser_write, each of them calls another function until we don't finish reading of data from server (either number of bytes, specified in headers) or don't get end of file.If we'll get end of file, then we'll pass rest of data to the browser and close connection. Or if we use persistent connection, then we'll pass control to the
startfunction, that initiates reading of new headers from browser.That's all. As I already mentioned above, main problem — building of right data flow sequence.
1. If cmake can't find required libraries, you can specify their location with two <em>cmake's variables —
CMAKE_INCLUDE_PATHиCMAKE_LIBRARY_PATH, by running cmake following way:> cmake . -DCMAKE_INCLUDE_PATH=~/exp/include -DCMAKE_LIBRARY_PATH=~/exp/lib2. I could also implement code, that allow to specify port number in command line, but I was lazy, as this example was just a prototype to check some of my ideas.
3. There is also cpp-netlib project, that has (development in progress) parsers for basic protocols — HTTP, SMTP и т.п.
浙公网安备 33010602011771号