11HTTP Parser
22===========
33
4+ [ ![ Build Status] ( https://api.travis-ci.org/nodejs/http-parser.svg?branch=master )] ( https://travis-ci.org/nodejs/http-parser )
5+
46This is a parser for HTTP messages written in C. It parses both requests and
57responses. The parser is designed to be used in performance HTTP
68applications. It does not make any syscalls nor allocations, it does not
@@ -34,43 +36,46 @@ Usage
3436One ` http_parser ` object is used per TCP connection. Initialize the struct
3537using ` http_parser_init() ` and set the callbacks. That might look something
3638like this for a request parser:
39+ ``` c
40+ http_parser_settings settings;
41+ settings.on_url = my_url_callback;
42+ settings.on_header_field = my_header_field_callback;
43+ /* ... */
3744
38- http_parser_settings settings;
39- settings.on_url = my_url_callback;
40- settings.on_header_field = my_header_field_callback;
41- /* ... */
42-
43- http_parser *parser = malloc(sizeof(http_parser));
44- http_parser_init(parser, HTTP_REQUEST);
45- parser->data = my_socket;
45+ http_parser *parser = malloc(sizeof (http_parser));
46+ http_parser_init (parser, HTTP_REQUEST);
47+ parser->data = my_socket;
48+ ```
4649
4750When data is received on the socket execute the parser and check for errors.
4851
49- size_t len = 80*1024, nparsed;
50- char buf[len];
51- ssize_t recved;
52+ ```c
53+ size_t len = 80*1024, nparsed;
54+ char buf[len];
55+ ssize_t recved;
5256
53- recved = recv(fd, buf, len, 0);
57+ recved = recv(fd, buf, len, 0);
5458
55- if (recved < 0) {
56- /* Handle error. */
57- }
59+ if (recved < 0) {
60+ /* Handle error. */
61+ }
5862
59- /* Start up / continue the parser.
60- * Note we pass recved==0 to signal that EOF has been recieved .
61- */
62- nparsed = http_parser_execute(parser, &settings, buf, recved);
63+ /* Start up / continue the parser.
64+ * Note we pass recved==0 to signal that EOF has been received .
65+ */
66+ nparsed = http_parser_execute(parser, &settings, buf, recved);
6367
64- if (parser->upgrade) {
65- /* handle new protocol */
66- } else if (nparsed != recved) {
67- /* Handle error. Usually just close the connection. */
68- }
68+ if (parser->upgrade) {
69+ /* handle new protocol */
70+ } else if (nparsed != recved) {
71+ /* Handle error. Usually just close the connection. */
72+ }
73+ ```
6974
70- HTTP needs to know where the end of the stream is. For example, sometimes
75+ ` http_parser ` needs to know where the end of the stream is. For example, sometimes
7176servers send responses without Content-Length and expect the client to
72- consume input (for the body) until EOF. To tell http_parser about EOF, give
73- ` 0 ` as the forth parameter to ` http_parser_execute() ` . Callbacks and errors
77+ consume input (for the body) until EOF. To tell ` http_parser ` about EOF, give
78+ ` 0 ` as the fourth parameter to ` http_parser_execute() ` . Callbacks and errors
7479can still be encountered during an EOF, so one must still be prepared
7580to receive them.
7681
@@ -88,8 +93,8 @@ the on_body callback.
8893The Special Problem of Upgrade
8994------------------------------
9095
91- HTTP supports upgrading the connection to a different protocol. An
92- increasingly common example of this is the Web Socket protocol which sends
96+ ` http_parser ` supports upgrading the connection to a different protocol. An
97+ increasingly common example of this is the WebSocket protocol which sends
9398a request like
9499
95100 GET /demo HTTP/1.1
@@ -101,11 +106,11 @@ a request like
101106
102107followed by non-HTTP data.
103108
104- (See http ://tools.ietf.org/html/draft-hixie-thewebsocketprotocol-75 for more
105- information the Web Socket protocol.)
109+ (See [ RFC6455 ] ( https ://tools.ietf.org/html/rfc6455 ) for more information the
110+ WebSocket protocol.)
106111
107112To support this, the parser will treat this as a normal HTTP message without a
108- body. Issuing both on_headers_complete and on_message_complete callbacks. However
113+ body, issuing both on_headers_complete and on_message_complete callbacks. However
109114http_parser_execute() will stop parsing at the end of the headers and return.
110115
111116The user is expected to check if ` parser->upgrade ` has been set to 1 after
@@ -126,21 +131,84 @@ There are two types of callbacks:
126131* notification ` typedef int (*http_cb) (http_parser*); `
127132 Callbacks: on_message_begin, on_headers_complete, on_message_complete.
128133* data ` typedef int (*http_data_cb) (http_parser*, const char *at, size_t length); `
129- Callbacks: (requests only) on_uri ,
134+ Callbacks: (requests only) on_url ,
130135 (common) on_header_field, on_header_value, on_body;
131136
132137Callbacks must return 0 on success. Returning a non-zero value indicates
133138error to the parser, making it exit immediately.
134139
140+ For cases where it is necessary to pass local information to/from a callback,
141+ the ` http_parser ` object's ` data ` field can be used.
142+ An example of such a case is when using threads to handle a socket connection,
143+ parse a request, and then give a response over that socket. By instantiation
144+ of a thread-local struct containing relevant data (e.g. accepted socket,
145+ allocated memory for callbacks to write into, etc), a parser's callbacks are
146+ able to communicate data between the scope of the thread and the scope of the
147+ callback in a threadsafe manner. This allows ` http_parser ` to be used in
148+ multi-threaded contexts.
149+
150+ Example:
151+ ``` c
152+ typedef struct {
153+ socket_t sock;
154+ void* buffer;
155+ int buf_len;
156+ } custom_data_t;
157+
158+
159+ int my_url_callback(http_parser* parser, const char * at, size_t length) {
160+ /* access to thread local custom_data_t struct.
161+ Use this access save parsed data for later use into thread local
162+ buffer, or communicate over socket
163+ * /
164+ parser->data;
165+ ...
166+ return 0;
167+ }
168+
169+ ...
170+
171+ void http_parser_thread (socket_t sock) {
172+ int nparsed = 0;
173+ /* allocate memory for user data * /
174+ custom_data_t * my_data = malloc(sizeof(custom_data_t));
175+
176+ /* some information for use by callbacks.
177+ * achieves thread -> callback information flow * /
178+ my_data->sock = sock;
179+
180+ /* instantiate a thread-local parser * /
181+ http_parser * parser = malloc(sizeof(http_parser));
182+ http_parser_init(parser, HTTP_REQUEST); /* initialise parser * /
183+ /* this custom data reference is accessible through the reference to the
184+ parser supplied to callback functions * /
185+ parser->data = my_data;
186+
187+ http_parser_settings settings; /* set up callbacks * /
188+ settings.on_url = my_url_callback;
189+
190+ /* execute parser * /
191+ nparsed = http_parser_execute(parser, &settings, buf, recved);
192+
193+ ...
194+ /* parsed information copied from callback.
195+ can now perform action on data copied into thread-local memory from callbacks.
196+ achieves callback -> thread information flow * /
197+ my_data->buffer;
198+ ...
199+ }
200+
201+ ```
202+
135203In case you parse HTTP message in chunks (i.e. `read()` request line
136204from socket, parse, read half headers, parse, etc) your data callbacks
137- may be called more than once. Http-parser guarantees that data pointer is only
205+ may be called more than once. `http_parser` guarantees that data pointer is only
138206valid for the lifetime of callback. You can also `read()` into a heap allocated
139207buffer to avoid copying memory around if this fits your application.
140208
141209Reading headers may be a tricky task if you read/parse headers partially.
142210Basically, you need to remember whether last header callback was field or value
143- and apply following logic:
211+ and apply the following logic:
144212
145213 (on_header_field and on_header_value shortened to on_h_*)
146214 ------------------------ ------------ --------------------------------------------
0 commit comments