Skip to content

Commit

Permalink
documentation updates:
Browse files Browse the repository at this point in the history
* add CONTRIBUTING.md
* update pynsq docs
* update README.md (add Authors/Contributors)
* update protocol docs
* update nsqd docs
* update INSTALLING.md
  • Loading branch information
mreiferson committed Oct 22, 2012
1 parent 4d1f5a6 commit d9af9e5
Show file tree
Hide file tree
Showing 6 changed files with 237 additions and 52 deletions.
37 changes: 37 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# Contributing

Thanks for your interest in contributing to NSQ!

## Getting Started

* make sure you have a [GitHub account](https://github.com/signup/free)
* submit a ticket for your issue, assuming one does not already exist
* clearly describe the issue including steps to reproduce when it is a bug
* identify specific versions of the binaries and client libraries
* fork the repository on GitHub

## Making Changes

* create a branch from where you want to base your work
* we typically name branches according to the following format: `helpful_name_<issue_number>`
* make commits of logical units
* make sure your commit messages are in a clear and readable format, example:

```
nsqd: fixed bug in protocol_v2
* update the message pump to properly account for RDYness
* cleanup variable names
* ...
```

* if you're fixing a bug or adding functionality it probably makes sense to write a test
* make sure to run `fmt.sh` and `test.sh` in the root of the repo to ensure that your code is
properly formatted and that tests pass (NOTE: we integrate Travis with GitHub for continuous
integration)

## Submitting Changes

* push your changes to your branch in your fork of the repository
* submit a pull request against bitly's repository
* comment in the pull request when you're ready for the changes to be reviewed: `"ready for review"`
23 changes: 16 additions & 7 deletions INSTALLING.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,13 @@
# Pre-requisites
## Binary Releases

Pre-built binaries (`nsqd`, `nsqlookupd`, `nsqadmin`, and all example apps) for linux and darwin are
available for [download][binary].

## Building From Source

### Pre-requisites

**golang** http://golang.org/doc/install

**simplejson** https://github.com/bitly/go-simplejson

Expand All @@ -12,12 +21,10 @@

$ go get github.com/bmizerany/assert

# Installing

Binaries (`nsqd`, `nsqlookupd`, `nsqadmin`, and all example apps)
### Compiling

Note: Binaries can not be built from within $GOPATH because of relative imports. To build, checkout to a directory
outside of $GOPATH
NOTE: binaries can not be built from within `$GOPATH` because of relative imports. To build,
checkout to a directory outside of your `$GOPATH`.

$ git clone https://github.com/bitly/nsq.git
$ cd $REPO
Expand All @@ -32,6 +39,8 @@ Python module (for building Python readers)

$ pip install pynsq

# Testing
## Testing

$ ./test.sh

[binary]: https://github.com/bitly/nsq/downloads
57 changes: 46 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,8 @@ and darwin.

[![Build Status](https://secure.travis-ci.org/bitly/nsq.png)](http://travis-ci.org/bitly/nsq)

**NSQ** was built as a successor to [simplequeue][simplequeue] (part of [simplehttp][simplehttp]) and as
such was designed to (in no particular order):
**NSQ** was built as a successor to [simplequeue][simplequeue] (part of [simplehttp][simplehttp])
and as such was designed to (in no particular order):

* provide easy topology solutions that enable high-availability and eliminate SPOFs
* address the need for stronger message delivery guarantees
Expand Down Expand Up @@ -77,10 +77,10 @@ This was one of our **highest** priorities. Our production systems handle a larg
all built upon our existing messaging tools, so we needed a way to slowly and methodically upgrade
specific parts of our infrastructure with little to no impact.

First, on the message *producer* side we built `nsqd` to match [simplequeue][simplequeue]. Specifically, nsqd
exposes an HTTP `/put` endpoint, just like `simplequeue`, to POST binary data (with the one caveat
that the endpoint takes an additional query parameter specifying the "topic"). Services that wanted
to switch to start publishing to `nsqd` only have to make minor code changes.
First, on the message *producer* side we built `nsqd` to match [simplequeue][simplequeue].
Specifically, nsqd exposes an HTTP `/put` endpoint, just like `simplequeue`, to POST binary data
(with the one caveat that the endpoint takes an additional query parameter specifying the "topic").
Services that wanted to switch to start publishing to `nsqd` only have to make minor code changes.

Second, we built libraries in both Python and Go that matched the functionality and idioms we had
been accustomed to in our existing libraries. This eased the transition on the message *consumer*
Expand Down Expand Up @@ -199,8 +199,8 @@ we would have traditionally maintained the older toolchain discussed above.
### Go

We made a strategic decision early on to build the **NSQ** core in [Go][golang]. We recently blogged
about our [use of Go at bitly][go_at_bitly] and alluded to this very project - it might be helpful to browse
through that post to get an understanding of our thinking with respect to the language.
about our [use of Go at bitly][go_at_bitly] and alluded to this very project - it might be helpful
to browse through that post to get an understanding of our thinking with respect to the language.

Regarding **NSQ**, Go channels (not to be confused with **NSQ** channels) and the language's built
in concurrency features are a perfect fit for the internal workings of `nsqd`. We leverage buffered
Expand All @@ -226,9 +226,10 @@ There is also a [protocol spec][protocol].
### Getting Started

The following steps will run **NSQ** on your local machine and walk through publishing, consuming,
and archive messages to disk.
and archiving messages to disk.

1. follow the instructions in the [INSTALLING][installing] doc.
1. follow the instructions in the [INSTALLING][installing] doc (or [download a binary
release][binary]).
2. in one shell, start `nsqlookupd`:

$ nsqlookupd
Expand All @@ -252,11 +253,31 @@ and archive messages to disk.
$ curl -d 'hello world 3' 'http://127.0.0.1:4151/put?topic=test'

7. to verify things worked as expected, in a web browser open `http://127.0.0.1:4171/` to view
the `nsqadmin` UI and check the log files (`test.*.log`) written to `/tmp`.
the `nsqadmin` UI and see statistics. Also, check the contents of the log files (`test.*.log`)
written to `/tmp`.

The important lesson here is that `nsq_to_file` (the client) is not explicitly told where the `test`
topic is produced, it retrieves this information from `nsqlookupd`.

## Authors

NSQ was designed and developed by Matt Reiferson ([@imsnakes][snakes_twitter]) and Jehiah Czebotar
([@jehiah][jehiah_twitter]) but wouldn't have been possible without the support of
[bitly][bitly]:

* Dan Frank ([@danielhfrank][dan_twitter])
* Pierce Lopez ([@ploxiln][pierce_twitter])
* Will McCutchen ([@mccutchen][mccutch_twitter])
* Micha Gorelick ([@mynameisfiber][micha_twitter])
* Jay Ridgeway ([@jayridge][jay_twitter])

### Contributors

* Phillip Rosen ([@phillro][phil_github]) for the [Node.js Client Library][node_lib]
* David Gardner ([@davidgardnerisme][david_twitter]) for the [PHP Client Library][php_lib]
* Harley Laue ([@losinggeneration][harley_github])
* Justin Azoff ([@JustinAzoff][justin_github])

[simplehttp]: https://github.com/bitly/simplehttp
[idempotence]: http://en.wikipedia.org/wiki/Idempotence
[golang]: http://golang.org
Expand All @@ -273,3 +294,17 @@ topic is produced, it retrieves this information from `nsqlookupd`.
[pynsq]: https://github.com/bitly/nsq/tree/master/pynsq
[nsq_post]: http://word.bitly.com/post/33232969144/nsq
[binary]: https://github.com/bitly/nsq/downloads
[snakes_twitter]: https://twitter.com/imsnakes
[jehiah_twitter]: https://twitter.com/jehiah
[dan_twitter]: https://twitter.com/danielhfrank
[pierce_twitter]: https://twitter.com/ploxiln
[mccutch_twitter]: https://twitter.com/mccutchen
[micha_twitter]: https://twitter.com/mynameisfiber
[harley_github]: https://github.com/losinggeneration
[david_twitter]: https://twitter.com/davegardnerisme
[justin_github]: https://github.com/JustinAzoff
[phil_github]: https://github.com/phillro
[node_lib]: https://github.com/phillro/nodensq
[php_lib]: https://github.com/davegardnerisme/nsqphp
[bitly]: https://bitly.com
[jay_twitter]: https://twitter.com/jayridge
4 changes: 2 additions & 2 deletions docs/protocol.md
Original file line number Diff line number Diff line change
Expand Up @@ -125,8 +125,8 @@ Commands are line oriented and structured as follows:

NOTE: there is no response

Unlike **V1**, data is streamed asynchronously to the client and framed in order to support the
various reply bodies, ie:
Data is streamed asynchronously to the client and framed in order to support the various reply
bodies, ie:

[ 4-byte integer frame ID ][ 4-byte size in bytes ][ N-byte data ]

Expand Down
74 changes: 42 additions & 32 deletions nsqd/README.md
Original file line number Diff line number Diff line change
@@ -1,41 +1,51 @@
nsqd
====
## nsqd

`nsqd` is the daemon that receives, buffers, and delivers messages to clients. It optionally connects to `nsqlookupd`
instances to announce topic and channels. It has a TCP API for clients, and HTTP API for publishing messages,
administrative actions, and statistics.
`nsqd` is the daemon that receives, buffers, and delivers messages to clients.

HTTP API
--------
It is normally run alongside `nsqlookupd` instances to announce topic and channels but can be run
standalone.

* `/put?topic=...` [message is POST body]
It listens on two TCP ports, one for clients and another for the HTTP API.

curl -d "<message>" http://127.0.0.1:4151/put?topic=message_topic
### HTTP API

* `/mput?topic=...` [messages are new line separated POST body]
* `/put?topic=...`

curl -d "<message>\n<message>" http://127.0.0.1:4151/put?topic=message_topic
POST message body

`$ curl -d "<message>" http://127.0.0.1:4151/put?topic=message_topic`

* `/mput?topic=...`

POST message body (`\n` separated)

`$ curl -d "<message>\n<message>" http://127.0.0.1:4151/put?topic=message_topic`

* `/empty_channel?topic=...&channel=...`
* `/delete_channel?topic=...&channel=...`
* `/stats` [?format=json]
* `/ping` (returns "OK" for use with monitoring)
* `/info` returns server version information.


Command Line Options
--------------------

Usage of ./nsqd:
-data-path="": path to store disk-backed messages
-debug=false: enable debug mode
-http-address="0.0.0.0:4151": <addr>:<port> to listen on for HTTP clients
-lookupd-tcp-address=[]: lookupd TCP address (may be given multiple times)
-max-bytes-per-file=104857600: number of bytes per diskqueue file before rolling
-mem-queue-size=10000: number of messages to keep in memory (per topic)
-msg-timeout=60000: time (ms) to wait before auto-requeing a message
-sync-every=2500: number of messages between diskqueue syncs
-tcp-address="0.0.0.0:4150": <addr>:<port> to listen on for TCP clients
-verbose=false: enable verbose logging
-version=false: print version string
-worker-id=0: unique identifier (int) for this worker (will default to a hash of hostname)
* `/stats`

supports both text and JSON via `?format=json`

* `/ping`

returns `OK`, helpful when monitoring

* `/info`

returns version information

### Command Line Options

-data-path="": path to store disk-backed messages
-debug=false: enable debug mode
-http-address="0.0.0.0:4151": <addr>:<port> to listen on for HTTP clients
-lookupd-tcp-address=[]: lookupd TCP address (may be given multiple times)
-max-bytes-per-file=104857600: number of bytes per diskqueue file before rolling
-mem-queue-size=10000: number of messages to keep in memory (per topic)
-msg-timeout=60000: time (ms) to wait before auto-requeing a message
-sync-every=2500: number of messages between diskqueue syncs
-tcp-address="0.0.0.0:4150": <addr>:<port> to listen on for TCP clients
-verbose=false: enable verbose logging
-version=false: print version string
-worker-id=0: unique identifier (int) for this worker (will default to a hash of hostname)
94 changes: 94 additions & 0 deletions pynsq/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
## pynsq

`pynsq` is a Python NSQ client library.

It provides a high-level reader library for building consumers and two low-level modules for both
sync and async communication over the NSQ protocol (if you wanted to write your own high-level
functionality).

The async module is built on top of the [Tornado IOLoop][tornado] and as such requires `tornado` be
installed:

`$ pip install tornado`

### Reader

Reader provides high-level functionality for building robust NSQ consumers in Python on top of the
async module.

Multiple reader instances can be instantiated in a single process (to consume from multiple
topics/channels at once). Each specifying a set of tasks that will be called for each message over
that channel. Tasks are defined as a dictionary of string names -> callables passed as
`all_tasks` during instantiation.

`preprocess_method` defines an optional callable that can alter the message data before other task
functions are called.

`validate_method` defines an optional callable that returns a boolean as to weather or not this
message should be processed.

`async` determines whether handlers will do asynchronous processing. If set to True, handlers must
accept a keyword argument called `finisher` that will be a callable used to signal message
completion (with a boolean argument indicating success).

The library handles backoff as well as maintaining a sufficient RDY count based on the # of
producers and your configured `max_in_flight`.

Here is an example that demonstrates synchronous message processing:

```python
import nsq

def task1(message):
print message
return True

def task2(message):
print message
return True

all_tasks = {"task1": task1, "task2": task2}
r = nsq.Reader(all_tasks, lookupd_http_addresses=['127.0.0.1:4161'],
topic="nsq_reader", channel="asdf")
nsq.run()
```

And async:

```python
"""
This is a simple example of async processing with nsq.Reader.
It will print "deferring processing" twice, and then print
the last 3 messages that it received.
Note in particular that we pass the `async=True` argument to Reader(),
and also that we cache a different finisher callable with
each message, to be called when we have successfully finished
processing it.
"""
import nsq

buf = []

def process_message(message, finisher):
global buf
# cache both the message and the finisher callable for later processing
buf.append((message, finisher))
if len(buf) >= 3:
print '****'
for msg, finish_fxn in buf:
print msg
finish_fxn(True) # use finish_fxn to tell NSQ of success
print '****'
buf = []
else:
print 'deferring processing'

all_tasks = {"task1": process_message}
r = nsq.Reader(all_tasks, lookupd_http_addresses=['127.0.0.1:4161'],
topic="nsq_reader", channel="async", async=True)
nsq.run()
```

[tornado]: https://github.com/facebook/tornado

0 comments on commit d9af9e5

Please sign in to comment.