mojodna.net
Binary Streaming with Hadoop (and Node.js) :: Drive-by Digressions
http://mojodna.net/2013/12/27/binary-streaming-with-hadoop-and-nodejs.html
Binary Streaming with Hadoop (and Node.js). Manipulating binary data from Hadoop streaming jobs is a black art. There are Python ( dumbo. And R ( rmr. Tools to facilitate streaming jobs, but all of them have abstracted the handling of byte streams (using typed bytes. Successfully enough that its difficult to determine how they actually work. Which, as it turns out, is important to know if youre. While researching this topic, I kept returning to a Stack Overflow post. Create Some Typed Bytes. Hadoop jar /...
picar.us
Architecture — Picarus 0.2.0 documentation
http://www.picar.us/en/latest/architecture.html
How Picarus fits in. The Picarus REST server stores state in Redis, launches jobs on Hadoop, and manages data on HBase/Redis. Because all state is stored in Redis it is safe to have multiple instances running with a load balancer distributing requests between them. The server is written in Python using the gevent. Socket library and the bottle. Micro web-framework. Gevent allows the server to process many connections simultaneously. Hadoop CDH4 (with mr1) is used on the cluster. The Hadoopy.