green-firehose v1.0.5
green-firehose
A byte-oriented buffering micro-service for AWS Greengrass.
Current version: 1.0.5
Lead Maintainer: Halim Qarroum
Table of contents
Install
green add npm://green-firehose
Description
This application acts as a bufferring mechanism between data emitted by green applications and the local filesystem. green-firehose
first buffers data in memory until it reaches a given size threshold, or a given time threshold is exceeded, after that it flushes the data in bulk on the filesystem using a file rotation mechanism.
green-firehose
(akin to AWS Firehose), is stream-oriented, meaning that the received data will be partitioned and pushed into a named stream. Each stream has its in-memory size threshold and buffering time threshold and there is a maximum of 25 streams available for green applications to consume.
Decoupled ingestion
This application has been thought from the beginning as being decoupled from the other application(s) running on the local Greengrass core, as such, you use the Greengrass subscription model to route messages to be ingested to green-firehose
.
For example, if you are deploying the green-monkey application which randomly generates messages on the local/+/monkey
topic, and would like these messages to be ingested by green-firehose
, you simply need to create a subscription in your green-cli deployment template as follow :
{
"source": {
"type": "lambda",
"function": "green-monkey"
},
"destination": {
"type": "lambda",
"function": "green-firehose"
},
"topic": "local/+/monkey"
}
If you need to do this with another application, you simply need to add another subscription from this application to green-firehose
.
Data format
green-firehose
does not care about the message format of your payload when it is received, as long as it is a valid JSON document. The received data will be considered as raw, and buffered and then persisted on the disk as such. However, since green-firehose
is stream-oriented, meaning that you push your data payloads to a specific stream, it determines the stream to which you want to push your data based on the topic on which it has been received.
For instance, messages received from the local/+/monkey
topic, will be sent to a stream having the same name. If you wish to publish to multiple streams from an application, you need to publish to different topic(s) and to associate a subscription with them in your deployment template so that green-firehose
can receive messages issued from these topics as well.
If your application requires to publish on different streams without knowing ahead-of-time the number or the name of the stream on which it is going to publish, you can use the green-firehose
Expressify API (documented in the API section) to do so. But be aware that by doing so, you will tightly couple your application to Expressify, since you need to call its specific API.
API
This application exposes an Expressify API which is accessible from any green application locally, but also from the AWS IoT Core service from the cloud remotely. The available API endpoints are documented and described below.
Method | Resource | Return code(s) | Payload required | Description |
---|---|---|---|---|
GET | /streams | 200 | No | This route returns the list of streams currently registered on green-firehose , along with their current state. |
GET | /streams/:id | 200/404 | No | This route returns statistics and metadata associated with the given stream. |
POST | /streams/:id | 200/503 | Yes | This route allows green applications to publish data on a given stream. |
DELETE | /streams/:id | 200/404 | No | By calling this route, green applications can delete a stream which will have the consequence of flushing to the disk every data currently managed by the stream. |
See also
- The green-cli command-line tool.
- The green-monkey payload generator.
- The green-sdk software development kit for green applications.
- The Expressify framework.