s3-getobject-accelerator v2.15.0
S3 GetObject Accelerator
Get large objects from S3 by using parallel byte-range fetches/parts without the AWS SDK to improve performance.
We measured a troughoput of 6.5 Gbit/s on an m5zn.6xlarge in eu-west-1 using this lib with this settings:
{concurrency: 64}.
Installation
npm i s3-getobject-acceleratorExamples
Compact
const {createWriteStream} = require('node:fs');
const {pipeline} = require('node:stream');
const {download} = require('s3-getobject-accelerator');
pipeline(
download({bucket: 'bucket', key: 'key', version: 'optional version'}, {partSizeInMegabytes: 8, concurrency: 4}).readStream(),
createWriteStream('/tmp/test'),
(err) => {
if (err) {
console.error('something went wrong', err);
} else {
console.log('done');
}
}
);More verbose
Get insights into the part downloads and write to file directly without stream if it is smaller than 1 TiB:
const {download} = require('s3-getobject-accelerator');
const d = download({bucket: 'bucket', key: 'key', version: 'optional version'}, {partSizeInMegabytes: 8, concurrency: 4});
d.on('part:downloading', ({partNo}) => {
console.log('start downloading part', partNo);
});
d.on('part:downloaded', ({partNo}) => {
console.log('part downloaded, write to disk next in correct order', partNo);
});
d.on('part:writing', ({partNo}) => {
console.log('start writing part to disk', partNo);
});
d.on('part:done', ({partNo}) => {
console.log('part written to disk', partNo);
});
d.meta((err, metadata) => {
if (err) {
console.error('something went wrong', err);
} else {
if (metadata.lengthInBytes > 1024 * 1024 * 1024 * 1024) {
console.error('file is larger than 1 TiB');
} else {
d.file('/tmp/test', (err) => {
if (err) {
console.error('something went wrong', err);
} else {
console.log('done');
}
});
}
}
});API
download(s3source, options)
s3source<Object>bucket<string>key<string>version<string>(optional)
options<Object>partSizeInMegabytes<number>(optional, defaults to uploaded part size)concurrency<number>requestTimeoutInMilliseconds<number>Maxium time for a request to complete from start to finish (optional, defaults to 300,000, 0 := no timeout)resolveTimeoutInMilliseconds<number>Maximum time for a DNS query to resolve (optional, defaults to 3,000, 0 := no timeout)connectionTimeoutInMilliseconds<number>Maximum time for a socket to connect (optional, defaults to 3,000, 0 := no timeout)readTimeoutInMilliseconds<number>Maxium time to read the response body (optional, defaults to 300,000, 0 := no timeout)dataTimeoutInMilliseconds<number>Maxium time between two data events while reading the response body (optional, defaults to 3,000, 0 := no timeout)writeTimeoutInMilliseconds<number>Maxium time to write the request body (optional, defaults to 300,000, 0 := no timeout)v2AwsSdkCredentials<AWS.Credentials>(optional)endpointHostname<string>(optional, defaults to s3.${region}.amazonaws.com)agent<https.Agent>(optional)
- Returns:
meta(cb)<Function>Get meta-data before starting the download (downloads the first part and keeps the body in memory until download starts)cb(err, metadata)<Function>err<Error>metadata<Object>lengthInBytes<number>parts<number>Number of parts available (optional)
readStream()<Function>Start download- Returns: ReadStream
file(path, cb)<Function>Start downloadpath<string>cb(err)<Function>err<Error>
abort([err])<Function>Abort downloaderr<Error>
partsDownloading()<Function>Number of parts downloading at the moment- Returns
<number>
- Returns
addListener(eventName, listener)See https://nodejs.org/api/events.html#emitteraddlistenereventname-listeneroff(eventName, listener)See https://nodejs.org/api/events.html#emitteroffeventname-listeneron(eventName, listener)See https://nodejs.org/api/events.html#emitteroneventname-listeneronce(eventName, listener)See https://nodejs.org/api/events.html#emitteronceeventname-listenerremoveListener(eventName, listener)See https://nodejs.org/api/events.html#emitterremovelistenereventname-listener
AWS credentials & region
AWS credentials are fetched in the following order:
options.v2AwsSdkCredentials- Environment variables
AWS_ACCESS_KEY_IDAWS_SECRET_ACCESS_KEYAWS_SESSION_TOKEN(optional)
- IMDSv2
AWS region is fetched in the following order:
- Environment variable
AWS_REGION - IMDSv2
Considerations
- Typical sizes
partSizeInMegabytesare 8 MB or 16 MB. If objects are uploaded using a multipart upload, it’s a good practice to download them in the same part sizes ( do not specifypartSizeInMegabytes), or at least aligned to part boundaries, for best performance (see https://docs.aws.amazon.com/whitepapers/latest/s3-optimizing-performance-best-practices/use-byte-range-fetches.html). - Keep in mind that you pay per GET request to Amazon S3. The smaller the parts, the more expensive a download is.
8 months ago
8 months ago
9 months ago
8 months ago
8 months ago
1 year ago
1 year ago
1 year ago
2 years ago
2 years ago
2 years ago
2 years ago
3 years ago
2 years ago
2 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago