2.0.2 • Published 10 months ago
json-indexer v2.0.2
json-indexer
json-indexer is a TypeScript utility for efficient indexing of large JSON files. It allows you to parse files incrementally, minimizing memory usage while building a structured index for quick access to objects. This is particularly useful for scenarios where you need to work with massive JSON files containing arrays of objects.
Features
- Efficient Parsing: Reads JSON files in chunks to handle large files without loading the entire content into memory.
- Customizable Indexing: Allows you to define additional keys to include in the index.
- Scalable: Suitable for large-scale data processing.
- Type-Safe: Leverages TypeScript for strong typing and compile-time safety.
Installation
Install the package via npm:
npm install json-indexerUsage
Example
Suppose you have a large JSON file (data.json) with the following structure:
{
"shoes": [
{ "id": "1", "name": "Nike Air", "size": 42, "color": "black" },
{ "id": "2", "name": "Adidas Boost", "size": 43, "color": "white" },
...
]
}You can use json-indexer to parse and index the shoes array like this:
import { JsonIndexer } from 'json-indexer';
// Your data type
interface Shoe {
id: string;
name: string;
size: number;
color: string;
}
// The resulting indexed data type
interface ShoeMetadata {
// id, filePosition, and length are required
id: string;
filePosition: number;
length: number;
// Extra keys that should be added to the index
name: string;
size: number;
}
// Assume `file` is a File object representing your JSON file
const file = new File([/* file content */], "data.json", {
type: "application/json"
});
// Create an instance of JsonIndexer
const indexer = new JsonIndexer(file);
// Build the index with additional properties
const shoeIndex = await indexer.index<ShoeMetadata>("shoes", ["name", "size"]);
/**
* Output:
* Map {
* "1" => {
* id: "1",
* filePosition: 123,
* length: 456,
* name: "Nike Air",
* size: 42,
* },
* "2" => { ... }
* }
**/
// Subsequent lookups
const metadata = shoeIndex.get('1');
if (metadata) {
const chunk = file.slice(
metadata.filePosition,
metadata.filePosition + metadata.length
);
const record = JSON.parse(await chunk.text());
}API Reference
JsonIndexer
A class for indexing JSON files.
Constructor
constructor(file: File, chunkSize = 1024 * 1024)file(File): The JSON file to index.chunkSize(number, optional): Size of each chunk read from the file (default: 1 MB).
Methods
index<T>
async index<T extends { id: string, filePosition: number, length: number }>(
key: string,
additionalIndexKeys: Array<RequiredAdditionalKeys<T>> = []
): Promise<Map<string, T>>- Generic type T must extend the base type containing
id,filePosition, andlength. key(string): The key of the array to index (e.g.,"shoes").additionalIndexKeys(Array<keyof T>): Keys to include in the index, beyond the base requirements.- Returns a
Promiseresolving to aMapwhere the keys are theidvalues of the indexed objects, and the values are the indexed objects with metadata
Benefits
- Memory Efficient: Processes the file in chunks, avoiding high memory usage.
- Incremental Parsing: Supports working with large files incrementally.
- Customizable Metadata: Add aditional fields to the index for detailed object representation.
- Flexible Type System: Generic type parameters at the method level for improved type safety and reusability
Error Handling
If you forget to include all required keys in additionalIndexKeys, the index() method will throw an error:
// This will throw an error because 'name' is required by the ShoeMetadata type
const index = await indexer.index<ShoeMetadata>("shoes", []);
// Error: Missing keys in additionalIndexKeys: nameLicense
This project is licensed under the MIT License.