Remove-csv-duplicates NPM

CSV Duplicate Remover

A simple and versatile Node.js script to remove duplicate entries from a CSV file based on a specific column. It includes a progress bar to visually track the processing and utilizes the command line for easy input.

CSV Duplicate Remover - Table of Contents - Features - Requirements - Installation - Dependencies - Usage - Contributing - License

Features

CSV Input: Accepts a CSV file with any number of columns.
Duplicate Removal: Removes duplicate entries based on the specified column.
User-Friendly CLI: Asks for the input file, output file, and target column via the command line.
Progress Bar: Shows a progress bar during processing.
Validation: Checks if the input file is a CSV file.

Requirements

Node.js
npm

Installation

Run the following command in your project to install the necessary dependencies:

npm install remove-csv-duplicates

This will install all the dependencies declared in the package.json file.

In your project's package.json file, add the following line to the scripts section:

"remove-duplicates": "node ./node_modules/remove-csv-duplicates/remove-csv-duplicates.js"

Dependencies

Usage

To run the script, simply execute the following command in the directory containing the script:

npm run remove-duplicates

The script will then prompt you for the input file name (must be a .csv file), output file name, and the column to parse for removing duplicates.

The input file name can be a relative or absolute path. If the file is in the same directory as the script, you can simply enter the file name. Otherwise, you must enter the relative or absolute path to the file. For example, if the file is in a subdirectory called data, you can enter data/file.csv or ./data/file.csv. If the file is in a parent directory, you can enter ../file.csv.

The output file name can be a relative or absolute path. If the file is in the same directory as the script, you can simply enter the file name. Otherwise, you must enter the relative or absolute path to the file. For example, if you want to save the file in a subdirectory called output, you can enter output/file.csv or ./output/file.csv. If you want to save the file in a parent directory, you can enter ../file.csv.

The column to parse for removing duplicates must be a valid column name in the CSV file. The script will display the column names in the CSV file and prompt you to enter the column name. The column name is case-sensitive.