npm.io
0.8.7 • Published 2d ago

@michaelborck/cite-sight-core

Licence
MIT
Version
0.8.7
Deps
2
Size
228 kB
Vulns
0
Weekly
0

CiteSight

edtech academic-integrity citation-analysis document-analysis frontend react research typescript vite web-app

Academic integrity tool for checking student assignments.

A desktop app, CLI tool, and web service that loads a student assignment, extracts references and in-text citations, verifies every reference exists via academic databases, checks URLs, validates citation formatting, and flags suspicious or fabricated references.

Features

  • Reference Verification — Checks every bibliography entry against Crossref, Semantic Scholar, and OpenAlex APIs
  • Citation Format Validation — Checks APA, MLA, and Chicago formatting rules
  • Cross-Reference Checking — Matches in-text citations to bibliography entries, flags orphans
  • URL Verification — HTTP checks on referenced URLs, with screenshots as evidence (desktop)
  • DOI Resolution — Validates DOIs via Crossref (bot-blocked/paywalled publisher pages are reported as blocked, not dead)
  • Citation Patterns — Future-dated citations, suspicious year clusters, mixed citation styles, and placeholder/template citation text
  • File Support — PDF, DOCX, TXT, Markdown, JSON

CiteSight focuses on citation integrity. Prose-level signals — readability, writing quality, and AI-writing tells (emojis, em-dashes, adverb ratio) — live in the sibling document-analyser tool; run both for a full picture.

Install

Three ways to use CiteSight:

Method Best for Install
Desktop app Offline use, URL screenshots Download for your platform
CLI Automation, CI pipelines npm install -g cite-sight
Docker VPS hosting, shared access docker pull michaelborck/cite-sight
Platform Comparison
Feature Web / Docker Desktop CLI
File input Single file Multiple files Single file
File types PDF, DOCX, TXT PDF, DOCX, TXT, MD PDF, DOCX, TXT, MD, JSON
URL screenshots Yes
PDF/CSV export Yes
Output format Browser dashboard Desktop dashboard Text or JSON (stdout)

Deploy on a VPS

Pull the pre-built Docker image — no Node.js or build tools needed on the server.

Quick deploy
docker run -d -p 3000:3000 --restart unless-stopped --name cite-sight michaelborck/cite-sight

The web app and API are available at http://your-server:3000.

Create a docker-compose.yml on your VPS:

services:
  app:
    image: michaelborck/cite-sight:latest
    ports:
      - "3000:3000"
    restart: unless-stopped
    environment:
      - PORT=3000

Then:

docker compose up -d
Update to latest version
docker compose pull
docker compose up -d
With Redis job queue (optional)

For heavier usage, add Redis to queue analysis jobs instead of processing synchronously:

services:
  app:
    image: michaelborck/cite-sight:latest
    ports:
      - "3000:3000"
    restart: unless-stopped
    environment:
      - PORT=3000
      - REDIS_URL=redis://redis:6379
    depends_on:
      - redis

  redis:
    image: redis:7-alpine
    restart: unless-stopped
Behind a reverse proxy (Nginx/Caddy)

If you're serving on a domain with HTTPS, point your reverse proxy at port 3000. Example Caddy config:

citesight.yourdomain.com {
    reverse_proxy localhost:3000
}

Quick Start (Development)

Desktop App
npm install
npm run build:core

# Terminal 1: Vite dev server
cd packages/desktop && npx vite

# Terminal 2: Electron
npx tsc -p packages/desktop/tsconfig.json
cd packages/desktop && npx electron .
Web App + Server
npm install
npm run build:core
npm run build:server

# Terminal 1: API server
node packages/server/dist/index.js

# Terminal 2: Web frontend
cd packages/web && npx vite

Open http://localhost:5173 — Vite proxies API calls to the server.

CLI
npm run build:core
npx tsc -p packages/cli/tsconfig.json

cite-sight check paper.pdf
cite-sight check paper.pdf --json
cite-sight check paper.pdf --style apa --email you@example.com

# Reports show the detail behind each issue by default — what was cited vs.
# what the matched record holds, plus the surrounding text for in-text
# citations. Use --minimal for a condensed summary-and-verdicts view.
cite-sight check paper.pdf --minimal

# For a bare source list / annotated bibliography (e.g. a deep-research export)
# rather than a manuscript, use --source-list to skip the in-text
# cross-reference check (otherwise every entry is reported as "uncited").
# CiteSight also auto-skips that check when no reference is cited at all.
cite-sight check sources.md --source-list

Batch checking and rate limits. Lookups run one reference at a time and every external request is paced to one per second, so checking a folder is slow but stays within the citation databases' polite-pool limits; results are cached per run, so a work cited across many papers is looked up only once. Always pass --email (it joins the Crossref/OpenAlex polite pools). Semantic Scholar's keyless tier can still rate-limit a large batch — supply a key with --s2-key or the SEMANTIC_SCHOLAR_API_KEY environment variable (the desktop app and server read the same variable). When a lookup is throttled, that reference is reported as unverified with the reason (e.g. "rate-limited on Semantic Scholar") — it is not a confirmed miss; re-run to retry those.

Docker (local build)
docker compose up --build
# Open http://localhost:3000

Project Structure

cite-sight/
├── packages/
│   ├── core/          # Shared analysis library
│   ├── desktop/       # Electron app
│   ├── cli/           # CLI tool
│   ├── server/        # Express API server
│   └── web/           # Landing page + online tool
├── Dockerfile
├── docker-compose.yml
└── package.json       # Workspace root

How Reference Verification Works

For each reference in the bibliography:

  1. Parse — Extract authors, title, year, journal, DOI, URL
  2. Validate Format — Check against APA/MLA/Chicago rules
  3. Verify Existence (cascade):
    • DOI → Crossref API
    • Search Crossref by title + author
    • Search Semantic Scholar (fallback)
    • Search OpenAlex (fallback)
    • If has URL → HTTP status check
  4. Cross-Reference — Match bibliography in-text citations
  5. Score — Confidence score (0–1) based on metadata match quality

Building Releases

Releases are built automatically via GitHub Actions when a version tag is pushed.

Version bump script

Use the bump script to update all workspace versions, commit, tag, and push in one step:

npm run bump -- patch   # 0.2.9 → 0.2.10
npm run bump -- minor   # 0.2.9 → 0.3.0
npm run bump -- major   # 0.2.9 → 1.0.0
npm run bump -- 1.0.0   # exact version

The script updates all 6 package.json files, commits, creates an annotated vX.Y.Z tag, and prompts before pushing.

What the tag triggers

Pushing a v* tag triggers:

  • Electron installers — macOS (DMG), Windows (NSIS), Linux (AppImage) with auto-update
  • npm publish@michaelborck/cite-sight-core + cite-sight CLI
  • Docker images — pushed to Docker Hub and GitHub Container Registry (amd64 + arm64)

Technology Stack

  • Core: TypeScript, pdfjs-dist, mammoth
  • Desktop: Electron, React 19, Zustand, Vite, electron-updater
  • Web: React 19, Vite
  • Server: Express, multer, BullMQ (optional)
  • CLI: Commander.js, chalk
  • APIs: Crossref, Semantic Scholar, OpenAlex (all free tier)

Licence

See LICENSE.