2.0.4 • Published 1 month ago

@split.dev/analytics v2.0.4

Weekly downloads
-
License
MIT
Repository
github
Last release
1 month ago

@split.dev/analytics

šŸš€ Simple AI crawler tracking for any website

Zero dependencies, lightweight, reliable tracking of AI crawlers like ChatGPT, Claude, Perplexity, and 20+ others.

⚔ Quick Start

npm install @split.dev/analytics

Configuration

Quick Setup:

import { createSplitMiddleware } from '@split.dev/analytics/middleware'

export const middleware = createSplitMiddleware({
  apiKey: process.env.SPLIT_API_KEY!,
  debug: process.env.NODE_ENV === 'development'
})

Environment Variables

# Your API key from Split Analytics dashboard
SPLIT_API_KEY=split_live_your_key_here

Next.js Setup

  1. Create middleware.ts in your project root:
import { createSplitMiddleware } from '@split.dev/analytics/middleware'

export const middleware = createSplitMiddleware({
  apiKey: process.env.SPLIT_API_KEY!,
  debug: process.env.NODE_ENV === 'development'
})

export const config = {
  matcher: ['/((?!api|_next/static|_next/image|favicon.ico).*)']
}
  1. The package will automatically track AI crawlers visiting your site!

That's it! AI crawler visits will appear in your Split Dashboard within 5-10 seconds.


šŸ“‹ Complete Setup Guide

1. Get Your API Key

  1. Sign up at split.dev
  2. Go to Settings → API Keys
  3. Click "Generate Live Key"
  4. Copy the key immediately (you won't see it again)

2. Install Package

npm install @split.dev/analytics
# or
yarn add @split.dev/analytics
# or  
pnpm add @split.dev/analytics

3. Add Environment Variable

# .env.local (Next.js)
SPLIT_API_KEY=split_live_your_actual_key_here

# .env (Node.js)
SPLIT_API_KEY=split_live_your_actual_key_here

4. Implement Tracking

Next.js Middleware (Recommended)

// middleware.ts
import { NextRequest, NextResponse } from 'next/server'
import { trackCrawlerVisit } from '@split.dev/analytics/middleware'

export async function middleware(request: NextRequest) {
  // Add Split Analytics tracking
  if (process.env.SPLIT_API_KEY) {
    trackCrawlerVisit(request, {
      apiKey: process.env.SPLIT_API_KEY,
      debug: process.env.NODE_ENV === 'development'
    }).then((wasTracked) => {
      if (wasTracked && process.env.NODE_ENV === 'development') {
        console.log('āœ… AI crawler tracked successfully')
      }
    }).catch((error) => {
      console.error('āŒ Split Analytics error:', error)
    })
  }
  
  // Your existing middleware logic here...
  return NextResponse.next()
}

export const config = {
  matcher: ['/((?!_next/static|_next/image|favicon.ico).*)']
}

Express/Node.js

const express = require('express')
const { trackCrawlerVisit } = require('@split.dev/analytics')

const app = express()

app.use(async (req, res, next) => {
  // Track crawler visits (non-blocking)
  if (process.env.SPLIT_API_KEY) {
    trackCrawlerVisit({
      url: req.url,
      userAgent: req.headers['user-agent'],
      method: req.method
    }, {
      apiKey: process.env.SPLIT_API_KEY,
      debug: process.env.NODE_ENV === 'development'
    }).catch(console.error)
  }
  
  next()
})

5. Test Your Setup

Trigger an AI crawler visit:

Option 1: Use ChatGPT Search on your website

  1. Use ChatGPT Search on your website
  2. Check your logs for: āœ… AI crawler tracked successfully
  3. Wait 5-10 seconds (batching delay)
  4. Check your Split Dashboard

Option 2: Simulate Locally with curl

You can simulate an AI crawler visit locally using curl with a known AI crawler User-Agent.
Run this command in your terminal (replace the URL if needed):

curl -H "User-Agent: Mozilla/5.0 (compatible; GPTBot/1.0; +https://openai.com/gptbot)" \ http://localhost:3000/


šŸ”§ Troubleshooting

"API returns 401 Unauthorized"

Cause: API key validation failing

Solutions: 1. Check your API key format:

echo $SPLIT_API_KEY
# Should start with: split_live_ or split_test_
  1. Verify key exists in Split Dashboard:

    • Go to Settings → API Keys
    • Confirm your key is listed and active
  2. Check for extra characters:

    # Remove any quotes or whitespace
    SPLIT_API_KEY=split_live_abc123  # āœ… Correct
    SPLIT_API_KEY="split_live_abc123"  # āŒ Has quotes
    SPLIT_API_KEY= split_live_abc123   # āŒ Has space

"Crawler detected but no data in dashboard"

Cause: Authentication issues or data format problems

Solutions: 1. Check if you're logged into Split Dashboard 2. Verify API key belongs to your account 3. Enable debug mode to see detailed logs:

trackCrawlerVisit(request, {
  apiKey: process.env.SPLIT_API_KEY,
  debug: true // Shows detailed logging
})

"No crawler visits detected"

Cause: Middleware not detecting AI crawlers

Debug steps: 1. Add debug logging:

const userAgent = request.headers.get('user-agent')
console.log('User-Agent:', userAgent)

trackCrawlerVisit(request, {
  apiKey: process.env.SPLIT_API_KEY,
  debug: true
}).then((wasTracked) => {
  console.log('Tracking result:', wasTracked ? 'SUCCESS' : 'NOT_DETECTED')
})
  1. Test with known crawler:
    # Simulate ChatGPT visit
    curl -H "User-Agent: Mozilla/5.0 (compatible; ChatGPT-User/1.0; +https://openai.com/bot)" \
         https://your-website.com

"5-10 second delay before data appears"

This is normal! Events are batched for efficiency:

  • Single visit: 5 second delay (batching)
  • 10+ visits: Immediate sending
  • Production: Consider this normal behavior

To reduce delay (not recommended):

import { SplitAnalytics } from '@split.dev/analytics'

const analytics = new SplitAnalytics({
  apiKey: process.env.SPLIT_API_KEY,
  batchIntervalMs: 1000 // 1 second (increases API calls)
})

šŸŽÆ Supported AI Crawlers

The package automatically detects 25+ AI crawlers:

OpenAI

  • GPTBot (training)
  • ChatGPT-User (search)
  • OAI-SearchBot (search)

Anthropic

  • ClaudeBot (training)
  • Claude-Web (assistant)

Google

  • Google-Extended (training)
  • Googlebot (search)

Microsoft

  • Bingbot (search)
  • BingPreview (search)

Others

  • PerplexityBot (Perplexity)
  • FacebookBot (Meta)
  • Bytespider (ByteDance)
  • CCBot (Common Crawl)
  • And 15+ more...

šŸ” Advanced Usage

Custom Event Tracking

import { SplitAnalytics } from '@split.dev/analytics'

const analytics = new SplitAnalytics({
  apiKey: process.env.SPLIT_API_KEY,
  debug: true
})

// Manual tracking
await analytics.track({
  url: 'https://example.com/page',
  userAgent: 'GPTBot/1.0',
  crawler: {
    name: 'GPTBot',
    company: 'OpenAI', 
    category: 'ai-training'
  },
  metadata: {
    source: 'manual-tracking',
    custom: 'data'
  }
})

Test API Connection

import { ping } from '@split.dev/analytics'

const result = await ping({
  apiKey: process.env.SPLIT_API_KEY,
  debug: true
})

console.log('Connection:', result.status) // 'ok' or 'error'

Environment-Specific Keys

# Use test keys in development
SPLIT_API_KEY=split_test_your_test_key_here  # Development
SPLIT_API_KEY=split_live_your_live_key_here  # Production

🚨 Common Mistakes

āŒ Blocking the response

// DON'T do this - blocks every request
export async function middleware(request: NextRequest) {
  await trackCrawlerVisit(request, config) // āŒ Blocks response
  return NextResponse.next()
}

āœ… Non-blocking approach

// DO this - doesn't block responses
export async function middleware(request: NextRequest) {
  trackCrawlerVisit(request, config).catch(console.error) // āœ… Non-blocking
  return NextResponse.next()
}

āŒ Missing error handling

// DON'T do this - can crash your app
trackCrawlerVisit(request, config) // āŒ No error handling

āœ… Proper error handling

// DO this - never crashes your app
trackCrawlerVisit(request, config).catch((error) => {
  console.error('Split Analytics error:', error) // āœ… Handles errors
})

āŒ Wrong matcher config

// DON'T track static files
export const config = {
  matcher: '/(.*)', // āŒ Tracks everything
}

āœ… Optimized matcher

// DO exclude static files
export const config = {
  matcher: ['/((?!_next/static|_next/image|favicon.ico).*)'] // āœ… Optimized
}

šŸ’” Best Practices

1. Environment Variables

# Use different keys per environment
SPLIT_API_KEY_DEV=split_test_...
SPLIT_API_KEY_PROD=split_live_...

2. Error Monitoring

trackCrawlerVisit(request, config).catch((error) => {
  // Send to your error monitoring service
  console.error('Split Analytics error:', error)
  // Sentry.captureException(error)
})

3. Performance

// Only enable debug in development
const config = {
  apiKey: process.env.SPLIT_API_KEY,
  debug: process.env.NODE_ENV === 'development'
}

4. Testing

# Test your implementation
npx @split.dev/analytics --test-api YOUR_API_KEY

šŸ“Š Dashboard Features

Once set up, your Split Dashboard shows:

  • šŸ“ˆ Crawler Visits: Timeline of AI crawler activity
  • šŸ¢ Attribution by Source: Which AI companies are crawling you
  • šŸ“ Geographic Data: Where crawlers are coming from
  • ⚔ Response Times: How fast your site responds to crawlers
  • šŸ“„ Popular Pages: Most crawled content
  • šŸ” Search Trends: What AI models are interested in

šŸ†˜ Need Help?

  1. Check the troubleshooting section
  2. Enable debug mode and check logs
  3. Test your API key: npx @split.dev/analytics --test-api YOUR_KEY

šŸ“ Changelog

v2.0.0

  • āœ… Added 25+ AI crawler detection
  • āœ… Batching for performance (5-second default)
  • āœ… Next.js middleware helpers
  • āœ… Automatic retry logic

Built with ā¤ļø by the Split team

2.0.4

1 month ago

2.0.3

1 month ago

2.0.1

1 month ago

2.0.0

2 months ago

1.2.0

2 months ago

1.1.0

2 months ago

1.0.0

2 months ago

0.1.2

2 months ago

0.1.1

2 months ago

0.1.0

2 months ago