2.3.4 • Published 10 months ago

@uniwebcms/site-content-collector v2.3.4

Weekly downloads
-
License
GPL-3.0-or-later
Repository
github
Last release
10 months ago

@uniwebcms/site-content-collector

A Node.js library that processes website content from a structured folder hierarchy into a standardized JSON format. The library reads content files (Markdown and JSON) and metadata (YAML), organizing them into a complete site structure that preserves page hierarchy and section ordering.

Installation

npm install @uniwebcms/site-content-collector

Usage Methods

This library can be used in three ways:

  1. As a Node.js module
  2. As a CLI tool
  3. As a webpack plugin

Node.js Module

import { collectSiteContent } from "@uniwebcms/site-content-collector";

async function processWebsite() {
  try {
    const content = await collectSiteContent("./website");
    console.log(content);
  } catch (err) {
    console.error("Processing error:", err);
  }
}

CLI Tool

Process content directly from the command line using npx:

# Output to directory (creates site-content.json)
npx collect-content ./source-dir ./output-dir

# Output to specific JSON file
npx collect-content ./source-dir ./output-dir/custom-name.json

# With pretty-printed JSON output
npx collect-content ./source-dir ./output.json --pretty

# With verbose logging
npx collect-content ./source-dir ./output.json --verbose

The CLI enforces these rules for safety:

  • When specifying a directory, it creates site-content.json inside it
  • When specifying a file, it must have a .json extension
  • Creates output directories if they don't exist

Webpack Plugin

The webpack plugin integrates content collection into your build process:

import { SiteContentPlugin } from "@uniwebcms/site-content-collector/webpack";

export default {
  plugins: [
    new SiteContentPlugin({
      injectToHtml: true, // Optional: inject into HTML (requires html-webpack-plugin)
      variableName: "__SITE_CONTENT__", // Optional: id/variable name when injecting
      filename: "site-content.json", // Optional: output filename
      injectFormat: "json", // Optional: injection format ('json' or 'script')
    }),
  ],
};

HTML Injection Formats

The plugin supports two formats for injecting content into HTML:

  1. JSON format (default):
<script type="application/json" id="__SITE_CONTENT__">
  {
    "pages": {
      /* content */
    }
  }
</script>

Access in your code:

const content = JSON.parse(
  document.getElementById("__SITE_CONTENT__").textContent
);
  1. Script format:
<script>
  window.__SITE_CONTENT__ = {
    /* content */
  };
</script>

Access in your code:

const content = window.__SITE_CONTENT__;

Content Structure

The library expects a folder structure organized as follows:

website/
├── site.yml               # Site-wide metadata and settings
├── home/                  # Each folder is a page
│   ├── page.yml          # Page-specific metadata
│   ├── 1-hero.md         # Section with prefix "1"
│   ├── 2-features.md     # Section with prefix "2"
│   └── 2.1-feature.md    # Subsection of "2"
└── about/
    ├── page.yml
    └── 1-intro.json      # JSON sections are also supported

Content Files

The library processes two types of content files:

Markdown Files (.md)

---
component: Hero # Optional component name
props: # Optional component properties
  background: ./bg.jpg
---

# Section Title

Content in Markdown format

JSON Files (.json)

{
  "component": "Feature",
  "props": {
    "icon": "star"
  },
  "content": {
    "type": "doc",
    "content": [] // ProseMirror/TipTap format
  }
}

File Naming Convention

Content files must follow these rules:

  1. Must have a numeric prefix (e.g., 1-, 2.1-) that determines order and hierarchy
  2. Must use either .md or .json extension
  3. Files without numeric prefixes are ignored
  4. Subsection numbers must reference existing parent sections (e.g., 2.1- requires a 2- section)

Metadata Files

  • site.yml: Site-wide configuration and metadata
  • page.yml: Page-specific metadata (optional in each page folder)

Output Structure

The library produces a JavaScript object with this structure:

{
  siteMetadata: {
    // Contents of site.yml
  },
  pages: {
    home: {
      metadata: {
        // Contents of page.yml
      },
      sections: [
        {
          id: "1",
          title: "hero",
          component: "Hero",
          props: {},
          content: {},     // ProseMirror/TipTap JSON
          subsections: []  // Nested sections
        }
      ]
    }
  },
  errors: []  // Processing errors if any
}

Error Handling

The library handles several types of errors:

  • Missing parent sections for subsections
  • Malformed YAML, JSON, or Markdown content
  • Invalid file structure or naming
  • Missing required files

Errors are collected in the errors array of the output, allowing processing to continue even when some files fail.

Requirements

  • Node.js >=18.0.0
  • When using the webpack plugin, webpack >=5.0.0 is required

This package uses ES Modules (ESM). Your project should either:

  • Have "type": "module" in its package.json, or
  • Use the .mjs extension for files using ESM syntax

License

GPL-3.0-or-later - see LICENSE for details

2.3.0

10 months ago

2.2.1

10 months ago

2.1.2

10 months ago

2.2.0

10 months ago

2.1.1

10 months ago

2.3.2

10 months ago

2.2.3

10 months ago

2.3.1

10 months ago

2.2.2

10 months ago

2.3.4

10 months ago

2.2.5

10 months ago

2.3.3

10 months ago

2.2.4

10 months ago

2.2.6

10 months ago

2.1.0

10 months ago

2.0.1

10 months ago

1.4.1

10 months ago

1.4.0

10 months ago

1.3.1

10 months ago

1.3.0

10 months ago

1.2.0

10 months ago

1.1.1

10 months ago

1.1.0

10 months ago

1.0.3

10 months ago

1.0.2

10 months ago

1.0.1

10 months ago

1.0.0

10 months ago