@uniwebcms/site-content-collector v2.3.4
@uniwebcms/site-content-collector
A Node.js library that processes website content from a structured folder hierarchy into a standardized JSON format. The library reads content files (Markdown and JSON) and metadata (YAML), organizing them into a complete site structure that preserves page hierarchy and section ordering.
Installation
npm install @uniwebcms/site-content-collectorUsage Methods
This library can be used in three ways:
- As a Node.js module
- As a CLI tool
- As a webpack plugin
Node.js Module
import { collectSiteContent } from "@uniwebcms/site-content-collector";
async function processWebsite() {
try {
const content = await collectSiteContent("./website");
console.log(content);
} catch (err) {
console.error("Processing error:", err);
}
}CLI Tool
Process content directly from the command line using npx:
# Output to directory (creates site-content.json)
npx collect-content ./source-dir ./output-dir
# Output to specific JSON file
npx collect-content ./source-dir ./output-dir/custom-name.json
# With pretty-printed JSON output
npx collect-content ./source-dir ./output.json --pretty
# With verbose logging
npx collect-content ./source-dir ./output.json --verboseThe CLI enforces these rules for safety:
- When specifying a directory, it creates
site-content.jsoninside it - When specifying a file, it must have a
.jsonextension - Creates output directories if they don't exist
Webpack Plugin
The webpack plugin integrates content collection into your build process:
import { SiteContentPlugin } from "@uniwebcms/site-content-collector/webpack";
export default {
plugins: [
new SiteContentPlugin({
injectToHtml: true, // Optional: inject into HTML (requires html-webpack-plugin)
variableName: "__SITE_CONTENT__", // Optional: id/variable name when injecting
filename: "site-content.json", // Optional: output filename
injectFormat: "json", // Optional: injection format ('json' or 'script')
}),
],
};HTML Injection Formats
The plugin supports two formats for injecting content into HTML:
- JSON format (default):
<script type="application/json" id="__SITE_CONTENT__">
{
"pages": {
/* content */
}
}
</script>Access in your code:
const content = JSON.parse(
document.getElementById("__SITE_CONTENT__").textContent
);- Script format:
<script>
window.__SITE_CONTENT__ = {
/* content */
};
</script>Access in your code:
const content = window.__SITE_CONTENT__;Content Structure
The library expects a folder structure organized as follows:
website/
├── site.yml # Site-wide metadata and settings
├── home/ # Each folder is a page
│ ├── page.yml # Page-specific metadata
│ ├── 1-hero.md # Section with prefix "1"
│ ├── 2-features.md # Section with prefix "2"
│ └── 2.1-feature.md # Subsection of "2"
└── about/
├── page.yml
└── 1-intro.json # JSON sections are also supportedContent Files
The library processes two types of content files:
Markdown Files (.md)
---
component: Hero # Optional component name
props: # Optional component properties
background: ./bg.jpg
---
# Section Title
Content in Markdown formatJSON Files (.json)
{
"component": "Feature",
"props": {
"icon": "star"
},
"content": {
"type": "doc",
"content": [] // ProseMirror/TipTap format
}
}File Naming Convention
Content files must follow these rules:
- Must have a numeric prefix (e.g.,
1-,2.1-) that determines order and hierarchy - Must use either
.mdor.jsonextension - Files without numeric prefixes are ignored
- Subsection numbers must reference existing parent sections (e.g.,
2.1-requires a2-section)
Metadata Files
site.yml: Site-wide configuration and metadatapage.yml: Page-specific metadata (optional in each page folder)
Output Structure
The library produces a JavaScript object with this structure:
{
siteMetadata: {
// Contents of site.yml
},
pages: {
home: {
metadata: {
// Contents of page.yml
},
sections: [
{
id: "1",
title: "hero",
component: "Hero",
props: {},
content: {}, // ProseMirror/TipTap JSON
subsections: [] // Nested sections
}
]
}
},
errors: [] // Processing errors if any
}Error Handling
The library handles several types of errors:
- Missing parent sections for subsections
- Malformed YAML, JSON, or Markdown content
- Invalid file structure or naming
- Missing required files
Errors are collected in the errors array of the output, allowing processing to continue even when some files fail.
Requirements
- Node.js >=18.0.0
- When using the webpack plugin, webpack >=5.0.0 is required
This package uses ES Modules (ESM). Your project should either:
- Have
"type": "module"in its package.json, or - Use the
.mjsextension for files using ESM syntax
License
GPL-3.0-or-later - see LICENSE for details
10 months ago
10 months ago
10 months ago
10 months ago
10 months ago
10 months ago
10 months ago
10 months ago
10 months ago
10 months ago
10 months ago
10 months ago
10 months ago
10 months ago
10 months ago
10 months ago
10 months ago
10 months ago
10 months ago
10 months ago
10 months ago
10 months ago
10 months ago
10 months ago
10 months ago
10 months ago
10 months ago