@uniwebcms/site-content-collector v2.3.4
@uniwebcms/site-content-collector
A Node.js library that processes website content from a structured folder hierarchy into a standardized JSON format. The library reads content files (Markdown and JSON) and metadata (YAML), organizing them into a complete site structure that preserves page hierarchy and section ordering.
Installation
npm install @uniwebcms/site-content-collector
Usage Methods
This library can be used in three ways:
- As a Node.js module
- As a CLI tool
- As a webpack plugin
Node.js Module
import { collectSiteContent } from "@uniwebcms/site-content-collector";
async function processWebsite() {
try {
const content = await collectSiteContent("./website");
console.log(content);
} catch (err) {
console.error("Processing error:", err);
}
}
CLI Tool
Process content directly from the command line using npx
:
# Output to directory (creates site-content.json)
npx collect-content ./source-dir ./output-dir
# Output to specific JSON file
npx collect-content ./source-dir ./output-dir/custom-name.json
# With pretty-printed JSON output
npx collect-content ./source-dir ./output.json --pretty
# With verbose logging
npx collect-content ./source-dir ./output.json --verbose
The CLI enforces these rules for safety:
- When specifying a directory, it creates
site-content.json
inside it - When specifying a file, it must have a
.json
extension - Creates output directories if they don't exist
Webpack Plugin
The webpack plugin integrates content collection into your build process:
import { SiteContentPlugin } from "@uniwebcms/site-content-collector/webpack";
export default {
plugins: [
new SiteContentPlugin({
injectToHtml: true, // Optional: inject into HTML (requires html-webpack-plugin)
variableName: "__SITE_CONTENT__", // Optional: id/variable name when injecting
filename: "site-content.json", // Optional: output filename
injectFormat: "json", // Optional: injection format ('json' or 'script')
}),
],
};
HTML Injection Formats
The plugin supports two formats for injecting content into HTML:
- JSON format (default):
<script type="application/json" id="__SITE_CONTENT__">
{
"pages": {
/* content */
}
}
</script>
Access in your code:
const content = JSON.parse(
document.getElementById("__SITE_CONTENT__").textContent
);
- Script format:
<script>
window.__SITE_CONTENT__ = {
/* content */
};
</script>
Access in your code:
const content = window.__SITE_CONTENT__;
Content Structure
The library expects a folder structure organized as follows:
website/
├── site.yml # Site-wide metadata and settings
├── home/ # Each folder is a page
│ ├── page.yml # Page-specific metadata
│ ├── 1-hero.md # Section with prefix "1"
│ ├── 2-features.md # Section with prefix "2"
│ └── 2.1-feature.md # Subsection of "2"
└── about/
├── page.yml
└── 1-intro.json # JSON sections are also supported
Content Files
The library processes two types of content files:
Markdown Files (.md)
---
component: Hero # Optional component name
props: # Optional component properties
background: ./bg.jpg
---
# Section Title
Content in Markdown format
JSON Files (.json)
{
"component": "Feature",
"props": {
"icon": "star"
},
"content": {
"type": "doc",
"content": [] // ProseMirror/TipTap format
}
}
File Naming Convention
Content files must follow these rules:
- Must have a numeric prefix (e.g.,
1-
,2.1-
) that determines order and hierarchy - Must use either
.md
or.json
extension - Files without numeric prefixes are ignored
- Subsection numbers must reference existing parent sections (e.g.,
2.1-
requires a2-
section)
Metadata Files
site.yml
: Site-wide configuration and metadatapage.yml
: Page-specific metadata (optional in each page folder)
Output Structure
The library produces a JavaScript object with this structure:
{
siteMetadata: {
// Contents of site.yml
},
pages: {
home: {
metadata: {
// Contents of page.yml
},
sections: [
{
id: "1",
title: "hero",
component: "Hero",
props: {},
content: {}, // ProseMirror/TipTap JSON
subsections: [] // Nested sections
}
]
}
},
errors: [] // Processing errors if any
}
Error Handling
The library handles several types of errors:
- Missing parent sections for subsections
- Malformed YAML, JSON, or Markdown content
- Invalid file structure or naming
- Missing required files
Errors are collected in the errors
array of the output, allowing processing to continue even when some files fail.
Requirements
- Node.js >=18.0.0
- When using the webpack plugin, webpack >=5.0.0 is required
This package uses ES Modules (ESM). Your project should either:
- Have
"type": "module"
in its package.json, or - Use the
.mjs
extension for files using ESM syntax
License
GPL-3.0-or-later - see LICENSE for details
5 months ago
5 months ago
5 months ago
5 months ago
5 months ago
5 months ago
5 months ago
5 months ago
5 months ago
4 months ago
5 months ago
5 months ago
5 months ago
5 months ago
5 months ago
5 months ago
5 months ago
5 months ago
5 months ago
5 months ago
5 months ago
5 months ago
5 months ago
5 months ago
5 months ago
5 months ago
5 months ago