@powerhousedao/sky-atlas-notion-data v1.1.12
Sky Atlas Notion Data
A tool for fetching and processing Sky Atlas data from Notion, converting it into a structured tree representation, and optionally committing it to GitHub and posting it to an import API.
Overview
This project fetches data from Notion pages, processes it, and creates a structured tree representation. The data transformation pipeline:
- Fetches raw data from Notion using the Notion API
- Processes the raw data into a structured format
- Creates a tree of ViewNodes that represent the Sky Atlas structure
- Optionally commits the data to GitHub and posts to an import API
Data Transformation Stages
Notion Data Fetching
- Fetches raw data from Notion pages
- Stores in
data/notion-pages/directory - Preserves Notion's original structure and relationships
Initial Processing
- Converts Notion's block-based structure into a simpler format
- Processes page properties and relationships
- Stores in
data/processed/directory - Maintains parent-child relationships from Notion
Data Parsing
- Creates a unified data structure from processed pages
- Uses
make-notion-data-by-id.tsto combine Hub and Atlas page data - Creates a map of all items by their ID for quick lookups
- Stores in
data/parsed/directory - Prepares data for tree generation
Tree Generation
- Creates the View Node Tree structure
- Applies numbering system
- Generates View Node Map for quick lookups
- Produces simplified text representation
Installation
To install dependencies:
bun installEnvironment Setup
Create a .env file based on the .env.example template with the following variables:
API_KEY="your-notion-api-key"
IMPORT_API_KEY="your-import-api-key"
GITHUB_TOKEN="your-github-token"
IMPORT_API_URL="your-import-api-url"
SKIP_IMPORT_API="true"
SKIP_GITHUB_SNAPSHOT="true"
USE_LOCAL_DATA="false"Usage
Available Scripts
make-atlas-data.ts
The main script for generating the tree structure. This script:
- Fetches data from Notion
- Processes it into a structured format
- Creates the view node tree and map
- Optionally commits to GitHub and posts to import API
bun run make-atlas-data [options]Options:
--outputPath <path>: Specify output directory (default: "data")--useLocalData: Use locally cached data instead of fetching from Notion--skipImportApi: Skip posting to import API--skipGithubSnapshot: Skip committing to GitHub--help: Display help message
For development, you can use the make-atlas-data-dev script which automatically sets development-friendly options:
bun run make-atlas-data-devfetch-latest-atlas-data.ts
Fetches already processed Atlas data from the server instead of processing raw Notion data. Useful for development to avoid repeated fetches.
bun run fetch-latest-atlas-data [options]Options:
--atlasDataUrl <url>: URL to fetch atlas data from--outputPath <path>: Path to write output files--help: Show help message
diff-atlas-data.ts
Compares two atlas data files and generates diffs in multiple formats to help identify changes:
bun run diff-atlas-data [options]Options:
--baseDataPath <path>: Path to base data file--newDataPath <path>: Path to new data file--outputPath <path>: Path to save diff files--help: Display help message
The script generates three types of diffs to help identify changes:
Raw JSON Diff
- Direct comparison of the JSON files
- Shows structural changes in the data
- Can be noisy due to JSON key ordering differences
Simplified Text Diff
- Converts the data to a human-readable text format
- Each node is represented as a block of text with its properties
- Makes it easier to see content changes
- Preserves the hierarchical structure
Sorted Simplified Diff
- Same as the simplified text diff but with all lines sorted
- Helps identify when the same data appears in a different order
- Useful for detecting when only JSON key ordering has changed
- Makes it easier to spot actual content changes vs. structural changes
Example output files:
<timestamp>-raw.diff # Raw JSON comparison
<timestamp>-simplified.diff # Human-readable text comparison
<timestamp>-simplified-sorted.diff # Sorted text comparisonupload-data
Upload the data generated by make-atlas-data to the Vercel Blob storage
bun run upload-data Data Structure Explanation
The project transforms Notion data into several interconnected data structures:
View Node Tree (Atlas Data)
- This is the primary data structure representing the Sky Atlas
- Stored in
atlas-data.json(also available asview-node-tree.jsonfor legacy compatibility) - Hierarchical representation with Scopes (type: SCOPE) as root nodes
- Each node can have subDocuments (children)
- Contains processed content with correct numbering
- Used for navigation and display in the Atlas Explorer
- Represents the complete structure of the Atlas
View Node Map
- A transformed version of the same data optimized for quick lookups
- Stored in
view-node-map.json - Takes the view node tree and flattens it into a map where:
- Keys are the node's slug suffix (e.g., "node-id|parent-suffix")
- Values are the complete node objects
- Used by the Next.js app for prerendering and quick node lookups
- Contains the same data as the tree, just organized differently
- Enables direct access to any node without tree traversal
- Makes it easy to find nodes by their URL path
Processing Flow
- Raw Notion data is fetched and stored in
notion-pages/ - Data is processed into a structured format in
processed/ - Processed data is parsed into
parsed/ - Final tree structure is generated and stored in
atlas-data.json
- Raw Notion data is fetched and stored in
Node Types
- SCOPE: Top-level nodes in the Atlas
- ARTICLE: Main content nodes
- SECTION: Content sections within articles
- CATEGORY: Grouping nodes with special numbering
- ANNOTATION: Supporting documentation
- And more (see constants.ts for full list)
Numbering System The Sky Atlas uses a sophisticated numbering system to create unique identifiers for each node in the tree. Here's how it works:
Basic Structure
- Each node has a
formalIdconsisting of aprefixandnumberPath - The prefix is typically "A" for the first scope
- The numberPath is an array of numbers/strings representing the node's position in the hierarchy
- Each node has a
Numbering Rules
- Scopes: Use their index in the sorted scope list as their numberPath (e.g., 0, 1, 2)
- Regular Nodes: Inherit parent's numberPath and append their counter value
- Categories: Special handling where children are flattened into the parent's numbering
- Agent Artifacts: Use "AG" prefix with index (e.g., "A.AG1", "A.AG2")
- Sky Primitives: Use "P" prefix with counter (e.g., "A.P1", "A.P2")
- Support Documents: Don't receive numbers in the path
Key Functions
makeSortedByNumberOrDocNo: Sorts items by explicit number or document numberupdateCounter: Manages counter increments based on node typeflattenedCategoryChildren: Calculates total children for category numberingmakeViewNodeAtlasId: Creates the final Atlas ID (e.g., "A.1.2.3")
Special Cases
- Categories remove their parent's last number from the path
- Agent Artifacts reset their parent's path and use their own numbering
- Support document types don't affect the numbering sequence
- Sky Primitives use a global counter for unique "P" numbers
Example Numbering
See existing tree and https://www.notion.so/atlas-axis/7b5370146f1e448897b189299222e206?v=e21c5c37020f4935a58d45b750d9bd1e&pvs=4
Hub and Relationships
- The Hub is a special Notion page that defines relationships between Atlas pages
- Used to establish parent-child relationships that aren't explicit in Notion
- Helps maintain the correct structure when pages can have multiple parents
- Ensures consistent navigation paths in the Atlas
Simplified Tree Format
- Human-readable text representation of the tree
- Each node is represented as a block of text with:
- Node ID
- Formal ID and title
- Content
- Hub URLs
- Sub-document IDs
- Supporting document IDs
- Used for:
- Reviewing changes in GitHub diffs
- Understanding the structure without parsing JSON
- Spotting structural changes more easily
- Generated alongside the JSON files for convenience
Output Files
The scripts generate several output files:
data/notion-pages/: Raw Notion page datadata/processed/: Processed Notion page datadata/parsed/: Parsed data ready for tree generationdata/atlas-data.json: Generated tree structuredata/view-node-map.json: Map of nodes in the treedata/simplified-atlas-tree.txt: Human-readable tree representationdata/view-node-tree.json: Same asatlas-data.json, for legacy use
Development
This project was created using bun init in bun v1.1.0. Bun is a fast all-in-one JavaScript runtime.
Project Structure
src/: Source codeconstants.ts: Constants used throughout the applicationfetching.ts: Functions for fetching data from Notionmake-notion-data-by-id.ts: Functions for creating a unified data structure from processed pagesmake-view-node-tree.ts: Functions for creating the view node treeprocessors.ts: Functions for processing Notion datatypes/: TypeScript type definitionsutils/: Utility functionspage-properties/: Functions for handling Notion page properties
scripts/: Executable scriptsmake-atlas-data.ts: Main script for generating the tree structurefetch-latest-atlas-data.ts: Script for fetching processed datadiff-atlas-data.ts: Script for comparing atlas data filesupload-data-to-vercel.ts: Script to upload atlas data to Vercel Blob storagemake-simplified-atlas-data.ts: Script for generating simplified text representationhandleEnv.ts: Environment variable handlingutils.ts: Script utility functions
License
This project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0).
5 months ago
5 months ago
5 months ago
6 months ago
6 months ago
7 months ago
7 months ago
7 months ago
7 months ago
7 months ago
7 months ago
7 months ago
7 months ago
7 months ago
7 months ago
7 months ago
7 months ago
7 months ago
7 months ago
7 months ago
7 months ago
7 months ago
7 months ago