0.4.3 • Published 10 months ago

copc-validator v0.4.3

Weekly downloads
-
License
MIT
Repository
github
Last release
10 months ago

COPC Validator

Table of Contents

  1. Introduction
    1. Getting Started
  2. Usage
    1. CLI
      1. Options
    2. Import
      1. Options
  3. Scans
    1. Quick scan
    2. Full scan
    3. Output
  4. Details
    1. Checks
      1. Status & Check Objects
      2. Functions
      3. Suites
      4. Parsers
      5. Collections
      6. All checks
    2. Report
      1. Report schema
  5. Future Plans

Introduction

COPC Validator is a library & command-line application for validating the header and content of a Cloud-Optimized Point Cloud (COPC) LAS file. Extending the copc.js library, it accepts either a (relative) file path or COPC url, and runs a series of checks against the values parsed by copc.js.

Getting Started

  1. Install from npm

    npm i -g copc-validator

    Global install is recommended for CLI usage

  2. Scan copc.laz file with copcc CLI

Examples:

  • Default

    copcc ./path/to/example.copc.laz
  • Deep scan, output to <pwd>/output.json

    copcc --deep path/to/example.copc.laz --output=output.json
  • Deep & Minified scan with worker count = 64, showing a progress bar

    copcc path/to/example.copc.laz -dmpw 64

Usage

COPC Validator has two main usages: via the copcc Command-Line Interface (CLI), or imported as the generateReport() function

CLI

copcc [options] <path>

The usage and implementation of COPC Validator is meant to be as simple as possible. The CLI will only need one file path and will automatically run a shallow scan by default, or a deep scan if provided with the --deep option. All other functionality is completely optional.

Options

OptionAliasDescriptionTypeDefault
deepdRead all points of each node; Otherwise, read only root pointbooleanfalse
namenReplace name in Report with provided stringstring<path>
minimOmit Copc or Las from Report, leaving checks and scan infobooleanfalse
pdalPOutput a pdal.metadata object containing header & vlr data in pdal info formatbooleanfalse
workerswNumber of Workers to create - Use at own (performance) risknumberCPU-count
queueqQueue size limit for reading PDR data. Useful for very high node counts (>10000)numberUnlimited
samplesSelect a random sample of nodes to read & validatenumberAll nodes
progresspShow a progress bar while reading the point databooleanfalse
outputoWrites the Report out to provided filepath; Otherwise, writes to stdoutstringN/A
helphDisplays help information for the copcc command; Overwrites all other optionsbooleanN/A
versionvDisplays copc-validator version (from package.json)booleanN/A

Import

  1. Add to project:

    yarn add copc-validator
      # or
    npm i copc-validator
  2. Import generateReport():

    import { generateReport } from 'copc-validator'
  • Example:
    async function printReport() {
      const report = await generateReport({
        source: 'path/to/example.copc.laz',
        options: {} // default options
      })
      console.log(report)
    }
  1. Copy laz-perf.wasm to /public (for browser usage)

Options

generateReport accepts most* of the same options as the CLI through the options property of the first parameter:

TypeScript:

const generateReport = ({
  source: string | File
  options?: {
    name?: string          //default: source | 'COPC Validator Report'
    mini?: boolean         //default: false
    pdal?: boolean         //default: false
    deep?: boolean         //default: false
    workers?: number       //default: CPU Thread Count
    queueLimit?: number    //default: Infinity
    sampleSize?: number    //default: All nodes
    showProgress?: boolean //default: false
  },
  collections?: {copc, las, fallback}
})

See below for collections information

* Key option differences:

  • No output, help, or version options
  • queue is renamed to queueLimit
  • sample is renamed to sampleSize
  • progress is renamed to showProgress
    • Not usable in a browser
  • Any Alias (listed above) will not work

Scans

COPC Validator comes with two scan types, shallow and deep

(see requirements.md for more details)

The report output supports a custom scan type, intended to be used by other developers that may extend the base functionality of COPC Validator. It is not currently used anywhere in this library.

Shallow scan

The shallow scan checks the LAS Public Header Block and various Variable Length Records (VLRs) to ensure the values adhere to the COPC specificiations (found here)

This scan will also check the root (first) point of every node (in the COPC Hierarchy) to ensure those points are valid according to the contents of the Las Header and COPC Info VLR

Deep scan

The deep scan performs the same checks as a shallow scan, but scans every point of each node rather than just the root point, in order to validate the full contents of the Point Data Records (PDRs) against the COPC specs and Header info

Output

COPC Validator outputs a JSON report according to the Report Schema, intended to be translated into a more human-readable format (such as a PDF or Webpage summary)

Details

Checks

A Check ultimately refers to the Object created by calling a Check.Function with performCheck(), which uses the Check.Suite property name to build the returned Check.Status into a complete Check.Check. This already feels like a bit much, without even mentioning Check.Parsers or Check.Collections, so we'll break it down piece-by-piece here

Pseudo-TypeScript:

namespace Check {
  type Status = {
    status: 'pass' | 'fail' | 'warn'
    description: string
    info?: string
  }
  type Check = Status & { id: string }

  type Function<T> =
    | (c: T) => Status
    | (c: T) => Promise<Status>

  type Suite<T> = { [id: string]: Check.Function<T> }
  type SuiteWithSource<T> = { source: T, suite: Suite<T>}

  type Parser<Source, Parsed> = (s: Source) => Promise<SuiteWithSource<Parsed>>

  type Collection = (SuiteWithSource<any> | Promise<SuiteWithSource<any>>)[]
}
type Check = Check.Check

See ./src/types/check.ts for the actual TypeScript code

Status & Check Objects

A Check.Status Object contains a status property with a value of "pass", "fail", or "warn", and optionally contains an info property with a string value.

A Check Object is the same as a Status Object with an additional string property named id

pass means file definitely matches COPC specificiations
fail means file does not match any COPC specifications
warn means file may not match current COPC specifications or recommendations

Functions

Check.Functions maintain the following properties:

  • Single (Object) parameter
  • Syncronous or Asyncronous
  • Output: Check.Status (or a Promise)
  • Pure function

Suites

A Check.Suite is a map of string ids to Check.Functions, where each Function uses the same Object as its parameter (such as the Copc Object, for ./src/suites/copc.ts). The id of a Function becomes the id value of the Check Object when a Check.Suite invokes its Functions

The purpose of this type of grouping is to limit the number of Getter calls for the same section of a file, like the 375 byte Header

All Suites (with their Check.Functions) are located under src/suites

Parsers

Check.Parsers are functions that take a source Object and return a Check.SuiteWithSource Object. Their main purpose is to parse a section of the given file into a usable object, and then return that object with its corrosponding Suite to be invoked from within a Collection.

All Parsers are located under src/parsers (ex: nodeParser)

nodes.ts

src/parsers/nodes.ts is unique among Parsers, in that it's actually running a Suite repeatedly as it parses. However, the data is not returned from the multithreaded Workers like a regular Check.Suite, so nodes.ts then gives the output data to the (new) pointDataSuite for sorting into Check.Statuses

src/utils/worker.js essentially matches the structure of a Suite because it used to be the src/suites/point-data.ts Suite. To increase speed, the pointDataSuite became per-Node instead of per-File, which maximizes multi-threading, but creates quite a mess since worker.js must be (nearly) entirely self-contained for Worker/Web Worker threading. So src/suites/point-data.ts now parses the output of src/utils/worker.js, all of which is controlled by the src/parsers/nodes.ts Parser

Collections

Check.Collections are arrays of Check.Suites with their respective source Object (Check.SuiteWithSource above). They allow Promises in order to use Check.Parsers internally without having to await them.

All Collections are located under src/collections (ex: CopcCollection)

Replacing Collections is the primary way of generating custom reports through generateReport, as you can supply different Check.Suites to perform different Check.Functions per source object.

Custom scan

generateReport has functionality to build customized reports by overwriting the Check.Collections used within:

Pseudo-Type:

import type {Copc, Getter, Las} from 'copc'
type Collections = {
  copc: ({
    filepath: string,
    copc: Copc,
    get: Getter,
    deep: boolean,
    workerCount?: number
  }) => Promise<Check.Collection>,
  las: ({
    get: Getter,
    header: Las.Header,
    vlrs: Las.Vlr[]
  }) => Promise<Check.Collection>,
  fallback: (get: Getter) => Promise<Check.Collection>
}

const generateReport = async ({
  source: string | File,
  options?: {...},
  collections?: Collections
}) => Promise<Report>

All Checks

IDDescriptionScanSuite
minorVersioncopc.header.minorVersion is 4ShallowHeader
pointDataRecordFormatcopc.header.pointDataRecordFormat is 6, 7, or 8ShallowHeader
headerLengthcopc.header.headerLength is 375ShallowHeader
pointCountByReturnSum of copc.header.pointCountByReturn equals copc.header.pointCountShallowHeader
legacyPointCountheader.legacyPointCount follows COPC/LAS specsShallowmanualHeader
legacyPointCountByReturnheader.legacyPointCountByReturn follows COPC/LAS specsShallowmanualHeader
vlrCountNumber of VLRs in copc.vlrs matches copc.header.vlrCountShallowVlr
evlrCountNumber of EVLRs in copc.vlrs matches copc.header.evlrCountShallowVlr
copc-infoExactly 1 copc info VLR exists with size of 160ShallowVlr
copc-hierarchyExactly 1 copc hierarchy VLR existsShallowVlr
laszip-encodedChecks for existance of LasZIP compression VLR, warns if not foundShallowVlr
wktEnsures wkt string can initialize proj4ShallowmanualVlr
bounds within cubeCopc cube envelops Las bounds (min & max)ShallowCopc
rgbRGB channels are used in PDR, if presentShallowPointData
rgbiChecks for 16-bit scaling of RGBI values, warns if 8-bitShallowPointData
xyzEach point exists within Las and Copc bounds, per nodeShallowPointData
gpsTimeEach point has GpsTime value within Las boundsShallowPointData
sortedGpsTimeThe points in each node are sorted by GpsTime value, warns if notDeepPointData
returnNumberEach point has ReturnNumber <= NumberOfReturnsShallowPointData
zeroPointWarns with list of all pointCount: 0 nodes in the HierarchyDeep*PointData
nodesReachableEvery Node ('D-X-Y-Z') in the Hierarchy is reachableShallowPointData
pointsReachableEach Node pageOffset + pageLength leads into another Node pageShallowPointData
...ID...DescriptionShallow...

Checks and their IDs are subject to change as I see fit

Report

Report schema

See JSON Schema

TypeScript pseudo-type Report:

import * as Copc from 'copc'

type Report = {
  name: string
  scan: {
    type: 'shallow' | 'deep' | 'custom' | string  //| 'shallow-X/N' | 'deep-X/N'
    filetype: 'COPC' | 'LAS' | 'Unknown'
    start: Date
    end: Date
    time: number
  }
  checks: ({
    id: string
    status: 'pass' | 'fail' | 'warn'
    info?: string
  })[]

  // When scan.filetype === 'COPC'
  copc?: {
    header: Copc.Las.Header
    vlrs: Copc.Las.Vlr[]
    info: Copc.Info
    wkt: string
    eb: Copc.Las.ExtraBytes
  }

  // When scan.filetype === 'LAS'
  las?: {
    header: Copc.Las.Header
    vlrs: Copc.Las.Vlr[]
  }
  error: {
    message: string
    stack?: string
  }

  // When scan.filetype === 'Unknown'
  error: {
    message: string
    stack?: string
  }
  copcError?: {
    message: string
    stack?: string
  } // only used if Copc.create() and Las.*.parse() fail for different reasons
}

Future Plans

  • Add more Check.Functions - waiting on laz-perf chunk table
  • Rewrite LAS Check.Collection to validate LAS 1.4 specifications
  • Continue to optimize for speed, especially large (1.5GB+) files
0.4.3

10 months ago

0.4.2

10 months ago

0.4.1

1 year ago

0.4.0

1 year ago

0.0.10

1 year ago

0.0.11

1 year ago

0.1.0

1 year ago

0.3.0

1 year ago

0.2.1

1 year ago

0.1.2

1 year ago

0.2.0

1 year ago

0.1.1

1 year ago

0.0.9

1 year ago

0.3.1

1 year ago

0.1.3

1 year ago

0.0.8

1 year ago

0.0.7

1 year ago

0.0.6

1 year ago

0.0.5

1 year ago

0.0.4

1 year ago