pdw v0.0.32
pdw is a library for creating quantified self applications. It's intended to be platform-agnostic with regard to the underlying database and the presentation layer. It handles the core logic, and provides interfaces out to whatever databases and whatever UI components you want to build.
flowchart LR
A("Database A")
H("Database B")
subgraph "Personal Data Warehouse"
B("DataStore
Plugins"):::special
C("PDW Library"):::special
end
D("UI Components")
F("Microservice APIs")
A --> B
B --> C
C --> F
C --> D
H --> B
classDef special fill:#46dThe pdw library exists because I got tired of starting from scratch every time I wanted to migrate my personal quantified self system from one platform to another, and because I wanted a common backbone of code that made it easier to work with disparate datasets federated across multiple databases.
Concepts
#TODO - all this needs rewritten somewhat now that I'm treating all Elements as wrappers for ElementData objects.
The library is built around a few core data structures: Def, PointDef, Entry, EntryPoint, Tag and Period
- A record of a thing that happened is called an
Entry - An Entry takes place during a time
Period - An entry contains zero or more
EntryPoints, which are data associated with the event that occurred - An EntryPoint has a specified data type called a
PointType - An Entry has exactly one associated
Def, which describes what the Entry means - Each
Defcontains zero or morePointDefs, which describe theEntryPoints contained by theEntryassociated with thatDef EntryandDefare all generically calledElements, because they are not contained by anything and extend the same abstract base class- A
DataStoreis a file or database that persistsElements in storage - A
Querysearches aDataStorefor Entries that match a set ofStandardParams - A list of Entries can have their
EntryPointvalues summarized usingRollups
👉
Defs containPointDefs and describeEntrys, which take place in Periods
An Entry is a record of a thing that occurred during a given period of time (Period). A Definition (or Def) describes a type of Entry that can exist. Every Entry has exactly one definfion. A Def contains zero or more PointDefs. A PointDef describes a key/value pair (called an EntryPoint) that can exist on its associated Entry. EntryPoints may be of any type. Each entry is associated with exactly one Period, which range in their granularity from seconds to years. Running a Query will return Entry instances match the query's parameters.
Simplified Conceptual Example
For the purposes of introduction, this section omits certain data properties to not confuse the main role of each data type.
Entry
An example entry, with some properties omitted for clarity.
{
_did: "rxyb",
_period: "2023-08-20",
aaaa: 10,
bbbb: "Had a good day. Ate a *stellar* hamburger. Watched The Avengers."
}Related Definition
Definition for the example entry, also with some properties omitted for clarity.
{
_did: "rxyb",
_scope: "DAY",
_lbl: "Nightly Review",
_desc: "A journal entry I make every night before bed",
_pts: [
{
_pid: "aaaa",
_lbl: "Satisfaction",
_desc: "How happy are you with the day, on a scale from 1 to 10?",
_type: "NUMBER"
},
{
_pid: "bbbb",
_lbl: "Journal",
_desc: "A paragraph about your day. What you did. How you felt.",
_type: "MARKDOWN"
},
]
}The Def describes the Entry because they share the same _did value. The _pid values found in the PointDefs in the Def._pts array match to the keys of the the Entry that are not prefixed with an underscore. This dictates that those key/value pairs are EntryPoints that are described by their matching PointDef.
Principles
- Don’t delete. Don’t update*. Create new & mark old as deleted.
- *updates allowed to:
_deleted,_updated
- *updates allowed to:
- Surrogate IDs for everything!
_did,_uid,_eid,_pid- ID property values may never be changed once they are established
- Case doesn't matter. Everything gets trimmed.
- Timezones suck and are not included in Periods
- Data density is important! Don’t store redundant data
Full Data Structure
The full data structure includes a few more properties that add features and make it possible to combine data from different datasets without duplication. The Entry and Def, interfaces extend abstract base interface called Element.
erDiagram
DEF{
SmallID _did
string _lbl
string _emoji
string _desc
enum _scope
string[] _tags
}
ENTRY {
SmallID _did
UID _eid
PeriodStr _period
string _note
string _source
}
POINTDEF{
SmallID _pid
string _lbl
string _emoji
string _desc
enum _type
enum _rollup
}
ELEMENT{
UID _uid
string _created
string _updated
boolean _deleted
}
DEF ||--o{ POINTDEF : owns
DEF ||--o{ ENTRY : describes
DEF ||--|| ELEMENT : is
ELEMENT ||--|| ENTRY : is
ELEMENT ||--|| TAG : is
ENTRYPOINT {
any pid "<- key is value of _pid from PointDef"
}
ENTRY ||--o{ ENTRYPOINT : owns
POINTDEF ||--o{ ENTRYPOINT : describesElement
Entries, Definitions, and Tags are all “Elements”, because they all extend the Element interface:
{
_uid: "ekdjwjsn-pwl8",
_created: "ekdjwjsn",
_updated: "ekdjwjsn",
_deleted: false
}These properties enable the PDW to uniquely identify individual instances of each element (using the _uid), manage data updates, and handle merges of elements that might exist in multiple datasets.
Def & PointDef
Definitions describe Entries. They may or may not contain Point Definitions, which would describe EntryPoints. Every definition contains a Definition ID (_did), label (_lbl), and level of granularity (_scope). Every PointDefinition contains a Point ID (_pid), its own label (_lbl), a point type (_type), a default rollup (_rollup), and a list of enumeration options (_opts).
{
_uid: "ekdjwjsn-pwl8",
_created: "ekdjwjsn",
_updated: "ekdjwjsn",
_deleted: false,
_did: "3pbm",
_lbl: "Exercised",
_emoji: "🏃♀️",
_desc: "Broke a sweat in the name of fitness",
_scope: "SECOND",
_tags: ["Health"]
_pts: [
{
_pid: "bw7k",
_lbl: "Workout Type",
_emoji: "🏋️♀️",
_desc: "The genre of exercise you did",
_type: "ENUM",
_rollup: "COUNTEACH",
_opts: {
"2akb": "Strength",
"maaf": "Cardio",
"82p9": "Mobility"
}
}
]
}Entry
An Entry is a record of an event that happened during some period of time. It is associated with exactly one defining Def, which it will share the same _did value. All entries must have a _period, which is an ISO8601-formatted string and does not contain timezone information. All periods have standard _source and _note properties, which default to an empty string.
Entry properties that are not prefixed with an underscore are EntryPoints. Their key should correspond to the _pid of a PointDef contained by the associated Def. The value of the EntryPoint should correspond with the PointDef’s _type enum value.
{
_uid: "ekdppjxx-28si",
_created: "ekdppjxx",
_updated: "ekdppjxx",
_deleted: false,
_did: "3pbm",
_note: "1.5 miles. sweat a ton",
_period: "2028-08-22T17:30:50",
_source: "iOS Shortcuts",
"bw7k": "maaf"
}2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago