group-stats-event-schemas v1.1.0
Event schemas
This repo contains all the schemas used across Domain, (currently only group-stats
events) under the schemas
folder.
- Each Schema has a versioned schema (JSON Schema) and an accompanying samples file under a folder with its name
- Changes in schemas should be made using a new version following semver guidelines. Following release types are supported.
patch
: if it is fixing an existing validation issue. This is a non breaking change.minor
: if it is adding new properties. This is a non breaking changemajor
: if it is making a breaking change.
- Any new PR should be based against
stage
first and then tomaster
. Build will fail if you do otherwise. master
branch will be used in production andstage
branch for staging environment.- Merge to
stage
ormaster
makes the schemas available via the following http endpoints immediately (if the builds pass) |Endpoint |Staging |Production |Note | | --- |--- |--- |--- | | GET Schema | https://stage-event-schemas.domain.com.au/v1/group-stats/AdvertView/2.0.6 |https://event-schemas.domain.com.au/v1/group-stats/AdvertView/2.0.6 | Use this endpoint to retrieve a published schema for a given schema key and version| | GET Meta | https://stage-event-schemas.domain.com.au/v1/group-stats/AdvertView/meta |https://event-schemas.domain.com.au/v1/group-stats/AdvertView/meta | Use this endpoint to retrieve meta info for a given schema key |
How to add a new schema or update an existing schema ?
Prerequisites (Local dev setup)
Node 10
or higheryarn
- Basic knowledge of JSON Schema
Steps to add/update a schema
- git clone the repo to a local directory in your machine
cd
into the local dir- Make a new branch for your changes based off the
stage
branch. - Run a
yarn install
command on the root directory of the locally cloned repo (dir) - Run
yarn run new group-stats/YourEventType
to add a new eventYourEventType
togroup-stats
dir with a default minimal schema, samplesREADME
and ameta
file. IfYourEventType
already exists it will copy the latest version and create a new version of the schema and sample inside the same directory. Additionaly you can pass in new release type (patch
,minor
,major
). - The meta (yaml) file can be used to provide meta information
teamName
: Provide your team name. if multiple teams, make it an arraydeprecatedVersions
: If you decide to deprecate any versions, add to the array here. Keep in mind that not all event producers may be using the same versions. Some event producers could be using an older version of the event schema.criticality
: One ofhigh
,medium
,low
. Talk to the Data Activation team to find out if your event is critical. Citical events have higher monitoring and stricter review process.
Test your changes
yarn check-schemas
: Run checks against all schemasyarn check-schemas:single group-stats/AdvertView
: Run checks against a single schemas key. This is only for local testing to test your schemas quickly. Replacegroup-stats/AdvertView
with your schema key.
Submit your changes for review
- Commit and push your changes.
- Create a PR based against the
stage
branch, title itYourEventType@Version
and let us know at #event-schemas-prs
Event versions
All event schemas are versioned and once published cannot be changed. Any change needs to be done by adding a new version.
All the 1.*.*
versions of group stats event schemas are auto generated. So far, all events were using either 2.0
or v2.0
in the EventVersion
property of their payload.
So they are mapped to the latest 1.*.*
version. All new schemas added manually should begin from 2.0.1
version and used in the EventVersion
property of the event payload.
Semver ranges in EventVersion
While sending events to group stats, you can make use of semver range format to target schema versions. For e.g. you can use ^2
, ~2
etc. instead of specifying a fixed schema version. Group stats will resolve to the max satisfying version which satisfies the given range format.
Testing your schemas on stage
Once your PR is merged to stage, your schema is available on stage for you to test. You can send events to group stats on stage environment using the new version and verify that it works as expected.
- Produce some events from your service on stage.
- Or if you dont have a service setup already, produce events manually
- Go to https://internal-stage-statistics-api.domain.com.au/swagger
- In the swagger UI, "Try it out" the
PUT /v2/stats/event
endpoint - In the body paste your event body, change your event type and version to your event type and version and hit the Execute button. See the payload below
{
"Data": {
"ClientType": "Website - Desktop",
"EventGeneratedTimestamp": 1589256320255,
"EventType": "SurveyResponded",<< THIS IS YOUR EVENTTYPE
"EventVersion": "^2", << THIS IS YOUR NEW VERSION, you can use semver range format or use a specific version
"MetaData": {
"Context": "search-result",
"EventProvider": "domain-survey-api",
"GAClientId": "420940774.1573787940",
"RespondentId": "b74d52aa-5121-4fb0-8606-bd5bfbbf0aa4",
"SurveyId": "sur_1",
"SurveyItemId": "ite_sc_first_home",
"SurveyItemOptionId": "opt_1",
"SurveyVersion": "v0",
"UserToken": "b74d52aa-5121-4fb0-8606-bd5bfbbf0aa4"
}
}
}
Use the kibana links below to debug your events
Staging | Production | Note |
---|---|---|
Schema Found | Schema Found | Use this to see logs when a schema is found. |
Schema Not Found | Schema Not Found | Use this to see logs when a schema is not found. |
Schema Version Not Found | Schema Version Not Found | Use this to see logs when a schema versinon is not found. |
Validation Failed | Validation Failed | Use this to see logs by EventType/EventProvider when an event fails validation |
Dashboard | Dashboard | Combines all above to a single dashboard |
NOTE: Events that pass validation are not logged so you will not find them in logs. Also Validation Failed logs are not logged every time (logged only frequently by a given factor) and are throttled (not logged) if they go beyond a certain rate per second. Other logs are also not logged everytime as they are cached.
Know issue: misleading additionalProperties
validation error
You may notice errors such as #/ClientType: #/additionalProperties/$false: #/ClientType (All values fail against the false schema)
. This usually means that you have set additionalProperties
to false
but provided properties that are not part of the schema definition. However, there is a known issue where you may notice this error even when there are no additional properties. This happens only when the validation fails due to some other reason. If you fix the others first this error goes away on its own.
Case sensitivity
Normally, event types are case insensitive, however, it is recommended to use the same casing, usually PasacalCasing
everytime you use an event type.
Definitions
If you have multiple schemas that share some common sub schemas or have a common base schema then you can make use of definitions
to reuse such schemas. For e.g. GAClientId
sub schema is repeated in most of the schemas and therefore is $ref
ed by the schemas from the definitions directory. These definitions are also versioned. This will save you time by not repeating the same definition on all your events. Also there are automation around auto updating schemas to the latest definitions.
Event validation architechture
See the confluence page
Want to add schemas for GA/MixPanel ?
Drop a message at #customer-data-platform to discuss. New groups can be added by creating another folder under the schemas
folder and following the same pattern.
CI/CD
Builds run on jenkins see the Jenkinsfile
. Also, uses CloudFormation
to create/update the stack, see cloudformation.yml
file.
DR
- Change of region : Change the
AWS_REGION
env variable in theJenkins
file to the new region
Architechture
4 years ago