ms-choices v3.3.0
MS Choices
Description
Ms choices is a router microservice that aggregates and fine tunes scores from multiple services in order to return the best campaign to serve.
Ownership
- AdServer
Input / Output
This project have following http/https endpoints input and output are in json format:
- /health: health check of the service
- /context: called by Beeswax to choose the best campaign
- /bestCampaigns: called by wsback to choose the best campaigns from a list
- /header-bidding: called by ms-bidder for header bidding
- /check-fraud, /check-brand-safety: called manually to check apps for fraud, check brand safety for web tutorial here
Workflows
Dependencies
Services requesting ms-choices:
- wsback calls ms-choices-int (internal + header-bidding in-app)
- ms-bidder calls ms-choices-int (internal + header-bidding web)
- Beeswax calls ms-choices-ext (external)
Services requested/used by ms-choices:
- ms-speed-api (GRPC)
- ms-yield (GRPC)
- ms-auction (http protobuf)
- Redis - clusters:
dv-prod-<region>
(read only)redis-ms-choices-<env>-<region>
(read/write)
- dynamoDB - tables:
data-<env>
(DAX)cappings-<env>
- S3 - buckets:
ogury-choices-<env>
ogury-audience-serving-<env>
ogury-context-<env>
- Kafka - produces in topics:
choices
Delivery
bidrequest
choices_requests
_bidrequestraw
- feature-config server (see all params used by ms-choices)
- prometheus
Fallback
In general there are no explicit fallback in case of failure of an external service.
This section is describing the expected behaviors in case of a failure of a service. Those effect have not been tested and are not guaranteed
- dynamo: Dynamo is embedded in the Health check, a failure of dynamo, would trigger a restart of all pods
- kafka: all step are feeding kafka queues, in case of slowness or unresponsiveness of the kafka server, there might be some performance impact on ms-choices.
- prometheus : prometheus shouldn't be an issue, it exposes a page that is requested by the service in a pull way.
- featureConfig: it probably won't be an issue everything should fall back to default values
Step | Dependency | default behavior | detection | complications |
---|---|---|---|---|
BrandSafety | Redis | block (configurable in feature config) | timeout/error | slow or unresponsive |
Fraud | Redis | pass | timeout/error | slow or unresponsive |
externalBlackList | S3 | pass | NA | none |
ordering | S3 | pass | NA | none |
programmaticResponse | ms-auction | no programmatic | timeout/error | slow or unresponsive |
speed | speed-api | use cache | cache get stale | burst of speed-api request in background, possible pod crash |
yield | yield-api | pass | timeout/error | slow or unresponsive |
Installation
Install package
yarn install
Initialize submodules
git submodule update --init --recursive
INFO: After editing proto on external project, do this command to update source of submodule
git submodule update --remote --merge
Check and update config file if needed
config/config.yml
Run
Locally
Start internal server:
docker-compose -f mock/docker-compose.yml up -d
yarn start-local:int
Or to start external server:
docker-compose -f mock/docker-compose.yml up -d
yarn start-local:ext
Run unit tests:
- ensure Redis server is running
yarn test
Tests
An easy way to run Redis is to use the docker-compose file in test
:
docker-compose -f test/docker-compose.yml up --build -d
then
yarn test
If Redis is running on a non-standard port, use the REDIS_PORT env var
If some test crash, connect to Redis and flush all data by calling FLUSHALL command:
ā telnet localhost 6379
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
FLUSHALL
+OK
Add a Delivery step
Create a class yourStepName.step.ts that implements the Step interface. You'll have to implement 3 methods: 1. getName: should return the name of your step (will be used by decorators such as metrics and kafka) 2. execute: the execution of your step. The given request object should not be changed. The result object may be changed. 3. getLabels: should return the prometheus labels (used by metrics decorator)
Instance your step where you want to use it by using the StepBuilder class and adding the decorator(s) you need. Example:
this.dvFraudAppStep = new StepBuilder(new DvfraudAppStep()).kafka().metrics().dryrun().build()
Compile protobufs
If you need the protobuf files:
cd delivery-common-protos
git checkout master
git pull
cd ..
yarn build:proto
You can replace master
with any other commit, branch or tag you intend to use.
Remember that you need to commit the fact that the submodule commit was changed.
Details
Running local mocks and testing Beeswax
Please see the mock README
Possible Errors
If you get a build error with node-gyp, don't forget to add flags !
export CPPFLAGS=-I/usr/local/opt/openssl/include
export LDFLAGS=-L/usr/local/opt/openssl/lib
Useful tools
Manually checking brand safety in DV: scripts/brand_safety (refer to README in the folder)
Profiling
It is possible to launch a profiling session in production To do so you have to connect to the /profile endpoint This will launch a 5 minutes profiling session only on the pod you reached. After 5 minutes it will release a file in a S3 bucket you can read and analyze.
example: you want to launch a 5 minutes profile session on prod internal eu-west make sure you have your VPN open to the server
curl -H "Authorization: static Kdfuo902FDSMMnsipam" http://ms-choices-int-euw1a.prod.cloud.ogury.io/profile
wait 5 minutes go aws S3 bucket ogury-choices-prod-eu-west-1/inspector you will find a new file here download the file
go to https://www.speedscope.app/ select the file you downloaded and start to analyze
Feature Config
Feature | Excepted Type | Description | Default Value |
---|---|---|---|
ms-choices.bypass-normal-flow-assets | string list separated by , | List of assets that normal flow in internal should be ignored and go direct to marketplace/open market step | void |
ms-choices.queue-buffering-max-kbytes | number | queue.buffering.max.kbytes setting of kafka client | 0 |
ms-choices.prio-web-weight | number | how much we should prioritize web inventory between 0 and 1 | 0.75 |
ms-choices.prio-thumbnail-weight | number | how much we should prioritize thumbnail inventory between 0 and 1 | 0.25 |
ms-choices.brand-safety.allow-unknown | boolean | should allow unknown brand safety level assets | false |
ms-choices.brand-safety.test-only-campaigns | string list separated by , | test campaign list which bypass brand safety step | void |
ms-choices.dv-bypass-apikeys | string list separated by , | api key list which bypass dv step | void |
ms-choices.header-bidding-display-default-score | number | default acc rate score for header bidding inventory when hashmap does not have value | 0 |
ms-choices.uniformize-scores | boolean | should use percentile uniformization for scoring | false |
ms-choices.composite-score-weights | complex type | score weights for every score | null => { default: { cta: 0, audience: 0} } is used |
ms-choices.excluded-regions-programmatic-internal | string list separated by , | aws regions in which that programmatic is disabled for internal inventory | void |
ms-choices.excluded-regions-programmatic-header-bidding | string list separated by , | aws regions in which that programmatic is disabled for header bidding inventory | void |
ms-choices.max-delay-pass-speed-threshold | number | max advance a campaign can have for speed, when campaign's delay > this value, it does not pass speedThresholdStep | 1.5 |
ms-choices.random-noise-composite-score-speed | number | noise setting pass to jStat which is used to randomize composite score variation | 0 |
ms-choices.ms-choices.fallback-campaign-ids | string list separated by , | fallback campaign list used when no normal/open market campaign is selected | void |
ms-choices.ms-choices.fallback-campaign-ids | string list separated by , | fallback campaign list used when no normal/open market campaign is selected | void |
ms-choices.log-request-kafka-pct | float number 0, 1 | chance to log incoming request to Kafka topic choices_request | 0.005 (0.5% requests logged) |
ms-choices.log-request-pct | float number 0, 1 | chance to log incoming request and response to datadog | 0 |
ms-choices.brand-safety.allow-unknown.[dv,ias] | boolean | allow passing when brand safety is unknown | false |
ms-choices.brand-safety.severity.[dv,ias] | string low, medium, high | set the level of the filter | high |
ms-choices.brand-safety.custom-filtering.[dv,ias] | string list separated by , | filter out specific segments | void |
ms-choices.publisher-fraud.severity.ias | string medium, high | set the level of the filter | high |
ms-choices.publisher-fraud.allow-unknown.ias | boolean | allow passing when brand safety is unknown | false |
ms-choices.brand-suitability.allow-unknown.ias | boolean | allow passing when brand suitability is unknown | false |
ms-choices.device-fraud.fill-cache.ias | boolean | force caching of ip based segments | false |
ms-choices.device-fraud.allow-unknown.ias | boolean | allow passing when brand safety is unknown | true |
ms-choices.brand-safety.fill-cache.[dv,ias] | boolean | force caching of url based segments | false |
ms-choices.header-bidding.third-party-extra-fee | complex type | third party fee for header bidding | null => {"Amazon TAM":0, "MAX":0, "PreBid":0} is used |
Brand safety
Design proposal, please read it if you are not familiar with the subject, it helps a lot.
Missing implementation details in DP:
- There is no true authentication/authorization on DV's API, we use a hashed/salted query parameter technique with unique partner id to let DV identify origin of http request.
- DV's segments and their categories: The segment type is determined in following code. Detailed DV API doc.
- Redis lock technique, we use redlock lib, detail in here.
Jenkins
You can validate your Jenkinsfile syntax using
curl -X POST -F "jenkinsfile=<Jenkinsfile" https://:@jenkins.ci.cloud.ogury.io/pipeline-model-converter/validate
2 years ago