@rl-js/redux-mdp NPM

Classes

Typedefs

MdpFactory ⇐ EnvironmentFactory

Class for constructing an Environment implemented as a ReduxMDP

Kind: global class
Extends: EnvironmentFactory

MdpFactory ⇐ EnvironmentFactory

new MdpFactory(params)

Create a factory for a particular MDP

Param	Type	Default	Description
params	object		Parameters for constructing the MDP
params.reducer	Reducer		Redux reducer representing the state of the MDP
params.getObservation	getObservation		Compute the current observation
params.computeReward	computeReward		Compute the current reward
params.isTerminated	isTerminated		Compute whether the environment is terminated
params.resolveAction	resolveAction		Resolve the MdpAction into a ReduxAction
params.gamma	number	1	Reward discounting factor for the MDP

mdpFactory.createEnvironment() ⇒ ReduxMDP

Create an instance of the environment.

Kind: instance method of MdpFactory

mdpFactory.setMdpMiddleware(middleware)

Configure any MdpMiddleware that should be part of the next invocation of createEnvironment()

Kind: instance method of MdpFactory

Param	Type
middleware	function

mdpFactory.setReduxMiddleware(middleware)

Configure any ReduxMiddleware that should be part of the next invocation of createEnvironment()

Kind: instance method of MdpFactory

Param	Type
middleware	function

ReduxMDP ⇐ Environment

Class representing in an Environment as an MDP using Redux.

Kind: global class
Extends: Environment

State : *

The underlying state representation of the environment. Should be a serializable object, e.g. state => JSON.parse(JSON.stringify(state)) should be an identity

Kind: global typedef

MdpAction : *

An object representing an action in an MDP. The type is specific to the MDP.

Kind: global typedef

Observation : *

An object representing the observation of an agent in the current state. The type is specific to the MDP.

Kind: global typedef

ReduxAction : Object

An Redux action. e.g. a Flux Standard Action: https://github.com/redux-utilities/flux-standard-action Your MdpAction will be converted into a ReduxAction by resolveAction

Kind: global typedef
Properties

Name	Type	Description
type	string	Each action must have a type associated with it.
payload	*	Any data associated with the action goes here
error	boolean	Should be true IIF the action represents an error
meta	*	Any data that is not explicitly part of the payload

reducer ⇒ State

A Redux reducer. Computes the next state without mutating the previous state object

Kind: global typedef
Returns: State - The new state object after the action is applied

Param	Type	Description
state	State	The current state of the MDP
action	ReduxAction	The resolved action for the MDP

getObservation ⇒ Observation

A function to get the observation of the agent given the current state.

Kind: global typedef
Returns: Observation - The observation for the current state

Param	Type	Description
state	State	The current state of the MDP

computeReward ⇒ number

A function to compute the reward given a state transition, i.e. (s, a, s). This function should be completely deterministic; any non-determinism should be handled by resolveAction.

Kind: global typedef
Returns: number - The reward for given the state transition.

Param	Type	Description
state	State	the current state for the MDP
action	ReduxAction	The next action
nextState	State	the next state for the mdp

isTerminated ⇒ boolean

A function to compute whether the environment is terminated, i.e. the current episode is over.

Kind: global typedef
Returns: boolean - True if the environment is terminated, false otherwise.

Param	Type	Description
state	State	the current state for the MDP
action	ReduxAction	The next action
nextState	State	the next state for the MDP.
time	number	The current timestep of the MDP, useful for finite horizon MDPs.

resolveAction ⇒ ReduxAction

A function to resolve a MdpAction into a ReduxAction. Any non-determinism in your environment should go here, as your Redux reducer should be completely deterministic.

Kind: global typedef
Returns: ReduxAction - The new state object after the action is applied