-
Notifications
You must be signed in to change notification settings - Fork 5
Customization
For scenarios where you would like to customize an environment or an algorithm e.g. PPO but the configurations provided by RL.ts do not suffice (e.g. adding random network distillation to your algorithm), then you can directly copy the original source file from this repository and edit that without having to code from scratch.
The following section will take you through how to customize an algorithm to your own needs in typescript and javascript.
The example used is DQN and the customization added is using a linear decay schedule for the epsilon value instead of the exponential decay one that is used by RL.ts. Epsilon is the probability of taking a purely random action instead of one that is based on the deep Q network, which is important for exploration in DQN.
First make sure to install RL.ts and typescript via
npm install rl-ts
npm install -g typescript
We will need to use typescript just once to compile the source code from this repo into javascript code you can use and edit.
Now go ahead and find the code you would like to customize. For DQN, this is https://github.com/StoneT2000/rl-ts/blob/main/src/Algos/dqn/index.ts
Copy it over to the same directory as where you installed RL.ts into a file called dqn.ts
For DQN, you will need to also install typings for the numjs package
npm i --save-dev @types/numjs
Other models, algorithms, environments etc. may require additional installations, but the typescript compiler will let you know what you need.
Now create a file called tsconfig.json
in the same folder with the following
{
"compilerOptions": {
"target": "ESNext",
"module": "commonjs",
"allowJs": true,
"esModuleInterop": true,
"isolatedModules": true,
"forceConsistentCasingInFileNames": true,
"strict": true,
"skipLibCheck": true,
},
"include": ["dqn.ts"]
}
This will tell typescript how to correctly compile the typescript code into a usable javascript module. To compile now, simply run
tsc
This will produce a new file called dqn.js
. You can go ahead and delete anything related to typescript including dqn.ts
as we no longer need it. With dqn.js
, you can now import the JS version of DQN into your javascript code to use and train as so:
const DQN = require("./dqn.js").DQN;
But lets make our change first. Open up dqn.js
and change the getEpsilon
function to
getEpsilon(timeStep, epsDecay, epsStart, epsEnd) {
const fraction = Math.min(1, timeStep / 10000);
return epsStart + fraction * (epsEnd - epsStart);
}
Now when training, DQN will pick epsilon values that move linearly towards epsEnd (0.05 by default) over 10000 steps and keep using a epsilon of 0.05 after 10000 steps.
For example usage of DQN in javascript, check out https://github.com/StoneT2000/rl-ts/blob/main/examples/dqn/cartpole.js. To test your customization, you can replace
const dqn = new RL.Algos.DQN(...)
with
const dqn = new DQN(...)
after importing dqn.js
with const DQN = require("./dqn.js").DQN;
First make sure to install RL.ts via
npm install rl-ts
Find the source code for the algorithm. For DQN, this is https://github.com/StoneT2000/rl-ts/blob/main/src/Algos/dqn/index.ts
Copy it over to the same directory as where you installed RL.ts into a file called dqn.ts
For DQN, you will need to also install typings for the numjs package
npm i --save-dev @types/numjs
Other models, algorithms, environments etc. may require additional installations, but the typescript compiler will let you know what you need.
Now create a file called tsconfig.json
in the same folder with the following
{
"compilerOptions": {
"target": "ESNext",
"module": "commonjs",
"allowJs": true,
"esModuleInterop": true,
"isolatedModules": true,
"forceConsistentCasingInFileNames": true,
"strict": true,
"skipLibCheck": true,
},
"include": ["dqn.ts"]
}
Now lets make our changes to change the exponential schedule to a linear one.
In the function getEpsilon
, change its body as so
private getEpsilon(timeStep: number, epsDecay: number, epsStart: number, epsEnd: number) {
const fraction = Math.min(1, timeStep / 10000);
return epsStart + fraction * (epsEnd - epsStart);
}
Now when training, DQN will pick epsilon values that move linearly towards epsEnd (0.05 by default) over 10000 steps and keep using a epsilon of 0.05 after 10000 steps.
We can now compile the typescript code by running
tsc
If the command is not found, make sure to install typescript globally via
npm i -g typescript
This will generate a file called dqn.js
which you can now import the DQN model from. You can also use typescript and import from dqn.ts
and also configure the tsconfig.json file as necessary.