Block a user
9aa07fed6a
Adding Environment Wrapper and including index randomization for trajectory selection
233ca77aa4
Completing initial model and treating memory leak
c8fdd11d8c
Outputting only 3 channels from the decoder
a83149f61e
Keeping channels as 3
1f4667a08d
Adding preprocessing function
7d7387bd5d
Adding target value function updates and momentum updates
ac714e3495
Correct history with detach
de17cab9f5
Add MOCO to introduce lower bound loss
05dd20cdfa
Add a class to freeze parameters
8fd56ba94d
Adding model architecture for Reward, Value and Target Value
47090449d1
Adding Reward, Value and Target Value models
c4283ced6f
Changing CLUB loss and Tensor stacking
6b4762d5fc
Changing Upper Bound loss
5caea7695a
Changing variable reshaping strategy
ada3cadf0c
Adding momentum encoder
d9d350e191
Adding Contrastive learning models
47a0772c9d
Replacing seed with version name variable in environment id naming
d558b9f558
Changing names for clean and noisy environments via version
41dcf22262
Collecting dataset from noiseless environment
11f00ad695
Add encoder loss and include tqdm for visualization
a1fe81f018
Grouping for actions too
38cc645253
Update models to give distribution as well in the output