Simplified implementation of Implicit Language Q Learning (Snell et al. 2022) (official, paper)
Evaluating on Graph Shortest Path task from Decision Transformer (Lili Chen et al. 2021):
where for each random graph, a transformer is trained to find optimal trajectories using only 1000 random walks.