[Re] Goal-conditioned Imitation Learning

Examined the code and modified how often the trajectories were being modified by HER and expert relabeling and significantly improved the poor results, but still did not achieve the same results for the baselines. GAIL appeared to have great difficulty improving its initial results, while HER was able to achieve a greater value fairly early on. Initially expected GAIL to outperform HER and for HER to slowly catch up, however this might have appeared farther on in training. Overall though, goalGAIL did perform more strongly than either of the baselines.

Share on

Twitter Facebook LinkedIn

Rashi Dhar

Share on