Welcome to IRL Benchmark’s documentation!

Getting started:

You can find installation instructions here.

If you’d like to use this project for your own experiments or if you are interested in replicating some results, have a look at the quickstart tutorial.

See extending the benchmark if you want to make one of the following changes:

  • adding a new environment
  • adding a new IRL algorithm
  • adding a new RL algorithm
  • adding a new metric

If you want your changes to be available to the entire community, please fork the repository and make a pull request. General guidelines for this can be found in our collaboration guide.

About the project

We want reusable resuts, reproducible results, robust results for IRL!

What should be reusable:

  • environments used for testing
  • potentially expert demonstrations (especially expensive if collected from human)
  • IRL algorithm implementations
  • RL algorithm implementations
  • experiment code and metrics

This makes the experiments reproducible and the findings more robust

Reinforcement learning research suffers from a reproducibility crisis (highly recommended to watch this ICLR talk by Joelle Pineau). This seems to be even worse in inverse reinforcement learning, where in addition we are often missing the used expert demonstrations and so far no one is really sure what the state of the art currently is.

This benchmark aims to contribute greatly towards solving this issue by creating a standard for reproducible IRL research.

Dive into the docs:

Some important classes to know:

  • BaseIRLAlgorithm: all IRL algorithms extend this class and must overwrite its train method. It pre-processes any config when new algorithms are created to make sure values lie within allowed ranges and missing values are set to their default values. Additionally it provides some useful methods that are used by a variety of IRL algorithms, such as calculating empirical feature counts.
  • TODO: add more