tia/Dreamer/local_dm_control_suite/README.md

# DeepMind Control Suite.

This submodule contains the domains and tasks described in the
[DeepMind Control Suite tech report](https://arxiv.org/abs/1801.00690).

## Quickstart

```python
from dm_control import suite
import numpy as np

# Load one task:
env = suite.load(domain_name="cartpole", task_name="swingup")

# Iterate over a task set:
for domain_name, task_name in suite.BENCHMARKING:
  env = suite.load(domain_name, task_name)

# Step through an episode and print out reward, discount and observation.
action_spec = env.action_spec()
time_step = env.reset()
while not time_step.last():
  action = np.random.uniform(action_spec.minimum,
                             action_spec.maximum,
                             size=action_spec.shape)
  time_step = env.step(action)
  print(time_step.reward, time_step.discount, time_step.observation)
```

## Illustration video

Below is a video montage of solved Control Suite tasks, with reward
visualisation enabled.

[![Video montage](https://img.youtube.com/vi/rAai4QzcYbs/0.jpg)](https://www.youtube.com/watch?v=rAai4QzcYbs)


### Quadruped domain [April 2019]

Roughly based on the 'ant' model introduced by [Schulman et al. 2015](https://arxiv.org/abs/1506.02438). Main modifications to the body are:

- 4 DoFs per leg, 1 constraining tendon.
- 3 actuators per leg: 'yaw', 'lift', 'extend'.
- Filtered position actuators with timescale of 100ms.
- Sensors include an IMU, force/torque sensors, and rangefinders.

Four tasks:

- `walk` and `run`: self-right the body then move forward at a desired speed.
- `escape`: escape a bowl-shaped random terrain (uses rangefinders).
- `fetch`, go to a moving ball and bring it to a target.

All behaviors in the video below were trained with [Abdolmaleki et al's
MPO](https://arxiv.org/abs/1806.06920).

[![Video montage](https://img.youtube.com/vi/RhRLjbb7pBE/0.jpg)](https://www.youtube.com/watch?v=RhRLjbb7pBE)
Add Dreamer Files 2023-07-17 08:48:01 +00:00			`# DeepMind Control Suite.`

			`This submodule contains the domains and tasks described in the`
			`[DeepMind Control Suite tech report](https://arxiv.org/abs/1801.00690).`

			`## Quickstart`

			```python
			`from dm_control import suite`
			`import numpy as np`

			`# Load one task:`
			`env = suite.load(domain_name="cartpole", task_name="swingup")`

			`# Iterate over a task set:`
			`for domain_name, task_name in suite.BENCHMARKING:`
			`env = suite.load(domain_name, task_name)`

			`# Step through an episode and print out reward, discount and observation.`
			`action_spec = env.action_spec()`
			`time_step = env.reset()`
			`while not time_step.last():`
			`action = np.random.uniform(action_spec.minimum,`
			`action_spec.maximum,`
			`size=action_spec.shape)`
			`time_step = env.step(action)`
			`print(time_step.reward, time_step.discount, time_step.observation)`
			```

			`## Illustration video`

			`Below is a video montage of solved Control Suite tasks, with reward`
			`visualisation enabled.`

			`[![Video montage](https://img.youtube.com/vi/rAai4QzcYbs/0.jpg)](https://www.youtube.com/watch?v=rAai4QzcYbs)`


			`### Quadruped domain [April 2019]`

			`Roughly based on the 'ant' model introduced by [Schulman et al. 2015](https://arxiv.org/abs/1506.02438). Main modifications to the body are:`

			`- 4 DoFs per leg, 1 constraining tendon.`
			`- 3 actuators per leg: 'yaw', 'lift', 'extend'.`
			`- Filtered position actuators with timescale of 100ms.`
			`- Sensors include an IMU, force/torque sensors, and rangefinders.`

			`Four tasks:`

			- `walk` and `run`: self-right the body then move forward at a desired speed.
			- `escape`: escape a bowl-shaped random terrain (uses rangefinders).
			- `fetch`, go to a moving ball and bring it to a target.

			`All behaviors in the video below were trained with [Abdolmaleki et al's`
			`MPO](https://arxiv.org/abs/1806.06920).`

			`[![Video montage](https://img.youtube.com/vi/RhRLjbb7pBE/0.jpg)](https://www.youtube.com/watch?v=RhRLjbb7pBE)`