Read the Docs build information Build id: 223673 Project: openai-education-spinningup Version: latest Commit: 2e0eff9bd019c317af908b72c056a33f14626602 Date: 2019-07-15T17:12:14.154170Z State: finished Success: True [rtd-command-info] start-time: 2019-07-15T17:12:14.622714Z, end-time: 2019-07-15T17:12:15.915378Z, duration: 1, exit-code: 0 git clone --no-single-branch --depth 50 git@github.com:openai/spinningup.git . Cloning into '.'... [rtd-command-info] start-time: 2019-07-15T17:12:16.210025Z, end-time: 2019-07-15T17:12:16.429257Z, duration: 0, exit-code: 0 git checkout --force origin/master Note: checking out 'origin/master'. You are in 'detached HEAD' state. You can look around, make experimental changes and commit them, and you can discard any commits you make in this state without impacting any branches by performing another checkout. If you want to create a new branch to retain commits you create, you may do so (now or later) by using -b with the checkout command again. Example: git checkout -b HEAD is now at 2e0eff9 Merge pull request #167 from seungjaeryanlee/keypapers/1907.02057 [rtd-command-info] start-time: 2019-07-15T17:12:16.510259Z, end-time: 2019-07-15T17:12:16.518356Z, duration: 0, exit-code: 0 git clean -d -f -f [rtd-command-info] start-time: 2019-07-15T17:12:17.509054Z, end-time: 2019-07-15T17:12:22.282333Z, duration: 4, exit-code: 0 python3.6 -mvirtualenv --no-site-packages --no-download Using base prefix '/home/docs/.pyenv/versions/3.6.8' New python executable in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/bin/python3.6 Also creating executable in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/bin/python Installing setuptools, pip, wheel... done. [rtd-command-info] start-time: 2019-07-15T17:12:22.379129Z, end-time: 2019-07-15T17:12:23.353246Z, duration: 0, exit-code: 0 python -m pip install --upgrade --cache-dir /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/.cache/pip pip Requirement already up-to-date: pip in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (19.1.1) [rtd-command-info] start-time: 2019-07-15T17:12:23.428853Z, end-time: 2019-07-15T17:12:35.151826Z, duration: 11, exit-code: 0 python -m pip install --upgrade --cache-dir /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/.cache/pip Pygments==2.3.1 setuptools==41.0.1 docutils==0.14 mock==1.0.1 pillow==5.4.1 alabaster>=0.7,<0.8,!=0.7.5 commonmark==0.8.1 recommonmark==0.5.0 sphinx<2 sphinx-rtd-theme<0.5 readthedocs-sphinx-ext<0.6 Collecting Pygments==2.3.1 Downloading https://files.pythonhosted.org/packages/13/e5/6d710c9cf96c31ac82657bcfb441df328b22df8564d58d0c4cd62612674c/Pygments-2.3.1-py2.py3-none-any.whl (849kB) Requirement already up-to-date: setuptools==41.0.1 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (41.0.1) Collecting docutils==0.14 Downloading https://files.pythonhosted.org/packages/36/fa/08e9e6e0e3cbd1d362c3bbee8d01d0aedb2155c4ac112b19ef3cae8eed8d/docutils-0.14-py3-none-any.whl (543kB) Collecting mock==1.0.1 Downloading https://files.pythonhosted.org/packages/a2/52/7edcd94f0afb721a2d559a5b9aae8af4f8f2c79bc63fdbe8a8a6c9b23bbe/mock-1.0.1.tar.gz (818kB) Collecting pillow==5.4.1 Downloading https://files.pythonhosted.org/packages/85/5e/e91792f198bbc5a0d7d3055ad552bc4062942d27eaf75c3e2783cf64eae5/Pillow-5.4.1-cp36-cp36m-manylinux1_x86_64.whl (2.0MB) Collecting alabaster!=0.7.5,<0.8,>=0.7 Downloading https://files.pythonhosted.org/packages/10/ad/00b090d23a222943eb0eda509720a404f531a439e803f6538f35136cae9e/alabaster-0.7.12-py2.py3-none-any.whl Collecting commonmark==0.8.1 Downloading https://files.pythonhosted.org/packages/ab/ca/439c88039583a29564a0043186875258e9a4f041fb5c422cd387b8e10175/commonmark-0.8.1-py2.py3-none-any.whl (47kB) Collecting recommonmark==0.5.0 Downloading https://files.pythonhosted.org/packages/9b/3d/92ea48401622510e57b4bdaa74dc9db2fb9e9e892324b48f9c02d716a93a/recommonmark-0.5.0-py2.py3-none-any.whl Collecting sphinx<2 Downloading https://files.pythonhosted.org/packages/7d/66/a4af242b4348b729b9d46ce5db23943ce9bca7da9bbe2ece60dc27f26420/Sphinx-1.8.5-py2.py3-none-any.whl (3.1MB) Collecting sphinx-rtd-theme<0.5 Downloading https://files.pythonhosted.org/packages/60/b4/4df37087a1d36755e3a3bfd2a30263f358d2dea21938240fa02313d45f51/sphinx_rtd_theme-0.4.3-py2.py3-none-any.whl (6.4MB) Collecting readthedocs-sphinx-ext<0.6 Downloading https://files.pythonhosted.org/packages/d1/f3/68de6af559ae681921a7b676a86c360454de67f3c8bd5327ae9352897ef4/readthedocs_sphinx_ext-0.5.17-py2.py3-none-any.whl Collecting future (from commonmark==0.8.1) Downloading https://files.pythonhosted.org/packages/90/52/e20466b85000a181e1e144fd8305caf2cf475e2f9674e797b222f8105f5f/future-0.17.1.tar.gz (829kB) Collecting imagesize (from sphinx<2) Downloading https://files.pythonhosted.org/packages/fc/b6/aef66b4c52a6ad6ac18cf6ebc5731ed06d8c9ae4d3b2d9951f261150be67/imagesize-1.1.0-py2.py3-none-any.whl Collecting requests>=2.0.0 (from sphinx<2) Downloading https://files.pythonhosted.org/packages/51/bd/23c926cd341ea6b7dd0b2a00aba99ae0f828be89d72b2190f27c11d4b7fb/requests-2.22.0-py2.py3-none-any.whl (57kB) Collecting sphinxcontrib-websupport (from sphinx<2) Downloading https://files.pythonhosted.org/packages/2a/59/d64bda9b7480a84a3569be4dde267c0f6675b255ba63b4c8e84469940457/sphinxcontrib_websupport-1.1.2-py2.py3-none-any.whl Collecting six>=1.5 (from sphinx<2) Downloading https://files.pythonhosted.org/packages/73/fb/00a976f728d0d1fecfe898238ce23f502a721c0ac0ecfedb80e0d88c64e9/six-1.12.0-py2.py3-none-any.whl Collecting packaging (from sphinx<2) Downloading https://files.pythonhosted.org/packages/91/32/58bc30e646e55eab8b21abf89e353f59c0cc02c417e42929f4a9546e1b1d/packaging-19.0-py2.py3-none-any.whl Collecting Jinja2>=2.3 (from sphinx<2) Downloading https://files.pythonhosted.org/packages/1d/e7/fd8b501e7a6dfe492a433deb7b9d833d39ca74916fa8bc63dd1a4947a671/Jinja2-2.10.1-py2.py3-none-any.whl (124kB) Collecting babel!=2.0,>=1.3 (from sphinx<2) Downloading https://files.pythonhosted.org/packages/2c/60/f2af68eb046c5de5b1fe6dd4743bf42c074f7141fe7b2737d3061533b093/Babel-2.7.0-py2.py3-none-any.whl (8.4MB) Collecting snowballstemmer>=1.1 (from sphinx<2) Downloading https://files.pythonhosted.org/packages/a0/5e/d9ead2d57d39b3e1c1868ce84212319e5543a19c4185dce7e42a9dd968b0/snowballstemmer-1.9.0.tar.gz (76kB) Collecting idna<2.9,>=2.5 (from requests>=2.0.0->sphinx<2) Downloading https://files.pythonhosted.org/packages/14/2c/cd551d81dbe15200be1cf41cd03869a46fe7226e7450af7a6545bfc474c9/idna-2.8-py2.py3-none-any.whl (58kB) Collecting urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 (from requests>=2.0.0->sphinx<2) Downloading https://files.pythonhosted.org/packages/e6/60/247f23a7121ae632d62811ba7f273d0e58972d75e58a94d329d51550a47d/urllib3-1.25.3-py2.py3-none-any.whl (150kB) Collecting certifi>=2017.4.17 (from requests>=2.0.0->sphinx<2) Downloading https://files.pythonhosted.org/packages/69/1b/b853c7a9d4f6a6d00749e94eb6f3a041e342a885b87340b79c1ef73e3a78/certifi-2019.6.16-py2.py3-none-any.whl (157kB) Collecting chardet<3.1.0,>=3.0.2 (from requests>=2.0.0->sphinx<2) Downloading https://files.pythonhosted.org/packages/bc/a9/01ffebfb562e4274b6487b4bb1ddec7ca55ec7510b22e4c51f14098443b8/chardet-3.0.4-py2.py3-none-any.whl (133kB) Collecting pyparsing>=2.0.2 (from packaging->sphinx<2) Downloading https://files.pythonhosted.org/packages/dd/d9/3ec19e966301a6e25769976999bd7bbe552016f0d32b577dc9d63d2e0c49/pyparsing-2.4.0-py2.py3-none-any.whl (62kB) Collecting MarkupSafe>=0.23 (from Jinja2>=2.3->sphinx<2) Downloading https://files.pythonhosted.org/packages/b2/5f/23e0023be6bb885d00ffbefad2942bc51a620328ee910f64abe5a8d18dd1/MarkupSafe-1.1.1-cp36-cp36m-manylinux1_x86_64.whl Collecting pytz>=2015.7 (from babel!=2.0,>=1.3->sphinx<2) Downloading https://files.pythonhosted.org/packages/3d/73/fe30c2daaaa0713420d0382b16fbb761409f532c56bdcc514bf7b6262bb6/pytz-2019.1-py2.py3-none-any.whl (510kB) Building wheels for collected packages: mock, future, snowballstemmer Building wheel for mock (setup.py): started Building wheel for mock (setup.py): finished with status 'done' Stored in directory: /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/.cache/pip/wheels/7e/72/92/744b532c779242b57aab4bcba80c312b30c069bbd60025e7e6 Building wheel for future (setup.py): started Building wheel for future (setup.py): finished with status 'done' Stored in directory: /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/.cache/pip/wheels/0c/61/d2/d6b7317325828fbb39ee6ad559dbe4664d0896da4721bf379e Building wheel for snowballstemmer (setup.py): started Building wheel for snowballstemmer (setup.py): finished with status 'done' Stored in directory: /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/.cache/pip/wheels/93/db/97/496f1d6bbcad1fecbc58fe45363540414be519312eded82bab Successfully built mock future snowballstemmer Installing collected packages: Pygments, docutils, mock, pillow, alabaster, future, commonmark, imagesize, idna, urllib3, certifi, chardet, requests, sphinxcontrib-websupport, six, pyparsing, packaging, MarkupSafe, Jinja2, pytz, babel, snowballstemmer, sphinx, recommonmark, sphinx-rtd-theme, readthedocs-sphinx-ext Successfully installed Jinja2-2.10.1 MarkupSafe-1.1.1 Pygments-2.3.1 alabaster-0.7.12 babel-2.7.0 certifi-2019.6.16 chardet-3.0.4 commonmark-0.8.1 docutils-0.14 future-0.17.1 idna-2.8 imagesize-1.1.0 mock-1.0.1 packaging-19.0 pillow-5.4.1 pyparsing-2.4.0 pytz-2019.1 readthedocs-sphinx-ext-0.5.17 recommonmark-0.5.0 requests-2.22.0 six-1.12.0 snowballstemmer-1.9.0 sphinx-1.8.5 sphinx-rtd-theme-0.4.3 sphinxcontrib-websupport-1.1.2 urllib3-1.25.3 [rtd-command-info] start-time: 2019-07-15T17:12:35.247613Z, end-time: 2019-07-15T17:13:32.226644Z, duration: 56, exit-code: 0 python -m pip install --exists-action=w --cache-dir /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/.cache/pip -r docs/docs_requirements.txt Collecting cloudpickle==0.5.2 (from -r docs/docs_requirements.txt (line 1)) Downloading https://files.pythonhosted.org/packages/aa/18/514b557c4d8d4ada1f0454ad06c845454ad438fd5c5e0039ba51d6b032fe/cloudpickle-0.5.2-py2.py3-none-any.whl Collecting gym>=0.10.8 (from -r docs/docs_requirements.txt (line 2)) Downloading https://files.pythonhosted.org/packages/9d/38/87aefd5388f6062267384b7e8f97dbc27c54b3e6137a5148b43d5c10890c/gym-0.13.1.tar.gz (1.6MB) Collecting ipython (from -r docs/docs_requirements.txt (line 3)) Downloading https://files.pythonhosted.org/packages/a6/2c/c7d44277b599df35af734d8f4142d501192fdb7aef5d04daf882d7eccfbc/ipython-7.6.1-py3-none-any.whl (774kB) Collecting joblib (from -r docs/docs_requirements.txt (line 4)) Downloading https://files.pythonhosted.org/packages/cd/c1/50a758e8247561e58cb87305b1e90b171b8c767b15b12a1734001f41d356/joblib-0.13.2-py2.py3-none-any.whl (278kB) Collecting matplotlib (from -r docs/docs_requirements.txt (line 5)) Downloading https://files.pythonhosted.org/packages/57/4f/dd381ecf6c6ab9bcdaa8ea912e866dedc6e696756156d8ecc087e20817e2/matplotlib-3.1.1-cp36-cp36m-manylinux1_x86_64.whl (13.1MB) Collecting numpy (from -r docs/docs_requirements.txt (line 6)) Downloading https://files.pythonhosted.org/packages/87/2d/e4656149cbadd3a8a0369fcd1a9c7d61cc7b87b3903b85389c70c989a696/numpy-1.16.4-cp36-cp36m-manylinux1_x86_64.whl (17.3MB) Collecting pandas (from -r docs/docs_requirements.txt (line 7)) Downloading https://files.pythonhosted.org/packages/19/74/e50234bc82c553fecdbd566d8650801e3fe2d6d8c8d940638e3d8a7c5522/pandas-0.24.2-cp36-cp36m-manylinux1_x86_64.whl (10.1MB) Collecting pytest (from -r docs/docs_requirements.txt (line 8)) Downloading https://files.pythonhosted.org/packages/69/1d/2430053122a3c6106f7fd1ff0bc68eb73e27db8f951db70fcd942da52c7b/pytest-5.0.1-py3-none-any.whl (221kB) Collecting psutil (from -r docs/docs_requirements.txt (line 9)) Downloading https://files.pythonhosted.org/packages/1c/ca/5b8c1fe032a458c2c4bcbe509d1401dca9dda35c7fc46b36bb81c2834740/psutil-5.6.3.tar.gz (435kB) Collecting scipy (from -r docs/docs_requirements.txt (line 10)) Downloading https://files.pythonhosted.org/packages/72/4c/5f81e7264b0a7a8bd570810f48cd346ba36faedbd2ba255c873ad556de76/scipy-1.3.0-cp36-cp36m-manylinux1_x86_64.whl (25.2MB) Collecting seaborn==0.8.1 (from -r docs/docs_requirements.txt (line 11)) Downloading https://files.pythonhosted.org/packages/10/01/dd1c7838cde3b69b247aaeb61016e238cafd8188a276e366d36aa6bcdab4/seaborn-0.8.1.tar.gz (178kB) Collecting sphinx==1.5.6 (from -r docs/docs_requirements.txt (line 12)) Downloading https://files.pythonhosted.org/packages/cd/c3/3fc2985e07f6111b47328be116df9e05d5c2f246a050e2e2ebf6bdc9c692/Sphinx-1.5.6-py2.py3-none-any.whl (1.6MB) Collecting sphinx-autobuild==0.7.1 (from -r docs/docs_requirements.txt (line 13)) Downloading https://files.pythonhosted.org/packages/41/21/d7407dd6258ca4f4dfe6b3edbd076702042c02bfcdd82b6f71cb58a359d2/sphinx-autobuild-0.7.1.tar.gz Collecting sphinx-rtd-theme==0.4.1 (from -r docs/docs_requirements.txt (line 14)) Downloading https://files.pythonhosted.org/packages/87/30/7460f7b77b6e8a080dd3688f750fe5d5666c49358f8941449c5b128fa97d/sphinx_rtd_theme-0.4.1-py2.py3-none-any.whl (5.4MB) Collecting tensorflow>=1.8.0 (from -r docs/docs_requirements.txt (line 15)) Downloading https://files.pythonhosted.org/packages/de/f0/96fb2e0412ae9692dbf400e5b04432885f677ad6241c088ccc5fe7724d69/tensorflow-1.14.0-cp36-cp36m-manylinux1_x86_64.whl (109.2MB) Collecting tqdm (from -r docs/docs_requirements.txt (line 16)) Downloading https://files.pythonhosted.org/packages/9f/3d/7a6b68b631d2ab54975f3a4863f3c4e9b26445353264ef01f465dc9b0208/tqdm-4.32.2-py2.py3-none-any.whl (50kB) Requirement already satisfied: six in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from gym>=0.10.8->-r docs/docs_requirements.txt (line 2)) (1.12.0) Collecting pyglet<=1.3.2,>=1.2.0 (from gym>=0.10.8->-r docs/docs_requirements.txt (line 2)) Downloading https://files.pythonhosted.org/packages/1c/fc/dad5eaaab68f0c21e2f906a94ddb98175662cc5a654eee404d59554ce0fa/pyglet-1.3.2-py2.py3-none-any.whl (1.0MB) Requirement already satisfied: pygments in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from ipython->-r docs/docs_requirements.txt (line 3)) (2.3.1) Collecting pexpect; sys_platform != "win32" (from ipython->-r docs/docs_requirements.txt (line 3)) Downloading https://files.pythonhosted.org/packages/0e/3e/377007e3f36ec42f1b84ec322ee12141a9e10d808312e5738f52f80a232c/pexpect-4.7.0-py2.py3-none-any.whl (58kB) Collecting decorator (from ipython->-r docs/docs_requirements.txt (line 3)) Downloading https://files.pythonhosted.org/packages/5f/88/0075e461560a1e750a0dcbf77f1d9de775028c37a19a346a6c565a257399/decorator-4.4.0-py2.py3-none-any.whl Collecting jedi>=0.10 (from ipython->-r docs/docs_requirements.txt (line 3)) Downloading https://files.pythonhosted.org/packages/4e/06/e906725a5b3ad7996bbdbfe9958aab75db64ef84bbaabefe47574de58865/jedi-0.14.1-py2.py3-none-any.whl (1.0MB) Collecting backcall (from ipython->-r docs/docs_requirements.txt (line 3)) Downloading https://files.pythonhosted.org/packages/84/71/c8ca4f5bb1e08401b916c68003acf0a0655df935d74d93bf3f3364b310e0/backcall-0.1.0.tar.gz Collecting prompt-toolkit<2.1.0,>=2.0.0 (from ipython->-r docs/docs_requirements.txt (line 3)) Downloading https://files.pythonhosted.org/packages/f7/a7/9b1dd14ef45345f186ef69d175bdd2491c40ab1dfa4b2b3e4352df719ed7/prompt_toolkit-2.0.9-py3-none-any.whl (337kB) Collecting pickleshare (from ipython->-r docs/docs_requirements.txt (line 3)) Downloading https://files.pythonhosted.org/packages/9a/41/220f49aaea88bc6fa6cba8d05ecf24676326156c23b991e80b3f2fc24c77/pickleshare-0.7.5-py2.py3-none-any.whl Collecting traitlets>=4.2 (from ipython->-r docs/docs_requirements.txt (line 3)) Downloading https://files.pythonhosted.org/packages/93/d6/abcb22de61d78e2fc3959c964628a5771e47e7cc60d53e9342e21ed6cc9a/traitlets-4.3.2-py2.py3-none-any.whl (74kB) Requirement already satisfied: setuptools>=18.5 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from ipython->-r docs/docs_requirements.txt (line 3)) (41.0.1) Collecting python-dateutil>=2.1 (from matplotlib->-r docs/docs_requirements.txt (line 5)) Downloading https://files.pythonhosted.org/packages/41/17/c62faccbfbd163c7f57f3844689e3a78bae1f403648a6afb1d0866d87fbb/python_dateutil-2.8.0-py2.py3-none-any.whl (226kB) Collecting cycler>=0.10 (from matplotlib->-r docs/docs_requirements.txt (line 5)) Downloading https://files.pythonhosted.org/packages/f7/d2/e07d3ebb2bd7af696440ce7e754c59dd546ffe1bbe732c8ab68b9c834e61/cycler-0.10.0-py2.py3-none-any.whl Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from matplotlib->-r docs/docs_requirements.txt (line 5)) (2.4.0) Collecting kiwisolver>=1.0.1 (from matplotlib->-r docs/docs_requirements.txt (line 5)) Downloading https://files.pythonhosted.org/packages/f8/a1/5742b56282449b1c0968197f63eae486eca2c35dcd334bab75ad524e0de1/kiwisolver-1.1.0-cp36-cp36m-manylinux1_x86_64.whl (90kB) Requirement already satisfied: pytz>=2011k in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from pandas->-r docs/docs_requirements.txt (line 7)) (2019.1) Collecting wcwidth (from pytest->-r docs/docs_requirements.txt (line 8)) Downloading https://files.pythonhosted.org/packages/7e/9f/526a6947247599b084ee5232e4f9190a38f398d7300d866af3ab571a5bfe/wcwidth-0.1.7-py2.py3-none-any.whl Collecting importlib-metadata>=0.12 (from pytest->-r docs/docs_requirements.txt (line 8)) Downloading https://files.pythonhosted.org/packages/bd/23/dce4879ec58acf3959580bfe769926ed8198727250c5e395e6785c764a02/importlib_metadata-0.18-py2.py3-none-any.whl Collecting atomicwrites>=1.0 (from pytest->-r docs/docs_requirements.txt (line 8)) Downloading https://files.pythonhosted.org/packages/52/90/6155aa926f43f2b2a22b01be7241be3bfd1ceaf7d0b3267213e8127d41f4/atomicwrites-1.3.0-py2.py3-none-any.whl Collecting attrs>=17.4.0 (from pytest->-r docs/docs_requirements.txt (line 8)) Downloading https://files.pythonhosted.org/packages/23/96/d828354fa2dbdf216eaa7b7de0db692f12c234f7ef888cc14980ef40d1d2/attrs-19.1.0-py2.py3-none-any.whl Requirement already satisfied: packaging in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from pytest->-r docs/docs_requirements.txt (line 8)) (19.0) Collecting py>=1.5.0 (from pytest->-r docs/docs_requirements.txt (line 8)) Downloading https://files.pythonhosted.org/packages/76/bc/394ad449851729244a97857ee14d7cba61ddb268dce3db538ba2f2ba1f0f/py-1.8.0-py2.py3-none-any.whl (83kB) Collecting pluggy<1.0,>=0.12 (from pytest->-r docs/docs_requirements.txt (line 8)) Downloading https://files.pythonhosted.org/packages/06/ee/de89e0582276e3551df3110088bf20844de2b0e7df2748406876cc78e021/pluggy-0.12.0-py2.py3-none-any.whl Collecting more-itertools>=4.0.0 (from pytest->-r docs/docs_requirements.txt (line 8)) Downloading https://files.pythonhosted.org/packages/1f/9e/942df77ddde2fae3f319f2ab8b5d00d5f6b115496e2eb4bad37d1aaefeea/more_itertools-7.1.0-py3-none-any.whl (55kB) Requirement already satisfied: docutils>=0.11 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from sphinx==1.5.6->-r docs/docs_requirements.txt (line 12)) (0.14) Requirement already satisfied: requests>=2.0.0 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from sphinx==1.5.6->-r docs/docs_requirements.txt (line 12)) (2.22.0) Requirement already satisfied: alabaster<0.8,>=0.7 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from sphinx==1.5.6->-r docs/docs_requirements.txt (line 12)) (0.7.12) Requirement already satisfied: snowballstemmer>=1.1 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from sphinx==1.5.6->-r docs/docs_requirements.txt (line 12)) (1.9.0) Requirement already satisfied: babel!=2.0,>=1.3 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from sphinx==1.5.6->-r docs/docs_requirements.txt (line 12)) (2.7.0) Requirement already satisfied: imagesize in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from sphinx==1.5.6->-r docs/docs_requirements.txt (line 12)) (1.1.0) Requirement already satisfied: Jinja2>=2.3 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from sphinx==1.5.6->-r docs/docs_requirements.txt (line 12)) (2.10.1) Collecting watchdog>=0.7.1 (from sphinx-autobuild==0.7.1->-r docs/docs_requirements.txt (line 13)) Downloading https://files.pythonhosted.org/packages/bb/e3/5a55d48a29300160779f0a0d2776d17c1b762a2039b36de528b093b87d5b/watchdog-0.9.0.tar.gz (85kB) Collecting argh>=0.24.1 (from sphinx-autobuild==0.7.1->-r docs/docs_requirements.txt (line 13)) Downloading https://files.pythonhosted.org/packages/06/1c/e667a7126f0b84aaa1c56844337bf0ac12445d1beb9c8a6199a7314944bf/argh-0.26.2-py2.py3-none-any.whl Collecting pathtools>=0.1.2 (from sphinx-autobuild==0.7.1->-r docs/docs_requirements.txt (line 13)) Downloading https://files.pythonhosted.org/packages/e7/7f/470d6fcdf23f9f3518f6b0b76be9df16dcc8630ad409947f8be2eb0ed13a/pathtools-0.1.2.tar.gz Collecting PyYAML>=3.10 (from sphinx-autobuild==0.7.1->-r docs/docs_requirements.txt (line 13)) Downloading https://files.pythonhosted.org/packages/a3/65/837fefac7475963d1eccf4aa684c23b95aa6c1d033a2c5965ccb11e22623/PyYAML-5.1.1.tar.gz (274kB) Collecting tornado>=3.2 (from sphinx-autobuild==0.7.1->-r docs/docs_requirements.txt (line 13)) Downloading https://files.pythonhosted.org/packages/30/78/2d2823598496127b21423baffaa186b668f73cd91887fcef78b6eade136b/tornado-6.0.3.tar.gz (482kB) Collecting port_for==0.3.1 (from sphinx-autobuild==0.7.1->-r docs/docs_requirements.txt (line 13)) Downloading https://files.pythonhosted.org/packages/ec/f1/e7d7a36b5f3e77fba587ae3ea4791512ffff74bc1d065d6185e463279bc4/port-for-0.3.1.tar.gz Collecting livereload>=2.3.0 (from sphinx-autobuild==0.7.1->-r docs/docs_requirements.txt (line 13)) Downloading https://files.pythonhosted.org/packages/12/4d/30cfe74402d2e962d66d35da29bf8850b0557b559ce84d09967c8ade859e/livereload-2.6.1-py2.py3-none-any.whl Collecting keras-preprocessing>=1.0.5 (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Downloading https://files.pythonhosted.org/packages/28/6a/8c1f62c37212d9fc441a7e26736df51ce6f0e38455816445471f10da4f0a/Keras_Preprocessing-1.1.0-py2.py3-none-any.whl (41kB) Collecting keras-applications>=1.0.6 (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Downloading https://files.pythonhosted.org/packages/71/e3/19762fdfc62877ae9102edf6342d71b28fbfd9dea3d2f96a882ce099b03f/Keras_Applications-1.0.8-py3-none-any.whl (50kB) Collecting absl-py>=0.7.0 (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Downloading https://files.pythonhosted.org/packages/da/3f/9b0355080b81b15ba6a9ffcf1f5ea39e307a2778b2f2dc8694724e8abd5b/absl-py-0.7.1.tar.gz (99kB) Collecting gast>=0.2.0 (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Downloading https://files.pythonhosted.org/packages/4e/35/11749bf99b2d4e3cceb4d55ca22590b0d7c2c62b9de38ac4a4a7f4687421/gast-0.2.2.tar.gz Collecting tensorboard<1.15.0,>=1.14.0 (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Downloading https://files.pythonhosted.org/packages/91/2d/2ed263449a078cd9c8a9ba50ebd50123adf1f8cfbea1492f9084169b89d9/tensorboard-1.14.0-py3-none-any.whl (3.1MB) Collecting tensorflow-estimator<1.15.0rc0,>=1.14.0rc0 (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Downloading https://files.pythonhosted.org/packages/3c/d5/21860a5b11caf0678fbc8319341b0ae21a07156911132e0e71bffed0510d/tensorflow_estimator-1.14.0-py2.py3-none-any.whl (488kB) Collecting termcolor>=1.1.0 (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Downloading https://files.pythonhosted.org/packages/8a/48/a76be51647d0eb9f10e2a4511bf3ffb8cc1e6b14e9e4fab46173aa79f981/termcolor-1.1.0.tar.gz Collecting astor>=0.6.0 (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Downloading https://files.pythonhosted.org/packages/d1/4f/950dfae467b384fc96bc6469de25d832534f6b4441033c39f914efd13418/astor-0.8.0-py2.py3-none-any.whl Collecting google-pasta>=0.1.6 (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Downloading https://files.pythonhosted.org/packages/d0/33/376510eb8d6246f3c30545f416b2263eee461e40940c2a4413c711bdf62d/google_pasta-0.1.7-py3-none-any.whl (52kB) Collecting wrapt>=1.11.1 (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Downloading https://files.pythonhosted.org/packages/23/84/323c2415280bc4fc880ac5050dddfb3c8062c2552b34c2e512eb4aa68f79/wrapt-1.11.2.tar.gz Collecting protobuf>=3.6.1 (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Downloading https://files.pythonhosted.org/packages/dc/0e/e7cdff89745986c984ba58e6ff6541bc5c388dd9ab9d7d312b3b1532584a/protobuf-3.9.0-cp36-cp36m-manylinux1_x86_64.whl (1.2MB) Requirement already satisfied: wheel>=0.26 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) (0.33.4) Collecting grpcio>=1.8.6 (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Downloading https://files.pythonhosted.org/packages/f2/5d/b434403adb2db8853a97828d3d19f2032e79d630e0d11a8e95d243103a11/grpcio-1.22.0-cp36-cp36m-manylinux1_x86_64.whl (2.2MB) Requirement already satisfied: future in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from pyglet<=1.3.2,>=1.2.0->gym>=0.10.8->-r docs/docs_requirements.txt (line 2)) (0.17.1) Collecting ptyprocess>=0.5 (from pexpect; sys_platform != "win32"->ipython->-r docs/docs_requirements.txt (line 3)) Downloading https://files.pythonhosted.org/packages/d1/29/605c2cc68a9992d18dada28206eeada56ea4bd07a239669da41674648b6f/ptyprocess-0.6.0-py2.py3-none-any.whl Collecting parso>=0.5.0 (from jedi>=0.10->ipython->-r docs/docs_requirements.txt (line 3)) Downloading https://files.pythonhosted.org/packages/a3/bd/bf4e5bd01d79906e5b945a7af033154da49fd2b0d5b5c705a21330323305/parso-0.5.1-py2.py3-none-any.whl (95kB) Collecting ipython-genutils (from traitlets>=4.2->ipython->-r docs/docs_requirements.txt (line 3)) Downloading https://files.pythonhosted.org/packages/fa/bc/9bd3b5c2b4774d5f33b2d544f1460be9df7df2fe42f352135381c347c69a/ipython_genutils-0.2.0-py2.py3-none-any.whl Collecting zipp>=0.5 (from importlib-metadata>=0.12->pytest->-r docs/docs_requirements.txt (line 8)) Downloading https://files.pythonhosted.org/packages/da/bd/1a5fdf15aa44231fd09f63ecf175b60f057ae37ec65b343bb009364923f3/zipp-0.5.2-py2.py3-none-any.whl Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from requests>=2.0.0->sphinx==1.5.6->-r docs/docs_requirements.txt (line 12)) (1.25.3) Requirement already satisfied: idna<2.9,>=2.5 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from requests>=2.0.0->sphinx==1.5.6->-r docs/docs_requirements.txt (line 12)) (2.8) Requirement already satisfied: certifi>=2017.4.17 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from requests>=2.0.0->sphinx==1.5.6->-r docs/docs_requirements.txt (line 12)) (2019.6.16) Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from requests>=2.0.0->sphinx==1.5.6->-r docs/docs_requirements.txt (line 12)) (3.0.4) Requirement already satisfied: MarkupSafe>=0.23 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from Jinja2>=2.3->sphinx==1.5.6->-r docs/docs_requirements.txt (line 12)) (1.1.1) Collecting h5py (from keras-applications>=1.0.6->tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Downloading https://files.pythonhosted.org/packages/30/99/d7d4fbf2d02bb30fb76179911a250074b55b852d34e98dd452a9f394ac06/h5py-2.9.0-cp36-cp36m-manylinux1_x86_64.whl (2.8MB) Collecting werkzeug>=0.11.15 (from tensorboard<1.15.0,>=1.14.0->tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Downloading https://files.pythonhosted.org/packages/9f/57/92a497e38161ce40606c27a86759c6b92dd34fcdb33f64171ec559257c02/Werkzeug-0.15.4-py2.py3-none-any.whl (327kB) Collecting markdown>=2.6.8 (from tensorboard<1.15.0,>=1.14.0->tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Downloading https://files.pythonhosted.org/packages/c0/4e/fd492e91abdc2d2fcb70ef453064d980688762079397f779758e055f6575/Markdown-3.1.1-py2.py3-none-any.whl (87kB) Building wheels for collected packages: gym, psutil, seaborn, sphinx-autobuild, backcall, watchdog, pathtools, PyYAML, tornado, port-for, absl-py, gast, termcolor, wrapt Building wheel for gym (setup.py): started Building wheel for gym (setup.py): finished with status 'done' Stored in directory: /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/.cache/pip/wheels/95/14/8e/b4f5c72600f654312b40c0844d4c23f146f291c48ac7a5df62 Building wheel for psutil (setup.py): started Building wheel for psutil (setup.py): finished with status 'done' Stored in directory: /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/.cache/pip/wheels/90/7e/74/bb640d77775e6b6a78bcc3120f9fea4d2a28b2706de1cff37d Building wheel for seaborn (setup.py): started Building wheel for seaborn (setup.py): finished with status 'done' Stored in directory: /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/.cache/pip/wheels/26/0a/44/53ddd89769e62f7c6691976375b86c6492e7dd20a2d3970e32 Building wheel for sphinx-autobuild (setup.py): started Building wheel for sphinx-autobuild (setup.py): finished with status 'done' Stored in directory: /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/.cache/pip/wheels/b8/4e/e7/4f5c82cd66a171ac79006454fb74f576ed9d4f14bf66f75e0c Building wheel for backcall (setup.py): started Building wheel for backcall (setup.py): finished with status 'done' Stored in directory: /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/.cache/pip/wheels/98/b0/dd/29e28ff615af3dda4c67cab719dd51357597eabff926976b45 Building wheel for watchdog (setup.py): started Building wheel for watchdog (setup.py): finished with status 'done' Stored in directory: /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/.cache/pip/wheels/61/1d/d0/04cfe495619be2095eb8d89a31c42adb4e42b76495bc8f784c Building wheel for pathtools (setup.py): started Building wheel for pathtools (setup.py): finished with status 'done' Stored in directory: /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/.cache/pip/wheels/0b/04/79/c3b0c3a0266a3cb4376da31e5bfe8bba0c489246968a68e843 Building wheel for PyYAML (setup.py): started Building wheel for PyYAML (setup.py): finished with status 'done' Stored in directory: /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/.cache/pip/wheels/16/27/a1/775c62ddea7bfa62324fd1f65847ed31c55dadb6051481ba3f Building wheel for tornado (setup.py): started Building wheel for tornado (setup.py): finished with status 'done' Stored in directory: /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/.cache/pip/wheels/84/bf/40/2f6ef700f48401ca40e5e3dd7d0e3c0a90e064897b7fe5fc08 Building wheel for port-for (setup.py): started Building wheel for port-for (setup.py): finished with status 'done' Stored in directory: /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/.cache/pip/wheels/76/d3/1b/ea48e3544c50666eed11eac26df8c741d197106ded6fd646e3 Building wheel for absl-py (setup.py): started Building wheel for absl-py (setup.py): finished with status 'done' Stored in directory: /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/.cache/pip/wheels/ee/98/38/46cbcc5a93cfea5492d19c38562691ddb23b940176c14f7b48 Building wheel for gast (setup.py): started Building wheel for gast (setup.py): finished with status 'done' Stored in directory: /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/.cache/pip/wheels/5c/2e/7e/a1d4d4fcebe6c381f378ce7743a3ced3699feb89bcfbdadadd Building wheel for termcolor (setup.py): started Building wheel for termcolor (setup.py): finished with status 'done' Stored in directory: /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/.cache/pip/wheels/7c/06/54/bc84598ba1daf8f970247f550b175aaaee85f68b4b0c5ab2c6 Building wheel for wrapt (setup.py): started Building wheel for wrapt (setup.py): finished with status 'done' Stored in directory: /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/.cache/pip/wheels/d7/de/2e/efa132238792efb6459a96e85916ef8597fcb3d2ae51590dfd Successfully built gym psutil seaborn sphinx-autobuild backcall watchdog pathtools PyYAML tornado port-for absl-py gast termcolor wrapt ERROR: gym 0.13.1 has requirement cloudpickle~=1.2.0, but you'll have cloudpickle 0.5.2 which is incompatible. Installing collected packages: cloudpickle, numpy, scipy, pyglet, gym, ptyprocess, pexpect, decorator, parso, jedi, backcall, wcwidth, prompt-toolkit, pickleshare, ipython-genutils, traitlets, ipython, joblib, python-dateutil, cycler, kiwisolver, matplotlib, pandas, zipp, importlib-metadata, atomicwrites, attrs, py, pluggy, more-itertools, pytest, psutil, seaborn, sphinx, PyYAML, argh, pathtools, watchdog, tornado, port-for, livereload, sphinx-autobuild, sphinx-rtd-theme, keras-preprocessing, h5py, keras-applications, absl-py, gast, protobuf, grpcio, werkzeug, markdown, tensorboard, tensorflow-estimator, termcolor, astor, google-pasta, wrapt, tensorflow, tqdm Found existing installation: Sphinx 1.8.5 Uninstalling Sphinx-1.8.5: Successfully uninstalled Sphinx-1.8.5 Found existing installation: sphinx-rtd-theme 0.4.3 Uninstalling sphinx-rtd-theme-0.4.3: Successfully uninstalled sphinx-rtd-theme-0.4.3 Successfully installed PyYAML-5.1.1 absl-py-0.7.1 argh-0.26.2 astor-0.8.0 atomicwrites-1.3.0 attrs-19.1.0 backcall-0.1.0 cloudpickle-0.5.2 cycler-0.10.0 decorator-4.4.0 gast-0.2.2 google-pasta-0.1.7 grpcio-1.22.0 gym-0.13.1 h5py-2.9.0 importlib-metadata-0.18 ipython-7.6.1 ipython-genutils-0.2.0 jedi-0.14.1 joblib-0.13.2 keras-applications-1.0.8 keras-preprocessing-1.1.0 kiwisolver-1.1.0 livereload-2.6.1 markdown-3.1.1 matplotlib-3.1.1 more-itertools-7.1.0 numpy-1.16.4 pandas-0.24.2 parso-0.5.1 pathtools-0.1.2 pexpect-4.7.0 pickleshare-0.7.5 pluggy-0.12.0 port-for-0.3.1 prompt-toolkit-2.0.9 protobuf-3.9.0 psutil-5.6.3 ptyprocess-0.6.0 py-1.8.0 pyglet-1.3.2 pytest-5.0.1 python-dateutil-2.8.0 scipy-1.3.0 seaborn-0.8.1 sphinx-1.5.6 sphinx-autobuild-0.7.1 sphinx-rtd-theme-0.4.1 tensorboard-1.14.0 tensorflow-1.14.0 tensorflow-estimator-1.14.0 termcolor-1.1.0 tornado-6.0.3 tqdm-4.32.2 traitlets-4.3.2 watchdog-0.9.0 wcwidth-0.1.7 werkzeug-0.15.4 wrapt-1.11.2 zipp-0.5.2 [rtd-command-info] start-time: 2019-07-15T17:13:33.026030Z, end-time: 2019-07-15T17:13:33.334968Z, duration: 0, exit-code: 0 cat docs/conf.py #!/usr/bin/env python3 # -*- coding: utf-8 -*- # # Spinning Up documentation build configuration file, created by # sphinx-quickstart on Wed Aug 15 04:21:07 2018. # # This file is execfile()d with the current directory set to its # containing dir. # # Note that not all possible configuration values are present in this # autogenerated file. # # All configuration values have a default; values that are commented out # serve to show the default. # If extensions (or modules to document with autodoc) are in another directory, # add these directories to sys.path here. If the directory is relative to the # documentation root, use os.path.abspath to make it absolute, like shown here. # import os import sys # Make sure spinup is accessible without going through setup.py dirname = os.path.dirname sys.path.insert(0, dirname(dirname(__file__))) # Mock mpi4py to get around having to install it on RTD server (which fails) from unittest.mock import MagicMock class Mock(MagicMock): @classmethod def __getattr__(cls, name): return MagicMock() MOCK_MODULES = ['mpi4py'] sys.modules.update((mod_name, Mock()) for mod_name in MOCK_MODULES) # Finish imports import spinup from recommonmark.parser import CommonMarkParser source_parsers = { '.md': CommonMarkParser, } # -- General configuration ------------------------------------------------ # If your documentation needs a minimal Sphinx version, state it here. # # needs_sphinx = '1.0' # Add any Sphinx extension module names here, as strings. They can be # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom # ones. extensions = ['sphinx.ext.imgmath', 'sphinx.ext.viewcode', 'sphinx.ext.autodoc', 'sphinx.ext.napoleon'] #'sphinx.ext.mathjax', ?? # imgmath settings imgmath_image_format = 'svg' imgmath_font_size = 14 # Add any paths that contain templates here, relative to this directory. templates_path = ['_templates'] # The suffix(es) of source filenames. # You can specify multiple suffix as a list of string: # source_suffix = ['.rst', '.md'] # source_suffix = '.rst' # The master toctree document. master_doc = 'index' # General information about the project. project = 'Spinning Up' copyright = '2018, OpenAI' author = 'Joshua Achiam' # The version info for the project you're documenting, acts as replacement for # |version| and |release|, also used in various other places throughout the # built documents. # # The short X.Y version. version = '' # The full version, including alpha/beta/rc tags. release = '' # The language for content autogenerated by Sphinx. Refer to documentation # for a list of supported languages. # # This is also used if you do content translation via gettext catalogs. # Usually you set "language" from the command line for these cases. language = None # List of patterns, relative to source directory, that match files and # directories to ignore when looking for source files. # This patterns also effect to html_static_path and html_extra_path exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store'] # The name of the Pygments (syntax highlighting) style to use. pygments_style = 'default' #'sphinx' # If true, `todo` and `todoList` produce output, else they produce nothing. todo_include_todos = False # -- Options for HTML output ---------------------------------------------- # The theme to use for HTML and HTML Help pages. See the documentation for # a list of builtin themes. # # html_theme = 'alabaster' html_theme = "sphinx_rtd_theme" # Theme options are theme-specific and customize the look and feel of a theme # further. For a list of options available for each theme, see the # documentation. # # html_theme_options = {} # Add any paths that contain custom static files (such as style sheets) here, # relative to this directory. They are copied after the builtin static files, # so a file named "default.css" will overwrite the builtin "default.css". html_static_path = ['_static'] html_logo = 'images/spinning-up-logo2.png' html_theme_options = { 'logo_only': True } #html_favicon = 'openai-favicon2_32x32.ico' html_favicon = 'openai_icon.ico' # -- Options for HTMLHelp output ------------------------------------------ # Output file base name for HTML help builder. htmlhelp_basename = 'SpinningUpdoc' # -- Options for LaTeX output --------------------------------------------- latex_elements = { # The paper size ('letterpaper' or 'a4paper'). # # 'papersize': 'letterpaper', # The font size ('10pt', '11pt' or '12pt'). # # 'pointsize': '10pt', # Additional stuff for the LaTeX preamble. # # 'preamble': '', # Latex figure (float) alignment # # 'figure_align': 'htbp', } imgmath_latex_preamble = r''' \usepackage{algorithm} \usepackage{algorithmic} \usepackage{cancel} \usepackage[verbose=true,letterpaper]{geometry} \geometry{ textheight=12in, textwidth=6.5in, top=1in, headheight=12pt, headsep=25pt, footskip=30pt } \newcommand{\E}{{\mathrm E}} \newcommand{\underE}[2]{\underset{\begin{subarray}{c}#1 \end{subarray}}{\E}\left[ #2 \right]} \newcommand{\Epi}[1]{\underset{\begin{subarray}{c}\tau \sim \pi \end{subarray}}{\E}\left[ #1 \right]} ''' # Grouping the document tree into LaTeX files. List of tuples # (source start file, target name, title, # author, documentclass [howto, manual, or own class]). latex_documents = [ (master_doc, 'SpinningUp.tex', 'Spinning Up Documentation', 'Joshua Achiam', 'manual'), ] # -- Options for manual page output --------------------------------------- # One entry per manual page. List of tuples # (source start file, name, description, authors, manual section). man_pages = [ (master_doc, 'spinningup', 'Spinning Up Documentation', [author], 1) ] # -- Options for Texinfo output ------------------------------------------- # Grouping the document tree into Texinfo files. List of tuples # (source start file, target name, title, author, # dir menu entry, description, category) texinfo_documents = [ (master_doc, 'SpinningUp', 'Spinning Up Documentation', author, 'SpinningUp', 'One line description of project.', 'Miscellaneous'), ] def setup(app): app.add_stylesheet('css/modify.css') ########################################################################### # auto-created readthedocs.org specific configuration # ########################################################################### # # The following code was added during an automated build on readthedocs.org # It is auto created and injected for every build. The result is based on the # conf.py.tmpl file found in the readthedocs.org codebase: # https://github.com/rtfd/readthedocs.org/blob/master/readthedocs/doc_builder/templates/doc_builder/conf.py.tmpl # import importlib import sys import os.path from six import string_types from sphinx import version_info # Get suffix for proper linking to GitHub # This is deprecated in Sphinx 1.3+, # as each page can have its own suffix if globals().get('source_suffix', False): if isinstance(source_suffix, string_types): SUFFIX = source_suffix elif isinstance(source_suffix, (list, tuple)): # Sphinx >= 1.3 supports list/tuple to define multiple suffixes SUFFIX = source_suffix[0] elif isinstance(source_suffix, dict): # Sphinx >= 1.8 supports a mapping dictionary for mulitple suffixes SUFFIX = list(source_suffix.keys())[0] # make a ``list()`` for py2/py3 compatibility else: # default to .rst SUFFIX = '.rst' else: SUFFIX = '.rst' # Add RTD Static Path. Add to the end because it overwrites previous files. if not 'html_static_path' in globals(): html_static_path = [] if os.path.exists('_static'): html_static_path.append('_static') # Add RTD Theme only if they aren't overriding it already using_rtd_theme = ( ( 'html_theme' in globals() and html_theme in ['default'] and # Allow people to bail with a hack of having an html_style 'html_style' not in globals() ) or 'html_theme' not in globals() ) if using_rtd_theme: theme = importlib.import_module('sphinx_rtd_theme') html_theme = 'sphinx_rtd_theme' html_style = None html_theme_options = {} if 'html_theme_path' in globals(): html_theme_path.append(theme.get_html_theme_path()) else: html_theme_path = [theme.get_html_theme_path()] if globals().get('websupport2_base_url', False): websupport2_base_url = 'https://readthedocs.com/websupport' websupport2_static_url = 'https://media.readthedocs.com/' #Add project information to the template context. context = { 'using_theme': using_rtd_theme, 'html_theme': html_theme, 'current_version': "latest", 'version_slug': "latest", 'MEDIA_URL': "https://media.readthedocs.com/media/", 'STATIC_URL': "https://media.readthedocs.com/", 'PRODUCTION_DOMAIN': "readthedocs.com", 'versions': [ ("latest", "/en/latest/"), ], 'downloads': [ ("pdf", "//readthedocs.com/projects/openai-education-spinningup/downloads/pdf/latest/"), ("html", "//readthedocs.com/projects/openai-education-spinningup/downloads/htmlzip/latest/"), ("epub", "//readthedocs.com/projects/openai-education-spinningup/downloads/epub/latest/"), ], 'subprojects': [ ], 'slug': 'openai-education-spinningup', 'name': u'spinningup', 'rtd_language': u'en', 'programming_language': u'words', 'canonical_url': 'https://spinningup.openai.com/en/latest/', 'analytics_code': 'UA-129132782-1', 'single_version': False, 'conf_py_path': '/docs/', 'api_host': 'https://readthedocs.com', 'github_user': 'openai', 'github_repo': 'spinningup', 'github_version': 'master', 'display_github': True, 'bitbucket_user': 'None', 'bitbucket_repo': 'None', 'bitbucket_version': 'master', 'display_bitbucket': False, 'gitlab_user': 'None', 'gitlab_repo': 'None', 'gitlab_version': 'master', 'display_gitlab': False, 'READTHEDOCS': True, 'using_theme': (html_theme == "default"), 'new_theme': (html_theme == "sphinx_rtd_theme"), 'source_suffix': SUFFIX, 'ad_free': False, 'user_analytics_code': 'UA-129132782-1', 'global_analytics_code': 'UA-17997319-2', 'commit': '2e0eff9b', } if 'html_context' in globals(): html_context.update(context) else: html_context = context # Add custom RTD extension if 'extensions' in globals(): # Insert at the beginning because it can interfere # with other extensions. # See https://github.com/rtfd/readthedocs.org/pull/4054 extensions.insert(0, "readthedocs_ext.readthedocs") else: extensions = ["readthedocs_ext.readthedocs"] project_language = 'en' # User's Sphinx configurations language_user = globals().get('language', None) latex_engine_user = globals().get('latex_engine', None) latex_elements_user = globals().get('latex_elements', None) # Remove this once xindy gets installed in Docker image and XINDYOPS # env variable is supported # https://github.com/rtfd/readthedocs-docker-images/pull/98 latex_use_xindy = False chinese = any([ language_user in ('zh_CN', 'zh_TW'), project_language in ('zh_CN', 'zh_TW'), ]) japanese = any([ language_user == 'ja', project_language == 'ja', ]) if chinese: latex_engine = latex_engine_user or 'xelatex' latex_elements_rtd = { 'preamble': '\\usepackage[UTF8]{ctex}\n', } latex_elements = latex_elements_user or latex_elements_rtd elif japanese: latex_engine = latex_engine_user or 'platex' [rtd-command-info] start-time: 2019-07-15T17:13:33.418873Z, end-time: 2019-07-15T17:15:03.440567Z, duration: 90, exit-code: 0 python sphinx-build -T -E -b readthedocs -d _build/doctrees-readthedocs -D language=en . _build/html Running Sphinx v1.5.6 making output directory... WARNING: Logging before flag parsing goes to stderr. W0715 17:13:37.655814 140251884380288 deprecation_wrapper.py:119] From /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/spinup/utils/mpi_tf.py:29: The name tf.train.AdamOptimizer is deprecated. Please use tf.compat.v1.train.AdamOptimizer instead. loading translations [en]... done building [mo]: targets for 0 po files that are out of date building [readthedocs]: targets for 31 source files that are out of date updating environment: 31 added, 0 changed, 0 removed reading sources... [ 3%] algorithms/ddpg reading sources... [ 6%] algorithms/ppo reading sources... [ 9%] algorithms/sac reading sources... [ 12%] algorithms/td3 reading sources... [ 16%] algorithms/trpo reading sources... [ 19%] algorithms/vpg reading sources... [ 22%] etc/acknowledgements reading sources... [ 25%] etc/author reading sources... [ 29%] index reading sources... [ 32%] spinningup/bench reading sources... [ 35%] spinningup/exercise2_1_soln reading sources... [ 38%] spinningup/exercise2_2_soln reading sources... [ 41%] spinningup/exercises reading sources... [ 45%] spinningup/extra_pg_proof1 reading sources... [ 48%] spinningup/extra_pg_proof2 reading sources... [ 51%] spinningup/keypapers reading sources... [ 54%] spinningup/rl_intro reading sources... [ 58%] spinningup/rl_intro2 reading sources... [ 61%] spinningup/rl_intro3 reading sources... [ 64%] spinningup/rl_intro4 reading sources... [ 67%] spinningup/spinningup reading sources... [ 70%] user/algorithms reading sources... [ 74%] user/installation reading sources... [ 77%] user/introduction reading sources... [ 80%] user/plotting reading sources... [ 83%] user/running reading sources... [ 87%] user/saving_and_loading reading sources... [ 90%] utils/logger reading sources... [ 93%] utils/mpi reading sources... [ 96%] utils/plotter reading sources... [100%] utils/run_utils /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/exercises.rst:3: WARNING: Duplicate explicit target name: "solution available here.". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/rl_intro3.rst:3: WARNING: Duplicate explicit target name: "on github". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/rl_intro3.rst:380: WARNING: Line block ends without a blank line. /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/plotting.rst:55: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/plotting.rst:63: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/running.rst:144: WARNING: Duplicate explicit target name: "cmdoption--ac_kwargs". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/running.rst:285: WARNING: Inline strong start-string without end-string. /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/saving_and_loading.rst:119: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/saving_and_loading.rst:146: WARNING: Duplicate explicit target name: "cmdoption-arg-default". looking for now-outdated files... none found pickling environment... done checking consistency... /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/exercise2_1_soln.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/exercise2_2_soln.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/extra_pg_proof1.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/extra_pg_proof2.rst:: WARNING: document isn't included in any toctree done preparing documents... /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/rl_intro4.rst:: WARNING: document isn't included in any toctree done writing output... [ 3%] algorithms/ddpg writing output... [ 6%] algorithms/ppo writing output... [ 9%] algorithms/sac writing output... [ 12%] algorithms/td3 writing output... [ 16%] algorithms/trpo writing output... [ 19%] algorithms/vpg writing output... [ 22%] etc/acknowledgements writing output... [ 25%] etc/author writing output... [ 29%] index writing output... [ 32%] spinningup/bench writing output... [ 35%] spinningup/exercise2_1_soln writing output... [ 38%] spinningup/exercise2_2_soln writing output... [ 41%] spinningup/exercises writing output... [ 45%] spinningup/extra_pg_proof1 writing output... [ 48%] spinningup/extra_pg_proof2 writing output... [ 51%] spinningup/keypapers writing output... [ 54%] spinningup/rl_intro writing output... [ 58%] spinningup/rl_intro2 writing output... [ 61%] spinningup/rl_intro3 writing output... [ 64%] spinningup/rl_intro4 writing output... [ 67%] spinningup/spinningup writing output... [ 70%] user/algorithms writing output... [ 74%] user/installation writing output... [ 77%] user/introduction writing output... [ 80%] user/plotting writing output... [ 83%] user/running writing output... [ 87%] user/saving_and_loading writing output... [ 90%] utils/logger writing output... [ 93%] utils/mpi writing output... [ 96%] utils/plotter writing output... [100%] utils/run_utils generating indices... genindex py-modindex highlighting module code... [ 10%] spinup.algos.ddpg.ddpg highlighting module code... [ 20%] spinup.algos.ppo.ppo highlighting module code... [ 30%] spinup.algos.sac.sac highlighting module code... [ 40%] spinup.algos.td3.td3 highlighting module code... [ 50%] spinup.algos.trpo.trpo highlighting module code... [ 60%] spinup.algos.vpg.vpg highlighting module code... [ 70%] spinup.utils.logx highlighting module code... [ 80%] spinup.utils.mpi_tools highlighting module code... [ 90%] spinup.utils.mpi_tf highlighting module code... [100%] spinup.utils.run_utils writing additional pages... search copying images... [ 10%] images/spinning-up-in-rl.png copying images... [ 20%] spinningup/../images/bench/bench_halfcheetah.svg copying images... [ 30%] spinningup/../images/bench/bench_hopper.svg copying images... [ 40%] spinningup/../images/bench/bench_walker.svg copying images... [ 50%] spinningup/../images/bench/bench_swim.svg copying images... [ 60%] spinningup/../images/bench/bench_ant.svg copying images... [ 70%] spinningup/../images/ex2-1_trpo_hopper.png copying images... [ 80%] spinningup/../images/ex2-2_ddpg_bug.svg copying images... [ 90%] spinningup/../images/rl_diagram_transparent_bg.png copying images... [100%] spinningup/../images/rl_algorithms_9_15.svg copying static files... WARNING: favicon file 'openai_icon.ico' does not exist done copying extra files... done dumping search index in English (code: en) ... done dumping object inventory... done build succeeded, 15 warnings. [rtd-command-info] start-time: 2019-07-15T17:15:03.714599Z, end-time: 2019-07-15T17:16:18.383261Z, duration: 74, exit-code: 0 python sphinx-build -T -b readthedocssinglehtmllocalmedia -d _build/doctrees-readthedocssinglehtmllocalmedia -D language=en . _build/localmedia Running Sphinx v1.5.6 making output directory... WARNING: Logging before flag parsing goes to stderr. W0715 17:15:06.350558 140106452594816 deprecation_wrapper.py:119] From /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/spinup/utils/mpi_tf.py:29: The name tf.train.AdamOptimizer is deprecated. Please use tf.compat.v1.train.AdamOptimizer instead. loading translations [en]... done loading pickled environment... not yet created building [mo]: targets for 0 po files that are out of date building [readthedocssinglehtmllocalmedia]: all documents updating environment: 31 added, 0 changed, 0 removed reading sources... [ 3%] algorithms/ddpg reading sources... [ 6%] algorithms/ppo reading sources... [ 9%] algorithms/sac reading sources... [ 12%] algorithms/td3 reading sources... [ 16%] algorithms/trpo reading sources... [ 19%] algorithms/vpg reading sources... [ 22%] etc/acknowledgements reading sources... [ 25%] etc/author reading sources... [ 29%] index reading sources... [ 32%] spinningup/bench reading sources... [ 35%] spinningup/exercise2_1_soln reading sources... [ 38%] spinningup/exercise2_2_soln reading sources... [ 41%] spinningup/exercises reading sources... [ 45%] spinningup/extra_pg_proof1 reading sources... [ 48%] spinningup/extra_pg_proof2 reading sources... [ 51%] spinningup/keypapers reading sources... [ 54%] spinningup/rl_intro reading sources... [ 58%] spinningup/rl_intro2 reading sources... [ 61%] spinningup/rl_intro3 reading sources... [ 64%] spinningup/rl_intro4 reading sources... [ 67%] spinningup/spinningup reading sources... [ 70%] user/algorithms reading sources... [ 74%] user/installation reading sources... [ 77%] user/introduction reading sources... [ 80%] user/plotting reading sources... [ 83%] user/running reading sources... [ 87%] user/saving_and_loading reading sources... [ 90%] utils/logger reading sources... [ 93%] utils/mpi reading sources... [ 96%] utils/plotter reading sources... [100%] utils/run_utils /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/exercises.rst:3: WARNING: Duplicate explicit target name: "solution available here.". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/rl_intro3.rst:3: WARNING: Duplicate explicit target name: "on github". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/rl_intro3.rst:380: WARNING: Line block ends without a blank line. /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/plotting.rst:55: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/plotting.rst:63: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/running.rst:144: WARNING: Duplicate explicit target name: "cmdoption--ac_kwargs". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/running.rst:285: WARNING: Inline strong start-string without end-string. /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/saving_and_loading.rst:119: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/saving_and_loading.rst:146: WARNING: Duplicate explicit target name: "cmdoption-arg-default". looking for now-outdated files... none found pickling environment... done checking consistency... /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/exercise2_1_soln.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/exercise2_2_soln.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/extra_pg_proof1.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/extra_pg_proof2.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/rl_intro4.rst:: WARNING: document isn't included in any toctree done preparing documents... done assembling single document... user/introduction user/installation user/algorithms user/running user/saving_and_loading user/plotting spinningup/rl_intro spinningup/rl_intro2 spinningup/rl_intro3 spinningup/spinningup spinningup/keypapers spinningup/exercises spinningup/bench algorithms/vpg algorithms/trpo algorithms/ppo algorithms/ddpg algorithms/td3 algorithms/sac utils/logger utils/plotter utils/mpi utils/run_utils etc/acknowledgements etc/author writing... done writing additional files... copying images... [ 12%] images/spinning-up-in-rl.png copying images... [ 25%] spinningup/../images/rl_diagram_transparent_bg.png copying images... [ 37%] spinningup/../images/rl_algorithms_9_15.svg copying images... [ 50%] spinningup/../images/bench/bench_halfcheetah.svg copying images... [ 62%] spinningup/../images/bench/bench_hopper.svg copying images... [ 75%] spinningup/../images/bench/bench_walker.svg copying images... [ 87%] spinningup/../images/bench/bench_swim.svg copying images... [100%] spinningup/../images/bench/bench_ant.svg copying static files... done WARNING: favicon file 'openai_icon.ico' does not exist copying extra files... done dumping object inventory... done build succeeded, 15 warnings. [rtd-command-info] start-time: 2019-07-15T17:16:18.589551Z, end-time: 2019-07-15T17:16:25.329665Z, duration: 6, exit-code: 0 python sphinx-build -b latex -D language=en -d _build/doctrees . _build/latex Running Sphinx v1.5.6 making output directory... WARNING: Logging before flag parsing goes to stderr. W0715 17:16:21.306950 140558918713472 deprecation_wrapper.py:119] From /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/spinup/utils/mpi_tf.py:29: The name tf.train.AdamOptimizer is deprecated. Please use tf.compat.v1.train.AdamOptimizer instead. loading translations [en]... done loading pickled environment... not yet created building [mo]: targets for 0 po files that are out of date building [latex]: all documents updating environment: 31 added, 0 changed, 0 removed reading sources... [ 3%] algorithms/ddpg reading sources... [ 6%] algorithms/ppo reading sources... [ 9%] algorithms/sac reading sources... [ 12%] algorithms/td3 reading sources... [ 16%] algorithms/trpo reading sources... [ 19%] algorithms/vpg reading sources... [ 22%] etc/acknowledgements reading sources... [ 25%] etc/author reading sources... [ 29%] index reading sources... [ 32%] spinningup/bench reading sources... [ 35%] spinningup/exercise2_1_soln reading sources... [ 38%] spinningup/exercise2_2_soln reading sources... [ 41%] spinningup/exercises reading sources... [ 45%] spinningup/extra_pg_proof1 reading sources... [ 48%] spinningup/extra_pg_proof2 reading sources... [ 51%] spinningup/keypapers reading sources... [ 54%] spinningup/rl_intro reading sources... [ 58%] spinningup/rl_intro2 reading sources... [ 61%] spinningup/rl_intro3 reading sources... [ 64%] spinningup/rl_intro4 reading sources... [ 67%] spinningup/spinningup reading sources... [ 70%] user/algorithms reading sources... [ 74%] user/installation reading sources... [ 77%] user/introduction reading sources... [ 80%] user/plotting reading sources... [ 83%] user/running reading sources... [ 87%] user/saving_and_loading reading sources... [ 90%] utils/logger reading sources... [ 93%] utils/mpi reading sources... [ 96%] utils/plotter reading sources... [100%] utils/run_utils /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/exercises.rst:3: WARNING: Duplicate explicit target name: "solution available here.". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/rl_intro3.rst:3: WARNING: Duplicate explicit target name: "on github". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/rl_intro3.rst:380: WARNING: Line block ends without a blank line. /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/plotting.rst:55: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/plotting.rst:63: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/running.rst:144: WARNING: Duplicate explicit target name: "cmdoption--ac_kwargs". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/running.rst:285: WARNING: Inline strong start-string without end-string. /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/saving_and_loading.rst:119: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/saving_and_loading.rst:146: WARNING: Duplicate explicit target name: "cmdoption-arg-default". looking for now-outdated files... none found pickling environment... done checking consistency... /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/exercise2_1_soln.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/exercise2_2_soln.rst:: WARNING: document isn't included in any toctree done /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/extra_pg_proof1.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/extra_pg_proof2.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/rl_intro4.rst:: WARNING: document isn't included in any toctree processing SpinningUp.tex... index user/introduction user/installation user/algorithms user/running user/saving_and_loading user/plotting spinningup/rl_intro spinningup/rl_intro2 spinningup/rl_intro3 spinningup/spinningup spinningup/keypapers spinningup/exercises spinningup/bench algorithms/vpg algorithms/trpo algorithms/ppo algorithms/ddpg algorithms/td3 algorithms/sac utils/logger utils/plotter utils/mpi utils/run_utils etc/acknowledgements etc/author resolving references... writing... done copying images... images/spinning-up-in-rl.png spinningup/../images/rl_diagram_transparent_bg.png spinningup/../images/rl_algorithms_9_15.svg spinningup/../images/bench/bench_halfcheetah.svg spinningup/../images/bench/bench_hopper.svg spinningup/../images/bench/bench_walker.svg spinningup/../images/bench/bench_swim.svg spinningup/../images/bench/bench_ant.svg copying TeX support files... done build succeeded, 14 warnings. [rtd-command-info] start-time: 2019-07-15T17:16:25.767158Z, end-time: 2019-07-15T17:16:27.373630Z, duration: 1, exit-code: 0 pdflatex -interaction=nonstopmode /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/_build/latex/SpinningUp.tex This is pdfTeX, Version 3.14159265-2.6-1.40.18 (TeX Live 2017/Debian) (preloaded format=pdflatex) restricted \write18 enabled. entering extended mode (/home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/c heckouts/latest/docs/_build/latex/SpinningUp.tex LaTeX2e <2017-04-15> Babel <3.18> and hyphenation patterns for 84 language(s) loaded. (./sphinxmanual.cls Document Class: sphinxmanual 2017/03/26 v1.5.4 Document class (Sphinx manual) (/usr/share/texlive/texmf-dist/tex/latex/base/report.cls Document Class: report 2014/09/29 v1.4h Standard LaTeX document class (/usr/share/texlive/texmf-dist/tex/latex/base/size10.clo))) (/usr/share/texlive/texmf-dist/tex/latex/base/inputenc.sty (/usr/share/texlive/texmf-dist/tex/latex/base/utf8.def (/usr/share/texlive/texmf-dist/tex/latex/base/t1enc.dfu) (/usr/share/texlive/texmf-dist/tex/latex/base/ot1enc.dfu) (/usr/share/texlive/texmf-dist/tex/latex/base/omsenc.dfu))) (/usr/share/texlive/texmf-dist/tex/latex/cmap/cmap.sty) (/usr/share/texlive/texmf-dist/tex/latex/base/fontenc.sty (/usr/share/texlive/texmf-dist/tex/latex/base/t1enc.def)<>) (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsmath.sty For additional information on amsmath, use the `?' option. (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amstext.sty (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsgen.sty)) (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsbsy.sty) (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsopn.sty)) (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/amssymb.sty (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/amsfonts.sty)) (/usr/share/texlive/texmf-dist/tex/generic/babel/babel.sty (/usr/share/texlive/texmf-dist/tex/generic/babel/switch.def) (/usr/share/texlive/texmf-dist/tex/generic/babel-english/english.ldf (/usr/share/texlive/texmf-dist/tex/generic/babel/babel.def (/usr/share/texlive/texmf-dist/tex/generic/babel/txtbabel.def)))) (/usr/share/texlive/texmf-dist/tex/latex/psnfss/times.sty) (/usr/share/texlive/texmf-dist/tex/latex/fncychap/fncychap.sty) (/usr/share/texlive/texmf-dist/tex/latex/tools/longtable.sty) (./sphinx.sty (/usr/share/texlive/texmf-dist/tex/latex/graphics/graphicx.sty (/usr/share/texlive/texmf-dist/tex/latex/graphics/keyval.sty) (/usr/share/texlive/texmf-dist/tex/latex/graphics/graphics.sty (/usr/share/texlive/texmf-dist/tex/latex/graphics/trig.sty) (/usr/share/texlive/texmf-dist/tex/latex/graphics-cfg/graphics.cfg) (/usr/share/texlive/texmf-dist/tex/latex/graphics-def/pdftex.def))) (/usr/share/texlive/texmf-dist/tex/latex/fancyhdr/fancyhdr.sty) (/usr/share/texlive/texmf-dist/tex/latex/base/textcomp.sty (/usr/share/texlive/texmf-dist/tex/latex/base/ts1enc.def (/usr/share/texlive/texmf-dist/tex/latex/base/ts1enc.dfu))) (/usr/share/texlive/texmf-dist/tex/latex/titlesec/titlesec.sty) (/usr/share/texlive/texmf-dist/tex/latex/tabulary/tabulary.sty (/usr/share/texlive/texmf-dist/tex/latex/tools/array.sty)) (/usr/share/texlive/texmf-dist/tex/latex/base/makeidx.sty) (/usr/share/texlive/texmf-dist/tex/latex/framed/framed.sty) (/usr/share/texlive/texmf-dist/tex/latex/xcolor/xcolor.sty (/usr/share/texlive/texmf-dist/tex/latex/graphics-cfg/color.cfg)) (/usr/share/texlive/texmf-dist/tex/latex/fancyvrb/fancyvrb.sty Style option: `fancyvrb' v2.7a, with DG/SPQR fixes, and firstline=lastline fix <2008/02/07> (tvz)) (/usr/share/texlive/texmf-dist/tex/latex/threeparttable/threeparttable.sty) (./footnotehyper-sphinx.sty (/usr/share/texlive/texmf-dist/tex/latex/mdwtools/footnote.sty)) (/usr/share/texlive/texmf-dist/tex/latex/float/float.sty) (/usr/share/texlive/texmf-dist/tex/latex/wrapfig/wrapfig.sty) (/usr/share/texlive/texmf-dist/tex/latex/parskip/parskip.sty) (/usr/share/texlive/texmf-dist/tex/latex/base/alltt.sty) (/usr/share/texlive/texmf-dist/tex/latex/upquote/upquote.sty) (/usr/share/texlive/texmf-dist/tex/latex/capt-of/capt-of.sty) (./needspace.sty) (./sphinxhighlight.sty) (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/kvoptions.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ltxcmds.sty) (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/kvsetkeys.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/infwarerr.sty) (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/etexcmds.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ifluatex.sty)))) (/usr/share/texlive/texmf-dist/tex/generic/pdftex/pdfcolor.tex) ** (sphinx) defining (legacy) text style macros without \sphinx prefix ** if clashes with packages, set latex_keep_old_macro_names=False in conf.py ) (/usr/share/texlive/texmf-dist/tex/latex/geometry/geometry.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ifpdf.sty) (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ifvtex.sty) (/usr/share/texlive/texmf-dist/tex/generic/ifxetex/ifxetex.sty)) (/usr/share/texlive/texmf-dist/tex/latex/multirow/multirow.sty) (/usr/share/texlive/texmf-dist/tex/latex/eqparbox/eqparbox.sty (/usr/share/texlive/texmf-dist/tex/latex/environ/environ.sty (/usr/share/texlive/texmf-dist/tex/latex/trimspaces/trimspaces.sty))) (/usr/share/texlive/texmf-dist/tex/latex/hyperref/hyperref.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/hobsub-hyperref.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/hobsub-generic.sty)) (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/auxhook.sty) (/usr/share/texlive/texmf-dist/tex/latex/hyperref/pd1enc.def) (/usr/share/texlive/texmf-dist/tex/latex/latexconfig/hyperref.cfg) (/usr/share/texlive/texmf-dist/tex/latex/hyperref/puenc.def) (/usr/share/texlive/texmf-dist/tex/latex/url/url.sty)) (/usr/share/texlive/texmf-dist/tex/latex/hyperref/hpdftex.def (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/rerunfilecheck.sty)) (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/hypcap.sty) Writing index file SpinningUp.idx No file SpinningUp.aux. (/usr/share/texlive/texmf-dist/tex/latex/base/ts1cmr.fd) (/usr/share/texlive/texmf-dist/tex/latex/psnfss/t1ptm.fd) (/usr/share/texlive/texmf-dist/tex/context/base/mkii/supp-pdf.mkii [Loading MPS to PDF converter (version 2006.09.02).] ) (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/epstopdf-base.sty (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/grfext.sty) (/usr/share/texlive/texmf-dist/tex/latex/latexconfig/epstopdf-sys.cfg)) *geometry* driver: auto-detecting *geometry* detected driver: pdftex (/usr/share/texlive/texmf-dist/tex/latex/hyperref/nameref.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/gettitlestring.sty)) (/usr/share/texlive/texmf-dist/tex/latex/psnfss/t1phv.fd)<><><><> (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/umsa.fd) (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/umsb.fd) [1{/var/lib/texmf/fo nts/map/pdftex/updmap/pdftex.map}] [2] [1] [2] (/usr/share/texlive/texmf-dist/tex/latex/psnfss/t1pcr.fd) [1 <./spinning-up-in- rl.png>] [2] Chapter 1. (/usr/share/texlive/texmf-dist/tex/latex/psnfss/ts1ptm.fd) LaTeX Warning: Hyper reference `user/introduction:introduction' on page 3 undef ined on input line 68. LaTeX Warning: Hyper reference `user/introduction:what-this-is' on page 3 undef ined on input line 71. LaTeX Warning: Hyper reference `user/introduction:why-we-built-this' on page 3 undefined on input line 74. LaTeX Warning: Hyper reference `user/introduction:how-this-serves-our-mission' on page 3 undefined on input line 77. LaTeX Warning: Hyper reference `user/introduction:code-design-philosophy' on pa ge 3 undefined on input line 80. LaTeX Warning: Hyper reference `user/introduction:support-plan' on page 3 undef ined on input line 83. [3] [4] [5] [6] Chapter 2. LaTeX Warning: Hyper reference `user/installation:installation' on page 7 undef ined on input line 207. LaTeX Warning: Hyper reference `user/installation:installing-python' on page 7 undefined on input line 210. LaTeX Warning: Hyper reference `user/installation:installing-openmpi' on page 7 undefined on input line 213. LaTeX Warning: Hyper reference `user/installation:ubuntu' on page 7 undefined o n input line 216. LaTeX Warning: Hyper reference `user/installation:mac-os-x' on page 7 undefined on input line 219. LaTeX Warning: Hyper reference `user/installation:installing-spinning-up' on pa ge 7 undefined on input line 224. LaTeX Warning: Hyper reference `user/installation:check-your-install' on page 7 undefined on input line 227. LaTeX Warning: Hyper reference `user/installation:installing-mujoco-optional' o n page 7 undefined on input line 230. [7] [8] [9] [10] Chapter 3. LaTeX Warning: Hyper reference `user/algorithms:algorithms' on page 11 undefine d on input line 361. LaTeX Warning: Hyper reference `user/algorithms:what-s-included' on page 11 und efined on input line 364. LaTeX Warning: Hyper reference `user/algorithms:why-these-algorithms' on page 1 1 undefined on input line 367. LaTeX Warning: Hyper reference `user/algorithms:the-on-policy-algorithms' on pa ge 11 undefined on input line 370. LaTeX Warning: Hyper reference `user/algorithms:the-off-policy-algorithms' on p age 11 undefined on input line 373. LaTeX Warning: Hyper reference `user/algorithms:code-format' on page 11 undefin ed on input line 378. LaTeX Warning: Hyper reference `user/algorithms:the-algorithm-file' on page 11 undefined on input line 381. LaTeX Warning: Hyper reference `user/algorithms:the-core-file' on page 11 undef ined on input line 384. [11] [12] [13] [14] Chapter 4. LaTeX Warning: Hyper reference `user/running:running-experiments' on page 15 un defined on input line 530. LaTeX Warning: Hyper reference `user/running:launching-from-the-command-line' o n page 15 undefined on input line 533. LaTeX Warning: Hyper reference `user/running:setting-hyperparameters-from-the-c ommand-line' on page 15 undefined on input line 536. LaTeX Warning: Hyper reference `user/running:launching-multiple-experiments-at- once' on page 15 undefined on input line 539. LaTeX Warning: Hyper reference `user/running:special-flags' on page 15 undefine d on input line 542. LaTeX Warning: Hyper reference `user/running:environment-flag' on page 15 undef ined on input line 545. LaTeX Warning: Hyper reference `user/running:shortcut-flags' on page 15 undefin ed on input line 548. LaTeX Warning: Hyper reference `user/running:config-flags' on page 15 undefined on input line 551. LaTeX Warning: Hyper reference `user/running:where-results-are-saved' on page 1 5 undefined on input line 556. LaTeX Warning: Hyper reference `user/running:how-is-suffix-determined' on page 15 undefined on input line 559. LaTeX Warning: Hyper reference `user/running:extra' on page 15 undefined on inp ut line 564. LaTeX Warning: Hyper reference `user/running:launching-from-scripts' on page 15 undefined on input line 569. LaTeX Warning: Hyper reference `user/running:using-experimentgrid' on page 15 u ndefined on input line 572. [15] [16] [17] [18] (/usr/share/texlive/texmf-dist/tex/latex/psnfss/ts1pcr.fd) [19] [20] Chapter 5. LaTeX Warning: Hyper reference `user/saving_and_loading:experiment-outputs' on page 21 undefined on input line 900. LaTeX Warning: Hyper reference `user/saving_and_loading:algorithm-outputs' on p age 21 undefined on input line 903. LaTeX Warning: Hyper reference `user/saving_and_loading:save-directory-location ' on page 21 undefined on input line 906. LaTeX Warning: Hyper reference `user/saving_and_loading:loading-and-running-tra ined-policies' on page 21 undefined on input line 909. LaTeX Warning: Hyper reference `user/saving_and_loading:if-environment-saves-su ccessfully' on page 21 undefined on input line 912. LaTeX Warning: Hyper reference `user/saving_and_loading:environment-not-found-e rror' on page 21 undefined on input line 915. LaTeX Warning: Hyper reference `user/saving_and_loading:using-trained-value-fun ctions' on page 21 undefined on input line 918. [21] LaTeX Warning: Hyper reference `user/saving_and_loading:details-below' on page 22 undefined on input line 963. [22] [23] [24] [25] [26] Chapter 6. [27] [28] Chapter 7. LaTeX Warning: Hyper reference `spinningup/rl_intro:part-1-key-concepts-in-rl' on page 29 undefined on input line 1279. LaTeX Warning: Hyper reference `spinningup/rl_intro:what-can-rl-do' on page 29 undefined on input line 1282. LaTeX Warning: Hyper reference `spinningup/rl_intro:key-concepts-and-terminolog y' on page 29 undefined on input line 1285. LaTeX Warning: Hyper reference `spinningup/rl_intro:optional-formalism' on page 29 undefined on input line 1288. [29] [30 <./rl_diagram_transparent_bg.png>] [31] [32] [33] ! Undefined control sequence. ... } P(\tau |\pi ) R(\tau ) = \underE {\tau \sim \pi }{R(\tau )}... l.1531 ...nderE{\tau\sim \pi}{R(\tau)}.\end{split} ! Undefined control sequence. ... } P(\tau |\pi ) R(\tau ) = \underE {\tau \sim \pi }{R(\tau )}... l.1531 ...nderE{\tau\sim \pi}{R(\tau)}.\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {\tau \sim \pi }{R(\tau )\... l.1550 ...R(\tau)\left| s_0 = s\right.}\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {\tau \sim \pi }{R(\tau )\... l.1550 ...R(\tau)\left| s_0 = s\right.}\end{split} [34] ! Undefined control sequence. ...egin {split}Q^{\pi }(s,a) = \underE {\tau \sim \pi }{R(\tau )\... l.1557 ...eft| s_0 = s, a_0 = a\right.}\end{split} ! Undefined control sequence. ...egin {split}Q^{\pi }(s,a) = \underE {\tau \sim \pi }{R(\tau )\... l.1557 ...eft| s_0 = s, a_0 = a\right.}\end{split} ! Undefined control sequence. ...split}V^*(s) = \max _{\pi } \underE {\tau \sim \pi }{R(\tau )\... l.1564 ...R(\tau)\left| s_0 = s\right.}\end{split} ! Undefined control sequence. ...split}V^*(s) = \max _{\pi } \underE {\tau \sim \pi }{R(\tau )\... l.1564 ...R(\tau)\left| s_0 = s\right.}\end{split} ! Undefined control sequence. ...lit}Q^*(s,a) = \max _{\pi } \underE {\tau \sim \pi }{R(\tau )\... l.1571 ...eft| s_0 = s, a_0 = a\right.}\end{split} ! Undefined control sequence. ...lit}Q^*(s,a) = \max _{\pi } \underE {\tau \sim \pi }{R(\tau )\... l.1571 ...eft| s_0 = s, a_0 = a\right.}\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {a\sim \pi }{Q^{\pi }(s,a)... l.1585 ...erE{a\sim \pi}{Q^{\pi}(s,a)},\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {a\sim \pi }{Q^{\pi }(s,a)... l.1585 ...erE{a\sim \pi}{Q^{\pi}(s,a)},\end{split} [35] ! Undefined control sequence. V^{\pi }(s) &= \underE {a \sim \pi \\ s'\sim P}{r(s,a) + \gamma ... l.1618 \end{align*} ! Missing } inserted. } l.1618 \end{align*} ! Missing { inserted. { l.1618 \end{align*} ! Undefined control sequence. ...}(s')}, \\ Q^{\pi }(s,a) &= \underE {s'\sim P}{r(s,a) + \gamma... l.1618 \end{align*} ! Undefined control sequence. ... {s'\sim P}{r(s,a) + \gamma \underE {a'\sim \pi }{Q^{\pi }(s',... l.1618 \end{align*} ! Undefined control sequence. V^{\pi }(s) &= \underE {a \sim \pi \\ s'\sim P}{r(s,a) + \gamma ... l.1618 \end{align*} ! Missing } inserted. } l.1618 \end{align*} ! Missing \endgroup inserted. \endgroup l.1618 \end{align*} ! Misplaced \omit. \math@cr@@@ ...@ \@ne \add@amps \maxfields@ \omit \kern -\alignsep@ \iftag@ ... l.1618 \end{align*} ! Missing { inserted. { l.1618 \end{align*} ! Undefined control sequence. ...}(s')}, \\ Q^{\pi }(s,a) &= \underE {s'\sim P}{r(s,a) + \gamma... l.1618 \end{align*} ! Undefined control sequence. ... {s'\sim P}{r(s,a) + \gamma \underE {a'\sim \pi }{Q^{\pi }(s',... l.1618 \end{align*} ! Undefined control sequence. V^*(s) &= \max _a \underE {s'\sim P}{r(s,a) + \gamma V^*(s')}, \... l.1625 \end{align*} ! Undefined control sequence. ...ma V^*(s')}, \\ Q^*(s,a) &= \underE {s'\sim P}{r(s,a) + \gamma... l.1625 \end{align*} ! Undefined control sequence. V^*(s) &= \max _a \underE {s'\sim P}{r(s,a) + \gamma V^*(s')}, \... l.1625 \end{align*} ! Undefined control sequence. ...ma V^*(s')}, \\ Q^*(s,a) &= \underE {s'\sim P}{r(s,a) + \gamma... l.1625 \end{align*} [36] [37] [38] Chapter 8. LaTeX Warning: Hyper reference `spinningup/rl_intro2:part-2-kinds-of-rl-algorit hms' on page 39 undefined on input line 1678. LaTeX Warning: Hyper reference `spinningup/rl_intro2:a-taxonomy-of-rl-algorithm s' on page 39 undefined on input line 1681. LaTeX Warning: Hyper reference `spinningup/rl_intro2:links-to-algorithms-in-tax onomy' on page 39 undefined on input line 1684. ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.1699 ...ncludegraphics{{rl_algorithms_9_15}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.1699 ...ncludegraphics{{rl_algorithms_9_15}.svg} LaTeX Warning: Hyper reference `spinningup/rl_intro2:citations-below' on page 3 9 undefined on input line 1700. [39] [40] [41] [42] Chapter 9. LaTeX Warning: Hyper reference `spinningup/rl_intro3:part-3-intro-to-policy-opt imization' on page 43 undefined on input line 1841. LaTeX Warning: Hyper reference `spinningup/rl_intro3:deriving-the-simplest-poli cy-gradient' on page 43 undefined on input line 1844. LaTeX Warning: Hyper reference `spinningup/rl_intro3:implementing-the-simplest- policy-gradient' on page 43 undefined on input line 1847. LaTeX Warning: Hyper reference `spinningup/rl_intro3:expected-grad-log-prob-lem ma' on page 43 undefined on input line 1850. LaTeX Warning: Hyper reference `spinningup/rl_intro3:don-t-let-the-past-distrac t-you' on page 43 undefined on input line 1853. LaTeX Warning: Hyper reference `spinningup/rl_intro3:implementing-reward-to-go- policy-gradient' on page 43 undefined on input line 1856. LaTeX Warning: Hyper reference `spinningup/rl_intro3:baselines-in-policy-gradie nts' on page 43 undefined on input line 1859. LaTeX Warning: Hyper reference `spinningup/rl_intro3:other-forms-of-the-policy- gradient' on page 43 undefined on input line 1862. LaTeX Warning: Hyper reference `spinningup/rl_intro3:recap' on page 43 undefine d on input line 1865. ! Undefined control sequence. l.1890 ...ected return \(J(\pi_{\theta}) = \underE {\tau \sim \pi_{\theta}}{R... [43] ! Undefined control sequence. ...} \log P(\tau | \theta ) &= \cancel {\nabla _{\theta } \log \r... l.1921 ... \log \pi_{\theta}(a_t |s_t).\end{split} ! Undefined control sequence. ...} + \sum _{t=0}^{T} \bigg ( \cancel {\nabla _{\theta } \log P(... l.1921 ... \log \pi_{\theta}(a_t |s_t).\end{split} ! Undefined control sequence. ...} \log P(\tau | \theta ) &= \cancel {\nabla _{\theta } \log \r... l.1921 ... \log \pi_{\theta}(a_t |s_t).\end{split} ! Undefined control sequence. ...} + \sum _{t=0}^{T} \bigg ( \cancel {\nabla _{\theta } \log P(... l.1921 ... \log \pi_{\theta}(a_t |s_t).\end{split} ! Undefined control sequence. ...eta }) &= \nabla _{\theta } \underE {\tau \sim \pi _{\theta }}... l.1933 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...Log-derivative trick} \\ &= \underE {\tau \sim \pi _{\theta }}... l.1933 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...heta } J(\pi _{\theta }) &= \underE {\tau \sim \pi _{\theta }}... l.1933 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...eta }) &= \nabla _{\theta } \underE {\tau \sim \pi _{\theta }}... l.1933 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...Log-derivative trick} \\ &= \underE {\tau \sim \pi _{\theta }}... l.1933 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...heta } J(\pi _{\theta }) &= \underE {\tau \sim \pi _{\theta }}... l.1933 \end{align*} \end{sphinxadmonition} [44] [45] [46] [47] ! Undefined control sequence. \split@tag \begin {split}\underE {x \sim P_{\theta }}{\nabla _{\t... l.2082 ...eta} \log P_{\theta}(x)} = 0.\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {x \sim P_{\theta }}{\nabla _{\t... l.2082 ...eta} \log P_{\theta}(x)} = 0.\end{split} ! Undefined control sequence. ...eta }(x) \\ \therefore 0 &= \underE {x \sim P_{\theta }}{\nabl... l.2099 ...{\theta} \log P_{\theta}(x)}.\end{split} ! Undefined control sequence. ...eta }(x) \\ \therefore 0 &= \underE {x \sim P_{\theta }}{\nabl... l.2099 ...{\theta} \log P_{\theta}(x)}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2107 ..._{\theta}(a_t |s_t) R(\tau)}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2107 ..._{\theta}(a_t |s_t) R(\tau)}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2115 ...R(s_{t'}, a_{t'}, s_{t'+1})}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2115 ...R(s_{t'}, a_{t'}, s_{t'+1})}.\end{split} [48] ! Undefined control sequence. \split@tag \begin {split}\underE {a_t \sim \pi _{\theta }}{\nabla... l.2167 ...\theta}(a_t|s_t) b(s_t)} = 0.\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {a_t \sim \pi _{\theta }}{\nabla... l.2167 ...\theta}(a_t|s_t) b(s_t)} = 0.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2171 ..., s_{t'+1}) - b(s_t)\right)}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2171 ..., s_{t'+1}) - b(s_t)\right)}.\end{split} [49] ! Undefined control sequence. ...phi _k = \arg \min _{\phi } \underE {s_t, \hat {R}_t \sim \pi ... l.2185 ...(s_t) - \hat{R}_t \right)^2},\end{split} ! Undefined control sequence. ...phi _k = \arg \min _{\phi } \underE {s_t, \hat {R}_t \sim \pi ... l.2185 ...(s_t) - \hat{R}_t \right)^2},\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2199 ...i_{\theta}(a_t |s_t) \Phi_t},\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2199 ...i_{\theta}(a_t |s_t) \Phi_t},\end{split} [50] [51] [52] Chapter 10. LaTeX Warning: Hyper reference `spinningup/spinningup:spinning-up-as-a-deep-rl- researcher' on page 53 undefined on input line 2253. LaTeX Warning: Hyper reference `spinningup/spinningup:the-right-background' on page 53 undefined on input line 2256. LaTeX Warning: Hyper reference `spinningup/spinningup:learn-by-doing' on page 5 3 undefined on input line 2259. LaTeX Warning: Hyper reference `spinningup/spinningup:developing-a-research-pro ject' on page 53 undefined on input line 2262. LaTeX Warning: Hyper reference `spinningup/spinningup:doing-rigorous-research-i n-rl' on page 53 undefined on input line 2265. LaTeX Warning: Hyper reference `spinningup/spinningup:closing-thoughts' on page 53 undefined on input line 2268. LaTeX Warning: Hyper reference `spinningup/spinningup:ps-other-resources' on pa ge 53 undefined on input line 2271. LaTeX Warning: Hyper reference `spinningup/spinningup:references' on page 53 un defined on input line 2274. [53] [54] [55] [56] [57] [58] Chapter 11. LaTeX Warning: Hyper reference `spinningup/keypapers:key-papers-in-deep-rl' on page 59 undefined on input line 2383. LaTeX Warning: Hyper reference `spinningup/keypapers:model-free-rl' on page 59 undefined on input line 2386. LaTeX Warning: Hyper reference `spinningup/keypapers:exploration' on page 59 un defined on input line 2389. LaTeX Warning: Hyper reference `spinningup/keypapers:transfer-and-multitask-rl' on page 59 undefined on input line 2392. LaTeX Warning: Hyper reference `spinningup/keypapers:hierarchy' on page 59 unde fined on input line 2395. LaTeX Warning: Hyper reference `spinningup/keypapers:memory' on page 59 undefin ed on input line 2398. LaTeX Warning: Hyper reference `spinningup/keypapers:model-based-rl' on page 59 undefined on input line 2401. LaTeX Warning: Hyper reference `spinningup/keypapers:meta-rl' on page 59 undefi ned on input line 2404. LaTeX Warning: Hyper reference `spinningup/keypapers:scaling-rl' on page 59 und efined on input line 2407. LaTeX Warning: Hyper reference `spinningup/keypapers:rl-in-the-real-world' on p age 59 undefined on input line 2410. LaTeX Warning: Hyper reference `spinningup/keypapers:safety' on page 59 undefin ed on input line 2413. LaTeX Warning: Hyper reference `spinningup/keypapers:imitation-learning-and-inv erse-reinforcement-learning' on page 59 undefined on input line 2416. LaTeX Warning: Hyper reference `spinningup/keypapers:reproducibility-analysis-a nd-critique' on page 59 undefined on input line 2419. LaTeX Warning: Hyper reference `spinningup/keypapers:bonus-classic-papers-in-rl -theory-or-review' on page 59 undefined on input line 2422. [59] [60] Overfull \vbox (103.35579pt too high) has occurred while \output is active [61] [62] Chapter 12. LaTeX Warning: Hyper reference `spinningup/exercises:exercises' on page 63 unde fined on input line 2511. LaTeX Warning: Hyper reference `spinningup/exercises:problem-set-1-basics-of-im plementation' on page 63 undefined on input line 2514. LaTeX Warning: Hyper reference `spinningup/exercises:problem-set-2-algorithm-fa ilure-modes' on page 63 undefined on input line 2517. LaTeX Warning: Hyper reference `spinningup/exercises:challenges' on page 63 und efined on input line 2520. [63] [64] [65] [66] Chapter 13. LaTeX Warning: Hyper reference `spinningup/bench:benchmarks-for-spinning-up-imp lementations' on page 67 undefined on input line 2659. LaTeX Warning: Hyper reference `spinningup/bench:performance-in-each-environmen t' on page 67 undefined on input line 2662. LaTeX Warning: Hyper reference `spinningup/bench:halfcheetah' on page 67 undefi ned on input line 2665. LaTeX Warning: Hyper reference `spinningup/bench:hopper' on page 67 undefined o n input line 2668. LaTeX Warning: Hyper reference `spinningup/bench:walker' on page 67 undefined o n input line 2671. LaTeX Warning: Hyper reference `spinningup/bench:swimmer' on page 67 undefined on input line 2674. LaTeX Warning: Hyper reference `spinningup/bench:ant' on page 67 undefined on i nput line 2677. LaTeX Warning: Hyper reference `spinningup/bench:experiment-details' on page 67 undefined on input line 2682. ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2700 ...includegraphics{{bench_halfcheetah}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2700 ...includegraphics{{bench_halfcheetah}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2709 ...phinxincludegraphics{{bench_hopper}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2709 ...phinxincludegraphics{{bench_hopper}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2718 ...phinxincludegraphics{{bench_walker}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2718 ...phinxincludegraphics{{bench_walker}.svg} [67] ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2727 ...\sphinxincludegraphics{{bench_swim}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2727 ...\sphinxincludegraphics{{bench_swim}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2736 ...t\sphinxincludegraphics{{bench_ant}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2736 ...t\sphinxincludegraphics{{bench_ant}.svg} [68] [69] [70] Chapter 14. LaTeX Warning: Hyper reference `algorithms/vpg:vanilla-policy-gradient' on page 71 undefined on input line 2759. LaTeX Warning: Hyper reference `algorithms/vpg:background' on page 71 undefined on input line 2762. LaTeX Warning: Hyper reference `algorithms/vpg:quick-facts' on page 71 undefine d on input line 2765. LaTeX Warning: Hyper reference `algorithms/vpg:key-equations' on page 71 undefi ned on input line 2768. LaTeX Warning: Hyper reference `algorithms/vpg:exploration-vs-exploitation' on page 71 undefined on input line 2771. LaTeX Warning: Hyper reference `algorithms/vpg:pseudocode' on page 71 undefined on input line 2774. LaTeX Warning: Hyper reference `algorithms/vpg:documentation' on page 71 undefi ned on input line 2779. LaTeX Warning: Hyper reference `algorithms/vpg:saved-model-contents' on page 71 undefined on input line 2782. LaTeX Warning: Hyper reference `algorithms/vpg:references' on page 71 undefined on input line 2787. LaTeX Warning: Hyper reference `algorithms/vpg:relevant-papers' on page 71 unde fined on input line 2790. LaTeX Warning: Hyper reference `algorithms/vpg:why-these-papers' on page 71 und efined on input line 2793. LaTeX Warning: Hyper reference `algorithms/vpg:other-public-implementations' on page 71 undefined on input line 2796. [71] ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2833 },\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2833 },\end{split} ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2850 ...rithms/vpg:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2851 \caption {Vanilla Policy Gradient Algorithm} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2853 \begin{algorithmic} [1] ! Undefined control sequence. l.2854 \STATE Input: initial policy parameters $\theta_0$, initial value... ! Undefined control sequence. l.2855 \FOR {$k = 0,1,2,...$} ! Undefined control sequence. l.2856 \STATE Collect set of trajectories ${\mathcal D}_k = \{\tau_i\}$ ... ! Undefined control sequence. l.2857 \STATE Compute rewards-to-go $\hat{R}_t$. ! Undefined control sequence. l.2858 \STATE Compute advantage estimates, $\hat{A}_t$ (using any method... ! Undefined control sequence. l.2859 \STATE Estimate policy gradient as ! Undefined control sequence. l.2863 \STATE Compute policy update, either using standard gradient ascent, ! Undefined control sequence. l.2868 \STATE Fit value function by regression on mean-squared error: ! Undefined control sequence. l.2873 \ENDFOR ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2874 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2875 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 2881--2881 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 2881--2881 \T1/ptm/m/it/10 steps_per_epoch=4000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 epochs=50\ T1/ptm/m/n/10 , \T1/ptm/m/it/10 gamma=0.99\T1/ptm/m/n/10 , \T1/ptm/m/it/10 pi_l r=0.0003\T1/ptm/m/n/10 , \T1/ptm/m/it/10 vf_lr=0.001\T1/ptm/m/n/10 , [72] [73] [74] [75] [76] Chapter 15. LaTeX Warning: Hyper reference `algorithms/trpo:trust-region-policy-optimizatio n' on page 77 undefined on input line 3081. LaTeX Warning: Hyper reference `algorithms/trpo:background' on page 77 undefine d on input line 3084. LaTeX Warning: Hyper reference `algorithms/trpo:quick-facts' on page 77 undefin ed on input line 3087. LaTeX Warning: Hyper reference `algorithms/trpo:key-equations' on page 77 undef ined on input line 3090. LaTeX Warning: Hyper reference `algorithms/trpo:exploration-vs-exploitation' on page 77 undefined on input line 3093. LaTeX Warning: Hyper reference `algorithms/trpo:pseudocode' on page 77 undefine d on input line 3096. LaTeX Warning: Hyper reference `algorithms/trpo:documentation' on page 77 undef ined on input line 3101. LaTeX Warning: Hyper reference `algorithms/trpo:saved-model-contents' on page 7 7 undefined on input line 3104. LaTeX Warning: Hyper reference `algorithms/trpo:references' on page 77 undefine d on input line 3109. LaTeX Warning: Hyper reference `algorithms/trpo:relevant-papers' on page 77 und efined on input line 3112. LaTeX Warning: Hyper reference `algorithms/trpo:why-these-papers' on page 77 un defined on input line 3115. LaTeX Warning: Hyper reference `algorithms/trpo:other-public-implementations' o n page 77 undefined on input line 3118. [77] ! Undefined control sequence. ...al L}(\theta _k, \theta ) = \underE {s,a \sim \pi _{\theta _k}... l.3162 },\end{split} ! Undefined control sequence. ...al L}(\theta _k, \theta ) = \underE {s,a \sim \pi _{\theta _k}... l.3162 },\end{split} ! Undefined control sequence. ...{KL}(\theta || \theta _k) = \underE {s \sim \pi _{\theta _k}}{... l.3168 }.\end{split} ! Undefined control sequence. ...{KL}(\theta || \theta _k) = \underE {s \sim \pi _{\theta _k}}{... l.3168 }.\end{split} [78] ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3217 ...ithms/trpo:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3218 \caption {Trust Region Policy Optimization} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3220 \begin{algorithmic} [1] ! Undefined control sequence. l.3221 \STATE Input: initial policy parameters $\theta_0$, initial value... ! Undefined control sequence. l.3222 \STATE Hyperparameters: KL-divergence limit $\delta$, backtrackin... ! Undefined control sequence. l.3223 \FOR {$k = 0,1,2,...$} ! Undefined control sequence. l.3224 \STATE Collect set of trajectories ${\mathcal D}_k = \{\tau_i\}$ ... ! Undefined control sequence. l.3225 \STATE Compute rewards-to-go $\hat{R}_t$. ! Undefined control sequence. l.3226 \STATE Compute advantage estimates, $\hat{A}_t$ (using any method... ! Undefined control sequence. l.3227 \STATE Estimate policy gradient as ! Undefined control sequence. l.3231 \STATE Use the conjugate gradient algorithm to compute ! Undefined control sequence. l.3236 \STATE Update the policy by backtracking line search with [79] ! Undefined control sequence. l.3241 \STATE Fit value function by regression on mean-squared error: ! Undefined control sequence. l.3246 \ENDFOR ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3247 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3248 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 3254--3254 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 3254--3254 \T1/ptm/m/it/10 steps_per_epoch=4000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 epochs=50\ T1/ptm/m/n/10 , \T1/ptm/m/it/10 gamma=0.99\T1/ptm/m/n/10 , \T1/ptm/m/it/10 delt a=0.01\T1/ptm/m/n/10 , \T1/ptm/m/it/10 vf_lr=0.001\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 3254--3254 \T1/ptm/m/it/10 train_v_iters=80\T1/ptm/m/n/10 , \T1/ptm/m/it/10 damp-ing_coeff =0.1\T1/ptm/m/n/10 , \T1/ptm/m/it/10 cg_iters=10\T1/ptm/m/n/10 , \T1/ptm/m/it/1 0 back-track_iters=10\T1/ptm/m/n/10 , \T1/ptm/m/it/10 back- Underfull \hbox (badness 10000) in paragraph at lines 3254--3254 \T1/ptm/m/it/10 track_coeff=0.8\T1/ptm/m/n/10 , \T1/ptm/m/it/10 lam=0.97\T1/ptm /m/n/10 , \T1/ptm/m/it/10 max_ep_len=1000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 log-g er_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 save_freq=10\T1/ptm/m/n/10 , [80] Overfull \vbox (100.02797pt too high) has occurred while \output is active [81] [82] [83] [84] Chapter 16. LaTeX Warning: Hyper reference `algorithms/ppo:proximal-policy-optimization' on page 85 undefined on input line 3528. LaTeX Warning: Hyper reference `algorithms/ppo:background' on page 85 undefined on input line 3531. LaTeX Warning: Hyper reference `algorithms/ppo:quick-facts' on page 85 undefine d on input line 3534. LaTeX Warning: Hyper reference `algorithms/ppo:key-equations' on page 85 undefi ned on input line 3537. LaTeX Warning: Hyper reference `algorithms/ppo:exploration-vs-exploitation' on page 85 undefined on input line 3540. LaTeX Warning: Hyper reference `algorithms/ppo:pseudocode' on page 85 undefined on input line 3543. LaTeX Warning: Hyper reference `algorithms/ppo:documentation' on page 85 undefi ned on input line 3548. LaTeX Warning: Hyper reference `algorithms/ppo:saved-model-contents' on page 85 undefined on input line 3551. LaTeX Warning: Hyper reference `algorithms/ppo:references' on page 85 undefined on input line 3556. LaTeX Warning: Hyper reference `algorithms/ppo:relevant-papers' on page 85 unde fined on input line 3559. LaTeX Warning: Hyper reference `algorithms/ppo:why-these-papers' on page 85 und efined on input line 3562. LaTeX Warning: Hyper reference `algorithms/ppo:other-public-implementations' on page 85 undefined on input line 3565. [85] [86] ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3674 ...rithms/ppo:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3675 \caption {PPO-Clip} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3677 \begin{algorithmic} [1] ! Undefined control sequence. l.3678 \STATE Input: initial policy parameters $\theta_0$, initial value... ! Undefined control sequence. l.3679 \FOR {$k = 0,1,2,...$} ! Undefined control sequence. l.3680 \STATE Collect set of trajectories ${\mathcal D}_k = \{\tau_i\}$ ... ! Undefined control sequence. l.3681 \STATE Compute rewards-to-go $\hat{R}_t$. ! Undefined control sequence. l.3682 \STATE Compute advantage estimates, $\hat{A}_t$ (using any method... ! Undefined control sequence. l.3683 \STATE Update the policy by maximizing the PPO-Clip objective: ! Undefined control sequence. l.3691 \STATE Fit value function by regression on mean-squared error: ! Undefined control sequence. l.3696 \ENDFOR ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3697 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3698 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 3704--3704 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , [87] [88] [89] [90] Chapter 17. LaTeX Warning: Hyper reference `algorithms/ddpg:deep-deterministic-policy-gradi ent' on page 91 undefined on input line 3924. LaTeX Warning: Hyper reference `algorithms/ddpg:background' on page 91 undefine d on input line 3927. LaTeX Warning: Hyper reference `algorithms/ddpg:quick-facts' on page 91 undefin ed on input line 3930. LaTeX Warning: Hyper reference `algorithms/ddpg:key-equations' on page 91 undef ined on input line 3933. LaTeX Warning: Hyper reference `algorithms/ddpg:the-q-learning-side-of-ddpg' on page 91 undefined on input line 3936. LaTeX Warning: Hyper reference `algorithms/ddpg:the-policy-learning-side-of-ddp g' on page 91 undefined on input line 3939. LaTeX Warning: Hyper reference `algorithms/ddpg:exploration-vs-exploitation' on page 91 undefined on input line 3944. LaTeX Warning: Hyper reference `algorithms/ddpg:pseudocode' on page 91 undefine d on input line 3947. LaTeX Warning: Hyper reference `algorithms/ddpg:documentation' on page 91 undef ined on input line 3952. LaTeX Warning: Hyper reference `algorithms/ddpg:saved-model-contents' on page 9 1 undefined on input line 3955. LaTeX Warning: Hyper reference `algorithms/ddpg:references' on page 91 undefine d on input line 3960. LaTeX Warning: Hyper reference `algorithms/ddpg:relevant-papers' on page 91 und efined on input line 3963. LaTeX Warning: Hyper reference `algorithms/ddpg:why-these-papers' on page 91 un defined on input line 3966. LaTeX Warning: Hyper reference `algorithms/ddpg:other-public-implementations' o n page 91 undefined on input line 3969. [91] [92] [93] ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4088 ...ithms/ddpg:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4089 \caption {Deep Deterministic Policy Gradient} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4091 \begin{algorithmic} [1] ! Undefined control sequence. l.4092 \STATE Input: initial policy parameters $\theta$, Q-function para... ! Undefined control sequence. l.4093 \STATE Set target parameters equal to main parameters $\theta_{\t... ! Undefined control sequence. l.4094 \REPEAT ! Undefined control sequence. l.4095 \STATE Observe state $s$ and select action $a = \text{clip}(\... ! Undefined control sequence. l.4096 \STATE Execute $a$ in the environment ! Undefined control sequence. l.4097 \STATE Observe next state $s'$, reward $r$, and done signal $... ! Undefined control sequence. l.4098 \STATE Store $(s,a,r,s',d)$ in replay buffer $\mathcal{D}$ ! Undefined control sequence. l.4099 \STATE If $s'$ is terminal, reset environment state. ! Undefined control sequence. l.4100 \IF {it's time to update} ! Undefined control sequence. l.4101 \FOR {however many updates} ! Undefined control sequence. l.4102 \STATE Randomly sample a batch of transitions, $B = \... ! Undefined control sequence. l.4103 \STATE Compute targets ! Undefined control sequence. l.4107 \STATE Update Q-function by one step of gradient desc... ! Undefined control sequence. l.4111 \STATE Update policy by one step of gradient ascent u... ! Undefined control sequence. l.4115 \STATE Update target networks with ! Undefined control sequence. l.4120 \ENDFOR ! Undefined control sequence. l.4121 \ENDIF ! Undefined control sequence. l.4122 \UNTIL {convergence} ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4123 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4124 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 4130--4130 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 4130--4130 \T1/ptm/m/it/10 pi_lr=0.001\T1/ptm/m/n/10 , \T1/ptm/m/it/10 q_lr=0.001\T1/ptm/m /n/10 , \T1/ptm/m/it/10 batch_size=100\T1/ptm/m/n/10 , \T1/ptm/m/it/10 start_st eps=10000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 act_noise=0.1\T1/ptm/m/n/10 , [94] [95] [96] Chapter 18. LaTeX Warning: Hyper reference `algorithms/td3:twin-delayed-ddpg' on page 97 un defined on input line 4345. LaTeX Warning: Hyper reference `algorithms/td3:background' on page 97 undefined on input line 4348. LaTeX Warning: Hyper reference `algorithms/td3:quick-facts' on page 97 undefine d on input line 4351. LaTeX Warning: Hyper reference `algorithms/td3:key-equations' on page 97 undefi ned on input line 4354. LaTeX Warning: Hyper reference `algorithms/td3:exploration-vs-exploitation' on page 97 undefined on input line 4357. LaTeX Warning: Hyper reference `algorithms/td3:pseudocode' on page 97 undefined on input line 4360. LaTeX Warning: Hyper reference `algorithms/td3:documentation' on page 97 undefi ned on input line 4365. LaTeX Warning: Hyper reference `algorithms/td3:saved-model-contents' on page 97 undefined on input line 4368. LaTeX Warning: Hyper reference `algorithms/td3:references' on page 97 undefined on input line 4373. LaTeX Warning: Hyper reference `algorithms/td3:relevant-papers' on page 97 unde fined on input line 4376. LaTeX Warning: Hyper reference `algorithms/td3:other-public-implementations' on page 97 undefined on input line 4379. [97] ! Undefined control sequence. ...}L(\phi _1, {\mathcal D}) = \underE {(s,a,r,s',d) \sim {\mathc... l.4436 },\end{split} ! Undefined control sequence. ...}L(\phi _1, {\mathcal D}) = \underE {(s,a,r,s',d) \sim {\mathc... l.4436 },\end{split} ! Undefined control sequence. ...}L(\phi _2, {\mathcal D}) = \underE {(s,a,r,s',d) \sim {\mathc... l.4440 }.\end{split} ! Undefined control sequence. ...}L(\phi _2, {\mathcal D}) = \underE {(s,a,r,s',d) \sim {\mathc... l.4440 }.\end{split} [98] ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4464 ...rithms/td3:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4465 \caption {Twin Delayed DDPG} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4467 \begin{algorithmic} [1] ! Undefined control sequence. l.4468 \STATE Input: initial policy parameters $\theta$, Q-function para... ! Undefined control sequence. l.4469 \STATE Set target parameters equal to main parameters $\theta_{\t... ! Undefined control sequence. l.4470 \REPEAT ! Undefined control sequence. l.4471 \STATE Observe state $s$ and select action $a = \text{clip}(\... ! Undefined control sequence. l.4472 \STATE Execute $a$ in the environment ! Undefined control sequence. l.4473 \STATE Observe next state $s'$, reward $r$, and done signal $... ! Undefined control sequence. l.4474 \STATE Store $(s,a,r,s',d)$ in replay buffer $\mathcal{D}$ ! Undefined control sequence. l.4475 \STATE If $s'$ is terminal, reset environment state. ! Undefined control sequence. l.4476 \IF {it's time to update} ! Undefined control sequence. l.4477 \FOR {$j$ in range(however many updates)} ! Undefined control sequence. l.4478 \STATE Randomly sample a batch of transitions, $B = \... ! Undefined control sequence. l.4479 \STATE Compute target actions ! Undefined control sequence. l.4483 \STATE Compute targets ! Undefined control sequence. l.4487 \STATE Update Q-functions by one step of gradient des... ! Undefined control sequence. l.4491 \IF { $j \mod$ \texttt{policy\_delay} $ = 0$} ! Undefined control sequence. l.4492 \STATE Update policy by one step of gradient asce... ! Undefined control sequence. l.4496 \STATE Update target networks with ! Undefined control sequence. l.4501 \ENDIF ! Undefined control sequence. l.4502 \ENDFOR ! Undefined control sequence. l.4503 \ENDIF ! Undefined control sequence. l.4504 \UNTIL {convergence} ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4505 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4506 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 4512--4512 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 4512--4512 \T1/ptm/m/it/10 pi_lr=0.001\T1/ptm/m/n/10 , \T1/ptm/m/it/10 q_lr=0.001\T1/ptm/m /n/10 , \T1/ptm/m/it/10 batch_size=100\T1/ptm/m/n/10 , \T1/ptm/m/it/10 start_st eps=10000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 act_noise=0.1\T1/ptm/m/n/10 , \T1/ptm /m/it/10 tar- [99] [100] [101] [102] Chapter 19. LaTeX Warning: Hyper reference `algorithms/sac:soft-actor-critic' on page 103 u ndefined on input line 4738. LaTeX Warning: Hyper reference `algorithms/sac:background' on page 103 undefine d on input line 4741. LaTeX Warning: Hyper reference `algorithms/sac:quick-facts' on page 103 undefin ed on input line 4744. LaTeX Warning: Hyper reference `algorithms/sac:key-equations' on page 103 undef ined on input line 4747. LaTeX Warning: Hyper reference `algorithms/sac:entropy-regularized-reinforcemen t-learning' on page 103 undefined on input line 4750. LaTeX Warning: Hyper reference `algorithms/sac:id1' on page 103 undefined on in put line 4753. LaTeX Warning: Hyper reference `algorithms/sac:exploration-vs-exploitation' on page 103 undefined on input line 4758. LaTeX Warning: Hyper reference `algorithms/sac:pseudocode' on page 103 undefine d on input line 4761. LaTeX Warning: Hyper reference `algorithms/sac:documentation' on page 103 undef ined on input line 4766. LaTeX Warning: Hyper reference `algorithms/sac:saved-model-contents' on page 10 3 undefined on input line 4769. LaTeX Warning: Hyper reference `algorithms/sac:references' on page 103 undefine d on input line 4774. LaTeX Warning: Hyper reference `algorithms/sac:relevant-papers' on page 103 und efined on input line 4777. LaTeX Warning: Hyper reference `algorithms/sac:other-public-implementations' on page 103 undefined on input line 4780. [103] ! Undefined control sequence. ...it@tag \begin {split}H(P) = \underE {x \sim P}{-\log P(x)}.\en... l.4827 ...underE{x \sim P}{-\log P(x)}.\end{split} ! Undefined control sequence. ...it@tag \begin {split}H(P) = \underE {x \sim P}{-\log P(x)}.\en... l.4827 ...underE{x \sim P}{-\log P(x)}.\end{split} ! Undefined control sequence. ...}\pi ^* = \arg \max _{\pi } \underE {\tau \sim \pi }{ \sum _{t... l.4831 ...pi(\cdot|s_t)\right) \bigg)},\end{split} ! Undefined control sequence. ...}\pi ^* = \arg \max _{\pi } \underE {\tau \sim \pi }{ \sum _{t... l.4831 ...pi(\cdot|s_t)\right) \bigg)},\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {\tau \sim \pi }{ \left . ... l.4835 ...ight) \bigg) \right| s_0 = s}\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {\tau \sim \pi }{ \left . ... l.4835 ...ight) \bigg) \right| s_0 = s}\end{split} ! Undefined control sequence. ...egin {split}Q^{\pi }(s,a) = \underE {\tau \sim \pi }{ \left . ... l.4839 ...ght)\right| s_0 = s, a_0 = a}\end{split} ! Undefined control sequence. ...egin {split}Q^{\pi }(s,a) = \underE {\tau \sim \pi }{ \left . ... l.4839 ...ght)\right| s_0 = s, a_0 = a}\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {a \sim \pi }{Q^{\pi }(s,a... l.4843 ...ha H\left(\pi(\cdot|s)\right)\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {a \sim \pi }{Q^{\pi }(s,a... l.4843 ...ha H\left(\pi(\cdot|s)\right)\end{split} ! Undefined control sequence. ...gin {split}Q^{\pi }(s,a) &= \underE {s' \sim P \\ a' \sim \pi ... l.4848 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Missing } inserted. } l.4848 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Missing { inserted. { l.4848 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Undefined control sequence. ...s')\right ) \right )} \\ &= \underE {s' \sim P}{R(s,a,s') + \g... l.4848 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Undefined control sequence. ...gin {split}Q^{\pi }(s,a) &= \underE {s' \sim P \\ a' \sim \pi ... l.4848 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Missing } inserted. } l.4848 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Missing { inserted. { l.4848 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Undefined control sequence. ...s')\right ) \right )} \\ &= \underE {s' \sim P}{R(s,a,s') + \g... l.4848 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} [104] ! Undefined control sequence. ...begin {split}V^{\pi }(s) &= \underE {a \sim \pi }{Q^{\pi }(s,a... l.4871 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. ...pi (\cdot |s)\right ) \\ &= \underE {a \sim \pi }{Q^{\pi }(s,a... l.4871 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. ...begin {split}V^{\pi }(s) &= \underE {a \sim \pi }{Q^{\pi }(s,a... l.4871 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. ...pi (\cdot |s)\right ) \\ &= \underE {a \sim \pi }{Q^{\pi }(s,a... l.4871 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. ...it}L(\psi , {\mathcal D}) = \underE {s \sim \mathcal {D} \\ \t... l.4879 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Missing } inserted. } l.4879 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Missing { inserted. { l.4879 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Undefined control sequence. ...it}L(\psi , {\mathcal D}) = \underE {s \sim \mathcal {D} \\ \t... l.4879 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Missing } inserted. } l.4879 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Missing { inserted. { l.4879 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {a \sim \pi }{Q^{\pi }(s,a) - \a... l.4885 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {a \sim \pi }{Q^{\pi }(s,a) - \a... l.4885 ...s,a) - \alpha \log \pi(a|s)}.\end{split} [105] ! Undefined control sequence. \split@tag \begin {split}\underE {a \sim \pi _{\theta }}{Q^{\pi _... l.4902 ...\tilde{a}_{\theta}(s,\xi)|s)}\end{split} ! Undefined control sequence. ...\log \pi _{\theta }(a|s)} = \underE {\xi \sim \mathcal {N}}{Q^... l.4902 ...\tilde{a}_{\theta}(s,\xi)|s)}\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {a \sim \pi _{\theta }}{Q^{\pi _... l.4902 ...\tilde{a}_{\theta}(s,\xi)|s)}\end{split} ! Undefined control sequence. ...\log \pi _{\theta }(a|s)} = \underE {\xi \sim \mathcal {N}}{Q^... l.4902 ...\tilde{a}_{\theta}(s,\xi)|s)}\end{split} ! Undefined control sequence. ...egin {split}\max _{\theta } \underE {s \sim \mathcal {D} \\ \x... l.4906 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Missing } inserted. } l.4906 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Missing { inserted. { l.4906 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Undefined control sequence. ...egin {split}\max _{\theta } \underE {s \sim \mathcal {D} \\ \x... l.4906 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Missing } inserted. } l.4906 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Missing { inserted. { l.4906 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4924 ...rithms/sac:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4925 \caption {Soft Actor-Critic} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4927 \begin{algorithmic} [1] ! Undefined control sequence. l.4928 \STATE Input: initial policy parameters $\theta$, Q-function para... ! Undefined control sequence. l.4929 \STATE Set target parameters equal to main parameters $\psi_{\tex... ! Undefined control sequence. l.4930 \REPEAT ! Undefined control sequence. l.4931 \STATE Observe state $s$ and select action $a \sim \pi_{\thet... ! Undefined control sequence. l.4932 \STATE Execute $a$ in the environment ! Undefined control sequence. l.4933 \STATE Observe next state $s'$, reward $r$, and done signal $... ! Undefined control sequence. l.4934 \STATE Store $(s,a,r,s',d)$ in replay buffer $\mathcal{D}$ ! Undefined control sequence. l.4935 \STATE If $s'$ is terminal, reset environment state. ! Undefined control sequence. l.4936 \IF {it's time to update} ! Undefined control sequence. l.4937 \FOR {$j$ in range(however many updates)} ! Undefined control sequence. l.4938 \STATE Randomly sample a batch of transitions, $B = \... ! Undefined control sequence. l.4939 \STATE Compute targets for Q and V functions: [106] ! Undefined control sequence. l.4944 \STATE Update Q-functions by one step of gradient des... ! Undefined control sequence. l.4948 \STATE Update V-function by one step of gradient desc... ! Undefined control sequence. l.4952 \STATE Update policy by one step of gradient ascent u... ! Undefined control sequence. l.4957 \STATE Update target value network with ! Undefined control sequence. l.4961 \ENDFOR ! Undefined control sequence. l.4962 \ENDIF ! Undefined control sequence. l.4963 \UNTIL {convergence} ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4964 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4965 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 4971--4971 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 4971--4971 \T1/ptm/m/it/10 lr=0.001\T1/ptm/m/n/10 , \T1/ptm/m/it/10 al-pha=0.2\T1/ptm/m/n/ 10 , \T1/ptm/m/it/10 batch_size=100\T1/ptm/m/n/10 , \T1/ptm/m/it/10 start_steps =10000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 max_ep_len=1000\T1/ptm/m/n/10 , \T1/ptm/ m/it/10 log- [107] Overfull \vbox (77.80809pt too high) has occurred while \output is active [108] [109] [110] Chapter 20. LaTeX Warning: Hyper reference `utils/logger:logger' on page 111 undefined on i nput line 5235. LaTeX Warning: Hyper reference `utils/logger:using-a-logger' on page 111 undefi ned on input line 5238. LaTeX Warning: Hyper reference `utils/logger:examples' on page 111 undefined on input line 5241. LaTeX Warning: Hyper reference `utils/logger:logging-and-mpi' on page 111 undef ined on input line 5244. LaTeX Warning: Hyper reference `utils/logger:logger-classes' on page 111 undefi ned on input line 5249. LaTeX Warning: Hyper reference `utils/logger:loading-saved-graphs' on page 111 undefined on input line 5252. [111] [112] [113] [114] LaTeX Warning: Hyper reference `utils/logger:spinup.utils.logx.Logger' on page 115 undefined on input line 5550. [115] [116] Chapter 21. [117] [118] Chapter 22. LaTeX Warning: Hyper reference `utils/mpi:mpi-tools' on page 119 undefined on i nput line 5695. LaTeX Warning: Hyper reference `utils/mpi:module-spinup.utils.mpi_tools' on pag e 119 undefined on input line 5698. LaTeX Warning: Hyper reference `utils/mpi:mpi-tensorflow-utilities' on page 119 undefined on input line 5701. [119] [120] Chapter 23. LaTeX Warning: Hyper reference `utils/run_utils:run-utils' on page 121 undefine d on input line 5828. LaTeX Warning: Hyper reference `utils/run_utils:experimentgrid' on page 121 und efined on input line 5831. LaTeX Warning: Hyper reference `utils/run_utils:calling-experiments' on page 12 1 undefined on input line 5834. [121] Underfull \hbox (badness 10000) in paragraph at lines 5986--5986 []\T1/ptm/m/it/10 exp_name\T1/ptm/m/n/10 , \T1/ptm/m/it/10 thunk\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , \T1/ptm/m/it/10 num_cpu=1\T1/ptm/m/n/1 0 , [122] [123] [124] Chapter 24. [125] [126] Chapter 25. [127] [128] Chapter 26. [129] [130] LaTeX Warning: Reference `utils/mpi:module-spinup.utils.mpi_tf' on page 131 und efined on input line 6119. LaTeX Warning: Reference `utils/mpi:module-spinup.utils.mpi_tools' on page 131 undefined on input line 6120. [131] No file SpinningUp.ind. (./SpinningUp.aux) Package rerunfilecheck Warning: File `SpinningUp.out' has changed. (rerunfilecheck) Rerun to get outlines right (rerunfilecheck) or use package `bookmark'. LaTeX Warning: There were undefined references. LaTeX Warning: Label(s) may have changed. Rerun to get cross-references right. ) (see the transcript file for additional information){/usr/share/texlive/texmf-d ist/fonts/enc/dvips/base/8r.enc}< /usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmr5.pfb> Output written on SpinningUp.pdf (135 pages, 1117470 bytes). Transcript written on SpinningUp.log. [rtd-command-info] start-time: 2019-07-15T17:16:27.466667Z, end-time: 2019-07-15T17:16:27.938100Z, duration: 0, exit-code: 0 makeindex -s python.ist SpinningUp.idx This is makeindex, version 2.15 [TeX Live 2017] (kpathsea + Thai support). Scanning style file ./python.ist.......done (7 attributes redefined, 0 ignored). Scanning input file SpinningUp.idx....done (78 entries accepted, 0 rejected). Sorting entries....done (506 comparisons). Generating output file SpinningUp.ind....done (144 lines written, 0 warnings). Output written in SpinningUp.ind. Transcript written in SpinningUp.ilg. [rtd-command-info] start-time: 2019-07-15T17:16:28.027008Z, end-time: 2019-07-15T17:16:29.524516Z, duration: 1, exit-code: 0 pdflatex -interaction=nonstopmode /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/_build/latex/SpinningUp.tex This is pdfTeX, Version 3.14159265-2.6-1.40.18 (TeX Live 2017/Debian) (preloaded format=pdflatex) restricted \write18 enabled. entering extended mode (/home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/c heckouts/latest/docs/_build/latex/SpinningUp.tex LaTeX2e <2017-04-15> Babel <3.18> and hyphenation patterns for 84 language(s) loaded. (./sphinxmanual.cls Document Class: sphinxmanual 2017/03/26 v1.5.4 Document class (Sphinx manual) (/usr/share/texlive/texmf-dist/tex/latex/base/report.cls Document Class: report 2014/09/29 v1.4h Standard LaTeX document class (/usr/share/texlive/texmf-dist/tex/latex/base/size10.clo))) (/usr/share/texlive/texmf-dist/tex/latex/base/inputenc.sty (/usr/share/texlive/texmf-dist/tex/latex/base/utf8.def (/usr/share/texlive/texmf-dist/tex/latex/base/t1enc.dfu) (/usr/share/texlive/texmf-dist/tex/latex/base/ot1enc.dfu) (/usr/share/texlive/texmf-dist/tex/latex/base/omsenc.dfu))) (/usr/share/texlive/texmf-dist/tex/latex/cmap/cmap.sty) (/usr/share/texlive/texmf-dist/tex/latex/base/fontenc.sty (/usr/share/texlive/texmf-dist/tex/latex/base/t1enc.def)<>) (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsmath.sty For additional information on amsmath, use the `?' option. (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amstext.sty (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsgen.sty)) (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsbsy.sty) (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsopn.sty)) (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/amssymb.sty (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/amsfonts.sty)) (/usr/share/texlive/texmf-dist/tex/generic/babel/babel.sty (/usr/share/texlive/texmf-dist/tex/generic/babel/switch.def) (/usr/share/texlive/texmf-dist/tex/generic/babel-english/english.ldf (/usr/share/texlive/texmf-dist/tex/generic/babel/babel.def (/usr/share/texlive/texmf-dist/tex/generic/babel/txtbabel.def)))) (/usr/share/texlive/texmf-dist/tex/latex/psnfss/times.sty) (/usr/share/texlive/texmf-dist/tex/latex/fncychap/fncychap.sty) (/usr/share/texlive/texmf-dist/tex/latex/tools/longtable.sty) (./sphinx.sty (/usr/share/texlive/texmf-dist/tex/latex/graphics/graphicx.sty (/usr/share/texlive/texmf-dist/tex/latex/graphics/keyval.sty) (/usr/share/texlive/texmf-dist/tex/latex/graphics/graphics.sty (/usr/share/texlive/texmf-dist/tex/latex/graphics/trig.sty) (/usr/share/texlive/texmf-dist/tex/latex/graphics-cfg/graphics.cfg) (/usr/share/texlive/texmf-dist/tex/latex/graphics-def/pdftex.def))) (/usr/share/texlive/texmf-dist/tex/latex/fancyhdr/fancyhdr.sty) (/usr/share/texlive/texmf-dist/tex/latex/base/textcomp.sty (/usr/share/texlive/texmf-dist/tex/latex/base/ts1enc.def (/usr/share/texlive/texmf-dist/tex/latex/base/ts1enc.dfu))) (/usr/share/texlive/texmf-dist/tex/latex/titlesec/titlesec.sty) (/usr/share/texlive/texmf-dist/tex/latex/tabulary/tabulary.sty (/usr/share/texlive/texmf-dist/tex/latex/tools/array.sty)) (/usr/share/texlive/texmf-dist/tex/latex/base/makeidx.sty) (/usr/share/texlive/texmf-dist/tex/latex/framed/framed.sty) (/usr/share/texlive/texmf-dist/tex/latex/xcolor/xcolor.sty (/usr/share/texlive/texmf-dist/tex/latex/graphics-cfg/color.cfg)) (/usr/share/texlive/texmf-dist/tex/latex/fancyvrb/fancyvrb.sty Style option: `fancyvrb' v2.7a, with DG/SPQR fixes, and firstline=lastline fix <2008/02/07> (tvz)) (/usr/share/texlive/texmf-dist/tex/latex/threeparttable/threeparttable.sty) (./footnotehyper-sphinx.sty (/usr/share/texlive/texmf-dist/tex/latex/mdwtools/footnote.sty)) (/usr/share/texlive/texmf-dist/tex/latex/float/float.sty) (/usr/share/texlive/texmf-dist/tex/latex/wrapfig/wrapfig.sty) (/usr/share/texlive/texmf-dist/tex/latex/parskip/parskip.sty) (/usr/share/texlive/texmf-dist/tex/latex/base/alltt.sty) (/usr/share/texlive/texmf-dist/tex/latex/upquote/upquote.sty) (/usr/share/texlive/texmf-dist/tex/latex/capt-of/capt-of.sty) (./needspace.sty) (./sphinxhighlight.sty) (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/kvoptions.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ltxcmds.sty) (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/kvsetkeys.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/infwarerr.sty) (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/etexcmds.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ifluatex.sty)))) (/usr/share/texlive/texmf-dist/tex/generic/pdftex/pdfcolor.tex) ** (sphinx) defining (legacy) text style macros without \sphinx prefix ** if clashes with packages, set latex_keep_old_macro_names=False in conf.py ) (/usr/share/texlive/texmf-dist/tex/latex/geometry/geometry.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ifpdf.sty) (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ifvtex.sty) (/usr/share/texlive/texmf-dist/tex/generic/ifxetex/ifxetex.sty)) (/usr/share/texlive/texmf-dist/tex/latex/multirow/multirow.sty) (/usr/share/texlive/texmf-dist/tex/latex/eqparbox/eqparbox.sty (/usr/share/texlive/texmf-dist/tex/latex/environ/environ.sty (/usr/share/texlive/texmf-dist/tex/latex/trimspaces/trimspaces.sty))) (/usr/share/texlive/texmf-dist/tex/latex/hyperref/hyperref.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/hobsub-hyperref.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/hobsub-generic.sty)) (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/auxhook.sty) (/usr/share/texlive/texmf-dist/tex/latex/hyperref/pd1enc.def) (/usr/share/texlive/texmf-dist/tex/latex/latexconfig/hyperref.cfg) (/usr/share/texlive/texmf-dist/tex/latex/hyperref/puenc.def) (/usr/share/texlive/texmf-dist/tex/latex/url/url.sty)) (/usr/share/texlive/texmf-dist/tex/latex/hyperref/hpdftex.def (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/rerunfilecheck.sty)) (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/hypcap.sty) Writing index file SpinningUp.idx (./SpinningUp.aux LaTeX Warning: Label `alg1' multiply defined. LaTeX Warning: Label `alg1' multiply defined. LaTeX Warning: Label `alg1' multiply defined. LaTeX Warning: Label `alg1' multiply defined. LaTeX Warning: Label `alg1' multiply defined. ) (/usr/share/texlive/texmf-dist/tex/latex/base/ts1cmr.fd) (/usr/share/texlive/texmf-dist/tex/latex/psnfss/t1ptm.fd) (/usr/share/texlive/texmf-dist/tex/context/base/mkii/supp-pdf.mkii [Loading MPS to PDF converter (version 2006.09.02).] ) (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/epstopdf-base.sty (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/grfext.sty) (/usr/share/texlive/texmf-dist/tex/latex/latexconfig/epstopdf-sys.cfg)) *geometry* driver: auto-detecting *geometry* detected driver: pdftex (/usr/share/texlive/texmf-dist/tex/latex/hyperref/nameref.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/gettitlestring.sty)) (./SpinningUp.out) (./SpinningUp.out) (/usr/share/texlive/texmf-dist/tex/latex/psnfss/t1phv.fd)<><><><> (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/umsa.fd) (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/umsb.fd) [1{/var/lib/texmf/fo nts/map/pdftex/updmap/pdftex.map}] [2] (./SpinningUp.toc [1] [2]) [3] [4] (/usr/share/texlive/texmf-dist/tex/latex/psnfss/t1pcr.fd) [1 <./spinning-up-in- rl.png>] [2] Chapter 1. (/usr/share/texlive/texmf-dist/tex/latex/psnfss/ts1ptm.fd) [3] [4] [5] [6] Chapter 2. [7] [8] [9] [10] Chapter 3. [11] [12] [13] [14] Chapter 4. [15] [16] [17] [18] (/usr/share/texlive/texmf-dist/tex/latex/psnfss/ts1pcr.fd) [19] [20] Chapter 5. [21] [22] [23] [24] [25] [26] Chapter 6. [27] [28] Chapter 7. [29] [30 <./rl_diagram_transparent_bg.png>] [31] [32] [33] ! Undefined control sequence. ... } P(\tau |\pi ) R(\tau ) = \underE {\tau \sim \pi }{R(\tau )}... l.1531 ...nderE{\tau\sim \pi}{R(\tau)}.\end{split} ! Undefined control sequence. ... } P(\tau |\pi ) R(\tau ) = \underE {\tau \sim \pi }{R(\tau )}... l.1531 ...nderE{\tau\sim \pi}{R(\tau)}.\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {\tau \sim \pi }{R(\tau )\... l.1550 ...R(\tau)\left| s_0 = s\right.}\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {\tau \sim \pi }{R(\tau )\... l.1550 ...R(\tau)\left| s_0 = s\right.}\end{split} [34] ! Undefined control sequence. ...egin {split}Q^{\pi }(s,a) = \underE {\tau \sim \pi }{R(\tau )\... l.1557 ...eft| s_0 = s, a_0 = a\right.}\end{split} ! Undefined control sequence. ...egin {split}Q^{\pi }(s,a) = \underE {\tau \sim \pi }{R(\tau )\... l.1557 ...eft| s_0 = s, a_0 = a\right.}\end{split} ! Undefined control sequence. ...split}V^*(s) = \max _{\pi } \underE {\tau \sim \pi }{R(\tau )\... l.1564 ...R(\tau)\left| s_0 = s\right.}\end{split} ! Undefined control sequence. ...split}V^*(s) = \max _{\pi } \underE {\tau \sim \pi }{R(\tau )\... l.1564 ...R(\tau)\left| s_0 = s\right.}\end{split} ! Undefined control sequence. ...lit}Q^*(s,a) = \max _{\pi } \underE {\tau \sim \pi }{R(\tau )\... l.1571 ...eft| s_0 = s, a_0 = a\right.}\end{split} ! Undefined control sequence. ...lit}Q^*(s,a) = \max _{\pi } \underE {\tau \sim \pi }{R(\tau )\... l.1571 ...eft| s_0 = s, a_0 = a\right.}\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {a\sim \pi }{Q^{\pi }(s,a)... l.1585 ...erE{a\sim \pi}{Q^{\pi}(s,a)},\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {a\sim \pi }{Q^{\pi }(s,a)... l.1585 ...erE{a\sim \pi}{Q^{\pi}(s,a)},\end{split} [35] ! Undefined control sequence. V^{\pi }(s) &= \underE {a \sim \pi \\ s'\sim P}{r(s,a) + \gamma ... l.1618 \end{align*} ! Missing } inserted. } l.1618 \end{align*} ! Missing { inserted. { l.1618 \end{align*} ! Undefined control sequence. ...}(s')}, \\ Q^{\pi }(s,a) &= \underE {s'\sim P}{r(s,a) + \gamma... l.1618 \end{align*} ! Undefined control sequence. ... {s'\sim P}{r(s,a) + \gamma \underE {a'\sim \pi }{Q^{\pi }(s',... l.1618 \end{align*} ! Undefined control sequence. V^{\pi }(s) &= \underE {a \sim \pi \\ s'\sim P}{r(s,a) + \gamma ... l.1618 \end{align*} ! Missing } inserted. } l.1618 \end{align*} ! Missing \endgroup inserted. \endgroup l.1618 \end{align*} ! Misplaced \omit. \math@cr@@@ ...@ \@ne \add@amps \maxfields@ \omit \kern -\alignsep@ \iftag@ ... l.1618 \end{align*} ! Missing { inserted. { l.1618 \end{align*} ! Undefined control sequence. ...}(s')}, \\ Q^{\pi }(s,a) &= \underE {s'\sim P}{r(s,a) + \gamma... l.1618 \end{align*} ! Undefined control sequence. ... {s'\sim P}{r(s,a) + \gamma \underE {a'\sim \pi }{Q^{\pi }(s',... l.1618 \end{align*} ! Undefined control sequence. V^*(s) &= \max _a \underE {s'\sim P}{r(s,a) + \gamma V^*(s')}, \... l.1625 \end{align*} ! Undefined control sequence. ...ma V^*(s')}, \\ Q^*(s,a) &= \underE {s'\sim P}{r(s,a) + \gamma... l.1625 \end{align*} ! Undefined control sequence. V^*(s) &= \max _a \underE {s'\sim P}{r(s,a) + \gamma V^*(s')}, \... l.1625 \end{align*} ! Undefined control sequence. ...ma V^*(s')}, \\ Q^*(s,a) &= \underE {s'\sim P}{r(s,a) + \gamma... l.1625 \end{align*} [36] [37] [38] Chapter 8. ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.1699 ...ncludegraphics{{rl_algorithms_9_15}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.1699 ...ncludegraphics{{rl_algorithms_9_15}.svg} [39] [40] [41] [42] Chapter 9. ! Undefined control sequence. l.1890 ...ected return \(J(\pi_{\theta}) = \underE {\tau \sim \pi_{\theta}}{R... [43] ! Undefined control sequence. ...} \log P(\tau | \theta ) &= \cancel {\nabla _{\theta } \log \r... l.1921 ... \log \pi_{\theta}(a_t |s_t).\end{split} ! Undefined control sequence. ...} + \sum _{t=0}^{T} \bigg ( \cancel {\nabla _{\theta } \log P(... l.1921 ... \log \pi_{\theta}(a_t |s_t).\end{split} ! Undefined control sequence. ...} \log P(\tau | \theta ) &= \cancel {\nabla _{\theta } \log \r... l.1921 ... \log \pi_{\theta}(a_t |s_t).\end{split} ! Undefined control sequence. ...} + \sum _{t=0}^{T} \bigg ( \cancel {\nabla _{\theta } \log P(... l.1921 ... \log \pi_{\theta}(a_t |s_t).\end{split} ! Undefined control sequence. ...eta }) &= \nabla _{\theta } \underE {\tau \sim \pi _{\theta }}... l.1933 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...Log-derivative trick} \\ &= \underE {\tau \sim \pi _{\theta }}... l.1933 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...heta } J(\pi _{\theta }) &= \underE {\tau \sim \pi _{\theta }}... l.1933 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...eta }) &= \nabla _{\theta } \underE {\tau \sim \pi _{\theta }}... l.1933 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...Log-derivative trick} \\ &= \underE {\tau \sim \pi _{\theta }}... l.1933 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...heta } J(\pi _{\theta }) &= \underE {\tau \sim \pi _{\theta }}... l.1933 \end{align*} \end{sphinxadmonition} [44] [45] [46] [47] ! Undefined control sequence. \split@tag \begin {split}\underE {x \sim P_{\theta }}{\nabla _{\t... l.2082 ...eta} \log P_{\theta}(x)} = 0.\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {x \sim P_{\theta }}{\nabla _{\t... l.2082 ...eta} \log P_{\theta}(x)} = 0.\end{split} ! Undefined control sequence. ...eta }(x) \\ \therefore 0 &= \underE {x \sim P_{\theta }}{\nabl... l.2099 ...{\theta} \log P_{\theta}(x)}.\end{split} ! Undefined control sequence. ...eta }(x) \\ \therefore 0 &= \underE {x \sim P_{\theta }}{\nabl... l.2099 ...{\theta} \log P_{\theta}(x)}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2107 ..._{\theta}(a_t |s_t) R(\tau)}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2107 ..._{\theta}(a_t |s_t) R(\tau)}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2115 ...R(s_{t'}, a_{t'}, s_{t'+1})}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2115 ...R(s_{t'}, a_{t'}, s_{t'+1})}.\end{split} [48] ! Undefined control sequence. \split@tag \begin {split}\underE {a_t \sim \pi _{\theta }}{\nabla... l.2167 ...\theta}(a_t|s_t) b(s_t)} = 0.\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {a_t \sim \pi _{\theta }}{\nabla... l.2167 ...\theta}(a_t|s_t) b(s_t)} = 0.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2171 ..., s_{t'+1}) - b(s_t)\right)}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2171 ..., s_{t'+1}) - b(s_t)\right)}.\end{split} [49] ! Undefined control sequence. ...phi _k = \arg \min _{\phi } \underE {s_t, \hat {R}_t \sim \pi ... l.2185 ...(s_t) - \hat{R}_t \right)^2},\end{split} ! Undefined control sequence. ...phi _k = \arg \min _{\phi } \underE {s_t, \hat {R}_t \sim \pi ... l.2185 ...(s_t) - \hat{R}_t \right)^2},\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2199 ...i_{\theta}(a_t |s_t) \Phi_t},\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2199 ...i_{\theta}(a_t |s_t) \Phi_t},\end{split} [50] [51] [52] Chapter 10. [53] [54] [55] [56] [57] [58] Chapter 11. [59] [60] Overfull \vbox (103.35579pt too high) has occurred while \output is active [61] [62] Chapter 12. [63] [64] [65] [66] Chapter 13. ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2700 ...includegraphics{{bench_halfcheetah}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2700 ...includegraphics{{bench_halfcheetah}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2709 ...phinxincludegraphics{{bench_hopper}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2709 ...phinxincludegraphics{{bench_hopper}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2718 ...phinxincludegraphics{{bench_walker}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2718 ...phinxincludegraphics{{bench_walker}.svg} [67] ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2727 ...\sphinxincludegraphics{{bench_swim}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2727 ...\sphinxincludegraphics{{bench_swim}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2736 ...t\sphinxincludegraphics{{bench_ant}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2736 ...t\sphinxincludegraphics{{bench_ant}.svg} [68] [69] [70] Chapter 14. [71] ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2833 },\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2833 },\end{split} ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2850 ...rithms/vpg:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2851 \caption {Vanilla Policy Gradient Algorithm} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2853 \begin{algorithmic} [1] ! Undefined control sequence. l.2854 \STATE Input: initial policy parameters $\theta_0$, initial value... ! Undefined control sequence. l.2855 \FOR {$k = 0,1,2,...$} ! Undefined control sequence. l.2856 \STATE Collect set of trajectories ${\mathcal D}_k = \{\tau_i\}$ ... ! Undefined control sequence. l.2857 \STATE Compute rewards-to-go $\hat{R}_t$. ! Undefined control sequence. l.2858 \STATE Compute advantage estimates, $\hat{A}_t$ (using any method... ! Undefined control sequence. l.2859 \STATE Estimate policy gradient as ! Undefined control sequence. l.2863 \STATE Compute policy update, either using standard gradient ascent, ! Undefined control sequence. l.2868 \STATE Fit value function by regression on mean-squared error: ! Undefined control sequence. l.2873 \ENDFOR ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2874 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2875 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 2881--2881 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 2881--2881 \T1/ptm/m/it/10 steps_per_epoch=4000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 epochs=50\ T1/ptm/m/n/10 , \T1/ptm/m/it/10 gamma=0.99\T1/ptm/m/n/10 , \T1/ptm/m/it/10 pi_l r=0.0003\T1/ptm/m/n/10 , \T1/ptm/m/it/10 vf_lr=0.001\T1/ptm/m/n/10 , [72] [73] [74] [75] [76] Chapter 15. [77] ! Undefined control sequence. ...al L}(\theta _k, \theta ) = \underE {s,a \sim \pi _{\theta _k}... l.3162 },\end{split} ! Undefined control sequence. ...al L}(\theta _k, \theta ) = \underE {s,a \sim \pi _{\theta _k}... l.3162 },\end{split} ! Undefined control sequence. ...{KL}(\theta || \theta _k) = \underE {s \sim \pi _{\theta _k}}{... l.3168 }.\end{split} ! Undefined control sequence. ...{KL}(\theta || \theta _k) = \underE {s \sim \pi _{\theta _k}}{... l.3168 }.\end{split} [78] ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3217 ...ithms/trpo:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3218 \caption {Trust Region Policy Optimization} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3220 \begin{algorithmic} [1] ! Undefined control sequence. l.3221 \STATE Input: initial policy parameters $\theta_0$, initial value... ! Undefined control sequence. l.3222 \STATE Hyperparameters: KL-divergence limit $\delta$, backtrackin... ! Undefined control sequence. l.3223 \FOR {$k = 0,1,2,...$} ! Undefined control sequence. l.3224 \STATE Collect set of trajectories ${\mathcal D}_k = \{\tau_i\}$ ... ! Undefined control sequence. l.3225 \STATE Compute rewards-to-go $\hat{R}_t$. ! Undefined control sequence. l.3226 \STATE Compute advantage estimates, $\hat{A}_t$ (using any method... ! Undefined control sequence. l.3227 \STATE Estimate policy gradient as ! Undefined control sequence. l.3231 \STATE Use the conjugate gradient algorithm to compute ! Undefined control sequence. l.3236 \STATE Update the policy by backtracking line search with [79] ! Undefined control sequence. l.3241 \STATE Fit value function by regression on mean-squared error: ! Undefined control sequence. l.3246 \ENDFOR ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3247 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3248 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 3254--3254 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 3254--3254 \T1/ptm/m/it/10 steps_per_epoch=4000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 epochs=50\ T1/ptm/m/n/10 , \T1/ptm/m/it/10 gamma=0.99\T1/ptm/m/n/10 , \T1/ptm/m/it/10 delt a=0.01\T1/ptm/m/n/10 , \T1/ptm/m/it/10 vf_lr=0.001\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 3254--3254 \T1/ptm/m/it/10 train_v_iters=80\T1/ptm/m/n/10 , \T1/ptm/m/it/10 damp-ing_coeff =0.1\T1/ptm/m/n/10 , \T1/ptm/m/it/10 cg_iters=10\T1/ptm/m/n/10 , \T1/ptm/m/it/1 0 back-track_iters=10\T1/ptm/m/n/10 , \T1/ptm/m/it/10 back- Underfull \hbox (badness 10000) in paragraph at lines 3254--3254 \T1/ptm/m/it/10 track_coeff=0.8\T1/ptm/m/n/10 , \T1/ptm/m/it/10 lam=0.97\T1/ptm /m/n/10 , \T1/ptm/m/it/10 max_ep_len=1000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 log-g er_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 save_freq=10\T1/ptm/m/n/10 , [80] Overfull \vbox (100.02797pt too high) has occurred while \output is active [81] [82] [83] [84] Chapter 16. [85] [86] ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3674 ...rithms/ppo:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3675 \caption {PPO-Clip} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3677 \begin{algorithmic} [1] ! Undefined control sequence. l.3678 \STATE Input: initial policy parameters $\theta_0$, initial value... ! Undefined control sequence. l.3679 \FOR {$k = 0,1,2,...$} ! Undefined control sequence. l.3680 \STATE Collect set of trajectories ${\mathcal D}_k = \{\tau_i\}$ ... ! Undefined control sequence. l.3681 \STATE Compute rewards-to-go $\hat{R}_t$. ! Undefined control sequence. l.3682 \STATE Compute advantage estimates, $\hat{A}_t$ (using any method... ! Undefined control sequence. l.3683 \STATE Update the policy by maximizing the PPO-Clip objective: ! Undefined control sequence. l.3691 \STATE Fit value function by regression on mean-squared error: ! Undefined control sequence. l.3696 \ENDFOR ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3697 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3698 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 3704--3704 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , [87] [88] [89] [90] Chapter 17. [91] [92] [93] ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4088 ...ithms/ddpg:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4089 \caption {Deep Deterministic Policy Gradient} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4091 \begin{algorithmic} [1] ! Undefined control sequence. l.4092 \STATE Input: initial policy parameters $\theta$, Q-function para... ! Undefined control sequence. l.4093 \STATE Set target parameters equal to main parameters $\theta_{\t... ! Undefined control sequence. l.4094 \REPEAT ! Undefined control sequence. l.4095 \STATE Observe state $s$ and select action $a = \text{clip}(\... ! Undefined control sequence. l.4096 \STATE Execute $a$ in the environment ! Undefined control sequence. l.4097 \STATE Observe next state $s'$, reward $r$, and done signal $... ! Undefined control sequence. l.4098 \STATE Store $(s,a,r,s',d)$ in replay buffer $\mathcal{D}$ ! Undefined control sequence. l.4099 \STATE If $s'$ is terminal, reset environment state. ! Undefined control sequence. l.4100 \IF {it's time to update} ! Undefined control sequence. l.4101 \FOR {however many updates} ! Undefined control sequence. l.4102 \STATE Randomly sample a batch of transitions, $B = \... ! Undefined control sequence. l.4103 \STATE Compute targets ! Undefined control sequence. l.4107 \STATE Update Q-function by one step of gradient desc... ! Undefined control sequence. l.4111 \STATE Update policy by one step of gradient ascent u... ! Undefined control sequence. l.4115 \STATE Update target networks with ! Undefined control sequence. l.4120 \ENDFOR ! Undefined control sequence. l.4121 \ENDIF ! Undefined control sequence. l.4122 \UNTIL {convergence} ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4123 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4124 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 4130--4130 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 4130--4130 \T1/ptm/m/it/10 pi_lr=0.001\T1/ptm/m/n/10 , \T1/ptm/m/it/10 q_lr=0.001\T1/ptm/m /n/10 , \T1/ptm/m/it/10 batch_size=100\T1/ptm/m/n/10 , \T1/ptm/m/it/10 start_st eps=10000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 act_noise=0.1\T1/ptm/m/n/10 , [94] [95] [96] Chapter 18. [97] ! Undefined control sequence. ...}L(\phi _1, {\mathcal D}) = \underE {(s,a,r,s',d) \sim {\mathc... l.4436 },\end{split} ! Undefined control sequence. ...}L(\phi _1, {\mathcal D}) = \underE {(s,a,r,s',d) \sim {\mathc... l.4436 },\end{split} ! Undefined control sequence. ...}L(\phi _2, {\mathcal D}) = \underE {(s,a,r,s',d) \sim {\mathc... l.4440 }.\end{split} ! Undefined control sequence. ...}L(\phi _2, {\mathcal D}) = \underE {(s,a,r,s',d) \sim {\mathc... l.4440 }.\end{split} [98] ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4464 ...rithms/td3:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4465 \caption {Twin Delayed DDPG} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4467 \begin{algorithmic} [1] ! Undefined control sequence. l.4468 \STATE Input: initial policy parameters $\theta$, Q-function para... ! Undefined control sequence. l.4469 \STATE Set target parameters equal to main parameters $\theta_{\t... ! Undefined control sequence. l.4470 \REPEAT ! Undefined control sequence. l.4471 \STATE Observe state $s$ and select action $a = \text{clip}(\... ! Undefined control sequence. l.4472 \STATE Execute $a$ in the environment ! Undefined control sequence. l.4473 \STATE Observe next state $s'$, reward $r$, and done signal $... ! Undefined control sequence. l.4474 \STATE Store $(s,a,r,s',d)$ in replay buffer $\mathcal{D}$ ! Undefined control sequence. l.4475 \STATE If $s'$ is terminal, reset environment state. ! Undefined control sequence. l.4476 \IF {it's time to update} ! Undefined control sequence. l.4477 \FOR {$j$ in range(however many updates)} ! Undefined control sequence. l.4478 \STATE Randomly sample a batch of transitions, $B = \... ! Undefined control sequence. l.4479 \STATE Compute target actions ! Undefined control sequence. l.4483 \STATE Compute targets ! Undefined control sequence. l.4487 \STATE Update Q-functions by one step of gradient des... ! Undefined control sequence. l.4491 \IF { $j \mod$ \texttt{policy\_delay} $ = 0$} ! Undefined control sequence. l.4492 \STATE Update policy by one step of gradient asce... ! Undefined control sequence. l.4496 \STATE Update target networks with ! Undefined control sequence. l.4501 \ENDIF ! Undefined control sequence. l.4502 \ENDFOR ! Undefined control sequence. l.4503 \ENDIF ! Undefined control sequence. l.4504 \UNTIL {convergence} ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4505 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4506 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 4512--4512 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 4512--4512 \T1/ptm/m/it/10 pi_lr=0.001\T1/ptm/m/n/10 , \T1/ptm/m/it/10 q_lr=0.001\T1/ptm/m /n/10 , \T1/ptm/m/it/10 batch_size=100\T1/ptm/m/n/10 , \T1/ptm/m/it/10 start_st eps=10000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 act_noise=0.1\T1/ptm/m/n/10 , \T1/ptm /m/it/10 tar- [99] [100] [101] [102] Chapter 19. [103] ! Undefined control sequence. ...it@tag \begin {split}H(P) = \underE {x \sim P}{-\log P(x)}.\en... l.4827 ...underE{x \sim P}{-\log P(x)}.\end{split} ! Undefined control sequence. ...it@tag \begin {split}H(P) = \underE {x \sim P}{-\log P(x)}.\en... l.4827 ...underE{x \sim P}{-\log P(x)}.\end{split} ! Undefined control sequence. ...}\pi ^* = \arg \max _{\pi } \underE {\tau \sim \pi }{ \sum _{t... l.4831 ...pi(\cdot|s_t)\right) \bigg)},\end{split} ! Undefined control sequence. ...}\pi ^* = \arg \max _{\pi } \underE {\tau \sim \pi }{ \sum _{t... l.4831 ...pi(\cdot|s_t)\right) \bigg)},\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {\tau \sim \pi }{ \left . ... l.4835 ...ight) \bigg) \right| s_0 = s}\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {\tau \sim \pi }{ \left . ... l.4835 ...ight) \bigg) \right| s_0 = s}\end{split} ! Undefined control sequence. ...egin {split}Q^{\pi }(s,a) = \underE {\tau \sim \pi }{ \left . ... l.4839 ...ght)\right| s_0 = s, a_0 = a}\end{split} ! Undefined control sequence. ...egin {split}Q^{\pi }(s,a) = \underE {\tau \sim \pi }{ \left . ... l.4839 ...ght)\right| s_0 = s, a_0 = a}\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {a \sim \pi }{Q^{\pi }(s,a... l.4843 ...ha H\left(\pi(\cdot|s)\right)\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {a \sim \pi }{Q^{\pi }(s,a... l.4843 ...ha H\left(\pi(\cdot|s)\right)\end{split} ! Undefined control sequence. ...gin {split}Q^{\pi }(s,a) &= \underE {s' \sim P \\ a' \sim \pi ... l.4848 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Missing } inserted. } l.4848 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Missing { inserted. { l.4848 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Undefined control sequence. ...s')\right ) \right )} \\ &= \underE {s' \sim P}{R(s,a,s') + \g... l.4848 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Undefined control sequence. ...gin {split}Q^{\pi }(s,a) &= \underE {s' \sim P \\ a' \sim \pi ... l.4848 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Missing } inserted. } l.4848 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Missing { inserted. { l.4848 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Undefined control sequence. ...s')\right ) \right )} \\ &= \underE {s' \sim P}{R(s,a,s') + \g... l.4848 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} [104] ! Undefined control sequence. ...begin {split}V^{\pi }(s) &= \underE {a \sim \pi }{Q^{\pi }(s,a... l.4871 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. ...pi (\cdot |s)\right ) \\ &= \underE {a \sim \pi }{Q^{\pi }(s,a... l.4871 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. ...begin {split}V^{\pi }(s) &= \underE {a \sim \pi }{Q^{\pi }(s,a... l.4871 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. ...pi (\cdot |s)\right ) \\ &= \underE {a \sim \pi }{Q^{\pi }(s,a... l.4871 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. ...it}L(\psi , {\mathcal D}) = \underE {s \sim \mathcal {D} \\ \t... l.4879 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Missing } inserted. } l.4879 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Missing { inserted. { l.4879 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Undefined control sequence. ...it}L(\psi , {\mathcal D}) = \underE {s \sim \mathcal {D} \\ \t... l.4879 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Missing } inserted. } l.4879 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Missing { inserted. { l.4879 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {a \sim \pi }{Q^{\pi }(s,a) - \a... l.4885 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {a \sim \pi }{Q^{\pi }(s,a) - \a... l.4885 ...s,a) - \alpha \log \pi(a|s)}.\end{split} [105] ! Undefined control sequence. \split@tag \begin {split}\underE {a \sim \pi _{\theta }}{Q^{\pi _... l.4902 ...\tilde{a}_{\theta}(s,\xi)|s)}\end{split} ! Undefined control sequence. ...\log \pi _{\theta }(a|s)} = \underE {\xi \sim \mathcal {N}}{Q^... l.4902 ...\tilde{a}_{\theta}(s,\xi)|s)}\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {a \sim \pi _{\theta }}{Q^{\pi _... l.4902 ...\tilde{a}_{\theta}(s,\xi)|s)}\end{split} ! Undefined control sequence. ...\log \pi _{\theta }(a|s)} = \underE {\xi \sim \mathcal {N}}{Q^... l.4902 ...\tilde{a}_{\theta}(s,\xi)|s)}\end{split} ! Undefined control sequence. ...egin {split}\max _{\theta } \underE {s \sim \mathcal {D} \\ \x... l.4906 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Missing } inserted. } l.4906 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Missing { inserted. { l.4906 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Undefined control sequence. ...egin {split}\max _{\theta } \underE {s \sim \mathcal {D} \\ \x... l.4906 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Missing } inserted. } l.4906 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Missing { inserted. { l.4906 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4924 ...rithms/sac:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4925 \caption {Soft Actor-Critic} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4927 \begin{algorithmic} [1] ! Undefined control sequence. l.4928 \STATE Input: initial policy parameters $\theta$, Q-function para... ! Undefined control sequence. l.4929 \STATE Set target parameters equal to main parameters $\psi_{\tex... ! Undefined control sequence. l.4930 \REPEAT ! Undefined control sequence. l.4931 \STATE Observe state $s$ and select action $a \sim \pi_{\thet... ! Undefined control sequence. l.4932 \STATE Execute $a$ in the environment ! Undefined control sequence. l.4933 \STATE Observe next state $s'$, reward $r$, and done signal $... ! Undefined control sequence. l.4934 \STATE Store $(s,a,r,s',d)$ in replay buffer $\mathcal{D}$ ! Undefined control sequence. l.4935 \STATE If $s'$ is terminal, reset environment state. ! Undefined control sequence. l.4936 \IF {it's time to update} ! Undefined control sequence. l.4937 \FOR {$j$ in range(however many updates)} ! Undefined control sequence. l.4938 \STATE Randomly sample a batch of transitions, $B = \... ! Undefined control sequence. l.4939 \STATE Compute targets for Q and V functions: [106] ! Undefined control sequence. l.4944 \STATE Update Q-functions by one step of gradient des... ! Undefined control sequence. l.4948 \STATE Update V-function by one step of gradient desc... ! Undefined control sequence. l.4952 \STATE Update policy by one step of gradient ascent u... ! Undefined control sequence. l.4957 \STATE Update target value network with ! Undefined control sequence. l.4961 \ENDFOR ! Undefined control sequence. l.4962 \ENDIF ! Undefined control sequence. l.4963 \UNTIL {convergence} ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4964 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4965 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 4971--4971 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 4971--4971 \T1/ptm/m/it/10 lr=0.001\T1/ptm/m/n/10 , \T1/ptm/m/it/10 al-pha=0.2\T1/ptm/m/n/ 10 , \T1/ptm/m/it/10 batch_size=100\T1/ptm/m/n/10 , \T1/ptm/m/it/10 start_steps =10000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 max_ep_len=1000\T1/ptm/m/n/10 , \T1/ptm/ m/it/10 log- [107] Overfull \vbox (77.80809pt too high) has occurred while \output is active [108] [109] [110] Chapter 20. [111] [112] [113] [114] [115] [116] Chapter 21. [117] [118] Chapter 22. [119] [120] Chapter 23. [121] Underfull \hbox (badness 10000) in paragraph at lines 5986--5986 []\T1/ptm/m/it/10 exp_name\T1/ptm/m/n/10 , \T1/ptm/m/it/10 thunk\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , \T1/ptm/m/it/10 num_cpu=1\T1/ptm/m/n/1 0 , [122] [123] [124] Chapter 24. [125] [126] Chapter 25. [127] [128] Chapter 26. [129] [130] [131] (./SpinningUp.ind [132] Underfull \hbox (badness 7522) in paragraph at lines 47--48 []\T1/ptm/m/n/10 add() (spinup.utils.run_utils.ExperimentGrid method), Overfull \hbox (5.61969pt too wide) in paragraph at lines 48--49 []\T1/ptm/m/n/10 apply_gradients() (spinup.utils.mpi_tf.MpiAdamOptimizer Overfull \hbox (17.83952pt too wide) in paragraph at lines 74--75 []\T1/ptm/m/n/10 compute_gradients() (spinup.utils.mpi_tf.MpiAdamOptimizer [133] Underfull \hbox (badness 10000) in paragraph at lines 103--104 []\T1/ptm/m/n/10 mpi_statistics_scalar() (in mod-ule Underfull \hbox (badness 10000) in paragraph at lines 119--120 []\T1/ptm/m/n/10 run() (spinup.utils.run_utils.ExperimentGrid method), Underfull \hbox (badness 10000) in paragraph at lines 140--141 []\T1/ptm/m/n/10 variant_name() (spinup.utils.run_utils.ExperimentGrid Underfull \hbox (badness 10000) in paragraph at lines 141--142 []\T1/ptm/m/n/10 variants() (spinup.utils.run_utils.ExperimentGrid [134]) (./SpinningUp.aux) Package rerunfilecheck Warning: File `SpinningUp.out' has changed. (rerunfilecheck) Rerun to get outlines right (rerunfilecheck) or use package `bookmark'. LaTeX Warning: There were multiply-defined labels. ) (see the transcript file for additional information){/usr/share/texlive/texmf-d ist/fonts/enc/dvips/base/8r.enc}< /usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmr5.pfb> Output written on SpinningUp.pdf (140 pages, 1145584 bytes). Transcript written on SpinningUp.log. [rtd-command-info] start-time: 2019-07-15T17:16:29.637815Z, end-time: 2019-07-15T17:16:29.963509Z, duration: 0, exit-code: 0 mv -f /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/_build/latex/SpinningUp.pdf /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/artifacts/latest/sphinx_pdf/openai-education-spinningup.pdf [rtd-command-info] start-time: 2019-07-15T17:16:30.067393Z, end-time: 2019-07-15T17:17:57.322496Z, duration: 87, exit-code: 0 python sphinx-build -T -b epub -d _build/doctrees-epub -D language=en . _build/epub Running Sphinx v1.5.6 making output directory... WARNING: Logging before flag parsing goes to stderr. W0715 17:16:32.600282 139847130681472 deprecation_wrapper.py:119] From /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/spinup/utils/mpi_tf.py:29: The name tf.train.AdamOptimizer is deprecated. Please use tf.compat.v1.train.AdamOptimizer instead. loading translations [en]... done loading pickled environment... not yet created building [mo]: targets for 0 po files that are out of date building [epub]: targets for 31 source files that are out of date updating environment: 31 added, 0 changed, 0 removed reading sources... [ 3%] algorithms/ddpg reading sources... [ 6%] algorithms/ppo reading sources... [ 9%] algorithms/sac reading sources... [ 12%] algorithms/td3 reading sources... [ 16%] algorithms/trpo reading sources... [ 19%] algorithms/vpg reading sources... [ 22%] etc/acknowledgements reading sources... [ 25%] etc/author reading sources... [ 29%] index reading sources... [ 32%] spinningup/bench reading sources... [ 35%] spinningup/exercise2_1_soln reading sources... [ 38%] spinningup/exercise2_2_soln reading sources... [ 41%] spinningup/exercises reading sources... [ 45%] spinningup/extra_pg_proof1 reading sources... [ 48%] spinningup/extra_pg_proof2 reading sources... [ 51%] spinningup/keypapers reading sources... [ 54%] spinningup/rl_intro reading sources... [ 58%] spinningup/rl_intro2 reading sources... [ 61%] spinningup/rl_intro3 reading sources... [ 64%] spinningup/rl_intro4 reading sources... [ 67%] spinningup/spinningup reading sources... [ 70%] user/algorithms reading sources... [ 74%] user/installation reading sources... [ 77%] user/introduction reading sources... [ 80%] user/plotting reading sources... [ 83%] user/running reading sources... [ 87%] user/saving_and_loading reading sources... [ 90%] utils/logger reading sources... [ 93%] utils/mpi reading sources... [ 96%] utils/plotter reading sources... [100%] utils/run_utils looking for now-outdated files... /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/exercises.rst:3: WARNING: Duplicate explicit target name: "solution available here.". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/rl_intro3.rst:3: WARNING: Duplicate explicit target name: "on github". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/rl_intro3.rst:380: WARNING: Line block ends without a blank line. /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/plotting.rst:55: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/plotting.rst:63: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/running.rst:144: WARNING: Duplicate explicit target name: "cmdoption--ac_kwargs". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/running.rst:285: WARNING: Inline strong start-string without end-string. /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/saving_and_loading.rst:119: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/saving_and_loading.rst:146: WARNING: Duplicate explicit target name: "cmdoption-arg-default". none found pickling environment... done checking consistency... /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/exercise2_1_soln.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/exercise2_2_soln.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/extra_pg_proof1.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/extra_pg_proof2.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/rl_intro4.rst:: WARNING: document isn't included in any toctree done preparing documents... done writing output... [ 3%] algorithms/ddpg writing output... [ 6%] algorithms/ppo writing output... [ 9%] algorithms/sac writing output... [ 12%] algorithms/td3 writing output... [ 16%] algorithms/trpo writing output... [ 19%] algorithms/vpg writing output... [ 22%] etc/acknowledgements writing output... [ 25%] etc/author writing output... [ 29%] index writing output... [ 32%] spinningup/bench writing output... [ 35%] spinningup/exercise2_1_soln writing output... [ 38%] spinningup/exercise2_2_soln writing output... [ 41%] spinningup/exercises writing output... [ 45%] spinningup/extra_pg_proof1 writing output... [ 48%] spinningup/extra_pg_proof2 writing output... [ 51%] spinningup/keypapers writing output... [ 54%] spinningup/rl_intro writing output... [ 58%] spinningup/rl_intro2 writing output... [ 61%] spinningup/rl_intro3 writing output... [ 64%] spinningup/rl_intro4 writing output... [ 67%] spinningup/spinningup writing output... [ 70%] user/algorithms writing output... [ 74%] user/installation writing output... [ 77%] user/introduction writing output... [ 80%] user/plotting writing output... [ 83%] user/running writing output... [ 87%] user/saving_and_loading writing output... [ 90%] utils/logger writing output... [ 93%] utils/mpi writing output... [ 96%] utils/plotter writing output... [100%] utils/run_utils generating indices... genindex py-modindex writing additional pages... copying images... [ 10%] images/spinning-up-in-rl.png copying images... [ 20%] spinningup/../images/bench/bench_halfcheetah.svg copying images... [ 30%] spinningup/../images/bench/bench_hopper.svg copying images... [ 40%] spinningup/../images/bench/bench_walker.svg copying images... [ 50%] spinningup/../images/bench/bench_swim.svg copying images... [ 60%] spinningup/../images/bench/bench_ant.svg copying images... [ 70%] spinningup/../images/ex2-1_trpo_hopper.png copying images... [ 80%] spinningup/../images/ex2-2_ddpg_bug.svg copying images... [ 90%] spinningup/../images/rl_diagram_transparent_bg.png copying images... [100%] spinningup/../images/rl_algorithms_9_15.svg copying static files... WARNING: favicon file 'openai_icon.ico' does not exist done copying extra files... done writing mimetype file... writing META-INF/container.xml file... writing content.opf file... WARNING: unknown mimetype for _static/openai-favicon2_32x32.ico, ignoring WARNING: unknown mimetype for _static/openai_icon.ico, ignoring writing nav.xhtml file... writing toc.ncx file... writing SpinningUp.epub file... build succeeded, 17 warnings. [rtd-command-info] start-time: 2019-07-15T17:17:57.471492Z, end-time: 2019-07-15T17:17:57.780438Z, duration: 0, exit-code: 0 mv -f /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/_build/epub/SpinningUp.epub /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/artifacts/latest/sphinx_epub/openai-education-spinningup.epub