Read the Docs build information Build id: 158208 Project: openai-education-spinningup Version: latest Commit: 23cb30c1c03067105cdc955c15f45ecd30a66e57 Date: 2018-11-11T08:38:11.587873Z State: finished Success: True [rtd-command-info] start-time: 2018-11-11T14:38:12.110025Z, end-time: 2018-11-11T14:38:12.116480Z, duration: 0, exit-code: 0 git remote set-url origin git@github.com:openai/spinningup.git [rtd-command-info] start-time: 2018-11-11T14:38:12.177584Z, end-time: 2018-11-11T14:38:15.262221Z, duration: 3, exit-code: 0 git fetch --tags --prune --prune-tags From github.com:openai/spinningup 9ef9590..23cb30c master -> origin/master * [new tag] v0.1 -> v0.1 [rtd-command-info] start-time: 2018-11-11T14:38:15.332228Z, end-time: 2018-11-11T14:38:15.523452Z, duration: 0, exit-code: 0 git checkout --force origin/master Previous HEAD position was 9ef9590 first commit HEAD is now at 23cb30c Update documentation to make MuJoCo installation optional, with accompanying minor version bump 0.1->0.1.1 to signify this. [rtd-command-info] start-time: 2018-11-11T14:38:15.609023Z, end-time: 2018-11-11T14:38:15.616795Z, duration: 0, exit-code: 0 git clean -d -f -f [rtd-command-info] start-time: 2018-11-11T14:38:15.696534Z, end-time: 2018-11-11T14:38:15.701468Z, duration: 0, exit-code: 0 git branch -r origin/HEAD -> origin/master origin/master [rtd-command-info] start-time: 2018-11-11T14:38:16.829506Z, end-time: 2018-11-11T14:38:20.167328Z, duration: 3, exit-code: 0 python3.6 -mvirtualenv --no-site-packages --no-download Using base prefix '/home/docs/.pyenv/versions/3.6.2' New python executable in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/bin/python3.6 Also creating executable in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/bin/python Installing setuptools, pip, wheel...done. [rtd-command-info] start-time: 2018-11-11T14:38:20.240814Z, end-time: 2018-11-11T14:38:30.756188Z, duration: 10, exit-code: 0 python pip install --upgrade --cache-dir /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/.cache/pip Pygments==2.2.0 setuptools<40 docutils==0.13.1 mock==1.0.1 pillow==2.6.1 alabaster>=0.7,<0.8,!=0.7.5 commonmark==0.5.4 recommonmark==0.4.0 sphinx<1.8 sphinx-rtd-theme<0.5 readthedocs-sphinx-ext<0.6 Collecting Pygments==2.2.0 Using cached https://files.pythonhosted.org/packages/02/ee/b6e02dc6529e82b75bb06823ff7d005b141037cb1416b10c6f00fc419dca/Pygments-2.2.0-py2.py3-none-any.whl Collecting setuptools<40 Using cached https://files.pythonhosted.org/packages/7f/e1/820d941153923aac1d49d7fc37e17b6e73bfbd2904959fffbad77900cf92/setuptools-39.2.0-py2.py3-none-any.whl Collecting docutils==0.13.1 Using cached https://files.pythonhosted.org/packages/7c/30/8fb30d820c012a6f701a66618ce065b6d61d08ac0a77e47fc7808dbaee47/docutils-0.13.1-py3-none-any.whl Collecting mock==1.0.1 Collecting pillow==2.6.1 Collecting alabaster!=0.7.5,<0.8,>=0.7 Using cached https://files.pythonhosted.org/packages/10/ad/00b090d23a222943eb0eda509720a404f531a439e803f6538f35136cae9e/alabaster-0.7.12-py2.py3-none-any.whl Collecting commonmark==0.5.4 Collecting recommonmark==0.4.0 Using cached https://files.pythonhosted.org/packages/df/a5/8ee4b84af7f997dfdba71254a88008cfc19c49df98983c9a4919e798f8ce/recommonmark-0.4.0-py2.py3-none-any.whl Collecting sphinx<1.8 Using cached https://files.pythonhosted.org/packages/90/f9/a0babe32c78480994e4f1b93315558f5ed756104054a7029c672a8d77b72/Sphinx-1.7.9-py2.py3-none-any.whl Collecting sphinx-rtd-theme<0.5 Using cached https://files.pythonhosted.org/packages/ef/0c/e4a462190506bc4bff6ca8cf93da07b2d13e540466d2e8a760352d0c69b0/sphinx_rtd_theme-0.4.2-py2.py3-none-any.whl Collecting readthedocs-sphinx-ext<0.6 Using cached https://files.pythonhosted.org/packages/2b/c5/126eb75a57918bb3d2f858ddda05f5670d6f07bfa356bc8870e2885f6aac/readthedocs_sphinx_ext-0.5.15-py2.py3-none-any.whl Collecting Jinja2>=2.3 (from sphinx<1.8) Using cached https://files.pythonhosted.org/packages/7f/ff/ae64bacdfc95f27a016a7bed8e8686763ba4d277a78ca76f32659220a731/Jinja2-2.10-py2.py3-none-any.whl Collecting packaging (from sphinx<1.8) Using cached https://files.pythonhosted.org/packages/89/d1/92e6df2e503a69df9faab187c684585f0136662c12bb1f36901d426f3fab/packaging-18.0-py2.py3-none-any.whl Collecting six>=1.5 (from sphinx<1.8) Using cached https://files.pythonhosted.org/packages/67/4b/141a581104b1f6397bfa78ac9d43d8ad29a7ca43ea90a2d863fe3056e86a/six-1.11.0-py2.py3-none-any.whl Collecting requests>=2.0.0 (from sphinx<1.8) Downloading https://files.pythonhosted.org/packages/ff/17/5cbb026005115301a8fb2f9b0e3e8d32313142fe8b617070e7baad20554f/requests-2.20.1-py2.py3-none-any.whl (57kB) Collecting sphinxcontrib-websupport (from sphinx<1.8) Using cached https://files.pythonhosted.org/packages/52/69/3c2fbdc3702358c5b34ee25e387b24838597ef099761fc9a42c166796e8f/sphinxcontrib_websupport-1.1.0-py2.py3-none-any.whl Collecting babel!=2.0,>=1.3 (from sphinx<1.8) Using cached https://files.pythonhosted.org/packages/b8/ad/c6f60602d3ee3d92fbed87675b6fb6a6f9a38c223343ababdb44ba201f10/Babel-2.6.0-py2.py3-none-any.whl Collecting snowballstemmer>=1.1 (from sphinx<1.8) Using cached https://files.pythonhosted.org/packages/d4/6c/8a935e2c7b54a37714656d753e4187ee0631988184ed50c0cf6476858566/snowballstemmer-1.2.1-py2.py3-none-any.whl Collecting imagesize (from sphinx<1.8) Using cached https://files.pythonhosted.org/packages/fc/b6/aef66b4c52a6ad6ac18cf6ebc5731ed06d8c9ae4d3b2d9951f261150be67/imagesize-1.1.0-py2.py3-none-any.whl Collecting MarkupSafe>=0.23 (from Jinja2>=2.3->sphinx<1.8) Using cached https://files.pythonhosted.org/packages/08/04/f2191b50fb7f0712f03f064b71d8b4605190f2178ba02e975a87f7b89a0d/MarkupSafe-1.1.0-cp36-cp36m-manylinux1_x86_64.whl Collecting pyparsing>=2.0.2 (from packaging->sphinx<1.8) Using cached https://files.pythonhosted.org/packages/71/e8/6777f6624681c8b9701a8a0a5654f3eb56919a01a78e12bf3c73f5a3c714/pyparsing-2.3.0-py2.py3-none-any.whl Collecting urllib3<1.25,>=1.21.1 (from requests>=2.0.0->sphinx<1.8) Using cached https://files.pythonhosted.org/packages/62/00/ee1d7de624db8ba7090d1226aebefab96a2c71cd5cfa7629d6ad3f61b79e/urllib3-1.24.1-py2.py3-none-any.whl Collecting idna<2.8,>=2.5 (from requests>=2.0.0->sphinx<1.8) Using cached https://files.pythonhosted.org/packages/4b/2a/0276479a4b3caeb8a8c1af2f8e4355746a97fab05a372e4a2c6a6b876165/idna-2.7-py2.py3-none-any.whl Collecting chardet<3.1.0,>=3.0.2 (from requests>=2.0.0->sphinx<1.8) Using cached https://files.pythonhosted.org/packages/bc/a9/01ffebfb562e4274b6487b4bb1ddec7ca55ec7510b22e4c51f14098443b8/chardet-3.0.4-py2.py3-none-any.whl Collecting certifi>=2017.4.17 (from requests>=2.0.0->sphinx<1.8) Using cached https://files.pythonhosted.org/packages/56/9d/1d02dd80bc4cd955f98980f28c5ee2200e1209292d5f9e9cc8d030d18655/certifi-2018.10.15-py2.py3-none-any.whl Collecting pytz>=0a (from babel!=2.0,>=1.3->sphinx<1.8) Using cached https://files.pythonhosted.org/packages/f8/0e/2365ddc010afb3d79147f1dd544e5ee24bf4ece58ab99b16fbb465ce6dc0/pytz-2018.7-py2.py3-none-any.whl Installing collected packages: Pygments, setuptools, docutils, mock, pillow, alabaster, commonmark, recommonmark, MarkupSafe, Jinja2, pyparsing, six, packaging, urllib3, idna, chardet, certifi, requests, sphinxcontrib-websupport, pytz, babel, snowballstemmer, imagesize, sphinx, sphinx-rtd-theme, readthedocs-sphinx-ext Found existing installation: setuptools 39.0.1 Uninstalling setuptools-39.0.1: Successfully uninstalled setuptools-39.0.1 Successfully installed Jinja2-2.10 MarkupSafe-1.1.0 Pygments-2.2.0 alabaster-0.7.12 babel-2.6.0 certifi-2018.10.15 chardet-3.0.4 commonmark-0.5.4 docutils-0.13.1 idna-2.7 imagesize-1.1.0 mock-1.0.1 packaging-18.0 pillow-2.6.1 pyparsing-2.3.0 pytz-2018.7 readthedocs-sphinx-ext-0.5.15 recommonmark-0.4.0 requests-2.20.1 setuptools-39.2.0 six-1.11.0 snowballstemmer-1.2.1 sphinx-1.7.9 sphinx-rtd-theme-0.4.2 sphinxcontrib-websupport-1.1.0 urllib3-1.24.1 You are using pip version 9.0.3, however version 18.1 is available. You should consider upgrading via the 'pip install --upgrade pip' command. [rtd-command-info] start-time: 2018-11-11T14:38:30.816241Z, end-time: 2018-11-11T14:39:20.131491Z, duration: 49, exit-code: 0 python pip install --exists-action=w --cache-dir /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/.cache/pip -r docs/docs_requirements.txt Collecting cloudpickle==0.5.2 (from -r docs/docs_requirements.txt (line 1)) Using cached https://files.pythonhosted.org/packages/aa/18/514b557c4d8d4ada1f0454ad06c845454ad438fd5c5e0039ba51d6b032fe/cloudpickle-0.5.2-py2.py3-none-any.whl Collecting gym>=0.10.8 (from -r docs/docs_requirements.txt (line 2)) Collecting ipython (from -r docs/docs_requirements.txt (line 3)) Using cached https://files.pythonhosted.org/packages/1b/e2/ffb8c1b574f972cf4183b0aac8f16b57f1e3bbe876b31555b107ea3fd009/ipython-7.1.1-py3-none-any.whl Collecting joblib (from -r docs/docs_requirements.txt (line 4)) Using cached https://files.pythonhosted.org/packages/0d/1b/995167f6c66848d4eb7eabc386aebe07a1571b397629b2eac3b7bebdc343/joblib-0.13.0-py2.py3-none-any.whl Collecting matplotlib (from -r docs/docs_requirements.txt (line 5)) Downloading https://files.pythonhosted.org/packages/71/07/16d781df15be30df4acfd536c479268f1208b2dfbc91e9ca5d92c9caf673/matplotlib-3.0.2-cp36-cp36m-manylinux1_x86_64.whl (12.9MB) Collecting numpy (from -r docs/docs_requirements.txt (line 6)) Using cached https://files.pythonhosted.org/packages/ff/7f/9d804d2348471c67a7d8b5f84f9bc59fd1cefa148986f2b74552f8573555/numpy-1.15.4-cp36-cp36m-manylinux1_x86_64.whl Collecting pandas (from -r docs/docs_requirements.txt (line 7)) Using cached https://files.pythonhosted.org/packages/e1/d8/feeb346d41f181e83fba45224ab14a8d8af019b48af742e047f3845d8cff/pandas-0.23.4-cp36-cp36m-manylinux1_x86_64.whl Collecting pytest (from -r docs/docs_requirements.txt (line 8)) Using cached https://files.pythonhosted.org/packages/2d/b8/fc3795707bb47ed9eb83c8d65515f3424977d779d6af333d24787b1f364e/pytest-3.10.0-py2.py3-none-any.whl Collecting psutil (from -r docs/docs_requirements.txt (line 9)) Collecting scipy (from -r docs/docs_requirements.txt (line 10)) Using cached https://files.pythonhosted.org/packages/a8/0b/f163da98d3a01b3e0ef1cab8dd2123c34aee2bafbb1c5bffa354cc8a1730/scipy-1.1.0-cp36-cp36m-manylinux1_x86_64.whl Collecting seaborn==0.8.1 (from -r docs/docs_requirements.txt (line 11)) Collecting sphinx==1.5.6 (from -r docs/docs_requirements.txt (line 12)) Using cached https://files.pythonhosted.org/packages/cd/c3/3fc2985e07f6111b47328be116df9e05d5c2f246a050e2e2ebf6bdc9c692/Sphinx-1.5.6-py2.py3-none-any.whl Collecting sphinx-autobuild==0.7.1 (from -r docs/docs_requirements.txt (line 13)) Collecting sphinx-rtd-theme==0.4.1 (from -r docs/docs_requirements.txt (line 14)) Using cached https://files.pythonhosted.org/packages/87/30/7460f7b77b6e8a080dd3688f750fe5d5666c49358f8941449c5b128fa97d/sphinx_rtd_theme-0.4.1-py2.py3-none-any.whl Collecting tensorflow>=1.8.0 (from -r docs/docs_requirements.txt (line 15)) Using cached https://files.pythonhosted.org/packages/22/cc/ca70b78087015d21c5f3f93694107f34ebccb3be9624385a911d4b52ecef/tensorflow-1.12.0-cp36-cp36m-manylinux1_x86_64.whl Collecting tqdm (from -r docs/docs_requirements.txt (line 16)) Using cached https://files.pythonhosted.org/packages/91/55/8cb23a97301b177e9c8e3226dba45bb454411de2cbd25746763267f226c2/tqdm-4.28.1-py2.py3-none-any.whl Requirement already satisfied: requests>=2.0 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from gym>=0.10.8->-r docs/docs_requirements.txt (line 2)) Requirement already satisfied: six in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from gym>=0.10.8->-r docs/docs_requirements.txt (line 2)) Collecting pyglet>=1.2.0 (from gym>=0.10.8->-r docs/docs_requirements.txt (line 2)) Using cached https://files.pythonhosted.org/packages/1c/fc/dad5eaaab68f0c21e2f906a94ddb98175662cc5a654eee404d59554ce0fa/pyglet-1.3.2-py2.py3-none-any.whl Collecting pexpect; sys_platform != "win32" (from ipython->-r docs/docs_requirements.txt (line 3)) Using cached https://files.pythonhosted.org/packages/89/e6/b5a1de8b0cc4e07ca1b305a4fcc3f9806025c1b651ea302646341222f88b/pexpect-4.6.0-py2.py3-none-any.whl Requirement already satisfied: setuptools>=18.5 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from ipython->-r docs/docs_requirements.txt (line 3)) Collecting prompt-toolkit<2.1.0,>=2.0.0 (from ipython->-r docs/docs_requirements.txt (line 3)) Using cached https://files.pythonhosted.org/packages/d1/e6/adb3be5576f5d27c6faa33f1e9fea8fe5dbd9351db12148de948507e352c/prompt_toolkit-2.0.7-py3-none-any.whl Collecting pickleshare (from ipython->-r docs/docs_requirements.txt (line 3)) Using cached https://files.pythonhosted.org/packages/9a/41/220f49aaea88bc6fa6cba8d05ecf24676326156c23b991e80b3f2fc24c77/pickleshare-0.7.5-py2.py3-none-any.whl Collecting jedi>=0.10 (from ipython->-r docs/docs_requirements.txt (line 3)) Using cached https://files.pythonhosted.org/packages/7a/1a/9bd24a185873b998611c2d8d4fb15cd5e8a879ead36355df7ee53e9111bf/jedi-0.13.1-py2.py3-none-any.whl Collecting backcall (from ipython->-r docs/docs_requirements.txt (line 3)) Collecting decorator (from ipython->-r docs/docs_requirements.txt (line 3)) Using cached https://files.pythonhosted.org/packages/bc/bb/a24838832ba35baf52f32ab1a49b906b5f82fb7c76b2f6a7e35e140bac30/decorator-4.3.0-py2.py3-none-any.whl Requirement already satisfied: pygments in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from ipython->-r docs/docs_requirements.txt (line 3)) Collecting traitlets>=4.2 (from ipython->-r docs/docs_requirements.txt (line 3)) Using cached https://files.pythonhosted.org/packages/93/d6/abcb22de61d78e2fc3959c964628a5771e47e7cc60d53e9342e21ed6cc9a/traitlets-4.3.2-py2.py3-none-any.whl Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from matplotlib->-r docs/docs_requirements.txt (line 5)) Collecting kiwisolver>=1.0.1 (from matplotlib->-r docs/docs_requirements.txt (line 5)) Using cached https://files.pythonhosted.org/packages/69/a7/88719d132b18300b4369fbffa741841cfd36d1e637e1990f27929945b538/kiwisolver-1.0.1-cp36-cp36m-manylinux1_x86_64.whl Collecting cycler>=0.10 (from matplotlib->-r docs/docs_requirements.txt (line 5)) Using cached https://files.pythonhosted.org/packages/f7/d2/e07d3ebb2bd7af696440ce7e754c59dd546ffe1bbe732c8ab68b9c834e61/cycler-0.10.0-py2.py3-none-any.whl Collecting python-dateutil>=2.1 (from matplotlib->-r docs/docs_requirements.txt (line 5)) Using cached https://files.pythonhosted.org/packages/74/68/d87d9b36af36f44254a8d512cbfc48369103a3b9e474be9bdfe536abfc45/python_dateutil-2.7.5-py2.py3-none-any.whl Requirement already satisfied: pytz>=2011k in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from pandas->-r docs/docs_requirements.txt (line 7)) Collecting pluggy>=0.7 (from pytest->-r docs/docs_requirements.txt (line 8)) Using cached https://files.pythonhosted.org/packages/1c/e7/017c262070af41fe251401cb0d0e1b7c38f656da634cd0c15604f1f30864/pluggy-0.8.0-py2.py3-none-any.whl Collecting attrs>=17.4.0 (from pytest->-r docs/docs_requirements.txt (line 8)) Using cached https://files.pythonhosted.org/packages/3a/e1/5f9023cc983f1a628a8c2fd051ad19e76ff7b142a0faf329336f9a62a514/attrs-18.2.0-py2.py3-none-any.whl Collecting atomicwrites>=1.0 (from pytest->-r docs/docs_requirements.txt (line 8)) Using cached https://files.pythonhosted.org/packages/3a/9a/9d878f8d885706e2530402de6417141129a943802c084238914fa6798d97/atomicwrites-1.2.1-py2.py3-none-any.whl Collecting py>=1.5.0 (from pytest->-r docs/docs_requirements.txt (line 8)) Using cached https://files.pythonhosted.org/packages/3e/c7/3da685ef117d42ac8d71af525208759742dd235f8094221fdaafcd3dba8f/py-1.7.0-py2.py3-none-any.whl Collecting more-itertools>=4.0.0 (from pytest->-r docs/docs_requirements.txt (line 8)) Using cached https://files.pythonhosted.org/packages/79/b1/eace304ef66bd7d3d8b2f78cc374b73ca03bc53664d78151e9df3b3996cc/more_itertools-4.3.0-py3-none-any.whl Requirement already satisfied: snowballstemmer>=1.1 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from sphinx==1.5.6->-r docs/docs_requirements.txt (line 12)) Requirement already satisfied: Jinja2>=2.3 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from sphinx==1.5.6->-r docs/docs_requirements.txt (line 12)) Requirement already satisfied: docutils>=0.11 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from sphinx==1.5.6->-r docs/docs_requirements.txt (line 12)) Requirement already satisfied: babel!=2.0,>=1.3 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from sphinx==1.5.6->-r docs/docs_requirements.txt (line 12)) Requirement already satisfied: imagesize in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from sphinx==1.5.6->-r docs/docs_requirements.txt (line 12)) Requirement already satisfied: alabaster<0.8,>=0.7 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from sphinx==1.5.6->-r docs/docs_requirements.txt (line 12)) Collecting pathtools>=0.1.2 (from sphinx-autobuild==0.7.1->-r docs/docs_requirements.txt (line 13)) Collecting watchdog>=0.7.1 (from sphinx-autobuild==0.7.1->-r docs/docs_requirements.txt (line 13)) Collecting livereload>=2.3.0 (from sphinx-autobuild==0.7.1->-r docs/docs_requirements.txt (line 13)) Using cached https://files.pythonhosted.org/packages/dd/b4/213daced3ff1b4e02a1f700748e20e9a7481f5bfef57d11ae9babfd4aa2f/livereload-2.5.2-py2.py3-none-any.whl Collecting PyYAML>=3.10 (from sphinx-autobuild==0.7.1->-r docs/docs_requirements.txt (line 13)) Collecting tornado>=3.2 (from sphinx-autobuild==0.7.1->-r docs/docs_requirements.txt (line 13)) Collecting argh>=0.24.1 (from sphinx-autobuild==0.7.1->-r docs/docs_requirements.txt (line 13)) Using cached https://files.pythonhosted.org/packages/06/1c/e667a7126f0b84aaa1c56844337bf0ac12445d1beb9c8a6199a7314944bf/argh-0.26.2-py2.py3-none-any.whl Collecting port-for==0.3.1 (from sphinx-autobuild==0.7.1->-r docs/docs_requirements.txt (line 13)) Collecting grpcio>=1.8.6 (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Using cached https://files.pythonhosted.org/packages/c3/4c/0a7c55764ac3013ca7a5e9638ee7b161488c0611afc2be465452987a3ccc/grpcio-1.16.0-cp36-cp36m-manylinux1_x86_64.whl Collecting astor>=0.6.0 (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Using cached https://files.pythonhosted.org/packages/35/6b/11530768cac581a12952a2aad00e1526b89d242d0b9f59534ef6e6a1752f/astor-0.7.1-py2.py3-none-any.whl Collecting protobuf>=3.6.1 (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Using cached https://files.pythonhosted.org/packages/c2/f9/28787754923612ca9bfdffc588daa05580ed70698add063a5629d1a4209d/protobuf-3.6.1-cp36-cp36m-manylinux1_x86_64.whl Collecting keras-applications>=1.0.6 (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Using cached https://files.pythonhosted.org/packages/3f/c4/2ff40221029f7098d58f8d7fb99b97e8100f3293f9856f0fb5834bef100b/Keras_Applications-1.0.6-py2.py3-none-any.whl Collecting tensorboard<1.13.0,>=1.12.0 (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Using cached https://files.pythonhosted.org/packages/e0/d0/65fe48383146199f16dbd5999ef226b87bce63ad5cd73c840cf722637969/tensorboard-1.12.0-py3-none-any.whl Collecting termcolor>=1.1.0 (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Collecting absl-py>=0.1.6 (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Collecting keras-preprocessing>=1.0.5 (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Using cached https://files.pythonhosted.org/packages/fc/94/74e0fa783d3fc07e41715973435dd051ca89c550881b3454233c39c73e69/Keras_Preprocessing-1.0.5-py2.py3-none-any.whl Collecting gast>=0.2.0 (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Requirement already satisfied: wheel>=0.26 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Requirement already satisfied: certifi>=2017.4.17 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from requests>=2.0->gym>=0.10.8->-r docs/docs_requirements.txt (line 2)) Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from requests>=2.0->gym>=0.10.8->-r docs/docs_requirements.txt (line 2)) Requirement already satisfied: urllib3<1.25,>=1.21.1 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from requests>=2.0->gym>=0.10.8->-r docs/docs_requirements.txt (line 2)) Requirement already satisfied: idna<2.8,>=2.5 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from requests>=2.0->gym>=0.10.8->-r docs/docs_requirements.txt (line 2)) Collecting future (from pyglet>=1.2.0->gym>=0.10.8->-r docs/docs_requirements.txt (line 2)) Collecting ptyprocess>=0.5 (from pexpect; sys_platform != "win32"->ipython->-r docs/docs_requirements.txt (line 3)) Using cached https://files.pythonhosted.org/packages/d1/29/605c2cc68a9992d18dada28206eeada56ea4bd07a239669da41674648b6f/ptyprocess-0.6.0-py2.py3-none-any.whl Collecting wcwidth (from prompt-toolkit<2.1.0,>=2.0.0->ipython->-r docs/docs_requirements.txt (line 3)) Using cached https://files.pythonhosted.org/packages/7e/9f/526a6947247599b084ee5232e4f9190a38f398d7300d866af3ab571a5bfe/wcwidth-0.1.7-py2.py3-none-any.whl Collecting parso>=0.3.0 (from jedi>=0.10->ipython->-r docs/docs_requirements.txt (line 3)) Using cached https://files.pythonhosted.org/packages/09/51/9c48a46334be50c13d25a3afe55fa05c445699304c5ad32619de953a2305/parso-0.3.1-py2.py3-none-any.whl Collecting ipython-genutils (from traitlets>=4.2->ipython->-r docs/docs_requirements.txt (line 3)) Using cached https://files.pythonhosted.org/packages/fa/bc/9bd3b5c2b4774d5f33b2d544f1460be9df7df2fe42f352135381c347c69a/ipython_genutils-0.2.0-py2.py3-none-any.whl Requirement already satisfied: MarkupSafe>=0.23 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from Jinja2>=2.3->sphinx==1.5.6->-r docs/docs_requirements.txt (line 12)) Collecting h5py (from keras-applications>=1.0.6->tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Using cached https://files.pythonhosted.org/packages/8e/cb/726134109e7bd71d98d1fcc717ffe051767aac42ede0e7326fd1787e5d64/h5py-2.8.0-cp36-cp36m-manylinux1_x86_64.whl Collecting werkzeug>=0.11.10 (from tensorboard<1.13.0,>=1.12.0->tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Using cached https://files.pythonhosted.org/packages/20/c4/12e3e56473e52375aa29c4764e70d1b8f3efa6682bef8d0aae04fe335243/Werkzeug-0.14.1-py2.py3-none-any.whl Collecting markdown>=2.6.8 (from tensorboard<1.13.0,>=1.12.0->tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Using cached https://files.pythonhosted.org/packages/7a/6b/5600647404ba15545ec37d2f7f58844d690baf2f81f3a60b862e48f29287/Markdown-3.0.1-py2.py3-none-any.whl Installing collected packages: cloudpickle, numpy, scipy, future, pyglet, gym, ptyprocess, pexpect, wcwidth, prompt-toolkit, pickleshare, parso, jedi, backcall, decorator, ipython-genutils, traitlets, ipython, joblib, kiwisolver, cycler, python-dateutil, matplotlib, pandas, pluggy, attrs, atomicwrites, py, more-itertools, pytest, psutil, seaborn, sphinx, pathtools, PyYAML, argh, watchdog, tornado, livereload, port-for, sphinx-autobuild, sphinx-rtd-theme, grpcio, astor, protobuf, h5py, keras-applications, werkzeug, markdown, tensorboard, termcolor, absl-py, keras-preprocessing, gast, tensorflow, tqdm Found existing installation: Sphinx 1.7.9 Uninstalling Sphinx-1.7.9: Successfully uninstalled Sphinx-1.7.9 Found existing installation: sphinx-rtd-theme 0.4.2 Uninstalling sphinx-rtd-theme-0.4.2: Successfully uninstalled sphinx-rtd-theme-0.4.2 Successfully installed PyYAML-3.13 absl-py-0.6.1 argh-0.26.2 astor-0.7.1 atomicwrites-1.2.1 attrs-18.2.0 backcall-0.1.0 cloudpickle-0.5.2 cycler-0.10.0 decorator-4.3.0 future-0.17.1 gast-0.2.0 grpcio-1.16.0 gym-0.10.9 h5py-2.8.0 ipython-7.1.1 ipython-genutils-0.2.0 jedi-0.13.1 joblib-0.13.0 keras-applications-1.0.6 keras-preprocessing-1.0.5 kiwisolver-1.0.1 livereload-2.5.2 markdown-3.0.1 matplotlib-3.0.2 more-itertools-4.3.0 numpy-1.15.4 pandas-0.23.4 parso-0.3.1 pathtools-0.1.2 pexpect-4.6.0 pickleshare-0.7.5 pluggy-0.8.0 port-for-0.3.1 prompt-toolkit-2.0.7 protobuf-3.6.1 psutil-5.4.8 ptyprocess-0.6.0 py-1.7.0 pyglet-1.3.2 pytest-3.10.0 python-dateutil-2.7.5 scipy-1.1.0 seaborn-0.8.1 sphinx-1.5.6 sphinx-autobuild-0.7.1 sphinx-rtd-theme-0.4.1 tensorboard-1.12.0 tensorflow-1.12.0 termcolor-1.1.0 tornado-5.1.1 tqdm-4.28.1 traitlets-4.3.2 watchdog-0.9.0 wcwidth-0.1.7 werkzeug-0.14.1 You are using pip version 9.0.3, however version 18.1 is available. You should consider upgrading via the 'pip install --upgrade pip' command. [rtd-command-info] start-time: 2018-11-11T14:39:20.611438Z, end-time: 2018-11-11T14:39:20.677549Z, duration: 0, exit-code: 0 cat docs/conf.py #!/usr/bin/env python3 # -*- coding: utf-8 -*- # # Spinning Up documentation build configuration file, created by # sphinx-quickstart on Wed Aug 15 04:21:07 2018. # # This file is execfile()d with the current directory set to its # containing dir. # # Note that not all possible configuration values are present in this # autogenerated file. # # All configuration values have a default; values that are commented out # serve to show the default. # If extensions (or modules to document with autodoc) are in another directory, # add these directories to sys.path here. If the directory is relative to the # documentation root, use os.path.abspath to make it absolute, like shown here. # import os import sys # Make sure spinup is accessible without going through setup.py dirname = os.path.dirname sys.path.insert(0, dirname(dirname(__file__))) # Mock mpi4py to get around having to install it on RTD server (which fails) from unittest.mock import MagicMock class Mock(MagicMock): @classmethod def __getattr__(cls, name): return MagicMock() MOCK_MODULES = ['mpi4py'] sys.modules.update((mod_name, Mock()) for mod_name in MOCK_MODULES) # Finish imports import spinup from recommonmark.parser import CommonMarkParser source_parsers = { '.md': CommonMarkParser, } # -- General configuration ------------------------------------------------ # If your documentation needs a minimal Sphinx version, state it here. # # needs_sphinx = '1.0' # Add any Sphinx extension module names here, as strings. They can be # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom # ones. extensions = ['sphinx.ext.imgmath', 'sphinx.ext.viewcode', 'sphinx.ext.autodoc', 'sphinx.ext.napoleon'] #'sphinx.ext.mathjax', ?? # imgmath settings imgmath_image_format = 'svg' imgmath_font_size = 14 # Add any paths that contain templates here, relative to this directory. templates_path = ['_templates'] # The suffix(es) of source filenames. # You can specify multiple suffix as a list of string: # source_suffix = ['.rst', '.md'] # source_suffix = '.rst' # The master toctree document. master_doc = 'index' # General information about the project. project = 'Spinning Up' copyright = '2018, OpenAI' author = 'Joshua Achiam' # The version info for the project you're documenting, acts as replacement for # |version| and |release|, also used in various other places throughout the # built documents. # # The short X.Y version. version = '' # The full version, including alpha/beta/rc tags. release = '' # The language for content autogenerated by Sphinx. Refer to documentation # for a list of supported languages. # # This is also used if you do content translation via gettext catalogs. # Usually you set "language" from the command line for these cases. language = None # List of patterns, relative to source directory, that match files and # directories to ignore when looking for source files. # This patterns also effect to html_static_path and html_extra_path exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store'] # The name of the Pygments (syntax highlighting) style to use. pygments_style = 'default' #'sphinx' # If true, `todo` and `todoList` produce output, else they produce nothing. todo_include_todos = False # -- Options for HTML output ---------------------------------------------- # The theme to use for HTML and HTML Help pages. See the documentation for # a list of builtin themes. # # html_theme = 'alabaster' html_theme = "sphinx_rtd_theme" # Theme options are theme-specific and customize the look and feel of a theme # further. For a list of options available for each theme, see the # documentation. # # html_theme_options = {} # Add any paths that contain custom static files (such as style sheets) here, # relative to this directory. They are copied after the builtin static files, # so a file named "default.css" will overwrite the builtin "default.css". html_static_path = ['_static'] html_logo = 'images/spinning-up-logo2.png' html_theme_options = { 'logo_only': True } #html_favicon = 'openai-favicon2_32x32.ico' html_favicon = 'openai_icon.ico' # -- Options for HTMLHelp output ------------------------------------------ # Output file base name for HTML help builder. htmlhelp_basename = 'SpinningUpdoc' # -- Options for LaTeX output --------------------------------------------- latex_elements = { # The paper size ('letterpaper' or 'a4paper'). # # 'papersize': 'letterpaper', # The font size ('10pt', '11pt' or '12pt'). # # 'pointsize': '10pt', # Additional stuff for the LaTeX preamble. # # 'preamble': '', # Latex figure (float) alignment # # 'figure_align': 'htbp', } imgmath_latex_preamble = r''' \usepackage{algorithm} \usepackage{algorithmic} \usepackage{cancel} \usepackage[verbose=true,letterpaper]{geometry} \geometry{ textheight=12in, textwidth=6.5in, top=1in, headheight=12pt, headsep=25pt, footskip=30pt } \newcommand{\E}{{\mathrm E}} \newcommand{\underE}[2]{\underset{\begin{subarray}{c}#1 \end{subarray}}{\E}\left[ #2 \right]} \newcommand{\Epi}[1]{\underset{\begin{subarray}{c}\tau \sim \pi \end{subarray}}{\E}\left[ #1 \right]} ''' # Grouping the document tree into LaTeX files. List of tuples # (source start file, target name, title, # author, documentclass [howto, manual, or own class]). latex_documents = [ (master_doc, 'SpinningUp.tex', 'Spinning Up Documentation', 'Joshua Achiam', 'manual'), ] # -- Options for manual page output --------------------------------------- # One entry per manual page. List of tuples # (source start file, name, description, authors, manual section). man_pages = [ (master_doc, 'spinningup', 'Spinning Up Documentation', [author], 1) ] # -- Options for Texinfo output ------------------------------------------- # Grouping the document tree into Texinfo files. List of tuples # (source start file, target name, title, author, # dir menu entry, description, category) texinfo_documents = [ (master_doc, 'SpinningUp', 'Spinning Up Documentation', author, 'SpinningUp', 'One line description of project.', 'Miscellaneous'), ] def setup(app): app.add_stylesheet('css/modify.css') ########################################################################### # auto-created readthedocs.org specific configuration # ########################################################################### # # The following code was added during an automated build on readthedocs.org # It is auto created and injected for every build. The result is based on the # conf.py.tmpl file found in the readthedocs.org codebase: # https://github.com/rtfd/readthedocs.org/blob/master/readthedocs/doc_builder/templates/doc_builder/conf.py.tmpl # import importlib import sys import os.path from six import string_types from sphinx import version_info # Get suffix for proper linking to GitHub # This is deprecated in Sphinx 1.3+, # as each page can have its own suffix if globals().get('source_suffix', False): if isinstance(source_suffix, string_types): SUFFIX = source_suffix else: SUFFIX = source_suffix[0] else: SUFFIX = '.rst' # Add RTD Static Path. Add to the end because it overwrites previous files. if not 'html_static_path' in globals(): html_static_path = [] if os.path.exists('_static'): html_static_path.append('_static') html_static_path.append('/home/docs/checkouts/readthedocs.org/readthedocs/templates/sphinx/_static') # Add RTD Theme only if they aren't overriding it already using_rtd_theme = ( ( 'html_theme' in globals() and html_theme in ['default'] and # Allow people to bail with a hack of having an html_style 'html_style' not in globals() ) or 'html_theme' not in globals() ) if using_rtd_theme: theme = importlib.import_module('sphinx_rtd_theme') html_theme = 'sphinx_rtd_theme' html_style = None html_theme_options = {} if 'html_theme_path' in globals(): html_theme_path.append(theme.get_html_theme_path()) else: html_theme_path = [theme.get_html_theme_path()] if globals().get('websupport2_base_url', False): websupport2_base_url = 'https://readthedocs.com/websupport' websupport2_static_url = 'https://media.readthedocs.com/' #Add project information to the template context. context = { 'using_theme': using_rtd_theme, 'html_theme': html_theme, 'current_version': "latest", 'version_slug': "latest", 'MEDIA_URL': "https://media.readthedocs.com/media/", 'STATIC_URL': "https://media.readthedocs.com/", 'PRODUCTION_DOMAIN': "readthedocs.com", 'versions': [ ("latest", "/en/latest/"), ("stable", "/en/stable/"), ], 'downloads': [ ("pdf", "//readthedocs.com/projects/openai-education-spinningup/downloads/pdf/latest/"), ("htmlzip", "//readthedocs.com/projects/openai-education-spinningup/downloads/htmlzip/latest/"), ("epub", "//readthedocs.com/projects/openai-education-spinningup/downloads/epub/latest/"), ], 'subprojects': [ ], 'slug': 'openai-education-spinningup', 'name': u'spinningup', 'rtd_language': u'en', 'programming_language': u'words', 'canonical_url': 'https://spinningup.openai.com/en/latest/', 'analytics_code': '', 'single_version': False, 'conf_py_path': '/docs/', 'api_host': 'https://readthedocs.com', 'github_user': 'openai', 'github_repo': 'spinningup', 'github_version': 'master', 'display_github': True, 'bitbucket_user': 'None', 'bitbucket_repo': 'None', 'bitbucket_version': 'master', 'display_bitbucket': False, 'gitlab_user': 'None', 'gitlab_repo': 'None', 'gitlab_version': 'master', 'display_gitlab': False, 'READTHEDOCS': True, 'using_theme': (html_theme == "default"), 'new_theme': (html_theme == "sphinx_rtd_theme"), 'source_suffix': SUFFIX, 'ad_free': False, 'user_analytics_code': '', 'global_analytics_code': 'UA-17997319-2', 'commit': '23cb30c1', } if 'html_context' in globals(): html_context.update(context) else: html_context = context # Add custom RTD extension if 'extensions' in globals(): # Insert at the beginning because it can interfere # with other extensions. # See https://github.com/rtfd/readthedocs.org/pull/4054 extensions.insert(0, "readthedocs_ext.readthedocs") else: extensions = ["readthedocs_ext.readthedocs"] [rtd-command-info] start-time: 2018-11-11T14:39:20.754880Z, end-time: 2018-11-11T14:41:01.012781Z, duration: 100, exit-code: 0 python sphinx-build -T -E -b readthedocs -d _build/doctrees-readthedocs -D language=en . _build/html Running Sphinx v1.5.6 making output directory... loading translations [en]... done building [mo]: targets for 0 po files that are out of date building [readthedocs]: targets for 31 source files that are out of date updating environment: 31 added, 0 changed, 0 removed reading sources... [ 3%] algorithms/ddpg reading sources... [ 6%] algorithms/ppo reading sources... [ 9%] algorithms/sac reading sources... [ 12%] algorithms/td3 reading sources... [ 16%] algorithms/trpo reading sources... [ 19%] algorithms/vpg reading sources... [ 22%] etc/acknowledgements reading sources... [ 25%] etc/author reading sources... [ 29%] index reading sources... [ 32%] spinningup/bench reading sources... [ 35%] spinningup/exercise2_1_soln reading sources... [ 38%] spinningup/exercise2_2_soln reading sources... [ 41%] spinningup/exercises reading sources... [ 45%] spinningup/extra_pg_proof1 reading sources... [ 48%] spinningup/extra_pg_proof2 reading sources... [ 51%] spinningup/keypapers reading sources... [ 54%] spinningup/rl_intro reading sources... [ 58%] spinningup/rl_intro2 reading sources... [ 61%] spinningup/rl_intro3 reading sources... [ 64%] spinningup/rl_intro4 reading sources... [ 67%] spinningup/spinningup reading sources... [ 70%] user/algorithms reading sources... [ 74%] user/installation reading sources... [ 77%] user/introduction reading sources... [ 80%] user/plotting reading sources... [ 83%] user/running reading sources... [ 87%] user/saving_and_loading reading sources... [ 90%] utils/logger reading sources... [ 93%] utils/mpi reading sources... [ 96%] utils/plotter reading sources... [100%] utils/run_utils /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/exercises.rst:3: WARNING: Duplicate explicit target name: "solution available here.". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/rl_intro3.rst:3: WARNING: Duplicate explicit target name: "on github". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/rl_intro3.rst:380: WARNING: Line block ends without a blank line. /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/plotting.rst:55: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/plotting.rst:63: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/running.rst:144: WARNING: Duplicate explicit target name: "cmdoption--ac_kwargs". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/running.rst:285: WARNING: Inline strong start-string without end-string. /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/saving_and_loading.rst:119: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/saving_and_loading.rst:146: WARNING: Duplicate explicit target name: "cmdoption-arg-default". looking for now-outdated files... none found pickling environment... done checking consistency... /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/exercise2_1_soln.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/exercise2_2_soln.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/extra_pg_proof1.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/extra_pg_proof2.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/rl_intro4.rst:: WARNING: document isn't included in any toctree done preparing documents... done writing output... [ 3%] algorithms/ddpg writing output... [ 6%] algorithms/ppo writing output... [ 9%] algorithms/sac writing output... [ 12%] algorithms/td3 writing output... [ 16%] algorithms/trpo writing output... [ 19%] algorithms/vpg writing output... [ 22%] etc/acknowledgements writing output... [ 25%] etc/author writing output... [ 29%] index writing output... [ 32%] spinningup/bench writing output... [ 35%] spinningup/exercise2_1_soln writing output... [ 38%] spinningup/exercise2_2_soln writing output... [ 41%] spinningup/exercises writing output... [ 45%] spinningup/extra_pg_proof1 writing output... [ 48%] spinningup/extra_pg_proof2 writing output... [ 51%] spinningup/keypapers writing output... [ 54%] spinningup/rl_intro writing output... [ 58%] spinningup/rl_intro2 writing output... [ 61%] spinningup/rl_intro3 writing output... [ 64%] spinningup/rl_intro4 writing output... [ 67%] spinningup/spinningup writing output... [ 70%] user/algorithms writing output... [ 74%] user/installation writing output... [ 77%] user/introduction writing output... [ 80%] user/plotting writing output... [ 83%] user/running writing output... [ 87%] user/saving_and_loading writing output... [ 90%] utils/logger writing output... [ 93%] utils/mpi writing output... [ 96%] utils/plotter writing output... [100%] utils/run_utils generating indices... genindex py-modindex highlighting module code... [ 10%] spinup.algos.ddpg.ddpg highlighting module code... [ 20%] spinup.algos.ppo.ppo highlighting module code... [ 30%] spinup.algos.sac.sac highlighting module code... [ 40%] spinup.algos.td3.td3 highlighting module code... [ 50%] spinup.algos.trpo.trpo highlighting module code... [ 60%] spinup.algos.vpg.vpg highlighting module code... [ 70%] spinup.utils.logx highlighting module code... [ 80%] spinup.utils.mpi_tools highlighting module code... [ 90%] spinup.utils.mpi_tf highlighting module code... [100%] spinup.utils.run_utils writing additional pages... search copying images... [ 10%] images/spinning-up-in-rl.png copying images... [ 20%] spinningup/../images/bench/bench_halfcheetah.svg copying images... [ 30%] spinningup/../images/bench/bench_hopper.svg copying images... [ 40%] spinningup/../images/bench/bench_walker.svg copying images... [ 50%] spinningup/../images/bench/bench_swim.svg copying images... [ 60%] spinningup/../images/bench/bench_ant.svg copying images... [ 70%] spinningup/../images/ex2-1_trpo_hopper.png copying images... [ 80%] spinningup/../images/ex2-2_ddpg_bug.svg copying images... [ 90%] spinningup/../images/rl_diagram_transparent_bg.png copying images... [100%] spinningup/../images/rl_algorithms_9_15.svg copying static files... WARNING: html_static_path entry '/home/docs/checkouts/readthedocs.org/readthedocs/templates/sphinx/_static' does not exist done WARNING: favicon file 'openai_icon.ico' does not exist copying extra files... done dumping search index in English (code: en) ... done dumping object inventory... done build succeeded, 16 warnings. [rtd-command-info] start-time: 2018-11-11T14:41:01.217415Z, end-time: 2018-11-11T14:42:18.751877Z, duration: 77, exit-code: 0 python sphinx-build -T -b readthedocssinglehtmllocalmedia -d _build/doctrees-readthedocssinglehtmllocalmedia -D language=en . _build/localmedia Running Sphinx v1.5.6 making output directory... loading translations [en]... done loading pickled environment... not yet created building [mo]: targets for 0 po files that are out of date building [readthedocssinglehtmllocalmedia]: all documents updating environment: 31 added, 0 changed, 0 removed reading sources... [ 3%] algorithms/ddpg reading sources... [ 6%] algorithms/ppo reading sources... [ 9%] algorithms/sac reading sources... [ 12%] algorithms/td3 reading sources... [ 16%] algorithms/trpo reading sources... [ 19%] algorithms/vpg reading sources... [ 22%] etc/acknowledgements reading sources... [ 25%] etc/author reading sources... [ 29%] index reading sources... [ 32%] spinningup/bench reading sources... [ 35%] spinningup/exercise2_1_soln reading sources... [ 38%] spinningup/exercise2_2_soln reading sources... [ 41%] spinningup/exercises reading sources... [ 45%] spinningup/extra_pg_proof1 reading sources... [ 48%] spinningup/extra_pg_proof2 reading sources... [ 51%] spinningup/keypapers reading sources... [ 54%] spinningup/rl_intro reading sources... [ 58%] spinningup/rl_intro2 reading sources... [ 61%] spinningup/rl_intro3 reading sources... [ 64%] spinningup/rl_intro4 reading sources... [ 67%] spinningup/spinningup reading sources... [ 70%] user/algorithms reading sources... [ 74%] user/installation reading sources... [ 77%] user/introduction reading sources... [ 80%] user/plotting reading sources... [ 83%] user/running reading sources... [ 87%] user/saving_and_loading reading sources... [ 90%] utils/logger reading sources... [ 93%] utils/mpi reading sources... [ 96%] utils/plotter reading sources... [100%] utils/run_utils /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/exercises.rst:3: WARNING: Duplicate explicit target name: "solution available here.". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/rl_intro3.rst:3: WARNING: Duplicate explicit target name: "on github". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/rl_intro3.rst:380: WARNING: Line block ends without a blank line. /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/plotting.rst:55: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/plotting.rst:63: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/running.rst:144: WARNING: Duplicate explicit target name: "cmdoption--ac_kwargs". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/running.rst:285: WARNING: Inline strong start-string without end-string. /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/saving_and_loading.rst:119: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/saving_and_loading.rst:146: WARNING: Duplicate explicit target name: "cmdoption-arg-default". looking for now-outdated files... none found pickling environment... done checking consistency... /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/exercise2_1_soln.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/exercise2_2_soln.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/extra_pg_proof1.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/extra_pg_proof2.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/rl_intro4.rst:: WARNING: document isn't included in any toctree done preparing documents... done assembling single document... user/introduction user/installation user/algorithms user/running user/saving_and_loading user/plotting spinningup/rl_intro spinningup/rl_intro2 spinningup/rl_intro3 spinningup/spinningup spinningup/keypapers spinningup/exercises spinningup/bench algorithms/vpg algorithms/trpo algorithms/ppo algorithms/ddpg algorithms/td3 algorithms/sac utils/logger utils/plotter utils/mpi utils/run_utils etc/acknowledgements etc/author writing... done writing additional files... copying images... [ 12%] images/spinning-up-in-rl.png copying images... [ 25%] spinningup/../images/rl_diagram_transparent_bg.png copying images... [ 37%] spinningup/../images/rl_algorithms_9_15.svg copying images... [ 50%] spinningup/../images/bench/bench_halfcheetah.svg copying images... [ 62%] spinningup/../images/bench/bench_hopper.svg copying images... [ 75%] spinningup/../images/bench/bench_walker.svg copying images... [ 87%] spinningup/../images/bench/bench_swim.svg copying images... [100%] spinningup/../images/bench/bench_ant.svg copying static files... WARNING: html_static_path entry '/home/docs/checkouts/readthedocs.org/readthedocs/templates/sphinx/_static' does not exist done WARNING: favicon file 'openai_icon.ico' does not exist copying extra files... done dumping object inventory... done build succeeded, 16 warnings. [rtd-command-info] start-time: 2018-11-11T14:42:18.975367Z, end-time: 2018-11-11T14:42:24.641388Z, duration: 5, exit-code: 0 python sphinx-build -b latex -D language=en -d _build/doctrees . _build/latex Running Sphinx v1.5.6 making output directory... loading translations [en]... done loading pickled environment... failed: source directory has changed building [mo]: targets for 0 po files that are out of date building [latex]: all documents updating environment: 31 added, 0 changed, 0 removed reading sources... [ 3%] algorithms/ddpg reading sources... [ 6%] algorithms/ppo reading sources... [ 9%] algorithms/sac reading sources... [ 12%] algorithms/td3 reading sources... [ 16%] algorithms/trpo reading sources... [ 19%] algorithms/vpg reading sources... [ 22%] etc/acknowledgements reading sources... [ 25%] etc/author reading sources... [ 29%] index reading sources... [ 32%] spinningup/bench reading sources... [ 35%] spinningup/exercise2_1_soln reading sources... [ 38%] spinningup/exercise2_2_soln reading sources... [ 41%] spinningup/exercises reading sources... [ 45%] spinningup/extra_pg_proof1 reading sources... [ 48%] spinningup/extra_pg_proof2 reading sources... [ 51%] spinningup/keypapers reading sources... [ 54%] spinningup/rl_intro reading sources... [ 58%] spinningup/rl_intro2 reading sources... [ 61%] spinningup/rl_intro3 reading sources... [ 64%] spinningup/rl_intro4 reading sources... [ 67%] spinningup/spinningup reading sources... [ 70%] user/algorithms reading sources... [ 74%] user/installation reading sources... [ 77%] user/introduction reading sources... [ 80%] user/plotting reading sources... [ 83%] user/running reading sources... [ 87%] user/saving_and_loading reading sources... [ 90%] utils/logger reading sources... [ 93%] utils/mpi reading sources... [ 96%] utils/plotter reading sources... [100%] utils/run_utils /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/exercises.rst:3: WARNING: Duplicate explicit target name: "solution available here.". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/rl_intro3.rst:3: WARNING: Duplicate explicit target name: "on github". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/rl_intro3.rst:380: WARNING: Line block ends without a blank line. /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/plotting.rst:55: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/plotting.rst:63: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/running.rst:144: WARNING: Duplicate explicit target name: "cmdoption--ac_kwargs". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/running.rst:285: WARNING: Inline strong start-string without end-string. /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/saving_and_loading.rst:119: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/saving_and_loading.rst:146: WARNING: Duplicate explicit target name: "cmdoption-arg-default". looking for now-outdated files... none found pickling environment... done checking consistency... /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/exercise2_1_soln.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/exercise2_2_soln.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/extra_pg_proof1.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/extra_pg_proof2.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/rl_intro4.rst:: WARNING: document isn't included in any toctree done processing SpinningUp.tex... index user/introduction user/installation user/algorithms user/running user/saving_and_loading user/plotting spinningup/rl_intro spinningup/rl_intro2 spinningup/rl_intro3 spinningup/spinningup spinningup/keypapers spinningup/exercises spinningup/bench algorithms/vpg algorithms/trpo algorithms/ppo algorithms/ddpg algorithms/td3 algorithms/sac utils/logger utils/plotter utils/mpi utils/run_utils etc/acknowledgements etc/author resolving references... writing... done copying images... images/spinning-up-in-rl.png spinningup/../images/rl_diagram_transparent_bg.png spinningup/../images/rl_algorithms_9_15.svg spinningup/../images/bench/bench_halfcheetah.svg spinningup/../images/bench/bench_hopper.svg spinningup/../images/bench/bench_walker.svg spinningup/../images/bench/bench_swim.svg spinningup/../images/bench/bench_ant.svg copying TeX support files... done build succeeded, 14 warnings. [rtd-command-info] start-time: 2018-11-11T14:42:24.703295Z, end-time: 2018-11-11T14:42:25.997889Z, duration: 1, exit-code: 0 pdflatex -interaction=nonstopmode /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/_build/latex/SpinningUp.tex This is pdfTeX, Version 3.14159265-2.6-1.40.16 (TeX Live 2015/Debian) (preloaded format=pdflatex) restricted \write18 enabled. entering extended mode (/home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/c heckouts/latest/docs/_build/latex/SpinningUp.tex LaTeX2e <2016/02/01> Babel <3.9q> and hyphenation patterns for 81 language(s) loaded. (./sphinxmanual.cls Document Class: sphinxmanual 2017/03/26 v1.5.4 Document class (Sphinx manual) (/usr/share/texlive/texmf-dist/tex/latex/base/report.cls Document Class: report 2014/09/29 v1.4h Standard LaTeX document class (/usr/share/texlive/texmf-dist/tex/latex/base/size10.clo))) (/usr/share/texlive/texmf-dist/tex/latex/base/inputenc.sty (/usr/share/texlive/texmf-dist/tex/latex/base/utf8.def (/usr/share/texlive/texmf-dist/tex/latex/base/t1enc.dfu) (/usr/share/texlive/texmf-dist/tex/latex/base/ot1enc.dfu) (/usr/share/texlive/texmf-dist/tex/latex/base/omsenc.dfu))) (/usr/share/texlive/texmf-dist/tex/latex/cmap/cmap.sty) (/usr/share/texlive/texmf-dist/tex/latex/base/fontenc.sty (/usr/share/texlive/texmf-dist/tex/latex/base/t1enc.def)<>) (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsmath.sty For additional information on amsmath, use the `?' option. (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amstext.sty (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsgen.sty)) (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsbsy.sty) (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsopn.sty)) (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/amssymb.sty (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/amsfonts.sty)) (/usr/share/texlive/texmf-dist/tex/generic/babel/babel.sty (/usr/share/texlive/texmf-dist/tex/generic/babel-english/english.ldf (/usr/share/texlive/texmf-dist/tex/generic/babel/babel.def))) (/usr/share/texlive/texmf-dist/tex/latex/psnfss/times.sty) (/usr/share/texlive/texmf-dist/tex/latex/fncychap/fncychap.sty) (/usr/share/texlive/texmf-dist/tex/latex/tools/longtable.sty) (./sphinx.sty (/usr/share/texlive/texmf-dist/tex/latex/graphics/graphicx.sty (/usr/share/texlive/texmf-dist/tex/latex/graphics/keyval.sty) (/usr/share/texlive/texmf-dist/tex/latex/graphics/graphics.sty (/usr/share/texlive/texmf-dist/tex/latex/graphics/trig.sty) (/usr/share/texlive/texmf-dist/tex/latex/latexconfig/graphics.cfg) (/usr/share/texlive/texmf-dist/tex/latex/pdftex-def/pdftex.def (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/infwarerr.sty) (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ltxcmds.sty)))) (/usr/share/texlive/texmf-dist/tex/latex/fancyhdr/fancyhdr.sty) (/usr/share/texlive/texmf-dist/tex/latex/base/textcomp.sty (/usr/share/texlive/texmf-dist/tex/latex/base/ts1enc.def (/usr/share/texlive/texmf-dist/tex/latex/base/ts1enc.dfu))) (/usr/share/texlive/texmf-dist/tex/latex/titlesec/titlesec.sty) (/usr/share/texlive/texmf-dist/tex/latex/etoolbox/etoolbox.sty) Package Sphinx Info: **** titlesec 2.10.1 successfully patched for bugfix **** (/usr/share/texlive/texmf-dist/tex/latex/tabulary/tabulary.sty (/usr/share/texlive/texmf-dist/tex/latex/tools/array.sty)) (/usr/share/texlive/texmf-dist/tex/latex/base/makeidx.sty) (/usr/share/texlive/texmf-dist/tex/latex/framed/framed.sty) (/usr/share/texlive/texmf-dist/tex/latex/xcolor/xcolor.sty (/usr/share/texlive/texmf-dist/tex/latex/latexconfig/color.cfg)) (/usr/share/texlive/texmf-dist/tex/latex/fancyvrb/fancyvrb.sty Style option: `fancyvrb' v2.7a, with DG/SPQR fixes, and firstline=lastline fix <2008/02/07> (tvz)) (/usr/share/texlive/texmf-dist/tex/latex/threeparttable/threeparttable.sty) (./footnotehyper-sphinx.sty (/usr/share/texlive/texmf-dist/tex/latex/mdwtools/footnote.sty)) (/usr/share/texlive/texmf-dist/tex/latex/float/float.sty) (/usr/share/texlive/texmf-dist/tex/latex/wrapfig/wrapfig.sty) (/usr/share/texlive/texmf-dist/tex/latex/parskip/parskip.sty) (/usr/share/texlive/texmf-dist/tex/latex/base/alltt.sty) (/usr/share/texlive/texmf-dist/tex/latex/upquote/upquote.sty) (/usr/share/texlive/texmf-dist/tex/latex/capt-of/capt-of.sty) (./needspace.sty) (./sphinxhighlight.sty) (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/kvoptions.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/kvsetkeys.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/etexcmds.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ifluatex.sty)))) (/usr/share/texlive/texmf-dist/tex/generic/pdftex/pdfcolor.tex) ** (sphinx) defining (legacy) text style macros without \sphinx prefix ** if clashes with packages, set latex_keep_old_macro_names=False in conf.py ) (/usr/share/texlive/texmf-dist/tex/latex/geometry/geometry.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ifpdf.sty) (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ifvtex.sty) (/usr/share/texlive/texmf-dist/tex/generic/ifxetex/ifxetex.sty)) (/usr/share/texlive/texmf-dist/tex/latex/multirow/multirow.sty) (/usr/share/texlive/texmf-dist/tex/latex/eqparbox/eqparbox.sty (/usr/share/texlive/texmf-dist/tex/latex/environ/environ.sty (/usr/share/texlive/texmf-dist/tex/latex/trimspaces/trimspaces.sty))) (/usr/share/texlive/texmf-dist/tex/latex/hyperref/hyperref.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/hobsub-hyperref.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/hobsub-generic.sty)) (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/auxhook.sty) (/usr/share/texlive/texmf-dist/tex/latex/hyperref/pd1enc.def) (/usr/share/texlive/texmf-dist/tex/latex/latexconfig/hyperref.cfg) (/usr/share/texlive/texmf-dist/tex/latex/hyperref/puenc.def) (/usr/share/texlive/texmf-dist/tex/latex/url/url.sty)) Package hyperref Message: Driver (autodetected): hpdftex. (/usr/share/texlive/texmf-dist/tex/latex/hyperref/hpdftex.def (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/rerunfilecheck.sty)) (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/hypcap.sty) Writing index file SpinningUp.idx No file SpinningUp.aux. (/usr/share/texlive/texmf-dist/tex/latex/base/ts1cmr.fd) (/usr/share/texlive/texmf-dist/tex/latex/psnfss/t1ptm.fd) (/usr/share/texlive/texmf-dist/tex/context/base/supp-pdf.mkii [Loading MPS to PDF converter (version 2006.09.02).] ) (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/epstopdf-base.sty (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/grfext.sty) (/usr/share/texlive/texmf-dist/tex/latex/latexconfig/epstopdf-sys.cfg)) *geometry* driver: auto-detecting *geometry* detected driver: pdftex (/usr/share/texlive/texmf-dist/tex/latex/hyperref/nameref.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/gettitlestring.sty)) (/usr/share/texlive/texmf-dist/tex/latex/psnfss/t1phv.fd)<><><><> (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/umsa.fd) (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/umsb.fd) [1{/var/lib/texmf/fo nts/map/pdftex/updmap/pdftex.map}] [2] [1] [2] (/usr/share/texlive/texmf-dist/tex/latex/psnfss/t1pcr.fd) [1 <./spinning-up-in-rl.png>] [2] Chapter 1. (/usr/share/texlive/texmf-dist/tex/latex/psnfss/ts1ptm.fd) LaTeX Warning: Hyper reference `user/introduction:introduction' on page 3 undef ined on input line 70. LaTeX Warning: Hyper reference `user/introduction:what-this-is' on page 3 undef ined on input line 73. LaTeX Warning: Hyper reference `user/introduction:why-we-built-this' on page 3 undefined on input line 76. LaTeX Warning: Hyper reference `user/introduction:how-this-serves-our-mission' on page 3 undefined on input line 79. LaTeX Warning: Hyper reference `user/introduction:code-design-philosophy' on pa ge 3 undefined on input line 82. LaTeX Warning: Hyper reference `user/introduction:support-plan' on page 3 undef ined on input line 85. [3] [4] [5] [6] Chapter 2. LaTeX Warning: Hyper reference `user/installation:installation' on page 7 undef ined on input line 209. LaTeX Warning: Hyper reference `user/installation:installing-python' on page 7 undefined on input line 212. LaTeX Warning: Hyper reference `user/installation:installing-openmpi' on page 7 undefined on input line 215. LaTeX Warning: Hyper reference `user/installation:ubuntu' on page 7 undefined o n input line 218. LaTeX Warning: Hyper reference `user/installation:mac-os-x' on page 7 undefined on input line 221. LaTeX Warning: Hyper reference `user/installation:installing-spinning-up' on pa ge 7 undefined on input line 226. LaTeX Warning: Hyper reference `user/installation:check-your-install' on page 7 undefined on input line 229. LaTeX Warning: Hyper reference `user/installation:installing-mujoco-optional' o n page 7 undefined on input line 232. [7] [8] [9] [10] Chapter 3. LaTeX Warning: Hyper reference `user/algorithms:algorithms' on page 11 undefine d on input line 359. LaTeX Warning: Hyper reference `user/algorithms:what-s-included' on page 11 und efined on input line 362. LaTeX Warning: Hyper reference `user/algorithms:why-these-algorithms' on page 1 1 undefined on input line 365. LaTeX Warning: Hyper reference `user/algorithms:the-on-policy-algorithms' on pa ge 11 undefined on input line 368. LaTeX Warning: Hyper reference `user/algorithms:the-off-policy-algorithms' on p age 11 undefined on input line 371. LaTeX Warning: Hyper reference `user/algorithms:code-format' on page 11 undefin ed on input line 376. LaTeX Warning: Hyper reference `user/algorithms:the-algorithm-file' on page 11 undefined on input line 379. LaTeX Warning: Hyper reference `user/algorithms:the-core-file' on page 11 undef ined on input line 382. [11] [12] [13] [14] Chapter 4. LaTeX Warning: Hyper reference `user/running:running-experiments' on page 15 un defined on input line 528. LaTeX Warning: Hyper reference `user/running:launching-from-the-command-line' o n page 15 undefined on input line 531. LaTeX Warning: Hyper reference `user/running:setting-hyperparameters-from-the-c ommand-line' on page 15 undefined on input line 534. LaTeX Warning: Hyper reference `user/running:launching-multiple-experiments-at- once' on page 15 undefined on input line 537. LaTeX Warning: Hyper reference `user/running:special-flags' on page 15 undefine d on input line 540. LaTeX Warning: Hyper reference `user/running:environment-flag' on page 15 undef ined on input line 543. LaTeX Warning: Hyper reference `user/running:shortcut-flags' on page 15 undefin ed on input line 546. LaTeX Warning: Hyper reference `user/running:config-flags' on page 15 undefined on input line 549. LaTeX Warning: Hyper reference `user/running:where-results-are-saved' on page 1 5 undefined on input line 554. LaTeX Warning: Hyper reference `user/running:how-is-suffix-determined' on page 15 undefined on input line 557. LaTeX Warning: Hyper reference `user/running:extra' on page 15 undefined on inp ut line 562. LaTeX Warning: Hyper reference `user/running:launching-from-scripts' on page 15 undefined on input line 567. LaTeX Warning: Hyper reference `user/running:using-experimentgrid' on page 15 u ndefined on input line 570. [15] [16] [17] [18] (/usr/share/texlive/texmf-dist/tex/latex/psnfss/ts1pcr.fd) [19] [20] Chapter 5. LaTeX Warning: Hyper reference `user/saving_and_loading:experiment-outputs' on page 21 undefined on input line 898. LaTeX Warning: Hyper reference `user/saving_and_loading:algorithm-outputs' on p age 21 undefined on input line 901. LaTeX Warning: Hyper reference `user/saving_and_loading:save-directory-location ' on page 21 undefined on input line 904. LaTeX Warning: Hyper reference `user/saving_and_loading:loading-and-running-tra ined-policies' on page 21 undefined on input line 907. LaTeX Warning: Hyper reference `user/saving_and_loading:if-environment-saves-su ccessfully' on page 21 undefined on input line 910. LaTeX Warning: Hyper reference `user/saving_and_loading:environment-not-found-e rror' on page 21 undefined on input line 913. LaTeX Warning: Hyper reference `user/saving_and_loading:using-trained-value-fun ctions' on page 21 undefined on input line 916. [21] LaTeX Warning: Hyper reference `user/saving_and_loading:details-below' on page 22 undefined on input line 961. [22] [23] [24] [25] [26] Chapter 6. [27] [28] Chapter 7. LaTeX Warning: Hyper reference `spinningup/rl_intro:part-1-key-concepts-in-rl' on page 29 undefined on input line 1277. LaTeX Warning: Hyper reference `spinningup/rl_intro:what-can-rl-do' on page 29 undefined on input line 1280. LaTeX Warning: Hyper reference `spinningup/rl_intro:key-concepts-and-terminolog y' on page 29 undefined on input line 1283. LaTeX Warning: Hyper reference `spinningup/rl_intro:optional-formalism' on page 29 undefined on input line 1286. [29] [30 <./rl_diagram_transparent_bg.png>] [31] [32] [33] ! Undefined control sequence. ... } P(\tau |\pi ) R(\tau ) = \underE {\tau \sim \pi }{R(\tau )}... l.1529 ...nderE{\tau\sim \pi}{R(\tau)}.\end{split} ! Undefined control sequence. ... } P(\tau |\pi ) R(\tau ) = \underE {\tau \sim \pi }{R(\tau )}... l.1529 ...nderE{\tau\sim \pi}{R(\tau)}.\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {\tau \sim \pi }{R(\tau )\... l.1548 ...R(\tau)\left| s_0 = s\right.}\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {\tau \sim \pi }{R(\tau )\... l.1548 ...R(\tau)\left| s_0 = s\right.}\end{split} [34] ! Undefined control sequence. ...egin {split}Q^{\pi }(s,a) = \underE {\tau \sim \pi }{R(\tau )\... l.1555 ...eft| s_0 = s, a_0 = a\right.}\end{split} ! Undefined control sequence. ...egin {split}Q^{\pi }(s,a) = \underE {\tau \sim \pi }{R(\tau )\... l.1555 ...eft| s_0 = s, a_0 = a\right.}\end{split} ! Undefined control sequence. ...split}V^*(s) = \max _{\pi } \underE {\tau \sim \pi }{R(\tau )\... l.1562 ...R(\tau)\left| s_0 = s\right.}\end{split} ! Undefined control sequence. ...split}V^*(s) = \max _{\pi } \underE {\tau \sim \pi }{R(\tau )\... l.1562 ...R(\tau)\left| s_0 = s\right.}\end{split} ! Undefined control sequence. ...lit}Q^*(s,a) = \max _{\pi } \underE {\tau \sim \pi }{R(\tau )\... l.1569 ...eft| s_0 = s, a_0 = a\right.}\end{split} ! Undefined control sequence. ...lit}Q^*(s,a) = \max _{\pi } \underE {\tau \sim \pi }{R(\tau )\... l.1569 ...eft| s_0 = s, a_0 = a\right.}\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {a\sim \pi }{Q^{\pi }(s,a)... l.1583 ...erE{a\sim \pi}{Q^{\pi}(s,a)},\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {a\sim \pi }{Q^{\pi }(s,a)... l.1583 ...erE{a\sim \pi}{Q^{\pi}(s,a)},\end{split} [35] ! Undefined control sequence. V^{\pi }(s) &= \underE {a \sim \pi \\ s'\sim P}{r(s,a) + \gamma ... l.1616 \end{align*} ! Missing } inserted. } l.1616 \end{align*} ! Missing { inserted. { l.1616 \end{align*} ! Undefined control sequence. ...}(s')}, \\ Q^{\pi }(s,a) &= \underE {s'\sim P}{r(s,a) + \gamma... l.1616 \end{align*} ! Undefined control sequence. ... {s'\sim P}{r(s,a) + \gamma \underE {a'\sim \pi }{Q^{\pi }(s',... l.1616 \end{align*} ! Undefined control sequence. V^{\pi }(s) &= \underE {a \sim \pi \\ s'\sim P}{r(s,a) + \gamma ... l.1616 \end{align*} ! Missing } inserted. } l.1616 \end{align*} ! Missing \endgroup inserted. \endgroup l.1616 \end{align*} ! Misplaced \omit. \math@cr@@@ ...@ \@ne \add@amps \maxfields@ \omit \kern -\alignsep@ \iftag@ ... l.1616 \end{align*} ! Missing { inserted. { l.1616 \end{align*} ! Undefined control sequence. ...}(s')}, \\ Q^{\pi }(s,a) &= \underE {s'\sim P}{r(s,a) + \gamma... l.1616 \end{align*} ! Undefined control sequence. ... {s'\sim P}{r(s,a) + \gamma \underE {a'\sim \pi }{Q^{\pi }(s',... l.1616 \end{align*} ! Undefined control sequence. V^*(s) &= \max _a \underE {s'\sim P}{r(s,a) + \gamma V^*(s')}, \... l.1623 \end{align*} ! Undefined control sequence. ...ma V^*(s')}, \\ Q^*(s,a) &= \underE {s'\sim P}{r(s,a) + \gamma... l.1623 \end{align*} ! Undefined control sequence. V^*(s) &= \max _a \underE {s'\sim P}{r(s,a) + \gamma V^*(s')}, \... l.1623 \end{align*} ! Undefined control sequence. ...ma V^*(s')}, \\ Q^*(s,a) &= \underE {s'\sim P}{r(s,a) + \gamma... l.1623 \end{align*} [36] [37] [38] Chapter 8. LaTeX Warning: Hyper reference `spinningup/rl_intro2:part-2-kinds-of-rl-algorit hms' on page 39 undefined on input line 1676. LaTeX Warning: Hyper reference `spinningup/rl_intro2:a-taxonomy-of-rl-algorithm s' on page 39 undefined on input line 1679. LaTeX Warning: Hyper reference `spinningup/rl_intro2:links-to-algorithms-in-tax onomy' on page 39 undefined on input line 1682. ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.1697 ...ncludegraphics{{rl_algorithms_9_15}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.1697 ...ncludegraphics{{rl_algorithms_9_15}.svg} LaTeX Warning: Hyper reference `spinningup/rl_intro2:citations-below' on page 3 9 undefined on input line 1698. [39] [40] [41] [42] Chapter 9. LaTeX Warning: Hyper reference `spinningup/rl_intro3:part-3-intro-to-policy-opt imization' on page 43 undefined on input line 1839. LaTeX Warning: Hyper reference `spinningup/rl_intro3:deriving-the-simplest-poli cy-gradient' on page 43 undefined on input line 1842. LaTeX Warning: Hyper reference `spinningup/rl_intro3:implementing-the-simplest- policy-gradient' on page 43 undefined on input line 1845. LaTeX Warning: Hyper reference `spinningup/rl_intro3:expected-grad-log-prob-lem ma' on page 43 undefined on input line 1848. LaTeX Warning: Hyper reference `spinningup/rl_intro3:don-t-let-the-past-distrac t-you' on page 43 undefined on input line 1851. LaTeX Warning: Hyper reference `spinningup/rl_intro3:implementing-reward-to-go- policy-gradient' on page 43 undefined on input line 1854. LaTeX Warning: Hyper reference `spinningup/rl_intro3:baselines-in-policy-gradie nts' on page 43 undefined on input line 1857. LaTeX Warning: Hyper reference `spinningup/rl_intro3:other-forms-of-the-policy- gradient' on page 43 undefined on input line 1860. LaTeX Warning: Hyper reference `spinningup/rl_intro3:recap' on page 43 undefine d on input line 1863. ! Undefined control sequence. l.1888 ...ected return \(J(\pi_{\theta}) = \underE {\tau \sim \pi_{\theta}}{R... [43] ! Undefined control sequence. ...} \log P(\tau | \theta ) &= \cancel {\nabla _{\theta } \log \r... l.1919 ... \log \pi_{\theta}(a_t |s_t).\end{split} ! Undefined control sequence. ...} + \sum _{t=0}^{T} \bigg ( \cancel {\nabla _{\theta } \log P(... l.1919 ... \log \pi_{\theta}(a_t |s_t).\end{split} ! Undefined control sequence. ...} \log P(\tau | \theta ) &= \cancel {\nabla _{\theta } \log \r... l.1919 ... \log \pi_{\theta}(a_t |s_t).\end{split} ! Undefined control sequence. ...} + \sum _{t=0}^{T} \bigg ( \cancel {\nabla _{\theta } \log P(... l.1919 ... \log \pi_{\theta}(a_t |s_t).\end{split} ! Undefined control sequence. ...eta }) &= \nabla _{\theta } \underE {\tau \sim \pi _{\theta }}... l.1931 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...Log-derivative trick} \\ &= \underE {\tau \sim \pi _{\theta }}... l.1931 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...heta } J(\pi _{\theta }) &= \underE {\tau \sim \pi _{\theta }}... l.1931 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...eta }) &= \nabla _{\theta } \underE {\tau \sim \pi _{\theta }}... l.1931 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...Log-derivative trick} \\ &= \underE {\tau \sim \pi _{\theta }}... l.1931 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...heta } J(\pi _{\theta }) &= \underE {\tau \sim \pi _{\theta }}... l.1931 \end{align*} \end{sphinxadmonition} [44] [45] [46] [47] ! Undefined control sequence. \split@tag \begin {split}\underE {x \sim P_{\theta }}{\nabla _{\t... l.2080 ...eta} \log P_{\theta}(x)} = 0.\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {x \sim P_{\theta }}{\nabla _{\t... l.2080 ...eta} \log P_{\theta}(x)} = 0.\end{split} ! Undefined control sequence. ...eta }(x) \\ \therefore 0 &= \underE {x \sim P_{\theta }}{\nabl... l.2097 ...{\theta} \log P_{\theta}(x)}.\end{split} ! Undefined control sequence. ...eta }(x) \\ \therefore 0 &= \underE {x \sim P_{\theta }}{\nabl... l.2097 ...{\theta} \log P_{\theta}(x)}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2105 ..._{\theta}(a_t |s_t) R(\tau)}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2105 ..._{\theta}(a_t |s_t) R(\tau)}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2113 ...R(s_{t'}, a_{t'}, s_{t'+1})}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2113 ...R(s_{t'}, a_{t'}, s_{t'+1})}.\end{split} [48] ! Undefined control sequence. \split@tag \begin {split}\underE {a_t \sim \pi _{\theta }}{\nabla... l.2165 ...\theta}(a_t|s_t) b(s_t)} = 0.\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {a_t \sim \pi _{\theta }}{\nabla... l.2165 ...\theta}(a_t|s_t) b(s_t)} = 0.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2169 ..., s_{t'+1}) - b(s_t)\right)}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2169 ..., s_{t'+1}) - b(s_t)\right)}.\end{split} [49] ! Undefined control sequence. ...phi _k = \arg \min _{\phi } \underE {s_t, \hat {R}_t \sim \pi ... l.2183 ...(s_t) - \hat{R}_t \right)^2},\end{split} ! Undefined control sequence. ...phi _k = \arg \min _{\phi } \underE {s_t, \hat {R}_t \sim \pi ... l.2183 ...(s_t) - \hat{R}_t \right)^2},\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2197 ...i_{\theta}(a_t |s_t) \Phi_t},\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2197 ...i_{\theta}(a_t |s_t) \Phi_t},\end{split} [50] [51] [52] Chapter 10. LaTeX Warning: Hyper reference `spinningup/spinningup:spinning-up-as-a-deep-rl- researcher' on page 53 undefined on input line 2251. LaTeX Warning: Hyper reference `spinningup/spinningup:the-right-background' on page 53 undefined on input line 2254. LaTeX Warning: Hyper reference `spinningup/spinningup:learn-by-doing' on page 5 3 undefined on input line 2257. LaTeX Warning: Hyper reference `spinningup/spinningup:developing-a-research-pro ject' on page 53 undefined on input line 2260. LaTeX Warning: Hyper reference `spinningup/spinningup:doing-rigorous-research-i n-rl' on page 53 undefined on input line 2263. LaTeX Warning: Hyper reference `spinningup/spinningup:closing-thoughts' on page 53 undefined on input line 2266. LaTeX Warning: Hyper reference `spinningup/spinningup:ps-other-resources' on pa ge 53 undefined on input line 2269. LaTeX Warning: Hyper reference `spinningup/spinningup:references' on page 53 un defined on input line 2272. [53] [54] [55] [56] [57] [58] Chapter 11. LaTeX Warning: Hyper reference `spinningup/keypapers:key-papers-in-deep-rl' on page 59 undefined on input line 2381. LaTeX Warning: Hyper reference `spinningup/keypapers:model-free-rl' on page 59 undefined on input line 2384. LaTeX Warning: Hyper reference `spinningup/keypapers:exploration' on page 59 un defined on input line 2387. LaTeX Warning: Hyper reference `spinningup/keypapers:transfer-and-multitask-rl' on page 59 undefined on input line 2390. LaTeX Warning: Hyper reference `spinningup/keypapers:hierarchy' on page 59 unde fined on input line 2393. LaTeX Warning: Hyper reference `spinningup/keypapers:memory' on page 59 undefin ed on input line 2396. LaTeX Warning: Hyper reference `spinningup/keypapers:model-based-rl' on page 59 undefined on input line 2399. LaTeX Warning: Hyper reference `spinningup/keypapers:meta-rl' on page 59 undefi ned on input line 2402. LaTeX Warning: Hyper reference `spinningup/keypapers:scaling-rl' on page 59 und efined on input line 2405. LaTeX Warning: Hyper reference `spinningup/keypapers:rl-in-the-real-world' on p age 59 undefined on input line 2408. LaTeX Warning: Hyper reference `spinningup/keypapers:safety' on page 59 undefin ed on input line 2411. LaTeX Warning: Hyper reference `spinningup/keypapers:imitation-learning-and-inv erse-reinforcement-learning' on page 59 undefined on input line 2414. LaTeX Warning: Hyper reference `spinningup/keypapers:bonus-classic-papers-in-rl -theory-or-review' on page 59 undefined on input line 2417. [59] [60] Overfull \vbox (74.45543pt too high) has occurred while \output is active [61] [62] Chapter 12. LaTeX Warning: Hyper reference `spinningup/exercises:exercises' on page 63 unde fined on input line 2503. LaTeX Warning: Hyper reference `spinningup/exercises:problem-set-1-basics-of-im plementation' on page 63 undefined on input line 2506. LaTeX Warning: Hyper reference `spinningup/exercises:problem-set-2-algorithm-fa ilure-modes' on page 63 undefined on input line 2509. LaTeX Warning: Hyper reference `spinningup/exercises:challenges' on page 63 und efined on input line 2512. [63] [64] [65] [66] Chapter 13. LaTeX Warning: Hyper reference `spinningup/bench:benchmarks-for-spinning-up-imp lementations' on page 67 undefined on input line 2651. LaTeX Warning: Hyper reference `spinningup/bench:performance-in-each-environmen t' on page 67 undefined on input line 2654. LaTeX Warning: Hyper reference `spinningup/bench:halfcheetah' on page 67 undefi ned on input line 2657. LaTeX Warning: Hyper reference `spinningup/bench:hopper' on page 67 undefined o n input line 2660. LaTeX Warning: Hyper reference `spinningup/bench:walker' on page 67 undefined o n input line 2663. LaTeX Warning: Hyper reference `spinningup/bench:swimmer' on page 67 undefined on input line 2666. LaTeX Warning: Hyper reference `spinningup/bench:ant' on page 67 undefined on i nput line 2669. LaTeX Warning: Hyper reference `spinningup/bench:experiment-details' on page 67 undefined on input line 2674. ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2692 ...includegraphics{{bench_halfcheetah}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2692 ...includegraphics{{bench_halfcheetah}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2701 ...phinxincludegraphics{{bench_hopper}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2701 ...phinxincludegraphics{{bench_hopper}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2710 ...phinxincludegraphics{{bench_walker}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2710 ...phinxincludegraphics{{bench_walker}.svg} [67] ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2719 ...\sphinxincludegraphics{{bench_swim}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2719 ...\sphinxincludegraphics{{bench_swim}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2728 ...t\sphinxincludegraphics{{bench_ant}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2728 ...t\sphinxincludegraphics{{bench_ant}.svg} [68] [69] [70] Chapter 14. LaTeX Warning: Hyper reference `algorithms/vpg:vanilla-policy-gradient' on page 71 undefined on input line 2751. LaTeX Warning: Hyper reference `algorithms/vpg:background' on page 71 undefined on input line 2754. LaTeX Warning: Hyper reference `algorithms/vpg:quick-facts' on page 71 undefine d on input line 2757. LaTeX Warning: Hyper reference `algorithms/vpg:key-equations' on page 71 undefi ned on input line 2760. LaTeX Warning: Hyper reference `algorithms/vpg:exploration-vs-exploitation' on page 71 undefined on input line 2763. LaTeX Warning: Hyper reference `algorithms/vpg:pseudocode' on page 71 undefined on input line 2766. LaTeX Warning: Hyper reference `algorithms/vpg:documentation' on page 71 undefi ned on input line 2771. LaTeX Warning: Hyper reference `algorithms/vpg:saved-model-contents' on page 71 undefined on input line 2774. LaTeX Warning: Hyper reference `algorithms/vpg:references' on page 71 undefined on input line 2779. LaTeX Warning: Hyper reference `algorithms/vpg:relevant-papers' on page 71 unde fined on input line 2782. LaTeX Warning: Hyper reference `algorithms/vpg:why-these-papers' on page 71 und efined on input line 2785. LaTeX Warning: Hyper reference `algorithms/vpg:other-public-implementations' on page 71 undefined on input line 2788. [71] ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2825 },\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2825 },\end{split} ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2842 ...rithms/vpg:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2843 \caption {Vanilla Policy Gradient Algorithm} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2845 \begin{algorithmic} [1] ! Undefined control sequence. l.2846 \STATE Input: initial policy parameters $\theta_0$, initial value... ! Undefined control sequence. l.2847 \FOR {$k = 0,1,2,...$} ! Undefined control sequence. l.2848 \STATE Collect set of trajectories ${\mathcal D}_k = \{\tau_i\}$ ... ! Undefined control sequence. l.2849 \STATE Compute rewards-to-go $\hat{R}_t$. ! Undefined control sequence. l.2850 \STATE Compute advantage estimates, $\hat{A}_t$ (using any method... ! Undefined control sequence. l.2851 \STATE Estimate policy gradient as ! Undefined control sequence. l.2855 \STATE Compute policy update, either using standard gradient ascent, ! Undefined control sequence. l.2860 \STATE Fit value function by regression on mean-squared error: ! Undefined control sequence. l.2865 \ENDFOR ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2866 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2867 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 2873--2873 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 2873--2873 \T1/ptm/m/it/10 steps_per_epoch=4000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 epochs=50\ T1/ptm/m/n/10 , \T1/ptm/m/it/10 gamma=0.99\T1/ptm/m/n/10 , \T1/ptm/m/it/10 pi_l r=0.0003\T1/ptm/m/n/10 , \T1/ptm/m/it/10 vf_lr=0.001\T1/ptm/m/n/10 , [72] [73] [74] [75] [76] Chapter 15. LaTeX Warning: Hyper reference `algorithms/trpo:trust-region-policy-optimizatio n' on page 77 undefined on input line 3073. LaTeX Warning: Hyper reference `algorithms/trpo:background' on page 77 undefine d on input line 3076. LaTeX Warning: Hyper reference `algorithms/trpo:quick-facts' on page 77 undefin ed on input line 3079. LaTeX Warning: Hyper reference `algorithms/trpo:key-equations' on page 77 undef ined on input line 3082. LaTeX Warning: Hyper reference `algorithms/trpo:exploration-vs-exploitation' on page 77 undefined on input line 3085. LaTeX Warning: Hyper reference `algorithms/trpo:pseudocode' on page 77 undefine d on input line 3088. LaTeX Warning: Hyper reference `algorithms/trpo:documentation' on page 77 undef ined on input line 3093. LaTeX Warning: Hyper reference `algorithms/trpo:saved-model-contents' on page 7 7 undefined on input line 3096. LaTeX Warning: Hyper reference `algorithms/trpo:references' on page 77 undefine d on input line 3101. LaTeX Warning: Hyper reference `algorithms/trpo:relevant-papers' on page 77 und efined on input line 3104. LaTeX Warning: Hyper reference `algorithms/trpo:why-these-papers' on page 77 un defined on input line 3107. LaTeX Warning: Hyper reference `algorithms/trpo:other-public-implementations' o n page 77 undefined on input line 3110. [77] ! Undefined control sequence. ...al L}(\theta _k, \theta ) = \underE {s,a \sim \pi _{\theta _k}... l.3154 },\end{split} ! Undefined control sequence. ...al L}(\theta _k, \theta ) = \underE {s,a \sim \pi _{\theta _k}... l.3154 },\end{split} ! Undefined control sequence. ...{KL}(\theta || \theta _k) = \underE {s \sim \pi _{\theta _k}}{... l.3160 }.\end{split} ! Undefined control sequence. ...{KL}(\theta || \theta _k) = \underE {s \sim \pi _{\theta _k}}{... l.3160 }.\end{split} [78] ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3209 ...ithms/trpo:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3210 \caption {Trust Region Policy Optimization} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3212 \begin{algorithmic} [1] ! Undefined control sequence. l.3213 \STATE Input: initial policy parameters $\theta_0$, initial value... ! Undefined control sequence. l.3214 \STATE Hyperparameters: KL-divergence limit $\delta$, backtrackin... ! Undefined control sequence. l.3215 \FOR {$k = 0,1,2,...$} ! Undefined control sequence. l.3216 \STATE Collect set of trajectories ${\mathcal D}_k = \{\tau_i\}$ ... ! Undefined control sequence. l.3217 \STATE Compute rewards-to-go $\hat{R}_t$. ! Undefined control sequence. l.3218 \STATE Compute advantage estimates, $\hat{A}_t$ (using any method... ! Undefined control sequence. l.3219 \STATE Estimate policy gradient as ! Undefined control sequence. l.3223 \STATE Use the conjugate gradient algorithm to compute ! Undefined control sequence. l.3228 \STATE Update the policy by backtracking line search with [79] ! Undefined control sequence. l.3233 \STATE Fit value function by regression on mean-squared error: ! Undefined control sequence. l.3238 \ENDFOR ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3239 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3240 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 3246--3246 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 3246--3246 \T1/ptm/m/it/10 steps_per_epoch=4000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 epochs=50\ T1/ptm/m/n/10 , \T1/ptm/m/it/10 gamma=0.99\T1/ptm/m/n/10 , \T1/ptm/m/it/10 delt a=0.01\T1/ptm/m/n/10 , \T1/ptm/m/it/10 vf_lr=0.001\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 3246--3246 \T1/ptm/m/it/10 train_v_iters=80\T1/ptm/m/n/10 , \T1/ptm/m/it/10 damp-ing_coeff =0.1\T1/ptm/m/n/10 , \T1/ptm/m/it/10 cg_iters=10\T1/ptm/m/n/10 , \T1/ptm/m/it/1 0 back-track_iters=10\T1/ptm/m/n/10 , \T1/ptm/m/it/10 back- Underfull \hbox (badness 10000) in paragraph at lines 3246--3246 \T1/ptm/m/it/10 track_coeff=0.8\T1/ptm/m/n/10 , \T1/ptm/m/it/10 lam=0.97\T1/ptm /m/n/10 , \T1/ptm/m/it/10 max_ep_len=1000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 log-g er_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 save_freq=10\T1/ptm/m/n/10 , [80] Overfull \vbox (100.02797pt too high) has occurred while \output is active [81] [82] [83] [84] Chapter 16. LaTeX Warning: Hyper reference `algorithms/ppo:proximal-policy-optimization' on page 85 undefined on input line 3520. LaTeX Warning: Hyper reference `algorithms/ppo:background' on page 85 undefined on input line 3523. LaTeX Warning: Hyper reference `algorithms/ppo:quick-facts' on page 85 undefine d on input line 3526. LaTeX Warning: Hyper reference `algorithms/ppo:key-equations' on page 85 undefi ned on input line 3529. LaTeX Warning: Hyper reference `algorithms/ppo:exploration-vs-exploitation' on page 85 undefined on input line 3532. LaTeX Warning: Hyper reference `algorithms/ppo:pseudocode' on page 85 undefined on input line 3535. LaTeX Warning: Hyper reference `algorithms/ppo:documentation' on page 85 undefi ned on input line 3540. LaTeX Warning: Hyper reference `algorithms/ppo:saved-model-contents' on page 85 undefined on input line 3543. LaTeX Warning: Hyper reference `algorithms/ppo:references' on page 85 undefined on input line 3548. LaTeX Warning: Hyper reference `algorithms/ppo:relevant-papers' on page 85 unde fined on input line 3551. LaTeX Warning: Hyper reference `algorithms/ppo:why-these-papers' on page 85 und efined on input line 3554. LaTeX Warning: Hyper reference `algorithms/ppo:other-public-implementations' on page 85 undefined on input line 3557. [85] [86] ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3666 ...rithms/ppo:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3667 \caption {PPO-Clip} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3669 \begin{algorithmic} [1] ! Undefined control sequence. l.3670 \STATE Input: initial policy parameters $\theta_0$, initial value... ! Undefined control sequence. l.3671 \FOR {$k = 0,1,2,...$} ! Undefined control sequence. l.3672 \STATE Collect set of trajectories ${\mathcal D}_k = \{\tau_i\}$ ... ! Undefined control sequence. l.3673 \STATE Compute rewards-to-go $\hat{R}_t$. ! Undefined control sequence. l.3674 \STATE Compute advantage estimates, $\hat{A}_t$ (using any method... ! Undefined control sequence. l.3675 \STATE Update the policy by maximizing the PPO-Clip objective: ! Undefined control sequence. l.3683 \STATE Fit value function by regression on mean-squared error: ! Undefined control sequence. l.3688 \ENDFOR ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3689 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3690 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 3696--3696 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , [87] [88] [89] [90] Chapter 17. LaTeX Warning: Hyper reference `algorithms/ddpg:deep-deterministic-policy-gradi ent' on page 91 undefined on input line 3916. LaTeX Warning: Hyper reference `algorithms/ddpg:background' on page 91 undefine d on input line 3919. LaTeX Warning: Hyper reference `algorithms/ddpg:quick-facts' on page 91 undefin ed on input line 3922. LaTeX Warning: Hyper reference `algorithms/ddpg:key-equations' on page 91 undef ined on input line 3925. LaTeX Warning: Hyper reference `algorithms/ddpg:the-q-learning-side-of-ddpg' on page 91 undefined on input line 3928. LaTeX Warning: Hyper reference `algorithms/ddpg:the-policy-learning-side-of-ddp g' on page 91 undefined on input line 3931. LaTeX Warning: Hyper reference `algorithms/ddpg:exploration-vs-exploitation' on page 91 undefined on input line 3936. LaTeX Warning: Hyper reference `algorithms/ddpg:pseudocode' on page 91 undefine d on input line 3939. LaTeX Warning: Hyper reference `algorithms/ddpg:documentation' on page 91 undef ined on input line 3944. LaTeX Warning: Hyper reference `algorithms/ddpg:saved-model-contents' on page 9 1 undefined on input line 3947. LaTeX Warning: Hyper reference `algorithms/ddpg:references' on page 91 undefine d on input line 3952. LaTeX Warning: Hyper reference `algorithms/ddpg:relevant-papers' on page 91 und efined on input line 3955. LaTeX Warning: Hyper reference `algorithms/ddpg:why-these-papers' on page 91 un defined on input line 3958. LaTeX Warning: Hyper reference `algorithms/ddpg:other-public-implementations' o n page 91 undefined on input line 3961. [91] [92] [93] ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4080 ...ithms/ddpg:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4081 \caption {Deep Deterministic Policy Gradient} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4083 \begin{algorithmic} [1] ! Undefined control sequence. l.4084 \STATE Input: initial policy parameters $\theta$, Q-function para... ! Undefined control sequence. l.4085 \STATE Set target parameters equal to main parameters $\theta_{\t... ! Undefined control sequence. l.4086 \REPEAT ! Undefined control sequence. l.4087 \STATE Observe state $s$ and select action $a = \text{clip}(\... ! Undefined control sequence. l.4088 \STATE Execute $a$ in the environment ! Undefined control sequence. l.4089 \STATE Observe next state $s'$, reward $r$, and done signal $... ! Undefined control sequence. l.4090 \STATE Store $(s,a,r,s',d)$ in replay buffer $\mathcal{D}$ ! Undefined control sequence. l.4091 \STATE If $s'$ is terminal, reset environment state. ! Undefined control sequence. l.4092 \IF {it's time to update} ! Undefined control sequence. l.4093 \FOR {however many updates} ! Undefined control sequence. l.4094 \STATE Randomly sample a batch of transitions, $B = \... ! Undefined control sequence. l.4095 \STATE Compute targets ! Undefined control sequence. l.4099 \STATE Update Q-function by one step of gradient desc... ! Undefined control sequence. l.4103 \STATE Update policy by one step of gradient ascent u... ! Undefined control sequence. l.4107 \STATE Update target networks with ! Undefined control sequence. l.4112 \ENDFOR ! Undefined control sequence. l.4113 \ENDIF ! Undefined control sequence. l.4114 \UNTIL {convergence} ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4115 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4116 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 4122--4122 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 4122--4122 \T1/ptm/m/it/10 pi_lr=0.001\T1/ptm/m/n/10 , \T1/ptm/m/it/10 q_lr=0.001\T1/ptm/m /n/10 , \T1/ptm/m/it/10 batch_size=100\T1/ptm/m/n/10 , \T1/ptm/m/it/10 start_st eps=10000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 act_noise=0.1\T1/ptm/m/n/10 , [94] [95] [96] Chapter 18. LaTeX Warning: Hyper reference `algorithms/td3:twin-delayed-ddpg' on page 97 un defined on input line 4337. LaTeX Warning: Hyper reference `algorithms/td3:background' on page 97 undefined on input line 4340. LaTeX Warning: Hyper reference `algorithms/td3:quick-facts' on page 97 undefine d on input line 4343. LaTeX Warning: Hyper reference `algorithms/td3:key-equations' on page 97 undefi ned on input line 4346. LaTeX Warning: Hyper reference `algorithms/td3:exploration-vs-exploitation' on page 97 undefined on input line 4349. LaTeX Warning: Hyper reference `algorithms/td3:pseudocode' on page 97 undefined on input line 4352. LaTeX Warning: Hyper reference `algorithms/td3:documentation' on page 97 undefi ned on input line 4357. LaTeX Warning: Hyper reference `algorithms/td3:saved-model-contents' on page 97 undefined on input line 4360. LaTeX Warning: Hyper reference `algorithms/td3:references' on page 97 undefined on input line 4365. LaTeX Warning: Hyper reference `algorithms/td3:relevant-papers' on page 97 unde fined on input line 4368. LaTeX Warning: Hyper reference `algorithms/td3:other-public-implementations' on page 97 undefined on input line 4371. [97] ! Undefined control sequence. ...}L(\phi _1, {\mathcal D}) = \underE {(s,a,r,s',d) \sim {\mathc... l.4428 },\end{split} ! Undefined control sequence. ...}L(\phi _1, {\mathcal D}) = \underE {(s,a,r,s',d) \sim {\mathc... l.4428 },\end{split} ! Undefined control sequence. ...}L(\phi _2, {\mathcal D}) = \underE {(s,a,r,s',d) \sim {\mathc... l.4432 }.\end{split} ! Undefined control sequence. ...}L(\phi _2, {\mathcal D}) = \underE {(s,a,r,s',d) \sim {\mathc... l.4432 }.\end{split} [98] ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4456 ...rithms/td3:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4457 \caption {Twin Delayed DDPG} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4459 \begin{algorithmic} [1] ! Undefined control sequence. l.4460 \STATE Input: initial policy parameters $\theta$, Q-function para... ! Undefined control sequence. l.4461 \STATE Set target parameters equal to main parameters $\theta_{\t... ! Undefined control sequence. l.4462 \REPEAT ! Undefined control sequence. l.4463 \STATE Observe state $s$ and select action $a = \text{clip}(\... ! Undefined control sequence. l.4464 \STATE Execute $a$ in the environment ! Undefined control sequence. l.4465 \STATE Observe next state $s'$, reward $r$, and done signal $... ! Undefined control sequence. l.4466 \STATE Store $(s,a,r,s',d)$ in replay buffer $\mathcal{D}$ ! Undefined control sequence. l.4467 \STATE If $s'$ is terminal, reset environment state. ! Undefined control sequence. l.4468 \IF {it's time to update} ! Undefined control sequence. l.4469 \FOR {$j$ in range(however many updates)} ! Undefined control sequence. l.4470 \STATE Randomly sample a batch of transitions, $B = \... ! Undefined control sequence. l.4471 \STATE Compute target actions ! Undefined control sequence. l.4475 \STATE Compute targets ! Undefined control sequence. l.4479 \STATE Update Q-functions by one step of gradient des... ! Undefined control sequence. l.4483 \IF { $j \mod$ \texttt{policy\_delay} $ = 0$} ! Undefined control sequence. l.4484 \STATE Update policy by one step of gradient asce... ! Undefined control sequence. l.4488 \STATE Update target networks with ! Undefined control sequence. l.4493 \ENDIF ! Undefined control sequence. l.4494 \ENDFOR ! Undefined control sequence. l.4495 \ENDIF ! Undefined control sequence. l.4496 \UNTIL {convergence} ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4497 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4498 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 4504--4504 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 4504--4504 \T1/ptm/m/it/10 pi_lr=0.001\T1/ptm/m/n/10 , \T1/ptm/m/it/10 q_lr=0.001\T1/ptm/m /n/10 , \T1/ptm/m/it/10 batch_size=100\T1/ptm/m/n/10 , \T1/ptm/m/it/10 start_st eps=10000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 act_noise=0.1\T1/ptm/m/n/10 , \T1/ptm /m/it/10 tar- [99] [100] [101] [102] Chapter 19. LaTeX Warning: Hyper reference `algorithms/sac:soft-actor-critic' on page 103 u ndefined on input line 4730. LaTeX Warning: Hyper reference `algorithms/sac:background' on page 103 undefine d on input line 4733. LaTeX Warning: Hyper reference `algorithms/sac:quick-facts' on page 103 undefin ed on input line 4736. LaTeX Warning: Hyper reference `algorithms/sac:key-equations' on page 103 undef ined on input line 4739. LaTeX Warning: Hyper reference `algorithms/sac:entropy-regularized-reinforcemen t-learning' on page 103 undefined on input line 4742. LaTeX Warning: Hyper reference `algorithms/sac:id1' on page 103 undefined on in put line 4745. LaTeX Warning: Hyper reference `algorithms/sac:exploration-vs-exploitation' on page 103 undefined on input line 4750. LaTeX Warning: Hyper reference `algorithms/sac:pseudocode' on page 103 undefine d on input line 4753. LaTeX Warning: Hyper reference `algorithms/sac:documentation' on page 103 undef ined on input line 4758. LaTeX Warning: Hyper reference `algorithms/sac:saved-model-contents' on page 10 3 undefined on input line 4761. LaTeX Warning: Hyper reference `algorithms/sac:references' on page 103 undefine d on input line 4766. LaTeX Warning: Hyper reference `algorithms/sac:relevant-papers' on page 103 und efined on input line 4769. LaTeX Warning: Hyper reference `algorithms/sac:other-public-implementations' on page 103 undefined on input line 4772. [103] ! Undefined control sequence. ...it@tag \begin {split}H(P) = \underE {x \sim P}{-\log P(x)}.\en... l.4819 ...underE{x \sim P}{-\log P(x)}.\end{split} ! Undefined control sequence. ...it@tag \begin {split}H(P) = \underE {x \sim P}{-\log P(x)}.\en... l.4819 ...underE{x \sim P}{-\log P(x)}.\end{split} ! Undefined control sequence. ...}\pi ^* = \arg \max _{\pi } \underE {\tau \sim \pi }{ \sum _{t... l.4823 ...pi(\cdot|s_t)\right) \bigg)},\end{split} ! Undefined control sequence. ...}\pi ^* = \arg \max _{\pi } \underE {\tau \sim \pi }{ \sum _{t... l.4823 ...pi(\cdot|s_t)\right) \bigg)},\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {\tau \sim \pi }{ \left . ... l.4827 ...ight) \bigg) \right| s_0 = s}\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {\tau \sim \pi }{ \left . ... l.4827 ...ight) \bigg) \right| s_0 = s}\end{split} ! Undefined control sequence. ...egin {split}Q^{\pi }(s,a) = \underE {\tau \sim \pi }{ \left . ... l.4831 ...ght)\right| s_0 = s, a_0 = a}\end{split} ! Undefined control sequence. ...egin {split}Q^{\pi }(s,a) = \underE {\tau \sim \pi }{ \left . ... l.4831 ...ght)\right| s_0 = s, a_0 = a}\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {a \sim \pi }{Q^{\pi }(s,a... l.4835 ...ha H\left(\pi(\cdot|s)\right)\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {a \sim \pi }{Q^{\pi }(s,a... l.4835 ...ha H\left(\pi(\cdot|s)\right)\end{split} ! Undefined control sequence. ...gin {split}Q^{\pi }(s,a) &= \underE {s' \sim P \\ a' \sim \pi ... l.4840 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Missing } inserted. } l.4840 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Missing { inserted. { l.4840 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Undefined control sequence. ...s')\right ) \right )} \\ &= \underE {s' \sim P}{R(s,a,s') + \g... l.4840 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Undefined control sequence. ...gin {split}Q^{\pi }(s,a) &= \underE {s' \sim P \\ a' \sim \pi ... l.4840 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Missing } inserted. } l.4840 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Missing { inserted. { l.4840 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Undefined control sequence. ...s')\right ) \right )} \\ &= \underE {s' \sim P}{R(s,a,s') + \g... l.4840 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} [104] ! Undefined control sequence. ...begin {split}V^{\pi }(s) &= \underE {a \sim \pi }{Q^{\pi }(s,a... l.4863 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. ...pi (\cdot |s)\right ) \\ &= \underE {a \sim \pi }{Q^{\pi }(s,a... l.4863 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. ...begin {split}V^{\pi }(s) &= \underE {a \sim \pi }{Q^{\pi }(s,a... l.4863 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. ...pi (\cdot |s)\right ) \\ &= \underE {a \sim \pi }{Q^{\pi }(s,a... l.4863 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. ...it}L(\psi , {\mathcal D}) = \underE {s \sim \mathcal {D} \\ \t... l.4871 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Missing } inserted. } l.4871 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Missing { inserted. { l.4871 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Undefined control sequence. ...it}L(\psi , {\mathcal D}) = \underE {s \sim \mathcal {D} \\ \t... l.4871 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Missing } inserted. } l.4871 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Missing { inserted. { l.4871 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {a \sim \pi }{Q^{\pi }(s,a) - \a... l.4877 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {a \sim \pi }{Q^{\pi }(s,a) - \a... l.4877 ...s,a) - \alpha \log \pi(a|s)}.\end{split} [105] ! Undefined control sequence. \split@tag \begin {split}\underE {a \sim \pi _{\theta }}{Q^{\pi _... l.4894 ...\tilde{a}_{\theta}(s,\xi)|s)}\end{split} ! Undefined control sequence. ...\log \pi _{\theta }(a|s)} = \underE {\xi \sim \mathcal {N}}{Q^... l.4894 ...\tilde{a}_{\theta}(s,\xi)|s)}\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {a \sim \pi _{\theta }}{Q^{\pi _... l.4894 ...\tilde{a}_{\theta}(s,\xi)|s)}\end{split} ! Undefined control sequence. ...\log \pi _{\theta }(a|s)} = \underE {\xi \sim \mathcal {N}}{Q^... l.4894 ...\tilde{a}_{\theta}(s,\xi)|s)}\end{split} ! Undefined control sequence. ...egin {split}\max _{\theta } \underE {s \sim \mathcal {D} \\ \x... l.4898 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Missing } inserted. } l.4898 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Missing { inserted. { l.4898 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Undefined control sequence. ...egin {split}\max _{\theta } \underE {s \sim \mathcal {D} \\ \x... l.4898 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Missing } inserted. } l.4898 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Missing { inserted. { l.4898 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4916 ...rithms/sac:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4917 \caption {Soft Actor-Critic} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4919 \begin{algorithmic} [1] ! Undefined control sequence. l.4920 \STATE Input: initial policy parameters $\theta$, Q-function para... ! Undefined control sequence. l.4921 \STATE Set target parameters equal to main parameters $\psi_{\tex... ! Undefined control sequence. l.4922 \REPEAT ! Undefined control sequence. l.4923 \STATE Observe state $s$ and select action $a \sim \pi_{\thet... ! Undefined control sequence. l.4924 \STATE Execute $a$ in the environment ! Undefined control sequence. l.4925 \STATE Observe next state $s'$, reward $r$, and done signal $... ! Undefined control sequence. l.4926 \STATE Store $(s,a,r,s',d)$ in replay buffer $\mathcal{D}$ ! Undefined control sequence. l.4927 \STATE If $s'$ is terminal, reset environment state. ! Undefined control sequence. l.4928 \IF {it's time to update} ! Undefined control sequence. l.4929 \FOR {$j$ in range(however many updates)} ! Undefined control sequence. l.4930 \STATE Randomly sample a batch of transitions, $B = \... ! Undefined control sequence. l.4931 \STATE Compute targets for Q and V functions: [106] ! Undefined control sequence. l.4936 \STATE Update Q-functions by one step of gradient des... ! Undefined control sequence. l.4940 \STATE Update V-function by one step of gradient desc... ! Undefined control sequence. l.4944 \STATE Update policy by one step of gradient ascent u... ! Undefined control sequence. l.4949 \STATE Update target value network with ! Undefined control sequence. l.4953 \ENDFOR ! Undefined control sequence. l.4954 \ENDIF ! Undefined control sequence. l.4955 \UNTIL {convergence} ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4956 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4957 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 4963--4963 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 4963--4963 \T1/ptm/m/it/10 lr=0.001\T1/ptm/m/n/10 , \T1/ptm/m/it/10 al-pha=0.2\T1/ptm/m/n/ 10 , \T1/ptm/m/it/10 batch_size=100\T1/ptm/m/n/10 , \T1/ptm/m/it/10 start_steps =10000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 max_ep_len=1000\T1/ptm/m/n/10 , \T1/ptm/ m/it/10 log- [107] Overfull \vbox (77.80809pt too high) has occurred while \output is active [108] [109] [110] Chapter 20. LaTeX Warning: Hyper reference `utils/logger:logger' on page 111 undefined on i nput line 5227. LaTeX Warning: Hyper reference `utils/logger:using-a-logger' on page 111 undefi ned on input line 5230. LaTeX Warning: Hyper reference `utils/logger:examples' on page 111 undefined on input line 5233. LaTeX Warning: Hyper reference `utils/logger:logging-and-mpi' on page 111 undef ined on input line 5236. LaTeX Warning: Hyper reference `utils/logger:logger-classes' on page 111 undefi ned on input line 5241. LaTeX Warning: Hyper reference `utils/logger:loading-saved-graphs' on page 111 undefined on input line 5244. [111] [112] [113] [114] LaTeX Warning: Hyper reference `utils/logger:spinup.utils.logx.Logger' on page 115 undefined on input line 5542. [115] [116] Chapter 21. [117] [118] Chapter 22. LaTeX Warning: Hyper reference `utils/mpi:mpi-tools' on page 119 undefined on i nput line 5687. LaTeX Warning: Hyper reference `utils/mpi:module-spinup.utils.mpi_tools' on pag e 119 undefined on input line 5690. LaTeX Warning: Hyper reference `utils/mpi:mpi-tensorflow-utilities' on page 119 undefined on input line 5693. [119] [120] Chapter 23. LaTeX Warning: Hyper reference `utils/run_utils:run-utils' on page 121 undefine d on input line 5820. LaTeX Warning: Hyper reference `utils/run_utils:experimentgrid' on page 121 und efined on input line 5823. LaTeX Warning: Hyper reference `utils/run_utils:calling-experiments' on page 12 1 undefined on input line 5826. [121] Underfull \hbox (badness 10000) in paragraph at lines 5978--5978 []\T1/ptm/m/it/10 exp_name\T1/ptm/m/n/10 , \T1/ptm/m/it/10 thunk\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , \T1/ptm/m/it/10 num_cpu=1\T1/ptm/m/n/1 0 , [122] [123] [124] Chapter 24. [125] [126] Chapter 25. [127] [128] Chapter 26. [129] [130] LaTeX Warning: Reference `utils/mpi:module-spinup.utils.mpi_tf' on page 131 und efined on input line 6111. LaTeX Warning: Reference `utils/mpi:module-spinup.utils.mpi_tools' on page 131 undefined on input line 6112. [131] No file SpinningUp.ind. (./SpinningUp.aux) Package rerunfilecheck Warning: File `SpinningUp.out' has changed. (rerunfilecheck) Rerun to get outlines right (rerunfilecheck) or use package `bookmark'. LaTeX Warning: There were undefined references. LaTeX Warning: Label(s) may have changed. Rerun to get cross-references right. ) (see the transcript file for additional information){/usr/share/texlive/texmf-d ist/fonts/enc/dvips/base/8r.enc}< /usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmr5.pfb> Output written on SpinningUp.pdf (135 pages, 1117784 bytes). Transcript written on SpinningUp.log. [rtd-command-info] start-time: 2018-11-11T14:42:26.074550Z, end-time: 2018-11-11T14:42:26.211046Z, duration: 0, exit-code: 0 makeindex -s python.ist SpinningUp.idx This is makeindex, version 2.15 [TeX Live 2015] (kpathsea + Thai support). Scanning style file ./python.ist.......done (7 attributes redefined, 0 ignored). Scanning input file SpinningUp.idx....done (78 entries accepted, 0 rejected). Sorting entries....done (506 comparisons). Generating output file SpinningUp.ind....done (144 lines written, 0 warnings). Output written in SpinningUp.ind. Transcript written in SpinningUp.ilg. [rtd-command-info] start-time: 2018-11-11T14:42:26.269755Z, end-time: 2018-11-11T14:42:27.466119Z, duration: 1, exit-code: 0 pdflatex -interaction=nonstopmode /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/_build/latex/SpinningUp.tex This is pdfTeX, Version 3.14159265-2.6-1.40.16 (TeX Live 2015/Debian) (preloaded format=pdflatex) restricted \write18 enabled. entering extended mode (/home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/c heckouts/latest/docs/_build/latex/SpinningUp.tex LaTeX2e <2016/02/01> Babel <3.9q> and hyphenation patterns for 81 language(s) loaded. (./sphinxmanual.cls Document Class: sphinxmanual 2017/03/26 v1.5.4 Document class (Sphinx manual) (/usr/share/texlive/texmf-dist/tex/latex/base/report.cls Document Class: report 2014/09/29 v1.4h Standard LaTeX document class (/usr/share/texlive/texmf-dist/tex/latex/base/size10.clo))) (/usr/share/texlive/texmf-dist/tex/latex/base/inputenc.sty (/usr/share/texlive/texmf-dist/tex/latex/base/utf8.def (/usr/share/texlive/texmf-dist/tex/latex/base/t1enc.dfu) (/usr/share/texlive/texmf-dist/tex/latex/base/ot1enc.dfu) (/usr/share/texlive/texmf-dist/tex/latex/base/omsenc.dfu))) (/usr/share/texlive/texmf-dist/tex/latex/cmap/cmap.sty) (/usr/share/texlive/texmf-dist/tex/latex/base/fontenc.sty (/usr/share/texlive/texmf-dist/tex/latex/base/t1enc.def)<>) (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsmath.sty For additional information on amsmath, use the `?' option. (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amstext.sty (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsgen.sty)) (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsbsy.sty) (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsopn.sty)) (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/amssymb.sty (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/amsfonts.sty)) (/usr/share/texlive/texmf-dist/tex/generic/babel/babel.sty (/usr/share/texlive/texmf-dist/tex/generic/babel-english/english.ldf (/usr/share/texlive/texmf-dist/tex/generic/babel/babel.def))) (/usr/share/texlive/texmf-dist/tex/latex/psnfss/times.sty) (/usr/share/texlive/texmf-dist/tex/latex/fncychap/fncychap.sty) (/usr/share/texlive/texmf-dist/tex/latex/tools/longtable.sty) (./sphinx.sty (/usr/share/texlive/texmf-dist/tex/latex/graphics/graphicx.sty (/usr/share/texlive/texmf-dist/tex/latex/graphics/keyval.sty) (/usr/share/texlive/texmf-dist/tex/latex/graphics/graphics.sty (/usr/share/texlive/texmf-dist/tex/latex/graphics/trig.sty) (/usr/share/texlive/texmf-dist/tex/latex/latexconfig/graphics.cfg) (/usr/share/texlive/texmf-dist/tex/latex/pdftex-def/pdftex.def (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/infwarerr.sty) (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ltxcmds.sty)))) (/usr/share/texlive/texmf-dist/tex/latex/fancyhdr/fancyhdr.sty) (/usr/share/texlive/texmf-dist/tex/latex/base/textcomp.sty (/usr/share/texlive/texmf-dist/tex/latex/base/ts1enc.def (/usr/share/texlive/texmf-dist/tex/latex/base/ts1enc.dfu))) (/usr/share/texlive/texmf-dist/tex/latex/titlesec/titlesec.sty) (/usr/share/texlive/texmf-dist/tex/latex/etoolbox/etoolbox.sty) Package Sphinx Info: **** titlesec 2.10.1 successfully patched for bugfix **** (/usr/share/texlive/texmf-dist/tex/latex/tabulary/tabulary.sty (/usr/share/texlive/texmf-dist/tex/latex/tools/array.sty)) (/usr/share/texlive/texmf-dist/tex/latex/base/makeidx.sty) (/usr/share/texlive/texmf-dist/tex/latex/framed/framed.sty) (/usr/share/texlive/texmf-dist/tex/latex/xcolor/xcolor.sty (/usr/share/texlive/texmf-dist/tex/latex/latexconfig/color.cfg)) (/usr/share/texlive/texmf-dist/tex/latex/fancyvrb/fancyvrb.sty Style option: `fancyvrb' v2.7a, with DG/SPQR fixes, and firstline=lastline fix <2008/02/07> (tvz)) (/usr/share/texlive/texmf-dist/tex/latex/threeparttable/threeparttable.sty) (./footnotehyper-sphinx.sty (/usr/share/texlive/texmf-dist/tex/latex/mdwtools/footnote.sty)) (/usr/share/texlive/texmf-dist/tex/latex/float/float.sty) (/usr/share/texlive/texmf-dist/tex/latex/wrapfig/wrapfig.sty) (/usr/share/texlive/texmf-dist/tex/latex/parskip/parskip.sty) (/usr/share/texlive/texmf-dist/tex/latex/base/alltt.sty) (/usr/share/texlive/texmf-dist/tex/latex/upquote/upquote.sty) (/usr/share/texlive/texmf-dist/tex/latex/capt-of/capt-of.sty) (./needspace.sty) (./sphinxhighlight.sty) (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/kvoptions.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/kvsetkeys.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/etexcmds.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ifluatex.sty)))) (/usr/share/texlive/texmf-dist/tex/generic/pdftex/pdfcolor.tex) ** (sphinx) defining (legacy) text style macros without \sphinx prefix ** if clashes with packages, set latex_keep_old_macro_names=False in conf.py ) (/usr/share/texlive/texmf-dist/tex/latex/geometry/geometry.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ifpdf.sty) (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ifvtex.sty) (/usr/share/texlive/texmf-dist/tex/generic/ifxetex/ifxetex.sty)) (/usr/share/texlive/texmf-dist/tex/latex/multirow/multirow.sty) (/usr/share/texlive/texmf-dist/tex/latex/eqparbox/eqparbox.sty (/usr/share/texlive/texmf-dist/tex/latex/environ/environ.sty (/usr/share/texlive/texmf-dist/tex/latex/trimspaces/trimspaces.sty))) (/usr/share/texlive/texmf-dist/tex/latex/hyperref/hyperref.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/hobsub-hyperref.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/hobsub-generic.sty)) (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/auxhook.sty) (/usr/share/texlive/texmf-dist/tex/latex/hyperref/pd1enc.def) (/usr/share/texlive/texmf-dist/tex/latex/latexconfig/hyperref.cfg) (/usr/share/texlive/texmf-dist/tex/latex/hyperref/puenc.def) (/usr/share/texlive/texmf-dist/tex/latex/url/url.sty)) Package hyperref Message: Driver (autodetected): hpdftex. (/usr/share/texlive/texmf-dist/tex/latex/hyperref/hpdftex.def (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/rerunfilecheck.sty)) (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/hypcap.sty) Writing index file SpinningUp.idx (./SpinningUp.aux LaTeX Warning: Label `alg1' multiply defined. LaTeX Warning: Label `alg1' multiply defined. LaTeX Warning: Label `alg1' multiply defined. LaTeX Warning: Label `alg1' multiply defined. LaTeX Warning: Label `alg1' multiply defined. ) (/usr/share/texlive/texmf-dist/tex/latex/base/ts1cmr.fd) (/usr/share/texlive/texmf-dist/tex/latex/psnfss/t1ptm.fd) (/usr/share/texlive/texmf-dist/tex/context/base/supp-pdf.mkii [Loading MPS to PDF converter (version 2006.09.02).] ) (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/epstopdf-base.sty (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/grfext.sty) (/usr/share/texlive/texmf-dist/tex/latex/latexconfig/epstopdf-sys.cfg)) *geometry* driver: auto-detecting *geometry* detected driver: pdftex (/usr/share/texlive/texmf-dist/tex/latex/hyperref/nameref.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/gettitlestring.sty)) (./SpinningUp.out) (./SpinningUp.out) (/usr/share/texlive/texmf-dist/tex/latex/psnfss/t1phv.fd)<><><><> (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/umsa.fd) (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/umsb.fd) [1{/var/lib/texmf/fo nts/map/pdftex/updmap/pdftex.map}] [2] (./SpinningUp.toc [1] [2]) [3] [4] (/usr/share/texlive/texmf-dist/tex/latex/psnfss/t1pcr.fd) [1 <./spinning-up-in-rl.png>] [2] Chapter 1. (/usr/share/texlive/texmf-dist/tex/latex/psnfss/ts1ptm.fd) [3] [4] [5] [6] Chapter 2. [7] [8] [9] [10] Chapter 3. [11] [12] [13] [14] Chapter 4. [15] [16] [17] [18] (/usr/share/texlive/texmf-dist/tex/latex/psnfss/ts1pcr.fd) [19] [20] Chapter 5. [21] [22] [23] [24] [25] [26] Chapter 6. [27] [28] Chapter 7. [29] [30 <./rl_diagram_transparent_bg.png>] [31] [32] [33] ! Undefined control sequence. ... } P(\tau |\pi ) R(\tau ) = \underE {\tau \sim \pi }{R(\tau )}... l.1529 ...nderE{\tau\sim \pi}{R(\tau)}.\end{split} ! Undefined control sequence. ... } P(\tau |\pi ) R(\tau ) = \underE {\tau \sim \pi }{R(\tau )}... l.1529 ...nderE{\tau\sim \pi}{R(\tau)}.\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {\tau \sim \pi }{R(\tau )\... l.1548 ...R(\tau)\left| s_0 = s\right.}\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {\tau \sim \pi }{R(\tau )\... l.1548 ...R(\tau)\left| s_0 = s\right.}\end{split} [34] ! Undefined control sequence. ...egin {split}Q^{\pi }(s,a) = \underE {\tau \sim \pi }{R(\tau )\... l.1555 ...eft| s_0 = s, a_0 = a\right.}\end{split} ! Undefined control sequence. ...egin {split}Q^{\pi }(s,a) = \underE {\tau \sim \pi }{R(\tau )\... l.1555 ...eft| s_0 = s, a_0 = a\right.}\end{split} ! Undefined control sequence. ...split}V^*(s) = \max _{\pi } \underE {\tau \sim \pi }{R(\tau )\... l.1562 ...R(\tau)\left| s_0 = s\right.}\end{split} ! Undefined control sequence. ...split}V^*(s) = \max _{\pi } \underE {\tau \sim \pi }{R(\tau )\... l.1562 ...R(\tau)\left| s_0 = s\right.}\end{split} ! Undefined control sequence. ...lit}Q^*(s,a) = \max _{\pi } \underE {\tau \sim \pi }{R(\tau )\... l.1569 ...eft| s_0 = s, a_0 = a\right.}\end{split} ! Undefined control sequence. ...lit}Q^*(s,a) = \max _{\pi } \underE {\tau \sim \pi }{R(\tau )\... l.1569 ...eft| s_0 = s, a_0 = a\right.}\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {a\sim \pi }{Q^{\pi }(s,a)... l.1583 ...erE{a\sim \pi}{Q^{\pi}(s,a)},\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {a\sim \pi }{Q^{\pi }(s,a)... l.1583 ...erE{a\sim \pi}{Q^{\pi}(s,a)},\end{split} [35] ! Undefined control sequence. V^{\pi }(s) &= \underE {a \sim \pi \\ s'\sim P}{r(s,a) + \gamma ... l.1616 \end{align*} ! Missing } inserted. } l.1616 \end{align*} ! Missing { inserted. { l.1616 \end{align*} ! Undefined control sequence. ...}(s')}, \\ Q^{\pi }(s,a) &= \underE {s'\sim P}{r(s,a) + \gamma... l.1616 \end{align*} ! Undefined control sequence. ... {s'\sim P}{r(s,a) + \gamma \underE {a'\sim \pi }{Q^{\pi }(s',... l.1616 \end{align*} ! Undefined control sequence. V^{\pi }(s) &= \underE {a \sim \pi \\ s'\sim P}{r(s,a) + \gamma ... l.1616 \end{align*} ! Missing } inserted. } l.1616 \end{align*} ! Missing \endgroup inserted. \endgroup l.1616 \end{align*} ! Misplaced \omit. \math@cr@@@ ...@ \@ne \add@amps \maxfields@ \omit \kern -\alignsep@ \iftag@ ... l.1616 \end{align*} ! Missing { inserted. { l.1616 \end{align*} ! Undefined control sequence. ...}(s')}, \\ Q^{\pi }(s,a) &= \underE {s'\sim P}{r(s,a) + \gamma... l.1616 \end{align*} ! Undefined control sequence. ... {s'\sim P}{r(s,a) + \gamma \underE {a'\sim \pi }{Q^{\pi }(s',... l.1616 \end{align*} ! Undefined control sequence. V^*(s) &= \max _a \underE {s'\sim P}{r(s,a) + \gamma V^*(s')}, \... l.1623 \end{align*} ! Undefined control sequence. ...ma V^*(s')}, \\ Q^*(s,a) &= \underE {s'\sim P}{r(s,a) + \gamma... l.1623 \end{align*} ! Undefined control sequence. V^*(s) &= \max _a \underE {s'\sim P}{r(s,a) + \gamma V^*(s')}, \... l.1623 \end{align*} ! Undefined control sequence. ...ma V^*(s')}, \\ Q^*(s,a) &= \underE {s'\sim P}{r(s,a) + \gamma... l.1623 \end{align*} [36] [37] [38] Chapter 8. ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.1697 ...ncludegraphics{{rl_algorithms_9_15}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.1697 ...ncludegraphics{{rl_algorithms_9_15}.svg} [39] [40] [41] [42] Chapter 9. ! Undefined control sequence. l.1888 ...ected return \(J(\pi_{\theta}) = \underE {\tau \sim \pi_{\theta}}{R... [43] ! Undefined control sequence. ...} \log P(\tau | \theta ) &= \cancel {\nabla _{\theta } \log \r... l.1919 ... \log \pi_{\theta}(a_t |s_t).\end{split} ! Undefined control sequence. ...} + \sum _{t=0}^{T} \bigg ( \cancel {\nabla _{\theta } \log P(... l.1919 ... \log \pi_{\theta}(a_t |s_t).\end{split} ! Undefined control sequence. ...} \log P(\tau | \theta ) &= \cancel {\nabla _{\theta } \log \r... l.1919 ... \log \pi_{\theta}(a_t |s_t).\end{split} ! Undefined control sequence. ...} + \sum _{t=0}^{T} \bigg ( \cancel {\nabla _{\theta } \log P(... l.1919 ... \log \pi_{\theta}(a_t |s_t).\end{split} ! Undefined control sequence. ...eta }) &= \nabla _{\theta } \underE {\tau \sim \pi _{\theta }}... l.1931 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...Log-derivative trick} \\ &= \underE {\tau \sim \pi _{\theta }}... l.1931 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...heta } J(\pi _{\theta }) &= \underE {\tau \sim \pi _{\theta }}... l.1931 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...eta }) &= \nabla _{\theta } \underE {\tau \sim \pi _{\theta }}... l.1931 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...Log-derivative trick} \\ &= \underE {\tau \sim \pi _{\theta }}... l.1931 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...heta } J(\pi _{\theta }) &= \underE {\tau \sim \pi _{\theta }}... l.1931 \end{align*} \end{sphinxadmonition} [44] [45] [46] [47] ! Undefined control sequence. \split@tag \begin {split}\underE {x \sim P_{\theta }}{\nabla _{\t... l.2080 ...eta} \log P_{\theta}(x)} = 0.\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {x \sim P_{\theta }}{\nabla _{\t... l.2080 ...eta} \log P_{\theta}(x)} = 0.\end{split} ! Undefined control sequence. ...eta }(x) \\ \therefore 0 &= \underE {x \sim P_{\theta }}{\nabl... l.2097 ...{\theta} \log P_{\theta}(x)}.\end{split} ! Undefined control sequence. ...eta }(x) \\ \therefore 0 &= \underE {x \sim P_{\theta }}{\nabl... l.2097 ...{\theta} \log P_{\theta}(x)}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2105 ..._{\theta}(a_t |s_t) R(\tau)}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2105 ..._{\theta}(a_t |s_t) R(\tau)}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2113 ...R(s_{t'}, a_{t'}, s_{t'+1})}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2113 ...R(s_{t'}, a_{t'}, s_{t'+1})}.\end{split} [48] ! Undefined control sequence. \split@tag \begin {split}\underE {a_t \sim \pi _{\theta }}{\nabla... l.2165 ...\theta}(a_t|s_t) b(s_t)} = 0.\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {a_t \sim \pi _{\theta }}{\nabla... l.2165 ...\theta}(a_t|s_t) b(s_t)} = 0.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2169 ..., s_{t'+1}) - b(s_t)\right)}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2169 ..., s_{t'+1}) - b(s_t)\right)}.\end{split} [49] ! Undefined control sequence. ...phi _k = \arg \min _{\phi } \underE {s_t, \hat {R}_t \sim \pi ... l.2183 ...(s_t) - \hat{R}_t \right)^2},\end{split} ! Undefined control sequence. ...phi _k = \arg \min _{\phi } \underE {s_t, \hat {R}_t \sim \pi ... l.2183 ...(s_t) - \hat{R}_t \right)^2},\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2197 ...i_{\theta}(a_t |s_t) \Phi_t},\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2197 ...i_{\theta}(a_t |s_t) \Phi_t},\end{split} [50] [51] [52] Chapter 10. [53] [54] [55] [56] [57] [58] Chapter 11. [59] [60] Overfull \vbox (74.45543pt too high) has occurred while \output is active [61] [62] Chapter 12. [63] [64] [65] [66] Chapter 13. ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2692 ...includegraphics{{bench_halfcheetah}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2692 ...includegraphics{{bench_halfcheetah}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2701 ...phinxincludegraphics{{bench_hopper}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2701 ...phinxincludegraphics{{bench_hopper}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2710 ...phinxincludegraphics{{bench_walker}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2710 ...phinxincludegraphics{{bench_walker}.svg} [67] ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2719 ...\sphinxincludegraphics{{bench_swim}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2719 ...\sphinxincludegraphics{{bench_swim}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2728 ...t\sphinxincludegraphics{{bench_ant}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2728 ...t\sphinxincludegraphics{{bench_ant}.svg} [68] [69] [70] Chapter 14. [71] ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2825 },\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2825 },\end{split} ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2842 ...rithms/vpg:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2843 \caption {Vanilla Policy Gradient Algorithm} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2845 \begin{algorithmic} [1] ! Undefined control sequence. l.2846 \STATE Input: initial policy parameters $\theta_0$, initial value... ! Undefined control sequence. l.2847 \FOR {$k = 0,1,2,...$} ! Undefined control sequence. l.2848 \STATE Collect set of trajectories ${\mathcal D}_k = \{\tau_i\}$ ... ! Undefined control sequence. l.2849 \STATE Compute rewards-to-go $\hat{R}_t$. ! Undefined control sequence. l.2850 \STATE Compute advantage estimates, $\hat{A}_t$ (using any method... ! Undefined control sequence. l.2851 \STATE Estimate policy gradient as ! Undefined control sequence. l.2855 \STATE Compute policy update, either using standard gradient ascent, ! Undefined control sequence. l.2860 \STATE Fit value function by regression on mean-squared error: ! Undefined control sequence. l.2865 \ENDFOR ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2866 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2867 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 2873--2873 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 2873--2873 \T1/ptm/m/it/10 steps_per_epoch=4000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 epochs=50\ T1/ptm/m/n/10 , \T1/ptm/m/it/10 gamma=0.99\T1/ptm/m/n/10 , \T1/ptm/m/it/10 pi_l r=0.0003\T1/ptm/m/n/10 , \T1/ptm/m/it/10 vf_lr=0.001\T1/ptm/m/n/10 , [72] [73] [74] [75] [76] Chapter 15. [77] ! Undefined control sequence. ...al L}(\theta _k, \theta ) = \underE {s,a \sim \pi _{\theta _k}... l.3154 },\end{split} ! Undefined control sequence. ...al L}(\theta _k, \theta ) = \underE {s,a \sim \pi _{\theta _k}... l.3154 },\end{split} ! Undefined control sequence. ...{KL}(\theta || \theta _k) = \underE {s \sim \pi _{\theta _k}}{... l.3160 }.\end{split} ! Undefined control sequence. ...{KL}(\theta || \theta _k) = \underE {s \sim \pi _{\theta _k}}{... l.3160 }.\end{split} [78] ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3209 ...ithms/trpo:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3210 \caption {Trust Region Policy Optimization} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3212 \begin{algorithmic} [1] ! Undefined control sequence. l.3213 \STATE Input: initial policy parameters $\theta_0$, initial value... ! Undefined control sequence. l.3214 \STATE Hyperparameters: KL-divergence limit $\delta$, backtrackin... ! Undefined control sequence. l.3215 \FOR {$k = 0,1,2,...$} ! Undefined control sequence. l.3216 \STATE Collect set of trajectories ${\mathcal D}_k = \{\tau_i\}$ ... ! Undefined control sequence. l.3217 \STATE Compute rewards-to-go $\hat{R}_t$. ! Undefined control sequence. l.3218 \STATE Compute advantage estimates, $\hat{A}_t$ (using any method... ! Undefined control sequence. l.3219 \STATE Estimate policy gradient as ! Undefined control sequence. l.3223 \STATE Use the conjugate gradient algorithm to compute ! Undefined control sequence. l.3228 \STATE Update the policy by backtracking line search with [79] ! Undefined control sequence. l.3233 \STATE Fit value function by regression on mean-squared error: ! Undefined control sequence. l.3238 \ENDFOR ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3239 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3240 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 3246--3246 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 3246--3246 \T1/ptm/m/it/10 steps_per_epoch=4000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 epochs=50\ T1/ptm/m/n/10 , \T1/ptm/m/it/10 gamma=0.99\T1/ptm/m/n/10 , \T1/ptm/m/it/10 delt a=0.01\T1/ptm/m/n/10 , \T1/ptm/m/it/10 vf_lr=0.001\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 3246--3246 \T1/ptm/m/it/10 train_v_iters=80\T1/ptm/m/n/10 , \T1/ptm/m/it/10 damp-ing_coeff =0.1\T1/ptm/m/n/10 , \T1/ptm/m/it/10 cg_iters=10\T1/ptm/m/n/10 , \T1/ptm/m/it/1 0 back-track_iters=10\T1/ptm/m/n/10 , \T1/ptm/m/it/10 back- Underfull \hbox (badness 10000) in paragraph at lines 3246--3246 \T1/ptm/m/it/10 track_coeff=0.8\T1/ptm/m/n/10 , \T1/ptm/m/it/10 lam=0.97\T1/ptm /m/n/10 , \T1/ptm/m/it/10 max_ep_len=1000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 log-g er_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 save_freq=10\T1/ptm/m/n/10 , [80] Overfull \vbox (100.02797pt too high) has occurred while \output is active [81] [82] [83] [84] Chapter 16. [85] [86] ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3666 ...rithms/ppo:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3667 \caption {PPO-Clip} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3669 \begin{algorithmic} [1] ! Undefined control sequence. l.3670 \STATE Input: initial policy parameters $\theta_0$, initial value... ! Undefined control sequence. l.3671 \FOR {$k = 0,1,2,...$} ! Undefined control sequence. l.3672 \STATE Collect set of trajectories ${\mathcal D}_k = \{\tau_i\}$ ... ! Undefined control sequence. l.3673 \STATE Compute rewards-to-go $\hat{R}_t$. ! Undefined control sequence. l.3674 \STATE Compute advantage estimates, $\hat{A}_t$ (using any method... ! Undefined control sequence. l.3675 \STATE Update the policy by maximizing the PPO-Clip objective: ! Undefined control sequence. l.3683 \STATE Fit value function by regression on mean-squared error: ! Undefined control sequence. l.3688 \ENDFOR ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3689 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3690 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 3696--3696 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , [87] [88] [89] [90] Chapter 17. [91] [92] [93] ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4080 ...ithms/ddpg:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4081 \caption {Deep Deterministic Policy Gradient} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4083 \begin{algorithmic} [1] ! Undefined control sequence. l.4084 \STATE Input: initial policy parameters $\theta$, Q-function para... ! Undefined control sequence. l.4085 \STATE Set target parameters equal to main parameters $\theta_{\t... ! Undefined control sequence. l.4086 \REPEAT ! Undefined control sequence. l.4087 \STATE Observe state $s$ and select action $a = \text{clip}(\... ! Undefined control sequence. l.4088 \STATE Execute $a$ in the environment ! Undefined control sequence. l.4089 \STATE Observe next state $s'$, reward $r$, and done signal $... ! Undefined control sequence. l.4090 \STATE Store $(s,a,r,s',d)$ in replay buffer $\mathcal{D}$ ! Undefined control sequence. l.4091 \STATE If $s'$ is terminal, reset environment state. ! Undefined control sequence. l.4092 \IF {it's time to update} ! Undefined control sequence. l.4093 \FOR {however many updates} ! Undefined control sequence. l.4094 \STATE Randomly sample a batch of transitions, $B = \... ! Undefined control sequence. l.4095 \STATE Compute targets ! Undefined control sequence. l.4099 \STATE Update Q-function by one step of gradient desc... ! Undefined control sequence. l.4103 \STATE Update policy by one step of gradient ascent u... ! Undefined control sequence. l.4107 \STATE Update target networks with ! Undefined control sequence. l.4112 \ENDFOR ! Undefined control sequence. l.4113 \ENDIF ! Undefined control sequence. l.4114 \UNTIL {convergence} ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4115 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4116 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 4122--4122 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 4122--4122 \T1/ptm/m/it/10 pi_lr=0.001\T1/ptm/m/n/10 , \T1/ptm/m/it/10 q_lr=0.001\T1/ptm/m /n/10 , \T1/ptm/m/it/10 batch_size=100\T1/ptm/m/n/10 , \T1/ptm/m/it/10 start_st eps=10000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 act_noise=0.1\T1/ptm/m/n/10 , [94] [95] [96] Chapter 18. [97] ! Undefined control sequence. ...}L(\phi _1, {\mathcal D}) = \underE {(s,a,r,s',d) \sim {\mathc... l.4428 },\end{split} ! Undefined control sequence. ...}L(\phi _1, {\mathcal D}) = \underE {(s,a,r,s',d) \sim {\mathc... l.4428 },\end{split} ! Undefined control sequence. ...}L(\phi _2, {\mathcal D}) = \underE {(s,a,r,s',d) \sim {\mathc... l.4432 }.\end{split} ! Undefined control sequence. ...}L(\phi _2, {\mathcal D}) = \underE {(s,a,r,s',d) \sim {\mathc... l.4432 }.\end{split} [98] ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4456 ...rithms/td3:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4457 \caption {Twin Delayed DDPG} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4459 \begin{algorithmic} [1] ! Undefined control sequence. l.4460 \STATE Input: initial policy parameters $\theta$, Q-function para... ! Undefined control sequence. l.4461 \STATE Set target parameters equal to main parameters $\theta_{\t... ! Undefined control sequence. l.4462 \REPEAT ! Undefined control sequence. l.4463 \STATE Observe state $s$ and select action $a = \text{clip}(\... ! Undefined control sequence. l.4464 \STATE Execute $a$ in the environment ! Undefined control sequence. l.4465 \STATE Observe next state $s'$, reward $r$, and done signal $... ! Undefined control sequence. l.4466 \STATE Store $(s,a,r,s',d)$ in replay buffer $\mathcal{D}$ ! Undefined control sequence. l.4467 \STATE If $s'$ is terminal, reset environment state. ! Undefined control sequence. l.4468 \IF {it's time to update} ! Undefined control sequence. l.4469 \FOR {$j$ in range(however many updates)} ! Undefined control sequence. l.4470 \STATE Randomly sample a batch of transitions, $B = \... ! Undefined control sequence. l.4471 \STATE Compute target actions ! Undefined control sequence. l.4475 \STATE Compute targets ! Undefined control sequence. l.4479 \STATE Update Q-functions by one step of gradient des... ! Undefined control sequence. l.4483 \IF { $j \mod$ \texttt{policy\_delay} $ = 0$} ! Undefined control sequence. l.4484 \STATE Update policy by one step of gradient asce... ! Undefined control sequence. l.4488 \STATE Update target networks with ! Undefined control sequence. l.4493 \ENDIF ! Undefined control sequence. l.4494 \ENDFOR ! Undefined control sequence. l.4495 \ENDIF ! Undefined control sequence. l.4496 \UNTIL {convergence} ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4497 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4498 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 4504--4504 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 4504--4504 \T1/ptm/m/it/10 pi_lr=0.001\T1/ptm/m/n/10 , \T1/ptm/m/it/10 q_lr=0.001\T1/ptm/m /n/10 , \T1/ptm/m/it/10 batch_size=100\T1/ptm/m/n/10 , \T1/ptm/m/it/10 start_st eps=10000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 act_noise=0.1\T1/ptm/m/n/10 , \T1/ptm /m/it/10 tar- [99] [100] [101] [102] Chapter 19. [103] ! Undefined control sequence. ...it@tag \begin {split}H(P) = \underE {x \sim P}{-\log P(x)}.\en... l.4819 ...underE{x \sim P}{-\log P(x)}.\end{split} ! Undefined control sequence. ...it@tag \begin {split}H(P) = \underE {x \sim P}{-\log P(x)}.\en... l.4819 ...underE{x \sim P}{-\log P(x)}.\end{split} ! Undefined control sequence. ...}\pi ^* = \arg \max _{\pi } \underE {\tau \sim \pi }{ \sum _{t... l.4823 ...pi(\cdot|s_t)\right) \bigg)},\end{split} ! Undefined control sequence. ...}\pi ^* = \arg \max _{\pi } \underE {\tau \sim \pi }{ \sum _{t... l.4823 ...pi(\cdot|s_t)\right) \bigg)},\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {\tau \sim \pi }{ \left . ... l.4827 ...ight) \bigg) \right| s_0 = s}\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {\tau \sim \pi }{ \left . ... l.4827 ...ight) \bigg) \right| s_0 = s}\end{split} ! Undefined control sequence. ...egin {split}Q^{\pi }(s,a) = \underE {\tau \sim \pi }{ \left . ... l.4831 ...ght)\right| s_0 = s, a_0 = a}\end{split} ! Undefined control sequence. ...egin {split}Q^{\pi }(s,a) = \underE {\tau \sim \pi }{ \left . ... l.4831 ...ght)\right| s_0 = s, a_0 = a}\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {a \sim \pi }{Q^{\pi }(s,a... l.4835 ...ha H\left(\pi(\cdot|s)\right)\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {a \sim \pi }{Q^{\pi }(s,a... l.4835 ...ha H\left(\pi(\cdot|s)\right)\end{split} ! Undefined control sequence. ...gin {split}Q^{\pi }(s,a) &= \underE {s' \sim P \\ a' \sim \pi ... l.4840 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Missing } inserted. } l.4840 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Missing { inserted. { l.4840 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Undefined control sequence. ...s')\right ) \right )} \\ &= \underE {s' \sim P}{R(s,a,s') + \g... l.4840 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Undefined control sequence. ...gin {split}Q^{\pi }(s,a) &= \underE {s' \sim P \\ a' \sim \pi ... l.4840 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Missing } inserted. } l.4840 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Missing { inserted. { l.4840 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Undefined control sequence. ...s')\right ) \right )} \\ &= \underE {s' \sim P}{R(s,a,s') + \g... l.4840 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} [104] ! Undefined control sequence. ...begin {split}V^{\pi }(s) &= \underE {a \sim \pi }{Q^{\pi }(s,a... l.4863 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. ...pi (\cdot |s)\right ) \\ &= \underE {a \sim \pi }{Q^{\pi }(s,a... l.4863 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. ...begin {split}V^{\pi }(s) &= \underE {a \sim \pi }{Q^{\pi }(s,a... l.4863 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. ...pi (\cdot |s)\right ) \\ &= \underE {a \sim \pi }{Q^{\pi }(s,a... l.4863 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. ...it}L(\psi , {\mathcal D}) = \underE {s \sim \mathcal {D} \\ \t... l.4871 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Missing } inserted. } l.4871 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Missing { inserted. { l.4871 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Undefined control sequence. ...it}L(\psi , {\mathcal D}) = \underE {s \sim \mathcal {D} \\ \t... l.4871 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Missing } inserted. } l.4871 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Missing { inserted. { l.4871 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {a \sim \pi }{Q^{\pi }(s,a) - \a... l.4877 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {a \sim \pi }{Q^{\pi }(s,a) - \a... l.4877 ...s,a) - \alpha \log \pi(a|s)}.\end{split} [105] ! Undefined control sequence. \split@tag \begin {split}\underE {a \sim \pi _{\theta }}{Q^{\pi _... l.4894 ...\tilde{a}_{\theta}(s,\xi)|s)}\end{split} ! Undefined control sequence. ...\log \pi _{\theta }(a|s)} = \underE {\xi \sim \mathcal {N}}{Q^... l.4894 ...\tilde{a}_{\theta}(s,\xi)|s)}\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {a \sim \pi _{\theta }}{Q^{\pi _... l.4894 ...\tilde{a}_{\theta}(s,\xi)|s)}\end{split} ! Undefined control sequence. ...\log \pi _{\theta }(a|s)} = \underE {\xi \sim \mathcal {N}}{Q^... l.4894 ...\tilde{a}_{\theta}(s,\xi)|s)}\end{split} ! Undefined control sequence. ...egin {split}\max _{\theta } \underE {s \sim \mathcal {D} \\ \x... l.4898 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Missing } inserted. } l.4898 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Missing { inserted. { l.4898 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Undefined control sequence. ...egin {split}\max _{\theta } \underE {s \sim \mathcal {D} \\ \x... l.4898 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Missing } inserted. } l.4898 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Missing { inserted. { l.4898 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4916 ...rithms/sac:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4917 \caption {Soft Actor-Critic} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4919 \begin{algorithmic} [1] ! Undefined control sequence. l.4920 \STATE Input: initial policy parameters $\theta$, Q-function para... ! Undefined control sequence. l.4921 \STATE Set target parameters equal to main parameters $\psi_{\tex... ! Undefined control sequence. l.4922 \REPEAT ! Undefined control sequence. l.4923 \STATE Observe state $s$ and select action $a \sim \pi_{\thet... ! Undefined control sequence. l.4924 \STATE Execute $a$ in the environment ! Undefined control sequence. l.4925 \STATE Observe next state $s'$, reward $r$, and done signal $... ! Undefined control sequence. l.4926 \STATE Store $(s,a,r,s',d)$ in replay buffer $\mathcal{D}$ ! Undefined control sequence. l.4927 \STATE If $s'$ is terminal, reset environment state. ! Undefined control sequence. l.4928 \IF {it's time to update} ! Undefined control sequence. l.4929 \FOR {$j$ in range(however many updates)} ! Undefined control sequence. l.4930 \STATE Randomly sample a batch of transitions, $B = \... ! Undefined control sequence. l.4931 \STATE Compute targets for Q and V functions: [106] ! Undefined control sequence. l.4936 \STATE Update Q-functions by one step of gradient des... ! Undefined control sequence. l.4940 \STATE Update V-function by one step of gradient desc... ! Undefined control sequence. l.4944 \STATE Update policy by one step of gradient ascent u... ! Undefined control sequence. l.4949 \STATE Update target value network with ! Undefined control sequence. l.4953 \ENDFOR ! Undefined control sequence. l.4954 \ENDIF ! Undefined control sequence. l.4955 \UNTIL {convergence} ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4956 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4957 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 4963--4963 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 4963--4963 \T1/ptm/m/it/10 lr=0.001\T1/ptm/m/n/10 , \T1/ptm/m/it/10 al-pha=0.2\T1/ptm/m/n/ 10 , \T1/ptm/m/it/10 batch_size=100\T1/ptm/m/n/10 , \T1/ptm/m/it/10 start_steps =10000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 max_ep_len=1000\T1/ptm/m/n/10 , \T1/ptm/ m/it/10 log- [107] Overfull \vbox (77.80809pt too high) has occurred while \output is active [108] [109] [110] Chapter 20. [111] [112] [113] [114] [115] [116] Chapter 21. [117] [118] Chapter 22. [119] [120] Chapter 23. [121] Underfull \hbox (badness 10000) in paragraph at lines 5978--5978 []\T1/ptm/m/it/10 exp_name\T1/ptm/m/n/10 , \T1/ptm/m/it/10 thunk\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , \T1/ptm/m/it/10 num_cpu=1\T1/ptm/m/n/1 0 , [122] [123] [124] Chapter 24. [125] [126] Chapter 25. [127] [128] Chapter 26. [129] [130] [131] (./SpinningUp.ind [132] Underfull \hbox (badness 7522) in paragraph at lines 47--48 []\T1/ptm/m/n/10 add() (spinup.utils.run_utils.ExperimentGrid method), Overfull \hbox (5.61969pt too wide) in paragraph at lines 48--49 []\T1/ptm/m/n/10 apply_gradients() (spinup.utils.mpi_tf.MpiAdamOptimizer Overfull \hbox (17.83952pt too wide) in paragraph at lines 74--75 []\T1/ptm/m/n/10 compute_gradients() (spinup.utils.mpi_tf.MpiAdamOptimizer [133] Underfull \hbox (badness 10000) in paragraph at lines 103--104 []\T1/ptm/m/n/10 mpi_statistics_scalar() (in mod-ule Underfull \hbox (badness 10000) in paragraph at lines 119--120 []\T1/ptm/m/n/10 run() (spinup.utils.run_utils.ExperimentGrid method), Underfull \hbox (badness 10000) in paragraph at lines 140--141 []\T1/ptm/m/n/10 variant_name() (spinup.utils.run_utils.ExperimentGrid Underfull \hbox (badness 10000) in paragraph at lines 141--142 []\T1/ptm/m/n/10 variants() (spinup.utils.run_utils.ExperimentGrid [134]) (./SpinningUp.aux) Package rerunfilecheck Warning: File `SpinningUp.out' has changed. (rerunfilecheck) Rerun to get outlines right (rerunfilecheck) or use package `bookmark'. LaTeX Warning: There were multiply-defined labels. ) (see the transcript file for additional information){/usr/share/texlive/texmf-d ist/fonts/enc/dvips/base/8r.enc}< /usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmr5.pfb> Output written on SpinningUp.pdf (140 pages, 1145586 bytes). Transcript written on SpinningUp.log. [rtd-command-info] start-time: 2018-11-11T14:42:27.541554Z, end-time: 2018-11-11T14:42:27.603226Z, duration: 0, exit-code: 0 mv -f /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/_build/latex/SpinningUp.pdf /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/artifacts/latest/sphinx_pdf/openai-education-spinningup.pdf [rtd-command-info] start-time: 2018-11-11T14:42:27.677540Z, end-time: 2018-11-11T14:43:58.914829Z, duration: 91, exit-code: 0 python sphinx-build -T -b epub -d _build/doctrees-epub -D language=en . _build/epub Running Sphinx v1.5.6 making output directory... loading translations [en]... done loading pickled environment... not yet created building [mo]: targets for 0 po files that are out of date building [epub]: targets for 31 source files that are out of date updating environment: 31 added, 0 changed, 0 removed reading sources... [ 3%] algorithms/ddpg reading sources... [ 6%] algorithms/ppo reading sources... [ 9%] algorithms/sac reading sources... [ 12%] algorithms/td3 reading sources... [ 16%] algorithms/trpo reading sources... [ 19%] algorithms/vpg reading sources... [ 22%] etc/acknowledgements reading sources... [ 25%] etc/author reading sources... [ 29%] index reading sources... [ 32%] spinningup/bench reading sources... [ 35%] spinningup/exercise2_1_soln reading sources... [ 38%] spinningup/exercise2_2_soln reading sources... [ 41%] spinningup/exercises reading sources... [ 45%] spinningup/extra_pg_proof1 reading sources... [ 48%] spinningup/extra_pg_proof2 reading sources... [ 51%] spinningup/keypapers reading sources... [ 54%] spinningup/rl_intro reading sources... [ 58%] spinningup/rl_intro2 reading sources... [ 61%] spinningup/rl_intro3 reading sources... [ 64%] spinningup/rl_intro4 reading sources... [ 67%] spinningup/spinningup reading sources... [ 70%] user/algorithms reading sources... [ 74%] user/installation reading sources... [ 77%] user/introduction reading sources... [ 80%] user/plotting reading sources... [ 83%] user/running reading sources... [ 87%] user/saving_and_loading reading sources... [ 90%] utils/logger reading sources... [ 93%] utils/mpi reading sources... [ 96%] utils/plotter reading sources... [100%] utils/run_utils looking for now-outdated files... none found pickling environment... /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/exercises.rst:3: WARNING: Duplicate explicit target name: "solution available here.". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/rl_intro3.rst:3: WARNING: Duplicate explicit target name: "on github". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/rl_intro3.rst:380: WARNING: Line block ends without a blank line. /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/plotting.rst:55: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/plotting.rst:63: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/running.rst:144: WARNING: Duplicate explicit target name: "cmdoption--ac_kwargs". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/running.rst:285: WARNING: Inline strong start-string without end-string. /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/saving_and_loading.rst:119: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/saving_and_loading.rst:146: WARNING: Duplicate explicit target name: "cmdoption-arg-default". done checking consistency... /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/exercise2_1_soln.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/exercise2_2_soln.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/extra_pg_proof1.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/extra_pg_proof2.rst:: WARNING: document isn't included in any toctree done preparing documents... /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/rl_intro4.rst:: WARNING: document isn't included in any toctree done writing output... [ 3%] algorithms/ddpg writing output... [ 6%] algorithms/ppo writing output... [ 9%] algorithms/sac writing output... [ 12%] algorithms/td3 writing output... [ 16%] algorithms/trpo writing output... [ 19%] algorithms/vpg writing output... [ 22%] etc/acknowledgements writing output... [ 25%] etc/author writing output... [ 29%] index writing output... [ 32%] spinningup/bench writing output... [ 35%] spinningup/exercise2_1_soln writing output... [ 38%] spinningup/exercise2_2_soln writing output... [ 41%] spinningup/exercises writing output... [ 45%] spinningup/extra_pg_proof1 writing output... [ 48%] spinningup/extra_pg_proof2 writing output... [ 51%] spinningup/keypapers writing output... [ 54%] spinningup/rl_intro writing output... [ 58%] spinningup/rl_intro2 writing output... [ 61%] spinningup/rl_intro3 writing output... [ 64%] spinningup/rl_intro4 writing output... [ 67%] spinningup/spinningup writing output... [ 70%] user/algorithms writing output... [ 74%] user/installation writing output... [ 77%] user/introduction writing output... [ 80%] user/plotting writing output... [ 83%] user/running writing output... [ 87%] user/saving_and_loading writing output... [ 90%] utils/logger writing output... [ 93%] utils/mpi writing output... [ 96%] utils/plotter writing output... [100%] utils/run_utils generating indices... genindex py-modindex writing additional pages... copying images... [ 10%] images/spinning-up-in-rl.png copying images... [ 20%] spinningup/../images/bench/bench_halfcheetah.svg copying images... [ 30%] spinningup/../images/bench/bench_hopper.svg copying images... [ 40%] spinningup/../images/bench/bench_walker.svg copying images... [ 50%] spinningup/../images/bench/bench_swim.svg copying images... [ 60%] spinningup/../images/bench/bench_ant.svg copying images... [ 70%] spinningup/../images/ex2-1_trpo_hopper.png copying images... [ 80%] spinningup/../images/ex2-2_ddpg_bug.svg copying images... [ 90%] spinningup/../images/rl_diagram_transparent_bg.png copying images... [100%] spinningup/../images/rl_algorithms_9_15.svg copying static files... WARNING: html_static_path entry '/home/docs/checkouts/readthedocs.org/readthedocs/templates/sphinx/_static' does not exist WARNING: favicon file 'openai_icon.ico' does not exist done copying extra files... done writing mimetype file... writing META-INF/container.xml file... writing content.opf file... WARNING: unknown mimetype for _static/openai-favicon2_32x32.ico, ignoring WARNING: unknown mimetype for _static/openai_icon.ico, ignoring writing nav.xhtml file... writing toc.ncx file... writing SpinningUp.epub file... build succeeded, 18 warnings. [rtd-command-info] start-time: 2018-11-11T14:43:58.997015Z, end-time: 2018-11-11T14:43:59.052681Z, duration: 0, exit-code: 0 mv -f /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/_build/epub/SpinningUp.epub /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/artifacts/latest/sphinx_epub/openai-education-spinningup.epub