Read the Docs build information Build id: 158209 Project: openai-education-spinningup Version: stable Commit: c671b47d16ec1e818c9f68f3e27264b0cd507be5 Date: 2018-11-11T08:38:16.328797Z State: finished Success: True [rtd-command-info] start-time: 2018-11-11T14:44:00.072369Z, end-time: 2018-11-11T14:44:02.457872Z, duration: 2, exit-code: 0 git clone git@github.com:openai/spinningup.git . Cloning into '.'... [rtd-command-info] start-time: 2018-11-11T14:44:02.743440Z, end-time: 2018-11-11T14:44:03.129615Z, duration: 0, exit-code: 0 git checkout --force c671b47d16ec1e818c9f68f3e27264b0cd507be5 Note: checking out 'c671b47d16ec1e818c9f68f3e27264b0cd507be5'. You are in 'detached HEAD' state. You can look around, make experimental changes and commit them, and you can discard any commits you make in this state without impacting any branches by performing another checkout. If you want to create a new branch to retain commits you create, you may do so (now or later) by using -b with the checkout command again. Example: git checkout -b HEAD is now at c671b47 Readme to link to the docs site. [rtd-command-info] start-time: 2018-11-11T14:44:03.194995Z, end-time: 2018-11-11T14:44:03.203248Z, duration: 0, exit-code: 0 git clean -d -f -f [rtd-command-info] start-time: 2018-11-11T14:44:03.301502Z, end-time: 2018-11-11T14:44:03.308696Z, duration: 0, exit-code: 0 git branch -r origin/HEAD -> origin/master origin/master [rtd-command-info] start-time: 2018-11-11T14:44:04.067477Z, end-time: 2018-11-11T14:44:06.906206Z, duration: 2, exit-code: 0 python3.6 -mvirtualenv --no-site-packages --no-download Using base prefix '/home/docs/.pyenv/versions/3.6.2' New python executable in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/stable/bin/python3.6 Also creating executable in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/stable/bin/python Installing setuptools, pip, wheel...done. [rtd-command-info] start-time: 2018-11-11T14:44:06.972079Z, end-time: 2018-11-11T14:44:15.875363Z, duration: 8, exit-code: 0 python pip install --upgrade --cache-dir /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/.cache/pip Pygments==2.2.0 setuptools<40 docutils==0.13.1 mock==1.0.1 pillow==2.6.1 alabaster>=0.7,<0.8,!=0.7.5 commonmark==0.5.4 recommonmark==0.4.0 sphinx<1.8 sphinx-rtd-theme<0.5 readthedocs-sphinx-ext<0.6 Collecting Pygments==2.2.0 Using cached https://files.pythonhosted.org/packages/02/ee/b6e02dc6529e82b75bb06823ff7d005b141037cb1416b10c6f00fc419dca/Pygments-2.2.0-py2.py3-none-any.whl Collecting setuptools<40 Using cached https://files.pythonhosted.org/packages/7f/e1/820d941153923aac1d49d7fc37e17b6e73bfbd2904959fffbad77900cf92/setuptools-39.2.0-py2.py3-none-any.whl Collecting docutils==0.13.1 Using cached https://files.pythonhosted.org/packages/7c/30/8fb30d820c012a6f701a66618ce065b6d61d08ac0a77e47fc7808dbaee47/docutils-0.13.1-py3-none-any.whl Collecting mock==1.0.1 Collecting pillow==2.6.1 Collecting alabaster!=0.7.5,<0.8,>=0.7 Using cached https://files.pythonhosted.org/packages/10/ad/00b090d23a222943eb0eda509720a404f531a439e803f6538f35136cae9e/alabaster-0.7.12-py2.py3-none-any.whl Collecting commonmark==0.5.4 Collecting recommonmark==0.4.0 Using cached https://files.pythonhosted.org/packages/df/a5/8ee4b84af7f997dfdba71254a88008cfc19c49df98983c9a4919e798f8ce/recommonmark-0.4.0-py2.py3-none-any.whl Collecting sphinx<1.8 Using cached https://files.pythonhosted.org/packages/90/f9/a0babe32c78480994e4f1b93315558f5ed756104054a7029c672a8d77b72/Sphinx-1.7.9-py2.py3-none-any.whl Collecting sphinx-rtd-theme<0.5 Using cached https://files.pythonhosted.org/packages/ef/0c/e4a462190506bc4bff6ca8cf93da07b2d13e540466d2e8a760352d0c69b0/sphinx_rtd_theme-0.4.2-py2.py3-none-any.whl Collecting readthedocs-sphinx-ext<0.6 Using cached https://files.pythonhosted.org/packages/2b/c5/126eb75a57918bb3d2f858ddda05f5670d6f07bfa356bc8870e2885f6aac/readthedocs_sphinx_ext-0.5.15-py2.py3-none-any.whl Collecting requests>=2.0.0 (from sphinx<1.8) Using cached https://files.pythonhosted.org/packages/ff/17/5cbb026005115301a8fb2f9b0e3e8d32313142fe8b617070e7baad20554f/requests-2.20.1-py2.py3-none-any.whl Collecting babel!=2.0,>=1.3 (from sphinx<1.8) Using cached https://files.pythonhosted.org/packages/b8/ad/c6f60602d3ee3d92fbed87675b6fb6a6f9a38c223343ababdb44ba201f10/Babel-2.6.0-py2.py3-none-any.whl Collecting Jinja2>=2.3 (from sphinx<1.8) Using cached https://files.pythonhosted.org/packages/7f/ff/ae64bacdfc95f27a016a7bed8e8686763ba4d277a78ca76f32659220a731/Jinja2-2.10-py2.py3-none-any.whl Collecting packaging (from sphinx<1.8) Using cached https://files.pythonhosted.org/packages/89/d1/92e6df2e503a69df9faab187c684585f0136662c12bb1f36901d426f3fab/packaging-18.0-py2.py3-none-any.whl Collecting six>=1.5 (from sphinx<1.8) Using cached https://files.pythonhosted.org/packages/67/4b/141a581104b1f6397bfa78ac9d43d8ad29a7ca43ea90a2d863fe3056e86a/six-1.11.0-py2.py3-none-any.whl Collecting snowballstemmer>=1.1 (from sphinx<1.8) Using cached https://files.pythonhosted.org/packages/d4/6c/8a935e2c7b54a37714656d753e4187ee0631988184ed50c0cf6476858566/snowballstemmer-1.2.1-py2.py3-none-any.whl Collecting sphinxcontrib-websupport (from sphinx<1.8) Using cached https://files.pythonhosted.org/packages/52/69/3c2fbdc3702358c5b34ee25e387b24838597ef099761fc9a42c166796e8f/sphinxcontrib_websupport-1.1.0-py2.py3-none-any.whl Collecting imagesize (from sphinx<1.8) Using cached https://files.pythonhosted.org/packages/fc/b6/aef66b4c52a6ad6ac18cf6ebc5731ed06d8c9ae4d3b2d9951f261150be67/imagesize-1.1.0-py2.py3-none-any.whl Collecting chardet<3.1.0,>=3.0.2 (from requests>=2.0.0->sphinx<1.8) Using cached https://files.pythonhosted.org/packages/bc/a9/01ffebfb562e4274b6487b4bb1ddec7ca55ec7510b22e4c51f14098443b8/chardet-3.0.4-py2.py3-none-any.whl Collecting idna<2.8,>=2.5 (from requests>=2.0.0->sphinx<1.8) Using cached https://files.pythonhosted.org/packages/4b/2a/0276479a4b3caeb8a8c1af2f8e4355746a97fab05a372e4a2c6a6b876165/idna-2.7-py2.py3-none-any.whl Collecting certifi>=2017.4.17 (from requests>=2.0.0->sphinx<1.8) Using cached https://files.pythonhosted.org/packages/56/9d/1d02dd80bc4cd955f98980f28c5ee2200e1209292d5f9e9cc8d030d18655/certifi-2018.10.15-py2.py3-none-any.whl Collecting urllib3<1.25,>=1.21.1 (from requests>=2.0.0->sphinx<1.8) Using cached https://files.pythonhosted.org/packages/62/00/ee1d7de624db8ba7090d1226aebefab96a2c71cd5cfa7629d6ad3f61b79e/urllib3-1.24.1-py2.py3-none-any.whl Collecting pytz>=0a (from babel!=2.0,>=1.3->sphinx<1.8) Using cached https://files.pythonhosted.org/packages/f8/0e/2365ddc010afb3d79147f1dd544e5ee24bf4ece58ab99b16fbb465ce6dc0/pytz-2018.7-py2.py3-none-any.whl Collecting MarkupSafe>=0.23 (from Jinja2>=2.3->sphinx<1.8) Using cached https://files.pythonhosted.org/packages/08/04/f2191b50fb7f0712f03f064b71d8b4605190f2178ba02e975a87f7b89a0d/MarkupSafe-1.1.0-cp36-cp36m-manylinux1_x86_64.whl Collecting pyparsing>=2.0.2 (from packaging->sphinx<1.8) Using cached https://files.pythonhosted.org/packages/71/e8/6777f6624681c8b9701a8a0a5654f3eb56919a01a78e12bf3c73f5a3c714/pyparsing-2.3.0-py2.py3-none-any.whl Installing collected packages: Pygments, setuptools, docutils, mock, pillow, alabaster, commonmark, recommonmark, chardet, idna, certifi, urllib3, requests, pytz, babel, MarkupSafe, Jinja2, six, pyparsing, packaging, snowballstemmer, sphinxcontrib-websupport, imagesize, sphinx, sphinx-rtd-theme, readthedocs-sphinx-ext Found existing installation: setuptools 39.0.1 Uninstalling setuptools-39.0.1: Successfully uninstalled setuptools-39.0.1 Successfully installed Jinja2-2.10 MarkupSafe-1.1.0 Pygments-2.2.0 alabaster-0.7.12 babel-2.6.0 certifi-2018.10.15 chardet-3.0.4 commonmark-0.5.4 docutils-0.13.1 idna-2.7 imagesize-1.1.0 mock-1.0.1 packaging-18.0 pillow-2.6.1 pyparsing-2.3.0 pytz-2018.7 readthedocs-sphinx-ext-0.5.15 recommonmark-0.4.0 requests-2.20.1 setuptools-39.2.0 six-1.11.0 snowballstemmer-1.2.1 sphinx-1.7.9 sphinx-rtd-theme-0.4.2 sphinxcontrib-websupport-1.1.0 urllib3-1.24.1 You are using pip version 9.0.3, however version 18.1 is available. You should consider upgrading via the 'pip install --upgrade pip' command. [rtd-command-info] start-time: 2018-11-11T14:44:15.936982Z, end-time: 2018-11-11T14:45:04.527075Z, duration: 48, exit-code: 0 python pip install --exists-action=w --cache-dir /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/.cache/pip -r docs/docs_requirements.txt Collecting cloudpickle==0.5.2 (from -r docs/docs_requirements.txt (line 1)) Using cached https://files.pythonhosted.org/packages/aa/18/514b557c4d8d4ada1f0454ad06c845454ad438fd5c5e0039ba51d6b032fe/cloudpickle-0.5.2-py2.py3-none-any.whl Collecting gym>=0.10.8 (from -r docs/docs_requirements.txt (line 2)) Collecting ipython (from -r docs/docs_requirements.txt (line 3)) Using cached https://files.pythonhosted.org/packages/1b/e2/ffb8c1b574f972cf4183b0aac8f16b57f1e3bbe876b31555b107ea3fd009/ipython-7.1.1-py3-none-any.whl Collecting joblib (from -r docs/docs_requirements.txt (line 4)) Using cached https://files.pythonhosted.org/packages/0d/1b/995167f6c66848d4eb7eabc386aebe07a1571b397629b2eac3b7bebdc343/joblib-0.13.0-py2.py3-none-any.whl Collecting matplotlib (from -r docs/docs_requirements.txt (line 5)) Using cached https://files.pythonhosted.org/packages/71/07/16d781df15be30df4acfd536c479268f1208b2dfbc91e9ca5d92c9caf673/matplotlib-3.0.2-cp36-cp36m-manylinux1_x86_64.whl Collecting numpy (from -r docs/docs_requirements.txt (line 6)) Using cached https://files.pythonhosted.org/packages/ff/7f/9d804d2348471c67a7d8b5f84f9bc59fd1cefa148986f2b74552f8573555/numpy-1.15.4-cp36-cp36m-manylinux1_x86_64.whl Collecting pandas (from -r docs/docs_requirements.txt (line 7)) Using cached https://files.pythonhosted.org/packages/e1/d8/feeb346d41f181e83fba45224ab14a8d8af019b48af742e047f3845d8cff/pandas-0.23.4-cp36-cp36m-manylinux1_x86_64.whl Collecting pytest (from -r docs/docs_requirements.txt (line 8)) Using cached https://files.pythonhosted.org/packages/2d/b8/fc3795707bb47ed9eb83c8d65515f3424977d779d6af333d24787b1f364e/pytest-3.10.0-py2.py3-none-any.whl Collecting psutil (from -r docs/docs_requirements.txt (line 9)) Collecting scipy (from -r docs/docs_requirements.txt (line 10)) Using cached https://files.pythonhosted.org/packages/a8/0b/f163da98d3a01b3e0ef1cab8dd2123c34aee2bafbb1c5bffa354cc8a1730/scipy-1.1.0-cp36-cp36m-manylinux1_x86_64.whl Collecting seaborn==0.8.1 (from -r docs/docs_requirements.txt (line 11)) Collecting sphinx==1.5.6 (from -r docs/docs_requirements.txt (line 12)) Using cached https://files.pythonhosted.org/packages/cd/c3/3fc2985e07f6111b47328be116df9e05d5c2f246a050e2e2ebf6bdc9c692/Sphinx-1.5.6-py2.py3-none-any.whl Collecting sphinx-autobuild==0.7.1 (from -r docs/docs_requirements.txt (line 13)) Collecting sphinx-rtd-theme==0.4.1 (from -r docs/docs_requirements.txt (line 14)) Using cached https://files.pythonhosted.org/packages/87/30/7460f7b77b6e8a080dd3688f750fe5d5666c49358f8941449c5b128fa97d/sphinx_rtd_theme-0.4.1-py2.py3-none-any.whl Collecting tensorflow>=1.8.0 (from -r docs/docs_requirements.txt (line 15)) Using cached https://files.pythonhosted.org/packages/22/cc/ca70b78087015d21c5f3f93694107f34ebccb3be9624385a911d4b52ecef/tensorflow-1.12.0-cp36-cp36m-manylinux1_x86_64.whl Collecting tqdm (from -r docs/docs_requirements.txt (line 16)) Using cached https://files.pythonhosted.org/packages/91/55/8cb23a97301b177e9c8e3226dba45bb454411de2cbd25746763267f226c2/tqdm-4.28.1-py2.py3-none-any.whl Requirement already satisfied: requests>=2.0 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/stable/lib/python3.6/site-packages (from gym>=0.10.8->-r docs/docs_requirements.txt (line 2)) Requirement already satisfied: six in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/stable/lib/python3.6/site-packages (from gym>=0.10.8->-r docs/docs_requirements.txt (line 2)) Collecting pyglet>=1.2.0 (from gym>=0.10.8->-r docs/docs_requirements.txt (line 2)) Using cached https://files.pythonhosted.org/packages/1c/fc/dad5eaaab68f0c21e2f906a94ddb98175662cc5a654eee404d59554ce0fa/pyglet-1.3.2-py2.py3-none-any.whl Requirement already satisfied: setuptools>=18.5 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/stable/lib/python3.6/site-packages (from ipython->-r docs/docs_requirements.txt (line 3)) Collecting backcall (from ipython->-r docs/docs_requirements.txt (line 3)) Collecting decorator (from ipython->-r docs/docs_requirements.txt (line 3)) Using cached https://files.pythonhosted.org/packages/bc/bb/a24838832ba35baf52f32ab1a49b906b5f82fb7c76b2f6a7e35e140bac30/decorator-4.3.0-py2.py3-none-any.whl Requirement already satisfied: pygments in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/stable/lib/python3.6/site-packages (from ipython->-r docs/docs_requirements.txt (line 3)) Collecting pexpect; sys_platform != "win32" (from ipython->-r docs/docs_requirements.txt (line 3)) Using cached https://files.pythonhosted.org/packages/89/e6/b5a1de8b0cc4e07ca1b305a4fcc3f9806025c1b651ea302646341222f88b/pexpect-4.6.0-py2.py3-none-any.whl Collecting traitlets>=4.2 (from ipython->-r docs/docs_requirements.txt (line 3)) Using cached https://files.pythonhosted.org/packages/93/d6/abcb22de61d78e2fc3959c964628a5771e47e7cc60d53e9342e21ed6cc9a/traitlets-4.3.2-py2.py3-none-any.whl Collecting jedi>=0.10 (from ipython->-r docs/docs_requirements.txt (line 3)) Using cached https://files.pythonhosted.org/packages/7a/1a/9bd24a185873b998611c2d8d4fb15cd5e8a879ead36355df7ee53e9111bf/jedi-0.13.1-py2.py3-none-any.whl Collecting prompt-toolkit<2.1.0,>=2.0.0 (from ipython->-r docs/docs_requirements.txt (line 3)) Using cached https://files.pythonhosted.org/packages/d1/e6/adb3be5576f5d27c6faa33f1e9fea8fe5dbd9351db12148de948507e352c/prompt_toolkit-2.0.7-py3-none-any.whl Collecting pickleshare (from ipython->-r docs/docs_requirements.txt (line 3)) Using cached https://files.pythonhosted.org/packages/9a/41/220f49aaea88bc6fa6cba8d05ecf24676326156c23b991e80b3f2fc24c77/pickleshare-0.7.5-py2.py3-none-any.whl Collecting cycler>=0.10 (from matplotlib->-r docs/docs_requirements.txt (line 5)) Using cached https://files.pythonhosted.org/packages/f7/d2/e07d3ebb2bd7af696440ce7e754c59dd546ffe1bbe732c8ab68b9c834e61/cycler-0.10.0-py2.py3-none-any.whl Collecting python-dateutil>=2.1 (from matplotlib->-r docs/docs_requirements.txt (line 5)) Using cached https://files.pythonhosted.org/packages/74/68/d87d9b36af36f44254a8d512cbfc48369103a3b9e474be9bdfe536abfc45/python_dateutil-2.7.5-py2.py3-none-any.whl Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/stable/lib/python3.6/site-packages (from matplotlib->-r docs/docs_requirements.txt (line 5)) Collecting kiwisolver>=1.0.1 (from matplotlib->-r docs/docs_requirements.txt (line 5)) Using cached https://files.pythonhosted.org/packages/69/a7/88719d132b18300b4369fbffa741841cfd36d1e637e1990f27929945b538/kiwisolver-1.0.1-cp36-cp36m-manylinux1_x86_64.whl Requirement already satisfied: pytz>=2011k in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/stable/lib/python3.6/site-packages (from pandas->-r docs/docs_requirements.txt (line 7)) Collecting atomicwrites>=1.0 (from pytest->-r docs/docs_requirements.txt (line 8)) Using cached https://files.pythonhosted.org/packages/3a/9a/9d878f8d885706e2530402de6417141129a943802c084238914fa6798d97/atomicwrites-1.2.1-py2.py3-none-any.whl Collecting attrs>=17.4.0 (from pytest->-r docs/docs_requirements.txt (line 8)) Using cached https://files.pythonhosted.org/packages/3a/e1/5f9023cc983f1a628a8c2fd051ad19e76ff7b142a0faf329336f9a62a514/attrs-18.2.0-py2.py3-none-any.whl Collecting py>=1.5.0 (from pytest->-r docs/docs_requirements.txt (line 8)) Using cached https://files.pythonhosted.org/packages/3e/c7/3da685ef117d42ac8d71af525208759742dd235f8094221fdaafcd3dba8f/py-1.7.0-py2.py3-none-any.whl Collecting pluggy>=0.7 (from pytest->-r docs/docs_requirements.txt (line 8)) Using cached https://files.pythonhosted.org/packages/1c/e7/017c262070af41fe251401cb0d0e1b7c38f656da634cd0c15604f1f30864/pluggy-0.8.0-py2.py3-none-any.whl Collecting more-itertools>=4.0.0 (from pytest->-r docs/docs_requirements.txt (line 8)) Using cached https://files.pythonhosted.org/packages/79/b1/eace304ef66bd7d3d8b2f78cc374b73ca03bc53664d78151e9df3b3996cc/more_itertools-4.3.0-py3-none-any.whl Requirement already satisfied: Jinja2>=2.3 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/stable/lib/python3.6/site-packages (from sphinx==1.5.6->-r docs/docs_requirements.txt (line 12)) Requirement already satisfied: snowballstemmer>=1.1 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/stable/lib/python3.6/site-packages (from sphinx==1.5.6->-r docs/docs_requirements.txt (line 12)) Requirement already satisfied: alabaster<0.8,>=0.7 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/stable/lib/python3.6/site-packages (from sphinx==1.5.6->-r docs/docs_requirements.txt (line 12)) Requirement already satisfied: babel!=2.0,>=1.3 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/stable/lib/python3.6/site-packages (from sphinx==1.5.6->-r docs/docs_requirements.txt (line 12)) Requirement already satisfied: docutils>=0.11 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/stable/lib/python3.6/site-packages (from sphinx==1.5.6->-r docs/docs_requirements.txt (line 12)) Requirement already satisfied: imagesize in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/stable/lib/python3.6/site-packages (from sphinx==1.5.6->-r docs/docs_requirements.txt (line 12)) Collecting PyYAML>=3.10 (from sphinx-autobuild==0.7.1->-r docs/docs_requirements.txt (line 13)) Collecting watchdog>=0.7.1 (from sphinx-autobuild==0.7.1->-r docs/docs_requirements.txt (line 13)) Collecting tornado>=3.2 (from sphinx-autobuild==0.7.1->-r docs/docs_requirements.txt (line 13)) Collecting port-for==0.3.1 (from sphinx-autobuild==0.7.1->-r docs/docs_requirements.txt (line 13)) Collecting pathtools>=0.1.2 (from sphinx-autobuild==0.7.1->-r docs/docs_requirements.txt (line 13)) Collecting argh>=0.24.1 (from sphinx-autobuild==0.7.1->-r docs/docs_requirements.txt (line 13)) Using cached https://files.pythonhosted.org/packages/06/1c/e667a7126f0b84aaa1c56844337bf0ac12445d1beb9c8a6199a7314944bf/argh-0.26.2-py2.py3-none-any.whl Collecting livereload>=2.3.0 (from sphinx-autobuild==0.7.1->-r docs/docs_requirements.txt (line 13)) Using cached https://files.pythonhosted.org/packages/dd/b4/213daced3ff1b4e02a1f700748e20e9a7481f5bfef57d11ae9babfd4aa2f/livereload-2.5.2-py2.py3-none-any.whl Collecting protobuf>=3.6.1 (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Using cached https://files.pythonhosted.org/packages/c2/f9/28787754923612ca9bfdffc588daa05580ed70698add063a5629d1a4209d/protobuf-3.6.1-cp36-cp36m-manylinux1_x86_64.whl Collecting keras-preprocessing>=1.0.5 (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Using cached https://files.pythonhosted.org/packages/fc/94/74e0fa783d3fc07e41715973435dd051ca89c550881b3454233c39c73e69/Keras_Preprocessing-1.0.5-py2.py3-none-any.whl Collecting gast>=0.2.0 (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Collecting astor>=0.6.0 (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Using cached https://files.pythonhosted.org/packages/35/6b/11530768cac581a12952a2aad00e1526b89d242d0b9f59534ef6e6a1752f/astor-0.7.1-py2.py3-none-any.whl Collecting tensorboard<1.13.0,>=1.12.0 (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Using cached https://files.pythonhosted.org/packages/e0/d0/65fe48383146199f16dbd5999ef226b87bce63ad5cd73c840cf722637969/tensorboard-1.12.0-py3-none-any.whl Collecting grpcio>=1.8.6 (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Using cached https://files.pythonhosted.org/packages/c3/4c/0a7c55764ac3013ca7a5e9638ee7b161488c0611afc2be465452987a3ccc/grpcio-1.16.0-cp36-cp36m-manylinux1_x86_64.whl Collecting termcolor>=1.1.0 (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Requirement already satisfied: wheel>=0.26 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/stable/lib/python3.6/site-packages (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Collecting absl-py>=0.1.6 (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Collecting keras-applications>=1.0.6 (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Using cached https://files.pythonhosted.org/packages/3f/c4/2ff40221029f7098d58f8d7fb99b97e8100f3293f9856f0fb5834bef100b/Keras_Applications-1.0.6-py2.py3-none-any.whl Requirement already satisfied: urllib3<1.25,>=1.21.1 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/stable/lib/python3.6/site-packages (from requests>=2.0->gym>=0.10.8->-r docs/docs_requirements.txt (line 2)) Requirement already satisfied: certifi>=2017.4.17 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/stable/lib/python3.6/site-packages (from requests>=2.0->gym>=0.10.8->-r docs/docs_requirements.txt (line 2)) Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/stable/lib/python3.6/site-packages (from requests>=2.0->gym>=0.10.8->-r docs/docs_requirements.txt (line 2)) Requirement already satisfied: idna<2.8,>=2.5 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/stable/lib/python3.6/site-packages (from requests>=2.0->gym>=0.10.8->-r docs/docs_requirements.txt (line 2)) Collecting future (from pyglet>=1.2.0->gym>=0.10.8->-r docs/docs_requirements.txt (line 2)) Collecting ptyprocess>=0.5 (from pexpect; sys_platform != "win32"->ipython->-r docs/docs_requirements.txt (line 3)) Using cached https://files.pythonhosted.org/packages/d1/29/605c2cc68a9992d18dada28206eeada56ea4bd07a239669da41674648b6f/ptyprocess-0.6.0-py2.py3-none-any.whl Collecting ipython-genutils (from traitlets>=4.2->ipython->-r docs/docs_requirements.txt (line 3)) Using cached https://files.pythonhosted.org/packages/fa/bc/9bd3b5c2b4774d5f33b2d544f1460be9df7df2fe42f352135381c347c69a/ipython_genutils-0.2.0-py2.py3-none-any.whl Collecting parso>=0.3.0 (from jedi>=0.10->ipython->-r docs/docs_requirements.txt (line 3)) Using cached https://files.pythonhosted.org/packages/09/51/9c48a46334be50c13d25a3afe55fa05c445699304c5ad32619de953a2305/parso-0.3.1-py2.py3-none-any.whl Collecting wcwidth (from prompt-toolkit<2.1.0,>=2.0.0->ipython->-r docs/docs_requirements.txt (line 3)) Using cached https://files.pythonhosted.org/packages/7e/9f/526a6947247599b084ee5232e4f9190a38f398d7300d866af3ab571a5bfe/wcwidth-0.1.7-py2.py3-none-any.whl Requirement already satisfied: MarkupSafe>=0.23 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/stable/lib/python3.6/site-packages (from Jinja2>=2.3->sphinx==1.5.6->-r docs/docs_requirements.txt (line 12)) Collecting werkzeug>=0.11.10 (from tensorboard<1.13.0,>=1.12.0->tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Using cached https://files.pythonhosted.org/packages/20/c4/12e3e56473e52375aa29c4764e70d1b8f3efa6682bef8d0aae04fe335243/Werkzeug-0.14.1-py2.py3-none-any.whl Collecting markdown>=2.6.8 (from tensorboard<1.13.0,>=1.12.0->tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Using cached https://files.pythonhosted.org/packages/7a/6b/5600647404ba15545ec37d2f7f58844d690baf2f81f3a60b862e48f29287/Markdown-3.0.1-py2.py3-none-any.whl Collecting h5py (from keras-applications>=1.0.6->tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Using cached https://files.pythonhosted.org/packages/8e/cb/726134109e7bd71d98d1fcc717ffe051767aac42ede0e7326fd1787e5d64/h5py-2.8.0-cp36-cp36m-manylinux1_x86_64.whl Installing collected packages: cloudpickle, numpy, scipy, future, pyglet, gym, backcall, decorator, ptyprocess, pexpect, ipython-genutils, traitlets, parso, jedi, wcwidth, prompt-toolkit, pickleshare, ipython, joblib, cycler, python-dateutil, kiwisolver, matplotlib, pandas, atomicwrites, attrs, py, pluggy, more-itertools, pytest, psutil, seaborn, sphinx, PyYAML, pathtools, argh, watchdog, tornado, port-for, livereload, sphinx-autobuild, sphinx-rtd-theme, protobuf, keras-preprocessing, gast, astor, werkzeug, grpcio, markdown, tensorboard, termcolor, absl-py, h5py, keras-applications, tensorflow, tqdm Found existing installation: Sphinx 1.7.9 Uninstalling Sphinx-1.7.9: Successfully uninstalled Sphinx-1.7.9 Found existing installation: sphinx-rtd-theme 0.4.2 Uninstalling sphinx-rtd-theme-0.4.2: Successfully uninstalled sphinx-rtd-theme-0.4.2 Successfully installed PyYAML-3.13 absl-py-0.6.1 argh-0.26.2 astor-0.7.1 atomicwrites-1.2.1 attrs-18.2.0 backcall-0.1.0 cloudpickle-0.5.2 cycler-0.10.0 decorator-4.3.0 future-0.17.1 gast-0.2.0 grpcio-1.16.0 gym-0.10.9 h5py-2.8.0 ipython-7.1.1 ipython-genutils-0.2.0 jedi-0.13.1 joblib-0.13.0 keras-applications-1.0.6 keras-preprocessing-1.0.5 kiwisolver-1.0.1 livereload-2.5.2 markdown-3.0.1 matplotlib-3.0.2 more-itertools-4.3.0 numpy-1.15.4 pandas-0.23.4 parso-0.3.1 pathtools-0.1.2 pexpect-4.6.0 pickleshare-0.7.5 pluggy-0.8.0 port-for-0.3.1 prompt-toolkit-2.0.7 protobuf-3.6.1 psutil-5.4.8 ptyprocess-0.6.0 py-1.7.0 pyglet-1.3.2 pytest-3.10.0 python-dateutil-2.7.5 scipy-1.1.0 seaborn-0.8.1 sphinx-1.5.6 sphinx-autobuild-0.7.1 sphinx-rtd-theme-0.4.1 tensorboard-1.12.0 tensorflow-1.12.0 termcolor-1.1.0 tornado-5.1.1 tqdm-4.28.1 traitlets-4.3.2 watchdog-0.9.0 wcwidth-0.1.7 werkzeug-0.14.1 You are using pip version 9.0.3, however version 18.1 is available. You should consider upgrading via the 'pip install --upgrade pip' command. [rtd-command-info] start-time: 2018-11-11T14:45:05.000225Z, end-time: 2018-11-11T14:45:05.075526Z, duration: 0, exit-code: 0 cat docs/conf.py #!/usr/bin/env python3 # -*- coding: utf-8 -*- # # Spinning Up documentation build configuration file, created by # sphinx-quickstart on Wed Aug 15 04:21:07 2018. # # This file is execfile()d with the current directory set to its # containing dir. # # Note that not all possible configuration values are present in this # autogenerated file. # # All configuration values have a default; values that are commented out # serve to show the default. # If extensions (or modules to document with autodoc) are in another directory, # add these directories to sys.path here. If the directory is relative to the # documentation root, use os.path.abspath to make it absolute, like shown here. # import os import sys # Make sure spinup is accessible without going through setup.py dirname = os.path.dirname sys.path.insert(0, dirname(dirname(__file__))) # Mock mpi4py to get around having to install it on RTD server (which fails) from unittest.mock import MagicMock class Mock(MagicMock): @classmethod def __getattr__(cls, name): return MagicMock() MOCK_MODULES = ['mpi4py'] sys.modules.update((mod_name, Mock()) for mod_name in MOCK_MODULES) # Finish imports import spinup from recommonmark.parser import CommonMarkParser source_parsers = { '.md': CommonMarkParser, } # -- General configuration ------------------------------------------------ # If your documentation needs a minimal Sphinx version, state it here. # # needs_sphinx = '1.0' # Add any Sphinx extension module names here, as strings. They can be # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom # ones. extensions = ['sphinx.ext.imgmath', 'sphinx.ext.viewcode', 'sphinx.ext.autodoc', 'sphinx.ext.napoleon'] #'sphinx.ext.mathjax', ?? # imgmath settings imgmath_image_format = 'svg' imgmath_font_size = 14 # Add any paths that contain templates here, relative to this directory. templates_path = ['_templates'] # The suffix(es) of source filenames. # You can specify multiple suffix as a list of string: # source_suffix = ['.rst', '.md'] # source_suffix = '.rst' # The master toctree document. master_doc = 'index' # General information about the project. project = 'Spinning Up' copyright = '2018, OpenAI' author = 'Joshua Achiam' # The version info for the project you're documenting, acts as replacement for # |version| and |release|, also used in various other places throughout the # built documents. # # The short X.Y version. version = '' # The full version, including alpha/beta/rc tags. release = '' # The language for content autogenerated by Sphinx. Refer to documentation # for a list of supported languages. # # This is also used if you do content translation via gettext catalogs. # Usually you set "language" from the command line for these cases. language = None # List of patterns, relative to source directory, that match files and # directories to ignore when looking for source files. # This patterns also effect to html_static_path and html_extra_path exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store'] # The name of the Pygments (syntax highlighting) style to use. pygments_style = 'default' #'sphinx' # If true, `todo` and `todoList` produce output, else they produce nothing. todo_include_todos = False # -- Options for HTML output ---------------------------------------------- # The theme to use for HTML and HTML Help pages. See the documentation for # a list of builtin themes. # # html_theme = 'alabaster' html_theme = "sphinx_rtd_theme" # Theme options are theme-specific and customize the look and feel of a theme # further. For a list of options available for each theme, see the # documentation. # # html_theme_options = {} # Add any paths that contain custom static files (such as style sheets) here, # relative to this directory. They are copied after the builtin static files, # so a file named "default.css" will overwrite the builtin "default.css". html_static_path = ['_static'] html_logo = 'images/spinning-up-logo2.png' html_theme_options = { 'logo_only': True } #html_favicon = 'openai-favicon2_32x32.ico' html_favicon = 'openai_icon.ico' # -- Options for HTMLHelp output ------------------------------------------ # Output file base name for HTML help builder. htmlhelp_basename = 'SpinningUpdoc' # -- Options for LaTeX output --------------------------------------------- latex_elements = { # The paper size ('letterpaper' or 'a4paper'). # # 'papersize': 'letterpaper', # The font size ('10pt', '11pt' or '12pt'). # # 'pointsize': '10pt', # Additional stuff for the LaTeX preamble. # # 'preamble': '', # Latex figure (float) alignment # # 'figure_align': 'htbp', } imgmath_latex_preamble = r''' \usepackage{algorithm} \usepackage{algorithmic} \usepackage{cancel} \usepackage[verbose=true,letterpaper]{geometry} \geometry{ textheight=12in, textwidth=6.5in, top=1in, headheight=12pt, headsep=25pt, footskip=30pt } \newcommand{\E}{{\mathrm E}} \newcommand{\underE}[2]{\underset{\begin{subarray}{c}#1 \end{subarray}}{\E}\left[ #2 \right]} \newcommand{\Epi}[1]{\underset{\begin{subarray}{c}\tau \sim \pi \end{subarray}}{\E}\left[ #1 \right]} ''' # Grouping the document tree into LaTeX files. List of tuples # (source start file, target name, title, # author, documentclass [howto, manual, or own class]). latex_documents = [ (master_doc, 'SpinningUp.tex', 'Spinning Up Documentation', 'Joshua Achiam', 'manual'), ] # -- Options for manual page output --------------------------------------- # One entry per manual page. List of tuples # (source start file, name, description, authors, manual section). man_pages = [ (master_doc, 'spinningup', 'Spinning Up Documentation', [author], 1) ] # -- Options for Texinfo output ------------------------------------------- # Grouping the document tree into Texinfo files. List of tuples # (source start file, target name, title, author, # dir menu entry, description, category) texinfo_documents = [ (master_doc, 'SpinningUp', 'Spinning Up Documentation', author, 'SpinningUp', 'One line description of project.', 'Miscellaneous'), ] def setup(app): app.add_stylesheet('css/modify.css') ########################################################################### # auto-created readthedocs.org specific configuration # ########################################################################### # # The following code was added during an automated build on readthedocs.org # It is auto created and injected for every build. The result is based on the # conf.py.tmpl file found in the readthedocs.org codebase: # https://github.com/rtfd/readthedocs.org/blob/master/readthedocs/doc_builder/templates/doc_builder/conf.py.tmpl # import importlib import sys import os.path from six import string_types from sphinx import version_info # Get suffix for proper linking to GitHub # This is deprecated in Sphinx 1.3+, # as each page can have its own suffix if globals().get('source_suffix', False): if isinstance(source_suffix, string_types): SUFFIX = source_suffix else: SUFFIX = source_suffix[0] else: SUFFIX = '.rst' # Add RTD Static Path. Add to the end because it overwrites previous files. if not 'html_static_path' in globals(): html_static_path = [] if os.path.exists('_static'): html_static_path.append('_static') html_static_path.append('/home/docs/checkouts/readthedocs.org/readthedocs/templates/sphinx/_static') # Add RTD Theme only if they aren't overriding it already using_rtd_theme = ( ( 'html_theme' in globals() and html_theme in ['default'] and # Allow people to bail with a hack of having an html_style 'html_style' not in globals() ) or 'html_theme' not in globals() ) if using_rtd_theme: theme = importlib.import_module('sphinx_rtd_theme') html_theme = 'sphinx_rtd_theme' html_style = None html_theme_options = {} if 'html_theme_path' in globals(): html_theme_path.append(theme.get_html_theme_path()) else: html_theme_path = [theme.get_html_theme_path()] if globals().get('websupport2_base_url', False): websupport2_base_url = 'https://readthedocs.com/websupport' websupport2_static_url = 'https://media.readthedocs.com/' #Add project information to the template context. context = { 'using_theme': using_rtd_theme, 'html_theme': html_theme, 'current_version': "stable", 'version_slug': "stable", 'MEDIA_URL': "https://media.readthedocs.com/media/", 'STATIC_URL': "https://media.readthedocs.com/", 'PRODUCTION_DOMAIN': "readthedocs.com", 'versions': [ ("latest", "/en/latest/"), ("stable", "/en/stable/"), ], 'downloads': [ ], 'subprojects': [ ], 'slug': 'openai-education-spinningup', 'name': u'spinningup', 'rtd_language': u'en', 'programming_language': u'words', 'canonical_url': 'https://spinningup.openai.com/en/latest/', 'analytics_code': '', 'single_version': False, 'conf_py_path': '/docs/', 'api_host': 'https://readthedocs.com', 'github_user': 'openai', 'github_repo': 'spinningup', 'github_version': 'c671b47d16ec1e818c9f68f3e27264b0cd507be5', 'display_github': True, 'bitbucket_user': 'None', 'bitbucket_repo': 'None', 'bitbucket_version': 'c671b47d16ec1e818c9f68f3e27264b0cd507be5', 'display_bitbucket': False, 'gitlab_user': 'None', 'gitlab_repo': 'None', 'gitlab_version': 'c671b47d16ec1e818c9f68f3e27264b0cd507be5', 'display_gitlab': False, 'READTHEDOCS': True, 'using_theme': (html_theme == "default"), 'new_theme': (html_theme == "sphinx_rtd_theme"), 'source_suffix': SUFFIX, 'ad_free': False, 'user_analytics_code': '', 'global_analytics_code': 'UA-17997319-2', 'commit': 'c671b47d', } if 'html_context' in globals(): html_context.update(context) else: html_context = context # Add custom RTD extension if 'extensions' in globals(): # Insert at the beginning because it can interfere # with other extensions. # See https://github.com/rtfd/readthedocs.org/pull/4054 extensions.insert(0, "readthedocs_ext.readthedocs") else: extensions = ["readthedocs_ext.readthedocs"] [rtd-command-info] start-time: 2018-11-11T14:45:05.154377Z, end-time: 2018-11-11T14:46:47.484969Z, duration: 102, exit-code: 0 python sphinx-build -T -b readthedocs -d _build/doctrees-readthedocs -D language=en . _build/html Running Sphinx v1.5.6 making output directory... loading translations [en]... done loading pickled environment... not yet created building [mo]: targets for 0 po files that are out of date building [readthedocs]: targets for 31 source files that are out of date updating environment: 31 added, 0 changed, 0 removed reading sources... [ 3%] algorithms/ddpg reading sources... [ 6%] algorithms/ppo reading sources... [ 9%] algorithms/sac reading sources... [ 12%] algorithms/td3 reading sources... [ 16%] algorithms/trpo reading sources... [ 19%] algorithms/vpg reading sources... [ 22%] etc/acknowledgements reading sources... [ 25%] etc/author reading sources... [ 29%] index reading sources... [ 32%] spinningup/bench reading sources... [ 35%] spinningup/exercise2_1_soln reading sources... [ 38%] spinningup/exercise2_2_soln reading sources... [ 41%] spinningup/exercises reading sources... [ 45%] spinningup/extra_pg_proof1 reading sources... [ 48%] spinningup/extra_pg_proof2 reading sources... [ 51%] spinningup/keypapers reading sources... [ 54%] spinningup/rl_intro reading sources... [ 58%] spinningup/rl_intro2 reading sources... [ 61%] spinningup/rl_intro3 reading sources... [ 64%] spinningup/rl_intro4 reading sources... [ 67%] spinningup/spinningup reading sources... [ 70%] user/algorithms reading sources... [ 74%] user/installation reading sources... [ 77%] user/introduction reading sources... [ 80%] user/plotting reading sources... [ 83%] user/running reading sources... [ 87%] user/saving_and_loading reading sources... [ 90%] utils/logger reading sources... [ 93%] utils/mpi reading sources... [ 96%] utils/plotter reading sources... [100%] utils/run_utils looking for now-outdated files... none found pickling environment... /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/spinningup/exercises.rst:3: WARNING: Duplicate explicit target name: "solution available here.". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/spinningup/rl_intro3.rst:3: WARNING: Duplicate explicit target name: "on github". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/spinningup/rl_intro3.rst:380: WARNING: Line block ends without a blank line. /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/user/plotting.rst:55: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/user/plotting.rst:63: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/user/running.rst:141: WARNING: Duplicate explicit target name: "cmdoption--ac_kwargs". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/user/running.rst:282: WARNING: Inline strong start-string without end-string. /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/user/saving_and_loading.rst:119: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/user/saving_and_loading.rst:146: WARNING: Duplicate explicit target name: "cmdoption-arg-default". done checking consistency... /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/spinningup/exercise2_1_soln.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/spinningup/exercise2_2_soln.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/spinningup/extra_pg_proof1.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/spinningup/extra_pg_proof2.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/spinningup/rl_intro4.rst:: WARNING: document isn't included in any toctree done preparing documents... done writing output... [ 3%] algorithms/ddpg writing output... [ 6%] algorithms/ppo writing output... [ 9%] algorithms/sac writing output... [ 12%] algorithms/td3 writing output... [ 16%] algorithms/trpo writing output... [ 19%] algorithms/vpg writing output... [ 22%] etc/acknowledgements writing output... [ 25%] etc/author writing output... [ 29%] index writing output... [ 32%] spinningup/bench writing output... [ 35%] spinningup/exercise2_1_soln writing output... [ 38%] spinningup/exercise2_2_soln writing output... [ 41%] spinningup/exercises writing output... [ 45%] spinningup/extra_pg_proof1 writing output... [ 48%] spinningup/extra_pg_proof2 writing output... [ 51%] spinningup/keypapers writing output... [ 54%] spinningup/rl_intro writing output... [ 58%] spinningup/rl_intro2 writing output... [ 61%] spinningup/rl_intro3 writing output... [ 64%] spinningup/rl_intro4 writing output... [ 67%] spinningup/spinningup writing output... [ 70%] user/algorithms writing output... [ 74%] user/installation writing output... [ 77%] user/introduction writing output... [ 80%] user/plotting writing output... [ 83%] user/running writing output... [ 87%] user/saving_and_loading writing output... [ 90%] utils/logger writing output... [ 93%] utils/mpi writing output... [ 96%] utils/plotter writing output... [100%] utils/run_utils generating indices... genindex py-modindex highlighting module code... [ 10%] spinup.algos.ddpg.ddpg highlighting module code... [ 20%] spinup.algos.ppo.ppo highlighting module code... [ 30%] spinup.algos.sac.sac highlighting module code... [ 40%] spinup.algos.td3.td3 highlighting module code... [ 50%] spinup.algos.trpo.trpo highlighting module code... [ 60%] spinup.algos.vpg.vpg highlighting module code... [ 70%] spinup.utils.logx highlighting module code... [ 80%] spinup.utils.mpi_tools highlighting module code... [ 90%] spinup.utils.mpi_tf highlighting module code... [100%] spinup.utils.run_utils writing additional pages... search copying images... [ 10%] images/spinning-up-in-rl.png copying images... [ 20%] spinningup/../images/bench/bench_halfcheetah.svg copying images... [ 30%] spinningup/../images/bench/bench_hopper.svg copying images... [ 40%] spinningup/../images/bench/bench_walker.svg copying images... [ 50%] spinningup/../images/bench/bench_swim.svg copying images... [ 60%] spinningup/../images/bench/bench_ant.svg copying images... [ 70%] spinningup/../images/ex2-1_trpo_hopper.png copying images... [ 80%] spinningup/../images/ex2-2_ddpg_bug.svg copying images... [ 90%] spinningup/../images/rl_diagram_transparent_bg.png copying images... [100%] spinningup/../images/rl_algorithms_9_15.svg copying static files... WARNING: html_static_path entry '/home/docs/checkouts/readthedocs.org/readthedocs/templates/sphinx/_static' does not exist WARNING: favicon file 'openai_icon.ico' does not exist done copying extra files... done dumping search index in English (code: en) ... done dumping object inventory... done build succeeded, 16 warnings. [rtd-command-info] start-time: 2018-11-11T14:46:47.657572Z, end-time: 2018-11-11T14:48:05.694988Z, duration: 78, exit-code: 0 python sphinx-build -T -b readthedocssinglehtmllocalmedia -d _build/doctrees-readthedocssinglehtmllocalmedia -D language=en . _build/localmedia Running Sphinx v1.5.6 making output directory... loading translations [en]... done loading pickled environment... not yet created building [mo]: targets for 0 po files that are out of date building [readthedocssinglehtmllocalmedia]: all documents updating environment: 31 added, 0 changed, 0 removed reading sources... [ 3%] algorithms/ddpg reading sources... [ 6%] algorithms/ppo reading sources... [ 9%] algorithms/sac reading sources... [ 12%] algorithms/td3 reading sources... [ 16%] algorithms/trpo reading sources... [ 19%] algorithms/vpg reading sources... [ 22%] etc/acknowledgements reading sources... [ 25%] etc/author reading sources... [ 29%] index reading sources... [ 32%] spinningup/bench reading sources... [ 35%] spinningup/exercise2_1_soln reading sources... [ 38%] spinningup/exercise2_2_soln reading sources... [ 41%] spinningup/exercises reading sources... [ 45%] spinningup/extra_pg_proof1 reading sources... [ 48%] spinningup/extra_pg_proof2 reading sources... [ 51%] spinningup/keypapers reading sources... [ 54%] spinningup/rl_intro reading sources... [ 58%] spinningup/rl_intro2 reading sources... [ 61%] spinningup/rl_intro3 reading sources... [ 64%] spinningup/rl_intro4 reading sources... [ 67%] spinningup/spinningup reading sources... [ 70%] user/algorithms reading sources... [ 74%] user/installation reading sources... [ 77%] user/introduction reading sources... [ 80%] user/plotting reading sources... [ 83%] user/running reading sources... [ 87%] user/saving_and_loading reading sources... [ 90%] utils/logger reading sources... [ 93%] utils/mpi reading sources... [ 96%] utils/plotter reading sources... [100%] utils/run_utils /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/spinningup/exercises.rst:3: WARNING: Duplicate explicit target name: "solution available here.". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/spinningup/rl_intro3.rst:3: WARNING: Duplicate explicit target name: "on github". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/spinningup/rl_intro3.rst:380: WARNING: Line block ends without a blank line. /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/user/plotting.rst:55: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/user/plotting.rst:63: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/user/running.rst:141: WARNING: Duplicate explicit target name: "cmdoption--ac_kwargs". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/user/running.rst:282: WARNING: Inline strong start-string without end-string. /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/user/saving_and_loading.rst:119: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/user/saving_and_loading.rst:146: WARNING: Duplicate explicit target name: "cmdoption-arg-default". looking for now-outdated files... none found pickling environment... done checking consistency... /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/spinningup/exercise2_1_soln.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/spinningup/exercise2_2_soln.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/spinningup/extra_pg_proof1.rst:: WARNING: document isn't included in any toctree done preparing documents... /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/spinningup/extra_pg_proof2.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/spinningup/rl_intro4.rst:: WARNING: document isn't included in any toctree done assembling single document... user/introduction user/installation user/algorithms user/running user/saving_and_loading user/plotting spinningup/rl_intro spinningup/rl_intro2 spinningup/rl_intro3 spinningup/spinningup spinningup/keypapers spinningup/exercises spinningup/bench algorithms/vpg algorithms/trpo algorithms/ppo algorithms/ddpg algorithms/td3 algorithms/sac utils/logger utils/plotter utils/mpi utils/run_utils etc/acknowledgements etc/author writing... done writing additional files... copying images... [ 12%] images/spinning-up-in-rl.png copying images... [ 25%] spinningup/../images/rl_diagram_transparent_bg.png copying images... [ 37%] spinningup/../images/rl_algorithms_9_15.svg copying images... [ 50%] spinningup/../images/bench/bench_halfcheetah.svg copying images... [ 62%] spinningup/../images/bench/bench_hopper.svg copying images... [ 75%] spinningup/../images/bench/bench_walker.svg copying images... [ 87%] spinningup/../images/bench/bench_swim.svg copying images... [100%] spinningup/../images/bench/bench_ant.svg copying static files... WARNING: html_static_path entry '/home/docs/checkouts/readthedocs.org/readthedocs/templates/sphinx/_static' does not exist WARNING: favicon file 'openai_icon.ico' does not exist done copying extra files... done dumping object inventory... done build succeeded, 16 warnings. [rtd-command-info] start-time: 2018-11-11T14:48:05.940395Z, end-time: 2018-11-11T14:48:11.526214Z, duration: 5, exit-code: 0 python sphinx-build -b latex -D language=en -d _build/doctrees . _build/latex Running Sphinx v1.5.6 making output directory... loading translations [en]... done loading pickled environment... failed: source directory has changed building [mo]: targets for 0 po files that are out of date building [latex]: all documents updating environment: 31 added, 0 changed, 0 removed reading sources... [ 3%] algorithms/ddpg reading sources... [ 6%] algorithms/ppo reading sources... [ 9%] algorithms/sac reading sources... [ 12%] algorithms/td3 reading sources... [ 16%] algorithms/trpo reading sources... [ 19%] algorithms/vpg reading sources... [ 22%] etc/acknowledgements reading sources... [ 25%] etc/author reading sources... [ 29%] index reading sources... [ 32%] spinningup/bench reading sources... [ 35%] spinningup/exercise2_1_soln reading sources... [ 38%] spinningup/exercise2_2_soln reading sources... [ 41%] spinningup/exercises reading sources... [ 45%] spinningup/extra_pg_proof1 reading sources... [ 48%] spinningup/extra_pg_proof2 reading sources... [ 51%] spinningup/keypapers reading sources... [ 54%] spinningup/rl_intro reading sources... [ 58%] spinningup/rl_intro2 reading sources... [ 61%] spinningup/rl_intro3 reading sources... [ 64%] spinningup/rl_intro4 reading sources... [ 67%] spinningup/spinningup reading sources... [ 70%] user/algorithms reading sources... [ 74%] user/installation reading sources... [ 77%] user/introduction reading sources... [ 80%] user/plotting reading sources... [ 83%] user/running reading sources... [ 87%] user/saving_and_loading reading sources... [ 90%] utils/logger reading sources... [ 93%] utils/mpi reading sources... [ 96%] utils/plotter reading sources... [100%] utils/run_utils /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/spinningup/exercises.rst:3: WARNING: Duplicate explicit target name: "solution available here.". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/spinningup/rl_intro3.rst:3: WARNING: Duplicate explicit target name: "on github". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/spinningup/rl_intro3.rst:380: WARNING: Line block ends without a blank line. /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/user/plotting.rst:55: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/user/plotting.rst:63: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/user/running.rst:141: WARNING: Duplicate explicit target name: "cmdoption--ac_kwargs". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/user/running.rst:282: WARNING: Inline strong start-string without end-string. /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/user/saving_and_loading.rst:119: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/user/saving_and_loading.rst:146: WARNING: Duplicate explicit target name: "cmdoption-arg-default". looking for now-outdated files... none found pickling environment... done checking consistency... /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/spinningup/exercise2_1_soln.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/spinningup/exercise2_2_soln.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/spinningup/extra_pg_proof1.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/spinningup/extra_pg_proof2.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/spinningup/rl_intro4.rst:: WARNING: document isn't included in any toctree done processing SpinningUp.tex... index user/introduction user/installation user/algorithms user/running user/saving_and_loading user/plotting spinningup/rl_intro spinningup/rl_intro2 spinningup/rl_intro3 spinningup/spinningup spinningup/keypapers spinningup/exercises spinningup/bench algorithms/vpg algorithms/trpo algorithms/ppo algorithms/ddpg algorithms/td3 algorithms/sac utils/logger utils/plotter utils/mpi utils/run_utils etc/acknowledgements etc/author resolving references... writing... done copying images... images/spinning-up-in-rl.png spinningup/../images/rl_diagram_transparent_bg.png spinningup/../images/rl_algorithms_9_15.svg spinningup/../images/bench/bench_halfcheetah.svg spinningup/../images/bench/bench_hopper.svg spinningup/../images/bench/bench_walker.svg spinningup/../images/bench/bench_swim.svg spinningup/../images/bench/bench_ant.svg copying TeX support files... done build succeeded, 14 warnings. [rtd-command-info] start-time: 2018-11-11T14:48:11.587124Z, end-time: 2018-11-11T14:48:12.854676Z, duration: 1, exit-code: 0 pdflatex -interaction=nonstopmode /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/_build/latex/SpinningUp.tex This is pdfTeX, Version 3.14159265-2.6-1.40.16 (TeX Live 2015/Debian) (preloaded format=pdflatex) restricted \write18 enabled. entering extended mode (/home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/c heckouts/stable/docs/_build/latex/SpinningUp.tex LaTeX2e <2016/02/01> Babel <3.9q> and hyphenation patterns for 81 language(s) loaded. (./sphinxmanual.cls Document Class: sphinxmanual 2017/03/26 v1.5.4 Document class (Sphinx manual) (/usr/share/texlive/texmf-dist/tex/latex/base/report.cls Document Class: report 2014/09/29 v1.4h Standard LaTeX document class (/usr/share/texlive/texmf-dist/tex/latex/base/size10.clo))) (/usr/share/texlive/texmf-dist/tex/latex/base/inputenc.sty (/usr/share/texlive/texmf-dist/tex/latex/base/utf8.def (/usr/share/texlive/texmf-dist/tex/latex/base/t1enc.dfu) (/usr/share/texlive/texmf-dist/tex/latex/base/ot1enc.dfu) (/usr/share/texlive/texmf-dist/tex/latex/base/omsenc.dfu))) (/usr/share/texlive/texmf-dist/tex/latex/cmap/cmap.sty) (/usr/share/texlive/texmf-dist/tex/latex/base/fontenc.sty (/usr/share/texlive/texmf-dist/tex/latex/base/t1enc.def)<>) (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsmath.sty For additional information on amsmath, use the `?' option. (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amstext.sty (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsgen.sty)) (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsbsy.sty) (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsopn.sty)) (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/amssymb.sty (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/amsfonts.sty)) (/usr/share/texlive/texmf-dist/tex/generic/babel/babel.sty (/usr/share/texlive/texmf-dist/tex/generic/babel-english/english.ldf (/usr/share/texlive/texmf-dist/tex/generic/babel/babel.def))) (/usr/share/texlive/texmf-dist/tex/latex/psnfss/times.sty) (/usr/share/texlive/texmf-dist/tex/latex/fncychap/fncychap.sty) (/usr/share/texlive/texmf-dist/tex/latex/tools/longtable.sty) (./sphinx.sty (/usr/share/texlive/texmf-dist/tex/latex/graphics/graphicx.sty (/usr/share/texlive/texmf-dist/tex/latex/graphics/keyval.sty) (/usr/share/texlive/texmf-dist/tex/latex/graphics/graphics.sty (/usr/share/texlive/texmf-dist/tex/latex/graphics/trig.sty) (/usr/share/texlive/texmf-dist/tex/latex/latexconfig/graphics.cfg) (/usr/share/texlive/texmf-dist/tex/latex/pdftex-def/pdftex.def (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/infwarerr.sty) (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ltxcmds.sty)))) (/usr/share/texlive/texmf-dist/tex/latex/fancyhdr/fancyhdr.sty) (/usr/share/texlive/texmf-dist/tex/latex/base/textcomp.sty (/usr/share/texlive/texmf-dist/tex/latex/base/ts1enc.def (/usr/share/texlive/texmf-dist/tex/latex/base/ts1enc.dfu))) (/usr/share/texlive/texmf-dist/tex/latex/titlesec/titlesec.sty) (/usr/share/texlive/texmf-dist/tex/latex/etoolbox/etoolbox.sty) Package Sphinx Info: **** titlesec 2.10.1 successfully patched for bugfix **** (/usr/share/texlive/texmf-dist/tex/latex/tabulary/tabulary.sty (/usr/share/texlive/texmf-dist/tex/latex/tools/array.sty)) (/usr/share/texlive/texmf-dist/tex/latex/base/makeidx.sty) (/usr/share/texlive/texmf-dist/tex/latex/framed/framed.sty) (/usr/share/texlive/texmf-dist/tex/latex/xcolor/xcolor.sty (/usr/share/texlive/texmf-dist/tex/latex/latexconfig/color.cfg)) (/usr/share/texlive/texmf-dist/tex/latex/fancyvrb/fancyvrb.sty Style option: `fancyvrb' v2.7a, with DG/SPQR fixes, and firstline=lastline fix <2008/02/07> (tvz)) (/usr/share/texlive/texmf-dist/tex/latex/threeparttable/threeparttable.sty) (./footnotehyper-sphinx.sty (/usr/share/texlive/texmf-dist/tex/latex/mdwtools/footnote.sty)) (/usr/share/texlive/texmf-dist/tex/latex/float/float.sty) (/usr/share/texlive/texmf-dist/tex/latex/wrapfig/wrapfig.sty) (/usr/share/texlive/texmf-dist/tex/latex/parskip/parskip.sty) (/usr/share/texlive/texmf-dist/tex/latex/base/alltt.sty) (/usr/share/texlive/texmf-dist/tex/latex/upquote/upquote.sty) (/usr/share/texlive/texmf-dist/tex/latex/capt-of/capt-of.sty) (./needspace.sty) (./sphinxhighlight.sty) (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/kvoptions.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/kvsetkeys.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/etexcmds.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ifluatex.sty)))) (/usr/share/texlive/texmf-dist/tex/generic/pdftex/pdfcolor.tex) ** (sphinx) defining (legacy) text style macros without \sphinx prefix ** if clashes with packages, set latex_keep_old_macro_names=False in conf.py ) (/usr/share/texlive/texmf-dist/tex/latex/geometry/geometry.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ifpdf.sty) (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ifvtex.sty) (/usr/share/texlive/texmf-dist/tex/generic/ifxetex/ifxetex.sty)) (/usr/share/texlive/texmf-dist/tex/latex/multirow/multirow.sty) (/usr/share/texlive/texmf-dist/tex/latex/eqparbox/eqparbox.sty (/usr/share/texlive/texmf-dist/tex/latex/environ/environ.sty (/usr/share/texlive/texmf-dist/tex/latex/trimspaces/trimspaces.sty))) (/usr/share/texlive/texmf-dist/tex/latex/hyperref/hyperref.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/hobsub-hyperref.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/hobsub-generic.sty)) (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/auxhook.sty) (/usr/share/texlive/texmf-dist/tex/latex/hyperref/pd1enc.def) (/usr/share/texlive/texmf-dist/tex/latex/latexconfig/hyperref.cfg) (/usr/share/texlive/texmf-dist/tex/latex/hyperref/puenc.def) (/usr/share/texlive/texmf-dist/tex/latex/url/url.sty)) Package hyperref Message: Driver (autodetected): hpdftex. (/usr/share/texlive/texmf-dist/tex/latex/hyperref/hpdftex.def (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/rerunfilecheck.sty)) (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/hypcap.sty) Writing index file SpinningUp.idx No file SpinningUp.aux. (/usr/share/texlive/texmf-dist/tex/latex/base/ts1cmr.fd) (/usr/share/texlive/texmf-dist/tex/latex/psnfss/t1ptm.fd) (/usr/share/texlive/texmf-dist/tex/context/base/supp-pdf.mkii [Loading MPS to PDF converter (version 2006.09.02).] ) (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/epstopdf-base.sty (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/grfext.sty) (/usr/share/texlive/texmf-dist/tex/latex/latexconfig/epstopdf-sys.cfg)) *geometry* driver: auto-detecting *geometry* detected driver: pdftex (/usr/share/texlive/texmf-dist/tex/latex/hyperref/nameref.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/gettitlestring.sty)) (/usr/share/texlive/texmf-dist/tex/latex/psnfss/t1phv.fd)<><><><> (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/umsa.fd) (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/umsb.fd) [1{/var/lib/texmf/fo nts/map/pdftex/updmap/pdftex.map}] [2] [1] [2] (/usr/share/texlive/texmf-dist/tex/latex/psnfss/t1pcr.fd) [1 <./spinning-up-in-rl.png>] [2] Chapter 1. (/usr/share/texlive/texmf-dist/tex/latex/psnfss/ts1ptm.fd) LaTeX Warning: Hyper reference `user/introduction:introduction' on page 3 undef ined on input line 70. LaTeX Warning: Hyper reference `user/introduction:what-this-is' on page 3 undef ined on input line 73. LaTeX Warning: Hyper reference `user/introduction:why-we-built-this' on page 3 undefined on input line 76. LaTeX Warning: Hyper reference `user/introduction:how-this-serves-our-mission' on page 3 undefined on input line 79. LaTeX Warning: Hyper reference `user/introduction:code-design-philosophy' on pa ge 3 undefined on input line 82. LaTeX Warning: Hyper reference `user/introduction:support-plan' on page 3 undef ined on input line 85. [3] [4] [5] [6] Chapter 2. LaTeX Warning: Hyper reference `user/installation:installation' on page 7 undef ined on input line 209. LaTeX Warning: Hyper reference `user/installation:installing-python' on page 7 undefined on input line 212. LaTeX Warning: Hyper reference `user/installation:installing-mujoco-and-openai- gym' on page 7 undefined on input line 215. LaTeX Warning: Hyper reference `user/installation:installing-openmpi' on page 7 undefined on input line 218. LaTeX Warning: Hyper reference `user/installation:ubuntu' on page 7 undefined o n input line 221. LaTeX Warning: Hyper reference `user/installation:mac-os-x' on page 7 undefined on input line 224. LaTeX Warning: Hyper reference `user/installation:installing-spinning-up' on pa ge 7 undefined on input line 229. LaTeX Warning: Hyper reference `user/installation:check-your-install' on page 7 undefined on input line 232. [7] [8] [9] [10] Chapter 3. LaTeX Warning: Hyper reference `user/algorithms:algorithms' on page 11 undefine d on input line 339. LaTeX Warning: Hyper reference `user/algorithms:what-s-included' on page 11 und efined on input line 342. LaTeX Warning: Hyper reference `user/algorithms:why-these-algorithms' on page 1 1 undefined on input line 345. LaTeX Warning: Hyper reference `user/algorithms:the-on-policy-algorithms' on pa ge 11 undefined on input line 348. LaTeX Warning: Hyper reference `user/algorithms:the-off-policy-algorithms' on p age 11 undefined on input line 351. LaTeX Warning: Hyper reference `user/algorithms:code-format' on page 11 undefin ed on input line 356. LaTeX Warning: Hyper reference `user/algorithms:the-algorithm-file' on page 11 undefined on input line 359. LaTeX Warning: Hyper reference `user/algorithms:the-core-file' on page 11 undef ined on input line 362. [11] [12] [13] [14] Chapter 4. LaTeX Warning: Hyper reference `user/running:running-experiments' on page 15 un defined on input line 508. LaTeX Warning: Hyper reference `user/running:launching-from-the-command-line' o n page 15 undefined on input line 511. LaTeX Warning: Hyper reference `user/running:setting-hyperparameters-from-the-c ommand-line' on page 15 undefined on input line 514. LaTeX Warning: Hyper reference `user/running:launching-multiple-experiments-at- once' on page 15 undefined on input line 517. LaTeX Warning: Hyper reference `user/running:special-flags' on page 15 undefine d on input line 520. LaTeX Warning: Hyper reference `user/running:environment-flag' on page 15 undef ined on input line 523. LaTeX Warning: Hyper reference `user/running:shortcut-flags' on page 15 undefin ed on input line 526. LaTeX Warning: Hyper reference `user/running:config-flags' on page 15 undefined on input line 529. LaTeX Warning: Hyper reference `user/running:where-results-are-saved' on page 1 5 undefined on input line 534. LaTeX Warning: Hyper reference `user/running:how-is-suffix-determined' on page 15 undefined on input line 537. LaTeX Warning: Hyper reference `user/running:extra' on page 15 undefined on inp ut line 542. LaTeX Warning: Hyper reference `user/running:launching-from-scripts' on page 15 undefined on input line 547. LaTeX Warning: Hyper reference `user/running:using-experimentgrid' on page 15 u ndefined on input line 550. [15] [16] [17] [18] (/usr/share/texlive/texmf-dist/tex/latex/psnfss/ts1pcr.fd) Overfull \vbox (11.785pt too high) detected at line 817 [19] [20] Chapter 5. LaTeX Warning: Hyper reference `user/saving_and_loading:experiment-outputs' on page 21 undefined on input line 873. LaTeX Warning: Hyper reference `user/saving_and_loading:algorithm-outputs' on p age 21 undefined on input line 876. LaTeX Warning: Hyper reference `user/saving_and_loading:save-directory-location ' on page 21 undefined on input line 879. LaTeX Warning: Hyper reference `user/saving_and_loading:loading-and-running-tra ined-policies' on page 21 undefined on input line 882. LaTeX Warning: Hyper reference `user/saving_and_loading:if-environment-saves-su ccessfully' on page 21 undefined on input line 885. LaTeX Warning: Hyper reference `user/saving_and_loading:environment-not-found-e rror' on page 21 undefined on input line 888. LaTeX Warning: Hyper reference `user/saving_and_loading:using-trained-value-fun ctions' on page 21 undefined on input line 891. [21] LaTeX Warning: Hyper reference `user/saving_and_loading:details-below' on page 22 undefined on input line 936. [22] [23] [24] [25] [26] Chapter 6. [27] [28] Chapter 7. LaTeX Warning: Hyper reference `spinningup/rl_intro:part-1-key-concepts-in-rl' on page 29 undefined on input line 1252. LaTeX Warning: Hyper reference `spinningup/rl_intro:what-can-rl-do' on page 29 undefined on input line 1255. LaTeX Warning: Hyper reference `spinningup/rl_intro:key-concepts-and-terminolog y' on page 29 undefined on input line 1258. LaTeX Warning: Hyper reference `spinningup/rl_intro:optional-formalism' on page 29 undefined on input line 1261. [29] [30 <./rl_diagram_transparent_bg.png>] [31] [32] [33] ! Undefined control sequence. ... } P(\tau |\pi ) R(\tau ) = \underE {\tau \sim \pi }{R(\tau )}... l.1504 ...nderE{\tau\sim \pi}{R(\tau)}.\end{split} ! Undefined control sequence. ... } P(\tau |\pi ) R(\tau ) = \underE {\tau \sim \pi }{R(\tau )}... l.1504 ...nderE{\tau\sim \pi}{R(\tau)}.\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {\tau \sim \pi }{R(\tau )\... l.1523 ...R(\tau)\left| s_0 = s\right.}\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {\tau \sim \pi }{R(\tau )\... l.1523 ...R(\tau)\left| s_0 = s\right.}\end{split} [34] ! Undefined control sequence. ...egin {split}Q^{\pi }(s,a) = \underE {\tau \sim \pi }{R(\tau )\... l.1530 ...eft| s_0 = s, a_0 = a\right.}\end{split} ! Undefined control sequence. ...egin {split}Q^{\pi }(s,a) = \underE {\tau \sim \pi }{R(\tau )\... l.1530 ...eft| s_0 = s, a_0 = a\right.}\end{split} ! Undefined control sequence. ...split}V^*(s) = \max _{\pi } \underE {\tau \sim \pi }{R(\tau )\... l.1537 ...R(\tau)\left| s_0 = s\right.}\end{split} ! Undefined control sequence. ...split}V^*(s) = \max _{\pi } \underE {\tau \sim \pi }{R(\tau )\... l.1537 ...R(\tau)\left| s_0 = s\right.}\end{split} ! Undefined control sequence. ...lit}Q^*(s,a) = \max _{\pi } \underE {\tau \sim \pi }{R(\tau )\... l.1544 ...eft| s_0 = s, a_0 = a\right.}\end{split} ! Undefined control sequence. ...lit}Q^*(s,a) = \max _{\pi } \underE {\tau \sim \pi }{R(\tau )\... l.1544 ...eft| s_0 = s, a_0 = a\right.}\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {a\sim \pi }{Q^{\pi }(s,a)... l.1558 ...erE{a\sim \pi}{Q^{\pi}(s,a)},\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {a\sim \pi }{Q^{\pi }(s,a)... l.1558 ...erE{a\sim \pi}{Q^{\pi}(s,a)},\end{split} [35] ! Undefined control sequence. V^{\pi }(s) &= \underE {a \sim \pi \\ s'\sim P}{r(s,a) + \gamma ... l.1591 \end{align*} ! Missing } inserted. } l.1591 \end{align*} ! Missing { inserted. { l.1591 \end{align*} ! Undefined control sequence. ...}(s')}, \\ Q^{\pi }(s,a) &= \underE {s'\sim P}{r(s,a) + \gamma... l.1591 \end{align*} ! Undefined control sequence. ... {s'\sim P}{r(s,a) + \gamma \underE {a'\sim \pi }{Q^{\pi }(s',... l.1591 \end{align*} ! Undefined control sequence. V^{\pi }(s) &= \underE {a \sim \pi \\ s'\sim P}{r(s,a) + \gamma ... l.1591 \end{align*} ! Missing } inserted. } l.1591 \end{align*} ! Missing \endgroup inserted. \endgroup l.1591 \end{align*} ! Misplaced \omit. \math@cr@@@ ...@ \@ne \add@amps \maxfields@ \omit \kern -\alignsep@ \iftag@ ... l.1591 \end{align*} ! Missing { inserted. { l.1591 \end{align*} ! Undefined control sequence. ...}(s')}, \\ Q^{\pi }(s,a) &= \underE {s'\sim P}{r(s,a) + \gamma... l.1591 \end{align*} ! Undefined control sequence. ... {s'\sim P}{r(s,a) + \gamma \underE {a'\sim \pi }{Q^{\pi }(s',... l.1591 \end{align*} ! Undefined control sequence. V^*(s) &= \max _a \underE {s'\sim P}{r(s,a) + \gamma V^*(s')}, \... l.1598 \end{align*} ! Undefined control sequence. ...ma V^*(s')}, \\ Q^*(s,a) &= \underE {s'\sim P}{r(s,a) + \gamma... l.1598 \end{align*} ! Undefined control sequence. V^*(s) &= \max _a \underE {s'\sim P}{r(s,a) + \gamma V^*(s')}, \... l.1598 \end{align*} ! Undefined control sequence. ...ma V^*(s')}, \\ Q^*(s,a) &= \underE {s'\sim P}{r(s,a) + \gamma... l.1598 \end{align*} [36] [37] [38] Chapter 8. LaTeX Warning: Hyper reference `spinningup/rl_intro2:part-2-kinds-of-rl-algorit hms' on page 39 undefined on input line 1651. LaTeX Warning: Hyper reference `spinningup/rl_intro2:a-taxonomy-of-rl-algorithm s' on page 39 undefined on input line 1654. LaTeX Warning: Hyper reference `spinningup/rl_intro2:links-to-algorithms-in-tax onomy' on page 39 undefined on input line 1657. ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.1672 ...ncludegraphics{{rl_algorithms_9_15}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.1672 ...ncludegraphics{{rl_algorithms_9_15}.svg} LaTeX Warning: Hyper reference `spinningup/rl_intro2:citations-below' on page 3 9 undefined on input line 1673. [39] [40] [41] [42] Chapter 9. LaTeX Warning: Hyper reference `spinningup/rl_intro3:part-3-intro-to-policy-opt imization' on page 43 undefined on input line 1814. LaTeX Warning: Hyper reference `spinningup/rl_intro3:deriving-the-simplest-poli cy-gradient' on page 43 undefined on input line 1817. LaTeX Warning: Hyper reference `spinningup/rl_intro3:implementing-the-simplest- policy-gradient' on page 43 undefined on input line 1820. LaTeX Warning: Hyper reference `spinningup/rl_intro3:expected-grad-log-prob-lem ma' on page 43 undefined on input line 1823. LaTeX Warning: Hyper reference `spinningup/rl_intro3:don-t-let-the-past-distrac t-you' on page 43 undefined on input line 1826. LaTeX Warning: Hyper reference `spinningup/rl_intro3:implementing-reward-to-go- policy-gradient' on page 43 undefined on input line 1829. LaTeX Warning: Hyper reference `spinningup/rl_intro3:baselines-in-policy-gradie nts' on page 43 undefined on input line 1832. LaTeX Warning: Hyper reference `spinningup/rl_intro3:other-forms-of-the-policy- gradient' on page 43 undefined on input line 1835. LaTeX Warning: Hyper reference `spinningup/rl_intro3:recap' on page 43 undefine d on input line 1838. ! Undefined control sequence. l.1863 ...ected return \(J(\pi_{\theta}) = \underE {\tau \sim \pi_{\theta}}{R... [43] ! Undefined control sequence. ...} \log P(\tau | \theta ) &= \cancel {\nabla _{\theta } \log \r... l.1894 ... \log \pi_{\theta}(a_t |s_t).\end{split} ! Undefined control sequence. ...} + \sum _{t=0}^{T} \bigg ( \cancel {\nabla _{\theta } \log P(... l.1894 ... \log \pi_{\theta}(a_t |s_t).\end{split} ! Undefined control sequence. ...} \log P(\tau | \theta ) &= \cancel {\nabla _{\theta } \log \r... l.1894 ... \log \pi_{\theta}(a_t |s_t).\end{split} ! Undefined control sequence. ...} + \sum _{t=0}^{T} \bigg ( \cancel {\nabla _{\theta } \log P(... l.1894 ... \log \pi_{\theta}(a_t |s_t).\end{split} ! Undefined control sequence. ...eta }) &= \nabla _{\theta } \underE {\tau \sim \pi _{\theta }}... l.1906 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...Log-derivative trick} \\ &= \underE {\tau \sim \pi _{\theta }}... l.1906 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...heta } J(\pi _{\theta }) &= \underE {\tau \sim \pi _{\theta }}... l.1906 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...eta }) &= \nabla _{\theta } \underE {\tau \sim \pi _{\theta }}... l.1906 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...Log-derivative trick} \\ &= \underE {\tau \sim \pi _{\theta }}... l.1906 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...heta } J(\pi _{\theta }) &= \underE {\tau \sim \pi _{\theta }}... l.1906 \end{align*} \end{sphinxadmonition} [44] [45] [46] [47] ! Undefined control sequence. \split@tag \begin {split}\underE {x \sim P_{\theta }}{\nabla _{\t... l.2055 ...eta} \log P_{\theta}(x)} = 0.\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {x \sim P_{\theta }}{\nabla _{\t... l.2055 ...eta} \log P_{\theta}(x)} = 0.\end{split} ! Undefined control sequence. ...eta }(x) \\ \therefore 0 &= \underE {x \sim P_{\theta }}{\nabl... l.2072 ...{\theta} \log P_{\theta}(x)}.\end{split} ! Undefined control sequence. ...eta }(x) \\ \therefore 0 &= \underE {x \sim P_{\theta }}{\nabl... l.2072 ...{\theta} \log P_{\theta}(x)}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2080 ..._{\theta}(a_t |s_t) R(\tau)}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2080 ..._{\theta}(a_t |s_t) R(\tau)}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2088 ...R(s_{t'}, a_{t'}, s_{t'+1})}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2088 ...R(s_{t'}, a_{t'}, s_{t'+1})}.\end{split} [48] ! Undefined control sequence. \split@tag \begin {split}\underE {a_t \sim \pi _{\theta }}{\nabla... l.2140 ...\theta}(a_t|s_t) b(s_t)} = 0.\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {a_t \sim \pi _{\theta }}{\nabla... l.2140 ...\theta}(a_t|s_t) b(s_t)} = 0.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2144 ..., s_{t'+1}) - b(s_t)\right)}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2144 ..., s_{t'+1}) - b(s_t)\right)}.\end{split} [49] ! Undefined control sequence. ...phi _k = \arg \min _{\phi } \underE {s_t, \hat {R}_t \sim \pi ... l.2158 ...(s_t) - \hat{R}_t \right)^2},\end{split} ! Undefined control sequence. ...phi _k = \arg \min _{\phi } \underE {s_t, \hat {R}_t \sim \pi ... l.2158 ...(s_t) - \hat{R}_t \right)^2},\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2172 ...i_{\theta}(a_t |s_t) \Phi_t},\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2172 ...i_{\theta}(a_t |s_t) \Phi_t},\end{split} [50] [51] [52] Chapter 10. LaTeX Warning: Hyper reference `spinningup/spinningup:spinning-up-as-a-deep-rl- researcher' on page 53 undefined on input line 2226. LaTeX Warning: Hyper reference `spinningup/spinningup:the-right-background' on page 53 undefined on input line 2229. LaTeX Warning: Hyper reference `spinningup/spinningup:learn-by-doing' on page 5 3 undefined on input line 2232. LaTeX Warning: Hyper reference `spinningup/spinningup:developing-a-research-pro ject' on page 53 undefined on input line 2235. LaTeX Warning: Hyper reference `spinningup/spinningup:doing-rigorous-research-i n-rl' on page 53 undefined on input line 2238. LaTeX Warning: Hyper reference `spinningup/spinningup:closing-thoughts' on page 53 undefined on input line 2241. LaTeX Warning: Hyper reference `spinningup/spinningup:ps-other-resources' on pa ge 53 undefined on input line 2244. LaTeX Warning: Hyper reference `spinningup/spinningup:references' on page 53 un defined on input line 2247. [53] [54] [55] [56] [57] [58] Chapter 11. LaTeX Warning: Hyper reference `spinningup/keypapers:key-papers-in-deep-rl' on page 59 undefined on input line 2356. LaTeX Warning: Hyper reference `spinningup/keypapers:model-free-rl' on page 59 undefined on input line 2359. LaTeX Warning: Hyper reference `spinningup/keypapers:exploration' on page 59 un defined on input line 2362. LaTeX Warning: Hyper reference `spinningup/keypapers:transfer-and-multitask-rl' on page 59 undefined on input line 2365. LaTeX Warning: Hyper reference `spinningup/keypapers:hierarchy' on page 59 unde fined on input line 2368. LaTeX Warning: Hyper reference `spinningup/keypapers:memory' on page 59 undefin ed on input line 2371. LaTeX Warning: Hyper reference `spinningup/keypapers:model-based-rl' on page 59 undefined on input line 2374. LaTeX Warning: Hyper reference `spinningup/keypapers:meta-rl' on page 59 undefi ned on input line 2377. LaTeX Warning: Hyper reference `spinningup/keypapers:scaling-rl' on page 59 und efined on input line 2380. LaTeX Warning: Hyper reference `spinningup/keypapers:rl-in-the-real-world' on p age 59 undefined on input line 2383. LaTeX Warning: Hyper reference `spinningup/keypapers:safety' on page 59 undefin ed on input line 2386. LaTeX Warning: Hyper reference `spinningup/keypapers:imitation-learning-and-inv erse-reinforcement-learning' on page 59 undefined on input line 2389. LaTeX Warning: Hyper reference `spinningup/keypapers:bonus-classic-papers-in-rl -theory-or-review' on page 59 undefined on input line 2392. [59] [60] Overfull \vbox (74.45543pt too high) has occurred while \output is active [61] [62] Chapter 12. LaTeX Warning: Hyper reference `spinningup/exercises:exercises' on page 63 unde fined on input line 2478. LaTeX Warning: Hyper reference `spinningup/exercises:problem-set-1-basics-of-im plementation' on page 63 undefined on input line 2481. LaTeX Warning: Hyper reference `spinningup/exercises:problem-set-2-algorithm-fa ilure-modes' on page 63 undefined on input line 2484. LaTeX Warning: Hyper reference `spinningup/exercises:challenges' on page 63 und efined on input line 2487. [63] [64] [65] [66] Chapter 13. LaTeX Warning: Hyper reference `spinningup/bench:benchmarks-for-spinning-up-imp lementations' on page 67 undefined on input line 2626. LaTeX Warning: Hyper reference `spinningup/bench:performance-in-each-environmen t' on page 67 undefined on input line 2629. LaTeX Warning: Hyper reference `spinningup/bench:halfcheetah' on page 67 undefi ned on input line 2632. LaTeX Warning: Hyper reference `spinningup/bench:hopper' on page 67 undefined o n input line 2635. LaTeX Warning: Hyper reference `spinningup/bench:walker' on page 67 undefined o n input line 2638. LaTeX Warning: Hyper reference `spinningup/bench:swimmer' on page 67 undefined on input line 2641. LaTeX Warning: Hyper reference `spinningup/bench:ant' on page 67 undefined on i nput line 2644. LaTeX Warning: Hyper reference `spinningup/bench:experiment-details' on page 67 undefined on input line 2649. ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2667 ...includegraphics{{bench_halfcheetah}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2667 ...includegraphics{{bench_halfcheetah}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2676 ...phinxincludegraphics{{bench_hopper}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2676 ...phinxincludegraphics{{bench_hopper}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2685 ...phinxincludegraphics{{bench_walker}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2685 ...phinxincludegraphics{{bench_walker}.svg} [67] ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2694 ...\sphinxincludegraphics{{bench_swim}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2694 ...\sphinxincludegraphics{{bench_swim}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2703 ...t\sphinxincludegraphics{{bench_ant}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2703 ...t\sphinxincludegraphics{{bench_ant}.svg} [68] [69] [70] Chapter 14. LaTeX Warning: Hyper reference `algorithms/vpg:vanilla-policy-gradient' on page 71 undefined on input line 2726. LaTeX Warning: Hyper reference `algorithms/vpg:background' on page 71 undefined on input line 2729. LaTeX Warning: Hyper reference `algorithms/vpg:quick-facts' on page 71 undefine d on input line 2732. LaTeX Warning: Hyper reference `algorithms/vpg:key-equations' on page 71 undefi ned on input line 2735. LaTeX Warning: Hyper reference `algorithms/vpg:exploration-vs-exploitation' on page 71 undefined on input line 2738. LaTeX Warning: Hyper reference `algorithms/vpg:pseudocode' on page 71 undefined on input line 2741. LaTeX Warning: Hyper reference `algorithms/vpg:documentation' on page 71 undefi ned on input line 2746. LaTeX Warning: Hyper reference `algorithms/vpg:saved-model-contents' on page 71 undefined on input line 2749. LaTeX Warning: Hyper reference `algorithms/vpg:references' on page 71 undefined on input line 2754. LaTeX Warning: Hyper reference `algorithms/vpg:relevant-papers' on page 71 unde fined on input line 2757. LaTeX Warning: Hyper reference `algorithms/vpg:why-these-papers' on page 71 und efined on input line 2760. LaTeX Warning: Hyper reference `algorithms/vpg:other-public-implementations' on page 71 undefined on input line 2763. [71] ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2800 },\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2800 },\end{split} ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2817 ...rithms/vpg:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2818 \caption {Vanilla Policy Gradient Algorithm} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2820 \begin{algorithmic} [1] ! Undefined control sequence. l.2821 \STATE Input: initial policy parameters $\theta_0$, initial value... ! Undefined control sequence. l.2822 \FOR {$k = 0,1,2,...$} ! Undefined control sequence. l.2823 \STATE Collect set of trajectories ${\mathcal D}_k = \{\tau_i\}$ ... ! Undefined control sequence. l.2824 \STATE Compute rewards-to-go $\hat{R}_t$. ! Undefined control sequence. l.2825 \STATE Compute advantage estimates, $\hat{A}_t$ (using any method... ! Undefined control sequence. l.2826 \STATE Estimate policy gradient as ! Undefined control sequence. l.2830 \STATE Compute policy update, either using standard gradient ascent, ! Undefined control sequence. l.2835 \STATE Fit value function by regression on mean-squared error: ! Undefined control sequence. l.2840 \ENDFOR ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2841 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2842 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 2848--2848 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 2848--2848 \T1/ptm/m/it/10 steps_per_epoch=4000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 epochs=50\ T1/ptm/m/n/10 , \T1/ptm/m/it/10 gamma=0.99\T1/ptm/m/n/10 , \T1/ptm/m/it/10 pi_l r=0.0003\T1/ptm/m/n/10 , \T1/ptm/m/it/10 vf_lr=0.001\T1/ptm/m/n/10 , [72] [73] [74] [75] [76] Chapter 15. LaTeX Warning: Hyper reference `algorithms/trpo:trust-region-policy-optimizatio n' on page 77 undefined on input line 3048. LaTeX Warning: Hyper reference `algorithms/trpo:background' on page 77 undefine d on input line 3051. LaTeX Warning: Hyper reference `algorithms/trpo:quick-facts' on page 77 undefin ed on input line 3054. LaTeX Warning: Hyper reference `algorithms/trpo:key-equations' on page 77 undef ined on input line 3057. LaTeX Warning: Hyper reference `algorithms/trpo:exploration-vs-exploitation' on page 77 undefined on input line 3060. LaTeX Warning: Hyper reference `algorithms/trpo:pseudocode' on page 77 undefine d on input line 3063. LaTeX Warning: Hyper reference `algorithms/trpo:documentation' on page 77 undef ined on input line 3068. LaTeX Warning: Hyper reference `algorithms/trpo:saved-model-contents' on page 7 7 undefined on input line 3071. LaTeX Warning: Hyper reference `algorithms/trpo:references' on page 77 undefine d on input line 3076. LaTeX Warning: Hyper reference `algorithms/trpo:relevant-papers' on page 77 und efined on input line 3079. LaTeX Warning: Hyper reference `algorithms/trpo:why-these-papers' on page 77 un defined on input line 3082. LaTeX Warning: Hyper reference `algorithms/trpo:other-public-implementations' o n page 77 undefined on input line 3085. [77] ! Undefined control sequence. ...al L}(\theta _k, \theta ) = \underE {s,a \sim \pi _{\theta _k}... l.3129 },\end{split} ! Undefined control sequence. ...al L}(\theta _k, \theta ) = \underE {s,a \sim \pi _{\theta _k}... l.3129 },\end{split} ! Undefined control sequence. ...{KL}(\theta || \theta _k) = \underE {s \sim \pi _{\theta _k}}{... l.3135 }.\end{split} ! Undefined control sequence. ...{KL}(\theta || \theta _k) = \underE {s \sim \pi _{\theta _k}}{... l.3135 }.\end{split} [78] ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3184 ...ithms/trpo:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3185 \caption {Trust Region Policy Optimization} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3187 \begin{algorithmic} [1] ! Undefined control sequence. l.3188 \STATE Input: initial policy parameters $\theta_0$, initial value... ! Undefined control sequence. l.3189 \STATE Hyperparameters: KL-divergence limit $\delta$, backtrackin... ! Undefined control sequence. l.3190 \FOR {$k = 0,1,2,...$} ! Undefined control sequence. l.3191 \STATE Collect set of trajectories ${\mathcal D}_k = \{\tau_i\}$ ... ! Undefined control sequence. l.3192 \STATE Compute rewards-to-go $\hat{R}_t$. ! Undefined control sequence. l.3193 \STATE Compute advantage estimates, $\hat{A}_t$ (using any method... ! Undefined control sequence. l.3194 \STATE Estimate policy gradient as ! Undefined control sequence. l.3198 \STATE Use the conjugate gradient algorithm to compute ! Undefined control sequence. l.3203 \STATE Update the policy by backtracking line search with [79] ! Undefined control sequence. l.3208 \STATE Fit value function by regression on mean-squared error: ! Undefined control sequence. l.3213 \ENDFOR ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3214 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3215 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 3221--3221 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 3221--3221 \T1/ptm/m/it/10 steps_per_epoch=4000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 epochs=50\ T1/ptm/m/n/10 , \T1/ptm/m/it/10 gamma=0.99\T1/ptm/m/n/10 , \T1/ptm/m/it/10 delt a=0.01\T1/ptm/m/n/10 , \T1/ptm/m/it/10 vf_lr=0.001\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 3221--3221 \T1/ptm/m/it/10 train_v_iters=80\T1/ptm/m/n/10 , \T1/ptm/m/it/10 damp-ing_coeff =0.1\T1/ptm/m/n/10 , \T1/ptm/m/it/10 cg_iters=10\T1/ptm/m/n/10 , \T1/ptm/m/it/1 0 back-track_iters=10\T1/ptm/m/n/10 , \T1/ptm/m/it/10 back- Underfull \hbox (badness 10000) in paragraph at lines 3221--3221 \T1/ptm/m/it/10 track_coeff=0.8\T1/ptm/m/n/10 , \T1/ptm/m/it/10 lam=0.97\T1/ptm /m/n/10 , \T1/ptm/m/it/10 max_ep_len=1000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 log-g er_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 save_freq=10\T1/ptm/m/n/10 , [80] Overfull \vbox (100.02797pt too high) has occurred while \output is active [81] [82] [83] [84] Chapter 16. LaTeX Warning: Hyper reference `algorithms/ppo:proximal-policy-optimization' on page 85 undefined on input line 3495. LaTeX Warning: Hyper reference `algorithms/ppo:background' on page 85 undefined on input line 3498. LaTeX Warning: Hyper reference `algorithms/ppo:quick-facts' on page 85 undefine d on input line 3501. LaTeX Warning: Hyper reference `algorithms/ppo:key-equations' on page 85 undefi ned on input line 3504. LaTeX Warning: Hyper reference `algorithms/ppo:exploration-vs-exploitation' on page 85 undefined on input line 3507. LaTeX Warning: Hyper reference `algorithms/ppo:pseudocode' on page 85 undefined on input line 3510. LaTeX Warning: Hyper reference `algorithms/ppo:documentation' on page 85 undefi ned on input line 3515. LaTeX Warning: Hyper reference `algorithms/ppo:saved-model-contents' on page 85 undefined on input line 3518. LaTeX Warning: Hyper reference `algorithms/ppo:references' on page 85 undefined on input line 3523. LaTeX Warning: Hyper reference `algorithms/ppo:relevant-papers' on page 85 unde fined on input line 3526. LaTeX Warning: Hyper reference `algorithms/ppo:why-these-papers' on page 85 und efined on input line 3529. LaTeX Warning: Hyper reference `algorithms/ppo:other-public-implementations' on page 85 undefined on input line 3532. [85] [86] ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3641 ...rithms/ppo:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3642 \caption {PPO-Clip} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3644 \begin{algorithmic} [1] ! Undefined control sequence. l.3645 \STATE Input: initial policy parameters $\theta_0$, initial value... ! Undefined control sequence. l.3646 \FOR {$k = 0,1,2,...$} ! Undefined control sequence. l.3647 \STATE Collect set of trajectories ${\mathcal D}_k = \{\tau_i\}$ ... ! Undefined control sequence. l.3648 \STATE Compute rewards-to-go $\hat{R}_t$. ! Undefined control sequence. l.3649 \STATE Compute advantage estimates, $\hat{A}_t$ (using any method... ! Undefined control sequence. l.3650 \STATE Update the policy by maximizing the PPO-Clip objective: ! Undefined control sequence. l.3658 \STATE Fit value function by regression on mean-squared error: ! Undefined control sequence. l.3663 \ENDFOR ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3664 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3665 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 3671--3671 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , [87] [88] [89] [90] Chapter 17. LaTeX Warning: Hyper reference `algorithms/ddpg:deep-deterministic-policy-gradi ent' on page 91 undefined on input line 3891. LaTeX Warning: Hyper reference `algorithms/ddpg:background' on page 91 undefine d on input line 3894. LaTeX Warning: Hyper reference `algorithms/ddpg:quick-facts' on page 91 undefin ed on input line 3897. LaTeX Warning: Hyper reference `algorithms/ddpg:key-equations' on page 91 undef ined on input line 3900. LaTeX Warning: Hyper reference `algorithms/ddpg:the-q-learning-side-of-ddpg' on page 91 undefined on input line 3903. LaTeX Warning: Hyper reference `algorithms/ddpg:the-policy-learning-side-of-ddp g' on page 91 undefined on input line 3906. LaTeX Warning: Hyper reference `algorithms/ddpg:exploration-vs-exploitation' on page 91 undefined on input line 3911. LaTeX Warning: Hyper reference `algorithms/ddpg:pseudocode' on page 91 undefine d on input line 3914. LaTeX Warning: Hyper reference `algorithms/ddpg:documentation' on page 91 undef ined on input line 3919. LaTeX Warning: Hyper reference `algorithms/ddpg:saved-model-contents' on page 9 1 undefined on input line 3922. LaTeX Warning: Hyper reference `algorithms/ddpg:references' on page 91 undefine d on input line 3927. LaTeX Warning: Hyper reference `algorithms/ddpg:relevant-papers' on page 91 und efined on input line 3930. LaTeX Warning: Hyper reference `algorithms/ddpg:why-these-papers' on page 91 un defined on input line 3933. LaTeX Warning: Hyper reference `algorithms/ddpg:other-public-implementations' o n page 91 undefined on input line 3936. [91] [92] [93] ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4055 ...ithms/ddpg:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4056 \caption {Deep Deterministic Policy Gradient} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4058 \begin{algorithmic} [1] ! Undefined control sequence. l.4059 \STATE Input: initial policy parameters $\theta$, Q-function para... ! Undefined control sequence. l.4060 \STATE Set target parameters equal to main parameters $\theta_{\t... ! Undefined control sequence. l.4061 \REPEAT ! Undefined control sequence. l.4062 \STATE Observe state $s$ and select action $a = \text{clip}(\... ! Undefined control sequence. l.4063 \STATE Execute $a$ in the environment ! Undefined control sequence. l.4064 \STATE Observe next state $s'$, reward $r$, and done signal $... ! Undefined control sequence. l.4065 \STATE Store $(s,a,r,s',d)$ in replay buffer $\mathcal{D}$ ! Undefined control sequence. l.4066 \STATE If $s'$ is terminal, reset environment state. ! Undefined control sequence. l.4067 \IF {it's time to update} ! Undefined control sequence. l.4068 \FOR {however many updates} ! Undefined control sequence. l.4069 \STATE Randomly sample a batch of transitions, $B = \... ! Undefined control sequence. l.4070 \STATE Compute targets ! Undefined control sequence. l.4074 \STATE Update Q-function by one step of gradient desc... ! Undefined control sequence. l.4078 \STATE Update policy by one step of gradient ascent u... ! Undefined control sequence. l.4082 \STATE Update target networks with ! Undefined control sequence. l.4087 \ENDFOR ! Undefined control sequence. l.4088 \ENDIF ! Undefined control sequence. l.4089 \UNTIL {convergence} ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4090 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4091 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 4097--4097 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 4097--4097 \T1/ptm/m/it/10 pi_lr=0.001\T1/ptm/m/n/10 , \T1/ptm/m/it/10 q_lr=0.001\T1/ptm/m /n/10 , \T1/ptm/m/it/10 batch_size=100\T1/ptm/m/n/10 , \T1/ptm/m/it/10 start_st eps=10000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 act_noise=0.1\T1/ptm/m/n/10 , [94] [95] [96] Chapter 18. LaTeX Warning: Hyper reference `algorithms/td3:twin-delayed-ddpg' on page 97 un defined on input line 4312. LaTeX Warning: Hyper reference `algorithms/td3:background' on page 97 undefined on input line 4315. LaTeX Warning: Hyper reference `algorithms/td3:quick-facts' on page 97 undefine d on input line 4318. LaTeX Warning: Hyper reference `algorithms/td3:key-equations' on page 97 undefi ned on input line 4321. LaTeX Warning: Hyper reference `algorithms/td3:exploration-vs-exploitation' on page 97 undefined on input line 4324. LaTeX Warning: Hyper reference `algorithms/td3:pseudocode' on page 97 undefined on input line 4327. LaTeX Warning: Hyper reference `algorithms/td3:documentation' on page 97 undefi ned on input line 4332. LaTeX Warning: Hyper reference `algorithms/td3:saved-model-contents' on page 97 undefined on input line 4335. LaTeX Warning: Hyper reference `algorithms/td3:references' on page 97 undefined on input line 4340. LaTeX Warning: Hyper reference `algorithms/td3:relevant-papers' on page 97 unde fined on input line 4343. LaTeX Warning: Hyper reference `algorithms/td3:other-public-implementations' on page 97 undefined on input line 4346. [97] ! Undefined control sequence. ...}L(\phi _1, {\mathcal D}) = \underE {(s,a,r,s',d) \sim {\mathc... l.4403 },\end{split} ! Undefined control sequence. ...}L(\phi _1, {\mathcal D}) = \underE {(s,a,r,s',d) \sim {\mathc... l.4403 },\end{split} ! Undefined control sequence. ...}L(\phi _2, {\mathcal D}) = \underE {(s,a,r,s',d) \sim {\mathc... l.4407 }.\end{split} ! Undefined control sequence. ...}L(\phi _2, {\mathcal D}) = \underE {(s,a,r,s',d) \sim {\mathc... l.4407 }.\end{split} [98] ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4431 ...rithms/td3:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4432 \caption {Twin Delayed DDPG} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4434 \begin{algorithmic} [1] ! Undefined control sequence. l.4435 \STATE Input: initial policy parameters $\theta$, Q-function para... ! Undefined control sequence. l.4436 \STATE Set target parameters equal to main parameters $\theta_{\t... ! Undefined control sequence. l.4437 \REPEAT ! Undefined control sequence. l.4438 \STATE Observe state $s$ and select action $a = \text{clip}(\... ! Undefined control sequence. l.4439 \STATE Execute $a$ in the environment ! Undefined control sequence. l.4440 \STATE Observe next state $s'$, reward $r$, and done signal $... ! Undefined control sequence. l.4441 \STATE Store $(s,a,r,s',d)$ in replay buffer $\mathcal{D}$ ! Undefined control sequence. l.4442 \STATE If $s'$ is terminal, reset environment state. ! Undefined control sequence. l.4443 \IF {it's time to update} ! Undefined control sequence. l.4444 \FOR {$j$ in range(however many updates)} ! Undefined control sequence. l.4445 \STATE Randomly sample a batch of transitions, $B = \... ! Undefined control sequence. l.4446 \STATE Compute target actions ! Undefined control sequence. l.4450 \STATE Compute targets ! Undefined control sequence. l.4454 \STATE Update Q-functions by one step of gradient des... ! Undefined control sequence. l.4458 \IF { $j \mod$ \texttt{policy\_delay} $ = 0$} ! Undefined control sequence. l.4459 \STATE Update policy by one step of gradient asce... ! Undefined control sequence. l.4463 \STATE Update target networks with ! Undefined control sequence. l.4468 \ENDIF ! Undefined control sequence. l.4469 \ENDFOR ! Undefined control sequence. l.4470 \ENDIF ! Undefined control sequence. l.4471 \UNTIL {convergence} ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4472 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4473 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 4479--4479 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 4479--4479 \T1/ptm/m/it/10 pi_lr=0.001\T1/ptm/m/n/10 , \T1/ptm/m/it/10 q_lr=0.001\T1/ptm/m /n/10 , \T1/ptm/m/it/10 batch_size=100\T1/ptm/m/n/10 , \T1/ptm/m/it/10 start_st eps=10000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 act_noise=0.1\T1/ptm/m/n/10 , \T1/ptm /m/it/10 tar- [99] [100] [101] [102] Chapter 19. LaTeX Warning: Hyper reference `algorithms/sac:soft-actor-critic' on page 103 u ndefined on input line 4705. LaTeX Warning: Hyper reference `algorithms/sac:background' on page 103 undefine d on input line 4708. LaTeX Warning: Hyper reference `algorithms/sac:quick-facts' on page 103 undefin ed on input line 4711. LaTeX Warning: Hyper reference `algorithms/sac:key-equations' on page 103 undef ined on input line 4714. LaTeX Warning: Hyper reference `algorithms/sac:entropy-regularized-reinforcemen t-learning' on page 103 undefined on input line 4717. LaTeX Warning: Hyper reference `algorithms/sac:id1' on page 103 undefined on in put line 4720. LaTeX Warning: Hyper reference `algorithms/sac:exploration-vs-exploitation' on page 103 undefined on input line 4725. LaTeX Warning: Hyper reference `algorithms/sac:pseudocode' on page 103 undefine d on input line 4728. LaTeX Warning: Hyper reference `algorithms/sac:documentation' on page 103 undef ined on input line 4733. LaTeX Warning: Hyper reference `algorithms/sac:saved-model-contents' on page 10 3 undefined on input line 4736. LaTeX Warning: Hyper reference `algorithms/sac:references' on page 103 undefine d on input line 4741. LaTeX Warning: Hyper reference `algorithms/sac:relevant-papers' on page 103 und efined on input line 4744. LaTeX Warning: Hyper reference `algorithms/sac:other-public-implementations' on page 103 undefined on input line 4747. [103] ! Undefined control sequence. ...it@tag \begin {split}H(P) = \underE {x \sim P}{-\log P(x)}.\en... l.4794 ...underE{x \sim P}{-\log P(x)}.\end{split} ! Undefined control sequence. ...it@tag \begin {split}H(P) = \underE {x \sim P}{-\log P(x)}.\en... l.4794 ...underE{x \sim P}{-\log P(x)}.\end{split} ! Undefined control sequence. ...}\pi ^* = \arg \max _{\pi } \underE {\tau \sim \pi }{ \sum _{t... l.4798 ...pi(\cdot|s_t)\right) \bigg)},\end{split} ! Undefined control sequence. ...}\pi ^* = \arg \max _{\pi } \underE {\tau \sim \pi }{ \sum _{t... l.4798 ...pi(\cdot|s_t)\right) \bigg)},\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {\tau \sim \pi }{ \left . ... l.4802 ...ight) \bigg) \right| s_0 = s}\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {\tau \sim \pi }{ \left . ... l.4802 ...ight) \bigg) \right| s_0 = s}\end{split} ! Undefined control sequence. ...egin {split}Q^{\pi }(s,a) = \underE {\tau \sim \pi }{ \left . ... l.4806 ...ght)\right| s_0 = s, a_0 = a}\end{split} ! Undefined control sequence. ...egin {split}Q^{\pi }(s,a) = \underE {\tau \sim \pi }{ \left . ... l.4806 ...ght)\right| s_0 = s, a_0 = a}\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {a \sim \pi }{Q^{\pi }(s,a... l.4810 ...ha H\left(\pi(\cdot|s)\right)\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {a \sim \pi }{Q^{\pi }(s,a... l.4810 ...ha H\left(\pi(\cdot|s)\right)\end{split} ! Undefined control sequence. ...gin {split}Q^{\pi }(s,a) &= \underE {s' \sim P \\ a' \sim \pi ... l.4815 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Missing } inserted. } l.4815 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Missing { inserted. { l.4815 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Undefined control sequence. ...s')\right ) \right )} \\ &= \underE {s' \sim P}{R(s,a,s') + \g... l.4815 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Undefined control sequence. ...gin {split}Q^{\pi }(s,a) &= \underE {s' \sim P \\ a' \sim \pi ... l.4815 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Missing } inserted. } l.4815 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Missing { inserted. { l.4815 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Undefined control sequence. ...s')\right ) \right )} \\ &= \underE {s' \sim P}{R(s,a,s') + \g... l.4815 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} [104] ! Undefined control sequence. ...begin {split}V^{\pi }(s) &= \underE {a \sim \pi }{Q^{\pi }(s,a... l.4838 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. ...pi (\cdot |s)\right ) \\ &= \underE {a \sim \pi }{Q^{\pi }(s,a... l.4838 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. ...begin {split}V^{\pi }(s) &= \underE {a \sim \pi }{Q^{\pi }(s,a... l.4838 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. ...pi (\cdot |s)\right ) \\ &= \underE {a \sim \pi }{Q^{\pi }(s,a... l.4838 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. ...it}L(\psi , {\mathcal D}) = \underE {s \sim \mathcal {D} \\ \t... l.4846 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Missing } inserted. } l.4846 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Missing { inserted. { l.4846 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Undefined control sequence. ...it}L(\psi , {\mathcal D}) = \underE {s \sim \mathcal {D} \\ \t... l.4846 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Missing } inserted. } l.4846 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Missing { inserted. { l.4846 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {a \sim \pi }{Q^{\pi }(s,a) - \a... l.4852 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {a \sim \pi }{Q^{\pi }(s,a) - \a... l.4852 ...s,a) - \alpha \log \pi(a|s)}.\end{split} [105] ! Undefined control sequence. \split@tag \begin {split}\underE {a \sim \pi _{\theta }}{Q^{\pi _... l.4869 ...\tilde{a}_{\theta}(s,\xi)|s)}\end{split} ! Undefined control sequence. ...\log \pi _{\theta }(a|s)} = \underE {\xi \sim \mathcal {N}}{Q^... l.4869 ...\tilde{a}_{\theta}(s,\xi)|s)}\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {a \sim \pi _{\theta }}{Q^{\pi _... l.4869 ...\tilde{a}_{\theta}(s,\xi)|s)}\end{split} ! Undefined control sequence. ...\log \pi _{\theta }(a|s)} = \underE {\xi \sim \mathcal {N}}{Q^... l.4869 ...\tilde{a}_{\theta}(s,\xi)|s)}\end{split} ! Undefined control sequence. ...egin {split}\max _{\theta } \underE {s \sim \mathcal {D} \\ \x... l.4873 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Missing } inserted. } l.4873 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Missing { inserted. { l.4873 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Undefined control sequence. ...egin {split}\max _{\theta } \underE {s \sim \mathcal {D} \\ \x... l.4873 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Missing } inserted. } l.4873 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Missing { inserted. { l.4873 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4891 ...rithms/sac:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4892 \caption {Soft Actor-Critic} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4894 \begin{algorithmic} [1] ! Undefined control sequence. l.4895 \STATE Input: initial policy parameters $\theta$, Q-function para... ! Undefined control sequence. l.4896 \STATE Set target parameters equal to main parameters $\psi_{\tex... ! Undefined control sequence. l.4897 \REPEAT ! Undefined control sequence. l.4898 \STATE Observe state $s$ and select action $a \sim \pi_{\thet... ! Undefined control sequence. l.4899 \STATE Execute $a$ in the environment ! Undefined control sequence. l.4900 \STATE Observe next state $s'$, reward $r$, and done signal $... ! Undefined control sequence. l.4901 \STATE Store $(s,a,r,s',d)$ in replay buffer $\mathcal{D}$ ! Undefined control sequence. l.4902 \STATE If $s'$ is terminal, reset environment state. ! Undefined control sequence. l.4903 \IF {it's time to update} ! Undefined control sequence. l.4904 \FOR {$j$ in range(however many updates)} ! Undefined control sequence. l.4905 \STATE Randomly sample a batch of transitions, $B = \... ! Undefined control sequence. l.4906 \STATE Compute targets for Q and V functions: [106] ! Undefined control sequence. l.4911 \STATE Update Q-functions by one step of gradient des... ! Undefined control sequence. l.4915 \STATE Update V-function by one step of gradient desc... ! Undefined control sequence. l.4919 \STATE Update policy by one step of gradient ascent u... ! Undefined control sequence. l.4924 \STATE Update target value network with ! Undefined control sequence. l.4928 \ENDFOR ! Undefined control sequence. l.4929 \ENDIF ! Undefined control sequence. l.4930 \UNTIL {convergence} ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4931 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4932 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 4938--4938 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 4938--4938 \T1/ptm/m/it/10 lr=0.001\T1/ptm/m/n/10 , \T1/ptm/m/it/10 al-pha=0.2\T1/ptm/m/n/ 10 , \T1/ptm/m/it/10 batch_size=100\T1/ptm/m/n/10 , \T1/ptm/m/it/10 start_steps =10000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 max_ep_len=1000\T1/ptm/m/n/10 , \T1/ptm/ m/it/10 log- [107] Overfull \vbox (77.80809pt too high) has occurred while \output is active [108] [109] [110] Chapter 20. LaTeX Warning: Hyper reference `utils/logger:logger' on page 111 undefined on i nput line 5202. LaTeX Warning: Hyper reference `utils/logger:using-a-logger' on page 111 undefi ned on input line 5205. LaTeX Warning: Hyper reference `utils/logger:examples' on page 111 undefined on input line 5208. LaTeX Warning: Hyper reference `utils/logger:logging-and-mpi' on page 111 undef ined on input line 5211. LaTeX Warning: Hyper reference `utils/logger:logger-classes' on page 111 undefi ned on input line 5216. LaTeX Warning: Hyper reference `utils/logger:loading-saved-graphs' on page 111 undefined on input line 5219. [111] [112] [113] [114] LaTeX Warning: Hyper reference `utils/logger:spinup.utils.logx.Logger' on page 115 undefined on input line 5517. [115] [116] Chapter 21. [117] [118] Chapter 22. LaTeX Warning: Hyper reference `utils/mpi:mpi-tools' on page 119 undefined on i nput line 5662. LaTeX Warning: Hyper reference `utils/mpi:module-spinup.utils.mpi_tools' on pag e 119 undefined on input line 5665. LaTeX Warning: Hyper reference `utils/mpi:mpi-tensorflow-utilities' on page 119 undefined on input line 5668. [119] [120] Chapter 23. LaTeX Warning: Hyper reference `utils/run_utils:run-utils' on page 121 undefine d on input line 5795. LaTeX Warning: Hyper reference `utils/run_utils:experimentgrid' on page 121 und efined on input line 5798. LaTeX Warning: Hyper reference `utils/run_utils:calling-experiments' on page 12 1 undefined on input line 5801. [121] Underfull \hbox (badness 10000) in paragraph at lines 5953--5953 []\T1/ptm/m/it/10 exp_name\T1/ptm/m/n/10 , \T1/ptm/m/it/10 thunk\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , \T1/ptm/m/it/10 num_cpu=1\T1/ptm/m/n/1 0 , [122] [123] [124] Chapter 24. [125] [126] Chapter 25. [127] [128] Chapter 26. [129] [130] LaTeX Warning: Reference `utils/mpi:module-spinup.utils.mpi_tf' on page 131 und efined on input line 6086. LaTeX Warning: Reference `utils/mpi:module-spinup.utils.mpi_tools' on page 131 undefined on input line 6087. [131] No file SpinningUp.ind. (./SpinningUp.aux) Package rerunfilecheck Warning: File `SpinningUp.out' has changed. (rerunfilecheck) Rerun to get outlines right (rerunfilecheck) or use package `bookmark'. LaTeX Warning: There were undefined references. LaTeX Warning: Label(s) may have changed. Rerun to get cross-references right. ) (see the transcript file for additional information){/usr/share/texlive/texmf-d ist/fonts/enc/dvips/base/8r.enc}< /usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmr5.pfb> Output written on SpinningUp.pdf (135 pages, 1116491 bytes). Transcript written on SpinningUp.log. [rtd-command-info] start-time: 2018-11-11T14:48:12.936299Z, end-time: 2018-11-11T14:48:13.070222Z, duration: 0, exit-code: 0 makeindex -s python.ist SpinningUp.idx This is makeindex, version 2.15 [TeX Live 2015] (kpathsea + Thai support). Scanning style file ./python.ist.......done (7 attributes redefined, 0 ignored). Scanning input file SpinningUp.idx....done (78 entries accepted, 0 rejected). Sorting entries....done (506 comparisons). Generating output file SpinningUp.ind....done (144 lines written, 0 warnings). Output written in SpinningUp.ind. Transcript written in SpinningUp.ilg. [rtd-command-info] start-time: 2018-11-11T14:48:13.130054Z, end-time: 2018-11-11T14:48:14.312883Z, duration: 1, exit-code: 0 pdflatex -interaction=nonstopmode /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/_build/latex/SpinningUp.tex This is pdfTeX, Version 3.14159265-2.6-1.40.16 (TeX Live 2015/Debian) (preloaded format=pdflatex) restricted \write18 enabled. entering extended mode (/home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/c heckouts/stable/docs/_build/latex/SpinningUp.tex LaTeX2e <2016/02/01> Babel <3.9q> and hyphenation patterns for 81 language(s) loaded. (./sphinxmanual.cls Document Class: sphinxmanual 2017/03/26 v1.5.4 Document class (Sphinx manual) (/usr/share/texlive/texmf-dist/tex/latex/base/report.cls Document Class: report 2014/09/29 v1.4h Standard LaTeX document class (/usr/share/texlive/texmf-dist/tex/latex/base/size10.clo))) (/usr/share/texlive/texmf-dist/tex/latex/base/inputenc.sty (/usr/share/texlive/texmf-dist/tex/latex/base/utf8.def (/usr/share/texlive/texmf-dist/tex/latex/base/t1enc.dfu) (/usr/share/texlive/texmf-dist/tex/latex/base/ot1enc.dfu) (/usr/share/texlive/texmf-dist/tex/latex/base/omsenc.dfu))) (/usr/share/texlive/texmf-dist/tex/latex/cmap/cmap.sty) (/usr/share/texlive/texmf-dist/tex/latex/base/fontenc.sty (/usr/share/texlive/texmf-dist/tex/latex/base/t1enc.def)<>) (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsmath.sty For additional information on amsmath, use the `?' option. (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amstext.sty (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsgen.sty)) (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsbsy.sty) (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsopn.sty)) (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/amssymb.sty (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/amsfonts.sty)) (/usr/share/texlive/texmf-dist/tex/generic/babel/babel.sty (/usr/share/texlive/texmf-dist/tex/generic/babel-english/english.ldf (/usr/share/texlive/texmf-dist/tex/generic/babel/babel.def))) (/usr/share/texlive/texmf-dist/tex/latex/psnfss/times.sty) (/usr/share/texlive/texmf-dist/tex/latex/fncychap/fncychap.sty) (/usr/share/texlive/texmf-dist/tex/latex/tools/longtable.sty) (./sphinx.sty (/usr/share/texlive/texmf-dist/tex/latex/graphics/graphicx.sty (/usr/share/texlive/texmf-dist/tex/latex/graphics/keyval.sty) (/usr/share/texlive/texmf-dist/tex/latex/graphics/graphics.sty (/usr/share/texlive/texmf-dist/tex/latex/graphics/trig.sty) (/usr/share/texlive/texmf-dist/tex/latex/latexconfig/graphics.cfg) (/usr/share/texlive/texmf-dist/tex/latex/pdftex-def/pdftex.def (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/infwarerr.sty) (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ltxcmds.sty)))) (/usr/share/texlive/texmf-dist/tex/latex/fancyhdr/fancyhdr.sty) (/usr/share/texlive/texmf-dist/tex/latex/base/textcomp.sty (/usr/share/texlive/texmf-dist/tex/latex/base/ts1enc.def (/usr/share/texlive/texmf-dist/tex/latex/base/ts1enc.dfu))) (/usr/share/texlive/texmf-dist/tex/latex/titlesec/titlesec.sty) (/usr/share/texlive/texmf-dist/tex/latex/etoolbox/etoolbox.sty) Package Sphinx Info: **** titlesec 2.10.1 successfully patched for bugfix **** (/usr/share/texlive/texmf-dist/tex/latex/tabulary/tabulary.sty (/usr/share/texlive/texmf-dist/tex/latex/tools/array.sty)) (/usr/share/texlive/texmf-dist/tex/latex/base/makeidx.sty) (/usr/share/texlive/texmf-dist/tex/latex/framed/framed.sty) (/usr/share/texlive/texmf-dist/tex/latex/xcolor/xcolor.sty (/usr/share/texlive/texmf-dist/tex/latex/latexconfig/color.cfg)) (/usr/share/texlive/texmf-dist/tex/latex/fancyvrb/fancyvrb.sty Style option: `fancyvrb' v2.7a, with DG/SPQR fixes, and firstline=lastline fix <2008/02/07> (tvz)) (/usr/share/texlive/texmf-dist/tex/latex/threeparttable/threeparttable.sty) (./footnotehyper-sphinx.sty (/usr/share/texlive/texmf-dist/tex/latex/mdwtools/footnote.sty)) (/usr/share/texlive/texmf-dist/tex/latex/float/float.sty) (/usr/share/texlive/texmf-dist/tex/latex/wrapfig/wrapfig.sty) (/usr/share/texlive/texmf-dist/tex/latex/parskip/parskip.sty) (/usr/share/texlive/texmf-dist/tex/latex/base/alltt.sty) (/usr/share/texlive/texmf-dist/tex/latex/upquote/upquote.sty) (/usr/share/texlive/texmf-dist/tex/latex/capt-of/capt-of.sty) (./needspace.sty) (./sphinxhighlight.sty) (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/kvoptions.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/kvsetkeys.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/etexcmds.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ifluatex.sty)))) (/usr/share/texlive/texmf-dist/tex/generic/pdftex/pdfcolor.tex) ** (sphinx) defining (legacy) text style macros without \sphinx prefix ** if clashes with packages, set latex_keep_old_macro_names=False in conf.py ) (/usr/share/texlive/texmf-dist/tex/latex/geometry/geometry.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ifpdf.sty) (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ifvtex.sty) (/usr/share/texlive/texmf-dist/tex/generic/ifxetex/ifxetex.sty)) (/usr/share/texlive/texmf-dist/tex/latex/multirow/multirow.sty) (/usr/share/texlive/texmf-dist/tex/latex/eqparbox/eqparbox.sty (/usr/share/texlive/texmf-dist/tex/latex/environ/environ.sty (/usr/share/texlive/texmf-dist/tex/latex/trimspaces/trimspaces.sty))) (/usr/share/texlive/texmf-dist/tex/latex/hyperref/hyperref.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/hobsub-hyperref.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/hobsub-generic.sty)) (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/auxhook.sty) (/usr/share/texlive/texmf-dist/tex/latex/hyperref/pd1enc.def) (/usr/share/texlive/texmf-dist/tex/latex/latexconfig/hyperref.cfg) (/usr/share/texlive/texmf-dist/tex/latex/hyperref/puenc.def) (/usr/share/texlive/texmf-dist/tex/latex/url/url.sty)) Package hyperref Message: Driver (autodetected): hpdftex. (/usr/share/texlive/texmf-dist/tex/latex/hyperref/hpdftex.def (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/rerunfilecheck.sty)) (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/hypcap.sty) Writing index file SpinningUp.idx (./SpinningUp.aux LaTeX Warning: Label `alg1' multiply defined. LaTeX Warning: Label `alg1' multiply defined. LaTeX Warning: Label `alg1' multiply defined. LaTeX Warning: Label `alg1' multiply defined. LaTeX Warning: Label `alg1' multiply defined. ) (/usr/share/texlive/texmf-dist/tex/latex/base/ts1cmr.fd) (/usr/share/texlive/texmf-dist/tex/latex/psnfss/t1ptm.fd) (/usr/share/texlive/texmf-dist/tex/context/base/supp-pdf.mkii [Loading MPS to PDF converter (version 2006.09.02).] ) (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/epstopdf-base.sty (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/grfext.sty) (/usr/share/texlive/texmf-dist/tex/latex/latexconfig/epstopdf-sys.cfg)) *geometry* driver: auto-detecting *geometry* detected driver: pdftex (/usr/share/texlive/texmf-dist/tex/latex/hyperref/nameref.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/gettitlestring.sty)) (./SpinningUp.out) (./SpinningUp.out) (/usr/share/texlive/texmf-dist/tex/latex/psnfss/t1phv.fd)<><><><> (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/umsa.fd) (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/umsb.fd) [1{/var/lib/texmf/fo nts/map/pdftex/updmap/pdftex.map}] [2] (./SpinningUp.toc [1] [2]) [3] [4] (/usr/share/texlive/texmf-dist/tex/latex/psnfss/t1pcr.fd) [1 <./spinning-up-in-rl.png>] [2] Chapter 1. (/usr/share/texlive/texmf-dist/tex/latex/psnfss/ts1ptm.fd) [3] [4] [5] [6] Chapter 2. [7] [8] [9] [10] Chapter 3. [11] [12] [13] [14] Chapter 4. [15] [16] [17] [18] (/usr/share/texlive/texmf-dist/tex/latex/psnfss/ts1pcr.fd) Overfull \vbox (11.785pt too high) detected at line 817 [19] [20] Chapter 5. [21] [22] [23] [24] [25] [26] Chapter 6. [27] [28] Chapter 7. [29] [30 <./rl_diagram_transparent_bg.png>] [31] [32] [33] ! Undefined control sequence. ... } P(\tau |\pi ) R(\tau ) = \underE {\tau \sim \pi }{R(\tau )}... l.1504 ...nderE{\tau\sim \pi}{R(\tau)}.\end{split} ! Undefined control sequence. ... } P(\tau |\pi ) R(\tau ) = \underE {\tau \sim \pi }{R(\tau )}... l.1504 ...nderE{\tau\sim \pi}{R(\tau)}.\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {\tau \sim \pi }{R(\tau )\... l.1523 ...R(\tau)\left| s_0 = s\right.}\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {\tau \sim \pi }{R(\tau )\... l.1523 ...R(\tau)\left| s_0 = s\right.}\end{split} [34] ! Undefined control sequence. ...egin {split}Q^{\pi }(s,a) = \underE {\tau \sim \pi }{R(\tau )\... l.1530 ...eft| s_0 = s, a_0 = a\right.}\end{split} ! Undefined control sequence. ...egin {split}Q^{\pi }(s,a) = \underE {\tau \sim \pi }{R(\tau )\... l.1530 ...eft| s_0 = s, a_0 = a\right.}\end{split} ! Undefined control sequence. ...split}V^*(s) = \max _{\pi } \underE {\tau \sim \pi }{R(\tau )\... l.1537 ...R(\tau)\left| s_0 = s\right.}\end{split} ! Undefined control sequence. ...split}V^*(s) = \max _{\pi } \underE {\tau \sim \pi }{R(\tau )\... l.1537 ...R(\tau)\left| s_0 = s\right.}\end{split} ! Undefined control sequence. ...lit}Q^*(s,a) = \max _{\pi } \underE {\tau \sim \pi }{R(\tau )\... l.1544 ...eft| s_0 = s, a_0 = a\right.}\end{split} ! Undefined control sequence. ...lit}Q^*(s,a) = \max _{\pi } \underE {\tau \sim \pi }{R(\tau )\... l.1544 ...eft| s_0 = s, a_0 = a\right.}\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {a\sim \pi }{Q^{\pi }(s,a)... l.1558 ...erE{a\sim \pi}{Q^{\pi}(s,a)},\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {a\sim \pi }{Q^{\pi }(s,a)... l.1558 ...erE{a\sim \pi}{Q^{\pi}(s,a)},\end{split} [35] ! Undefined control sequence. V^{\pi }(s) &= \underE {a \sim \pi \\ s'\sim P}{r(s,a) + \gamma ... l.1591 \end{align*} ! Missing } inserted. } l.1591 \end{align*} ! Missing { inserted. { l.1591 \end{align*} ! Undefined control sequence. ...}(s')}, \\ Q^{\pi }(s,a) &= \underE {s'\sim P}{r(s,a) + \gamma... l.1591 \end{align*} ! Undefined control sequence. ... {s'\sim P}{r(s,a) + \gamma \underE {a'\sim \pi }{Q^{\pi }(s',... l.1591 \end{align*} ! Undefined control sequence. V^{\pi }(s) &= \underE {a \sim \pi \\ s'\sim P}{r(s,a) + \gamma ... l.1591 \end{align*} ! Missing } inserted. } l.1591 \end{align*} ! Missing \endgroup inserted. \endgroup l.1591 \end{align*} ! Misplaced \omit. \math@cr@@@ ...@ \@ne \add@amps \maxfields@ \omit \kern -\alignsep@ \iftag@ ... l.1591 \end{align*} ! Missing { inserted. { l.1591 \end{align*} ! Undefined control sequence. ...}(s')}, \\ Q^{\pi }(s,a) &= \underE {s'\sim P}{r(s,a) + \gamma... l.1591 \end{align*} ! Undefined control sequence. ... {s'\sim P}{r(s,a) + \gamma \underE {a'\sim \pi }{Q^{\pi }(s',... l.1591 \end{align*} ! Undefined control sequence. V^*(s) &= \max _a \underE {s'\sim P}{r(s,a) + \gamma V^*(s')}, \... l.1598 \end{align*} ! Undefined control sequence. ...ma V^*(s')}, \\ Q^*(s,a) &= \underE {s'\sim P}{r(s,a) + \gamma... l.1598 \end{align*} ! Undefined control sequence. V^*(s) &= \max _a \underE {s'\sim P}{r(s,a) + \gamma V^*(s')}, \... l.1598 \end{align*} ! Undefined control sequence. ...ma V^*(s')}, \\ Q^*(s,a) &= \underE {s'\sim P}{r(s,a) + \gamma... l.1598 \end{align*} [36] [37] [38] Chapter 8. ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.1672 ...ncludegraphics{{rl_algorithms_9_15}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.1672 ...ncludegraphics{{rl_algorithms_9_15}.svg} [39] [40] [41] [42] Chapter 9. ! Undefined control sequence. l.1863 ...ected return \(J(\pi_{\theta}) = \underE {\tau \sim \pi_{\theta}}{R... [43] ! Undefined control sequence. ...} \log P(\tau | \theta ) &= \cancel {\nabla _{\theta } \log \r... l.1894 ... \log \pi_{\theta}(a_t |s_t).\end{split} ! Undefined control sequence. ...} + \sum _{t=0}^{T} \bigg ( \cancel {\nabla _{\theta } \log P(... l.1894 ... \log \pi_{\theta}(a_t |s_t).\end{split} ! Undefined control sequence. ...} \log P(\tau | \theta ) &= \cancel {\nabla _{\theta } \log \r... l.1894 ... \log \pi_{\theta}(a_t |s_t).\end{split} ! Undefined control sequence. ...} + \sum _{t=0}^{T} \bigg ( \cancel {\nabla _{\theta } \log P(... l.1894 ... \log \pi_{\theta}(a_t |s_t).\end{split} ! Undefined control sequence. ...eta }) &= \nabla _{\theta } \underE {\tau \sim \pi _{\theta }}... l.1906 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...Log-derivative trick} \\ &= \underE {\tau \sim \pi _{\theta }}... l.1906 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...heta } J(\pi _{\theta }) &= \underE {\tau \sim \pi _{\theta }}... l.1906 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...eta }) &= \nabla _{\theta } \underE {\tau \sim \pi _{\theta }}... l.1906 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...Log-derivative trick} \\ &= \underE {\tau \sim \pi _{\theta }}... l.1906 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...heta } J(\pi _{\theta }) &= \underE {\tau \sim \pi _{\theta }}... l.1906 \end{align*} \end{sphinxadmonition} [44] [45] [46] [47] ! Undefined control sequence. \split@tag \begin {split}\underE {x \sim P_{\theta }}{\nabla _{\t... l.2055 ...eta} \log P_{\theta}(x)} = 0.\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {x \sim P_{\theta }}{\nabla _{\t... l.2055 ...eta} \log P_{\theta}(x)} = 0.\end{split} ! Undefined control sequence. ...eta }(x) \\ \therefore 0 &= \underE {x \sim P_{\theta }}{\nabl... l.2072 ...{\theta} \log P_{\theta}(x)}.\end{split} ! Undefined control sequence. ...eta }(x) \\ \therefore 0 &= \underE {x \sim P_{\theta }}{\nabl... l.2072 ...{\theta} \log P_{\theta}(x)}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2080 ..._{\theta}(a_t |s_t) R(\tau)}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2080 ..._{\theta}(a_t |s_t) R(\tau)}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2088 ...R(s_{t'}, a_{t'}, s_{t'+1})}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2088 ...R(s_{t'}, a_{t'}, s_{t'+1})}.\end{split} [48] ! Undefined control sequence. \split@tag \begin {split}\underE {a_t \sim \pi _{\theta }}{\nabla... l.2140 ...\theta}(a_t|s_t) b(s_t)} = 0.\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {a_t \sim \pi _{\theta }}{\nabla... l.2140 ...\theta}(a_t|s_t) b(s_t)} = 0.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2144 ..., s_{t'+1}) - b(s_t)\right)}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2144 ..., s_{t'+1}) - b(s_t)\right)}.\end{split} [49] ! Undefined control sequence. ...phi _k = \arg \min _{\phi } \underE {s_t, \hat {R}_t \sim \pi ... l.2158 ...(s_t) - \hat{R}_t \right)^2},\end{split} ! Undefined control sequence. ...phi _k = \arg \min _{\phi } \underE {s_t, \hat {R}_t \sim \pi ... l.2158 ...(s_t) - \hat{R}_t \right)^2},\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2172 ...i_{\theta}(a_t |s_t) \Phi_t},\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2172 ...i_{\theta}(a_t |s_t) \Phi_t},\end{split} [50] [51] [52] Chapter 10. [53] [54] [55] [56] [57] [58] Chapter 11. [59] [60] Overfull \vbox (74.45543pt too high) has occurred while \output is active [61] [62] Chapter 12. [63] [64] [65] [66] Chapter 13. ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2667 ...includegraphics{{bench_halfcheetah}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2667 ...includegraphics{{bench_halfcheetah}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2676 ...phinxincludegraphics{{bench_hopper}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2676 ...phinxincludegraphics{{bench_hopper}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2685 ...phinxincludegraphics{{bench_walker}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2685 ...phinxincludegraphics{{bench_walker}.svg} [67] ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2694 ...\sphinxincludegraphics{{bench_swim}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2694 ...\sphinxincludegraphics{{bench_swim}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2703 ...t\sphinxincludegraphics{{bench_ant}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2703 ...t\sphinxincludegraphics{{bench_ant}.svg} [68] [69] [70] Chapter 14. [71] ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2800 },\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2800 },\end{split} ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2817 ...rithms/vpg:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2818 \caption {Vanilla Policy Gradient Algorithm} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2820 \begin{algorithmic} [1] ! Undefined control sequence. l.2821 \STATE Input: initial policy parameters $\theta_0$, initial value... ! Undefined control sequence. l.2822 \FOR {$k = 0,1,2,...$} ! Undefined control sequence. l.2823 \STATE Collect set of trajectories ${\mathcal D}_k = \{\tau_i\}$ ... ! Undefined control sequence. l.2824 \STATE Compute rewards-to-go $\hat{R}_t$. ! Undefined control sequence. l.2825 \STATE Compute advantage estimates, $\hat{A}_t$ (using any method... ! Undefined control sequence. l.2826 \STATE Estimate policy gradient as ! Undefined control sequence. l.2830 \STATE Compute policy update, either using standard gradient ascent, ! Undefined control sequence. l.2835 \STATE Fit value function by regression on mean-squared error: ! Undefined control sequence. l.2840 \ENDFOR ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2841 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2842 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 2848--2848 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 2848--2848 \T1/ptm/m/it/10 steps_per_epoch=4000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 epochs=50\ T1/ptm/m/n/10 , \T1/ptm/m/it/10 gamma=0.99\T1/ptm/m/n/10 , \T1/ptm/m/it/10 pi_l r=0.0003\T1/ptm/m/n/10 , \T1/ptm/m/it/10 vf_lr=0.001\T1/ptm/m/n/10 , [72] [73] [74] [75] [76] Chapter 15. [77] ! Undefined control sequence. ...al L}(\theta _k, \theta ) = \underE {s,a \sim \pi _{\theta _k}... l.3129 },\end{split} ! Undefined control sequence. ...al L}(\theta _k, \theta ) = \underE {s,a \sim \pi _{\theta _k}... l.3129 },\end{split} ! Undefined control sequence. ...{KL}(\theta || \theta _k) = \underE {s \sim \pi _{\theta _k}}{... l.3135 }.\end{split} ! Undefined control sequence. ...{KL}(\theta || \theta _k) = \underE {s \sim \pi _{\theta _k}}{... l.3135 }.\end{split} [78] ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3184 ...ithms/trpo:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3185 \caption {Trust Region Policy Optimization} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3187 \begin{algorithmic} [1] ! Undefined control sequence. l.3188 \STATE Input: initial policy parameters $\theta_0$, initial value... ! Undefined control sequence. l.3189 \STATE Hyperparameters: KL-divergence limit $\delta$, backtrackin... ! Undefined control sequence. l.3190 \FOR {$k = 0,1,2,...$} ! Undefined control sequence. l.3191 \STATE Collect set of trajectories ${\mathcal D}_k = \{\tau_i\}$ ... ! Undefined control sequence. l.3192 \STATE Compute rewards-to-go $\hat{R}_t$. ! Undefined control sequence. l.3193 \STATE Compute advantage estimates, $\hat{A}_t$ (using any method... ! Undefined control sequence. l.3194 \STATE Estimate policy gradient as ! Undefined control sequence. l.3198 \STATE Use the conjugate gradient algorithm to compute ! Undefined control sequence. l.3203 \STATE Update the policy by backtracking line search with [79] ! Undefined control sequence. l.3208 \STATE Fit value function by regression on mean-squared error: ! Undefined control sequence. l.3213 \ENDFOR ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3214 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3215 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 3221--3221 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 3221--3221 \T1/ptm/m/it/10 steps_per_epoch=4000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 epochs=50\ T1/ptm/m/n/10 , \T1/ptm/m/it/10 gamma=0.99\T1/ptm/m/n/10 , \T1/ptm/m/it/10 delt a=0.01\T1/ptm/m/n/10 , \T1/ptm/m/it/10 vf_lr=0.001\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 3221--3221 \T1/ptm/m/it/10 train_v_iters=80\T1/ptm/m/n/10 , \T1/ptm/m/it/10 damp-ing_coeff =0.1\T1/ptm/m/n/10 , \T1/ptm/m/it/10 cg_iters=10\T1/ptm/m/n/10 , \T1/ptm/m/it/1 0 back-track_iters=10\T1/ptm/m/n/10 , \T1/ptm/m/it/10 back- Underfull \hbox (badness 10000) in paragraph at lines 3221--3221 \T1/ptm/m/it/10 track_coeff=0.8\T1/ptm/m/n/10 , \T1/ptm/m/it/10 lam=0.97\T1/ptm /m/n/10 , \T1/ptm/m/it/10 max_ep_len=1000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 log-g er_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 save_freq=10\T1/ptm/m/n/10 , [80] Overfull \vbox (100.02797pt too high) has occurred while \output is active [81] [82] [83] [84] Chapter 16. [85] [86] ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3641 ...rithms/ppo:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3642 \caption {PPO-Clip} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3644 \begin{algorithmic} [1] ! Undefined control sequence. l.3645 \STATE Input: initial policy parameters $\theta_0$, initial value... ! Undefined control sequence. l.3646 \FOR {$k = 0,1,2,...$} ! Undefined control sequence. l.3647 \STATE Collect set of trajectories ${\mathcal D}_k = \{\tau_i\}$ ... ! Undefined control sequence. l.3648 \STATE Compute rewards-to-go $\hat{R}_t$. ! Undefined control sequence. l.3649 \STATE Compute advantage estimates, $\hat{A}_t$ (using any method... ! Undefined control sequence. l.3650 \STATE Update the policy by maximizing the PPO-Clip objective: ! Undefined control sequence. l.3658 \STATE Fit value function by regression on mean-squared error: ! Undefined control sequence. l.3663 \ENDFOR ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3664 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3665 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 3671--3671 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , [87] [88] [89] [90] Chapter 17. [91] [92] [93] ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4055 ...ithms/ddpg:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4056 \caption {Deep Deterministic Policy Gradient} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4058 \begin{algorithmic} [1] ! Undefined control sequence. l.4059 \STATE Input: initial policy parameters $\theta$, Q-function para... ! Undefined control sequence. l.4060 \STATE Set target parameters equal to main parameters $\theta_{\t... ! Undefined control sequence. l.4061 \REPEAT ! Undefined control sequence. l.4062 \STATE Observe state $s$ and select action $a = \text{clip}(\... ! Undefined control sequence. l.4063 \STATE Execute $a$ in the environment ! Undefined control sequence. l.4064 \STATE Observe next state $s'$, reward $r$, and done signal $... ! Undefined control sequence. l.4065 \STATE Store $(s,a,r,s',d)$ in replay buffer $\mathcal{D}$ ! Undefined control sequence. l.4066 \STATE If $s'$ is terminal, reset environment state. ! Undefined control sequence. l.4067 \IF {it's time to update} ! Undefined control sequence. l.4068 \FOR {however many updates} ! Undefined control sequence. l.4069 \STATE Randomly sample a batch of transitions, $B = \... ! Undefined control sequence. l.4070 \STATE Compute targets ! Undefined control sequence. l.4074 \STATE Update Q-function by one step of gradient desc... ! Undefined control sequence. l.4078 \STATE Update policy by one step of gradient ascent u... ! Undefined control sequence. l.4082 \STATE Update target networks with ! Undefined control sequence. l.4087 \ENDFOR ! Undefined control sequence. l.4088 \ENDIF ! Undefined control sequence. l.4089 \UNTIL {convergence} ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4090 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4091 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 4097--4097 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 4097--4097 \T1/ptm/m/it/10 pi_lr=0.001\T1/ptm/m/n/10 , \T1/ptm/m/it/10 q_lr=0.001\T1/ptm/m /n/10 , \T1/ptm/m/it/10 batch_size=100\T1/ptm/m/n/10 , \T1/ptm/m/it/10 start_st eps=10000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 act_noise=0.1\T1/ptm/m/n/10 , [94] [95] [96] Chapter 18. [97] ! Undefined control sequence. ...}L(\phi _1, {\mathcal D}) = \underE {(s,a,r,s',d) \sim {\mathc... l.4403 },\end{split} ! Undefined control sequence. ...}L(\phi _1, {\mathcal D}) = \underE {(s,a,r,s',d) \sim {\mathc... l.4403 },\end{split} ! Undefined control sequence. ...}L(\phi _2, {\mathcal D}) = \underE {(s,a,r,s',d) \sim {\mathc... l.4407 }.\end{split} ! Undefined control sequence. ...}L(\phi _2, {\mathcal D}) = \underE {(s,a,r,s',d) \sim {\mathc... l.4407 }.\end{split} [98] ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4431 ...rithms/td3:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4432 \caption {Twin Delayed DDPG} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4434 \begin{algorithmic} [1] ! Undefined control sequence. l.4435 \STATE Input: initial policy parameters $\theta$, Q-function para... ! Undefined control sequence. l.4436 \STATE Set target parameters equal to main parameters $\theta_{\t... ! Undefined control sequence. l.4437 \REPEAT ! Undefined control sequence. l.4438 \STATE Observe state $s$ and select action $a = \text{clip}(\... ! Undefined control sequence. l.4439 \STATE Execute $a$ in the environment ! Undefined control sequence. l.4440 \STATE Observe next state $s'$, reward $r$, and done signal $... ! Undefined control sequence. l.4441 \STATE Store $(s,a,r,s',d)$ in replay buffer $\mathcal{D}$ ! Undefined control sequence. l.4442 \STATE If $s'$ is terminal, reset environment state. ! Undefined control sequence. l.4443 \IF {it's time to update} ! Undefined control sequence. l.4444 \FOR {$j$ in range(however many updates)} ! Undefined control sequence. l.4445 \STATE Randomly sample a batch of transitions, $B = \... ! Undefined control sequence. l.4446 \STATE Compute target actions ! Undefined control sequence. l.4450 \STATE Compute targets ! Undefined control sequence. l.4454 \STATE Update Q-functions by one step of gradient des... ! Undefined control sequence. l.4458 \IF { $j \mod$ \texttt{policy\_delay} $ = 0$} ! Undefined control sequence. l.4459 \STATE Update policy by one step of gradient asce... ! Undefined control sequence. l.4463 \STATE Update target networks with ! Undefined control sequence. l.4468 \ENDIF ! Undefined control sequence. l.4469 \ENDFOR ! Undefined control sequence. l.4470 \ENDIF ! Undefined control sequence. l.4471 \UNTIL {convergence} ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4472 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4473 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 4479--4479 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 4479--4479 \T1/ptm/m/it/10 pi_lr=0.001\T1/ptm/m/n/10 , \T1/ptm/m/it/10 q_lr=0.001\T1/ptm/m /n/10 , \T1/ptm/m/it/10 batch_size=100\T1/ptm/m/n/10 , \T1/ptm/m/it/10 start_st eps=10000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 act_noise=0.1\T1/ptm/m/n/10 , \T1/ptm /m/it/10 tar- [99] [100] [101] [102] Chapter 19. [103] ! Undefined control sequence. ...it@tag \begin {split}H(P) = \underE {x \sim P}{-\log P(x)}.\en... l.4794 ...underE{x \sim P}{-\log P(x)}.\end{split} ! Undefined control sequence. ...it@tag \begin {split}H(P) = \underE {x \sim P}{-\log P(x)}.\en... l.4794 ...underE{x \sim P}{-\log P(x)}.\end{split} ! Undefined control sequence. ...}\pi ^* = \arg \max _{\pi } \underE {\tau \sim \pi }{ \sum _{t... l.4798 ...pi(\cdot|s_t)\right) \bigg)},\end{split} ! Undefined control sequence. ...}\pi ^* = \arg \max _{\pi } \underE {\tau \sim \pi }{ \sum _{t... l.4798 ...pi(\cdot|s_t)\right) \bigg)},\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {\tau \sim \pi }{ \left . ... l.4802 ...ight) \bigg) \right| s_0 = s}\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {\tau \sim \pi }{ \left . ... l.4802 ...ight) \bigg) \right| s_0 = s}\end{split} ! Undefined control sequence. ...egin {split}Q^{\pi }(s,a) = \underE {\tau \sim \pi }{ \left . ... l.4806 ...ght)\right| s_0 = s, a_0 = a}\end{split} ! Undefined control sequence. ...egin {split}Q^{\pi }(s,a) = \underE {\tau \sim \pi }{ \left . ... l.4806 ...ght)\right| s_0 = s, a_0 = a}\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {a \sim \pi }{Q^{\pi }(s,a... l.4810 ...ha H\left(\pi(\cdot|s)\right)\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {a \sim \pi }{Q^{\pi }(s,a... l.4810 ...ha H\left(\pi(\cdot|s)\right)\end{split} ! Undefined control sequence. ...gin {split}Q^{\pi }(s,a) &= \underE {s' \sim P \\ a' \sim \pi ... l.4815 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Missing } inserted. } l.4815 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Missing { inserted. { l.4815 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Undefined control sequence. ...s')\right ) \right )} \\ &= \underE {s' \sim P}{R(s,a,s') + \g... l.4815 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Undefined control sequence. ...gin {split}Q^{\pi }(s,a) &= \underE {s' \sim P \\ a' \sim \pi ... l.4815 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Missing } inserted. } l.4815 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Missing { inserted. { l.4815 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Undefined control sequence. ...s')\right ) \right )} \\ &= \underE {s' \sim P}{R(s,a,s') + \g... l.4815 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} [104] ! Undefined control sequence. ...begin {split}V^{\pi }(s) &= \underE {a \sim \pi }{Q^{\pi }(s,a... l.4838 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. ...pi (\cdot |s)\right ) \\ &= \underE {a \sim \pi }{Q^{\pi }(s,a... l.4838 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. ...begin {split}V^{\pi }(s) &= \underE {a \sim \pi }{Q^{\pi }(s,a... l.4838 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. ...pi (\cdot |s)\right ) \\ &= \underE {a \sim \pi }{Q^{\pi }(s,a... l.4838 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. ...it}L(\psi , {\mathcal D}) = \underE {s \sim \mathcal {D} \\ \t... l.4846 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Missing } inserted. } l.4846 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Missing { inserted. { l.4846 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Undefined control sequence. ...it}L(\psi , {\mathcal D}) = \underE {s \sim \mathcal {D} \\ \t... l.4846 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Missing } inserted. } l.4846 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Missing { inserted. { l.4846 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {a \sim \pi }{Q^{\pi }(s,a) - \a... l.4852 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {a \sim \pi }{Q^{\pi }(s,a) - \a... l.4852 ...s,a) - \alpha \log \pi(a|s)}.\end{split} [105] ! Undefined control sequence. \split@tag \begin {split}\underE {a \sim \pi _{\theta }}{Q^{\pi _... l.4869 ...\tilde{a}_{\theta}(s,\xi)|s)}\end{split} ! Undefined control sequence. ...\log \pi _{\theta }(a|s)} = \underE {\xi \sim \mathcal {N}}{Q^... l.4869 ...\tilde{a}_{\theta}(s,\xi)|s)}\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {a \sim \pi _{\theta }}{Q^{\pi _... l.4869 ...\tilde{a}_{\theta}(s,\xi)|s)}\end{split} ! Undefined control sequence. ...\log \pi _{\theta }(a|s)} = \underE {\xi \sim \mathcal {N}}{Q^... l.4869 ...\tilde{a}_{\theta}(s,\xi)|s)}\end{split} ! Undefined control sequence. ...egin {split}\max _{\theta } \underE {s \sim \mathcal {D} \\ \x... l.4873 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Missing } inserted. } l.4873 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Missing { inserted. { l.4873 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Undefined control sequence. ...egin {split}\max _{\theta } \underE {s \sim \mathcal {D} \\ \x... l.4873 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Missing } inserted. } l.4873 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Missing { inserted. { l.4873 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4891 ...rithms/sac:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4892 \caption {Soft Actor-Critic} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4894 \begin{algorithmic} [1] ! Undefined control sequence. l.4895 \STATE Input: initial policy parameters $\theta$, Q-function para... ! Undefined control sequence. l.4896 \STATE Set target parameters equal to main parameters $\psi_{\tex... ! Undefined control sequence. l.4897 \REPEAT ! Undefined control sequence. l.4898 \STATE Observe state $s$ and select action $a \sim \pi_{\thet... ! Undefined control sequence. l.4899 \STATE Execute $a$ in the environment ! Undefined control sequence. l.4900 \STATE Observe next state $s'$, reward $r$, and done signal $... ! Undefined control sequence. l.4901 \STATE Store $(s,a,r,s',d)$ in replay buffer $\mathcal{D}$ ! Undefined control sequence. l.4902 \STATE If $s'$ is terminal, reset environment state. ! Undefined control sequence. l.4903 \IF {it's time to update} ! Undefined control sequence. l.4904 \FOR {$j$ in range(however many updates)} ! Undefined control sequence. l.4905 \STATE Randomly sample a batch of transitions, $B = \... ! Undefined control sequence. l.4906 \STATE Compute targets for Q and V functions: [106] ! Undefined control sequence. l.4911 \STATE Update Q-functions by one step of gradient des... ! Undefined control sequence. l.4915 \STATE Update V-function by one step of gradient desc... ! Undefined control sequence. l.4919 \STATE Update policy by one step of gradient ascent u... ! Undefined control sequence. l.4924 \STATE Update target value network with ! Undefined control sequence. l.4928 \ENDFOR ! Undefined control sequence. l.4929 \ENDIF ! Undefined control sequence. l.4930 \UNTIL {convergence} ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4931 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4932 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 4938--4938 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 4938--4938 \T1/ptm/m/it/10 lr=0.001\T1/ptm/m/n/10 , \T1/ptm/m/it/10 al-pha=0.2\T1/ptm/m/n/ 10 , \T1/ptm/m/it/10 batch_size=100\T1/ptm/m/n/10 , \T1/ptm/m/it/10 start_steps =10000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 max_ep_len=1000\T1/ptm/m/n/10 , \T1/ptm/ m/it/10 log- [107] Overfull \vbox (77.80809pt too high) has occurred while \output is active [108] [109] [110] Chapter 20. [111] [112] [113] [114] [115] [116] Chapter 21. [117] [118] Chapter 22. [119] [120] Chapter 23. [121] Underfull \hbox (badness 10000) in paragraph at lines 5953--5953 []\T1/ptm/m/it/10 exp_name\T1/ptm/m/n/10 , \T1/ptm/m/it/10 thunk\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , \T1/ptm/m/it/10 num_cpu=1\T1/ptm/m/n/1 0 , [122] [123] [124] Chapter 24. [125] [126] Chapter 25. [127] [128] Chapter 26. [129] [130] [131] (./SpinningUp.ind [132] Underfull \hbox (badness 7522) in paragraph at lines 47--48 []\T1/ptm/m/n/10 add() (spinup.utils.run_utils.ExperimentGrid method), Overfull \hbox (5.61969pt too wide) in paragraph at lines 48--49 []\T1/ptm/m/n/10 apply_gradients() (spinup.utils.mpi_tf.MpiAdamOptimizer Overfull \hbox (17.83952pt too wide) in paragraph at lines 74--75 []\T1/ptm/m/n/10 compute_gradients() (spinup.utils.mpi_tf.MpiAdamOptimizer [133] Underfull \hbox (badness 10000) in paragraph at lines 103--104 []\T1/ptm/m/n/10 mpi_statistics_scalar() (in mod-ule Underfull \hbox (badness 10000) in paragraph at lines 119--120 []\T1/ptm/m/n/10 run() (spinup.utils.run_utils.ExperimentGrid method), Underfull \hbox (badness 10000) in paragraph at lines 140--141 []\T1/ptm/m/n/10 variant_name() (spinup.utils.run_utils.ExperimentGrid Underfull \hbox (badness 10000) in paragraph at lines 141--142 []\T1/ptm/m/n/10 variants() (spinup.utils.run_utils.ExperimentGrid [134]) (./SpinningUp.aux) Package rerunfilecheck Warning: File `SpinningUp.out' has changed. (rerunfilecheck) Rerun to get outlines right (rerunfilecheck) or use package `bookmark'. LaTeX Warning: There were multiply-defined labels. ) (see the transcript file for additional information){/usr/share/texlive/texmf-d ist/fonts/enc/dvips/base/8r.enc}< /usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmr5.pfb> Output written on SpinningUp.pdf (140 pages, 1144256 bytes). Transcript written on SpinningUp.log. [rtd-command-info] start-time: 2018-11-11T14:48:14.387230Z, end-time: 2018-11-11T14:48:14.450179Z, duration: 0, exit-code: 0 mv -f /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/_build/latex/SpinningUp.pdf /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/artifacts/stable/sphinx_pdf/openai-education-spinningup.pdf [rtd-command-info] start-time: 2018-11-11T14:48:14.510591Z, end-time: 2018-11-11T14:49:45.234397Z, duration: 90, exit-code: 0 python sphinx-build -T -b epub -d _build/doctrees-epub -D language=en . _build/epub Running Sphinx v1.5.6 making output directory... loading translations [en]... done loading pickled environment... not yet created building [mo]: targets for 0 po files that are out of date building [epub]: targets for 31 source files that are out of date updating environment: 31 added, 0 changed, 0 removed reading sources... [ 3%] algorithms/ddpg reading sources... [ 6%] algorithms/ppo reading sources... [ 9%] algorithms/sac reading sources... [ 12%] algorithms/td3 reading sources... [ 16%] algorithms/trpo reading sources... [ 19%] algorithms/vpg reading sources... [ 22%] etc/acknowledgements reading sources... [ 25%] etc/author reading sources... [ 29%] index reading sources... [ 32%] spinningup/bench reading sources... [ 35%] spinningup/exercise2_1_soln reading sources... [ 38%] spinningup/exercise2_2_soln reading sources... [ 41%] spinningup/exercises reading sources... [ 45%] spinningup/extra_pg_proof1 reading sources... [ 48%] spinningup/extra_pg_proof2 reading sources... [ 51%] spinningup/keypapers reading sources... [ 54%] spinningup/rl_intro reading sources... [ 58%] spinningup/rl_intro2 reading sources... [ 61%] spinningup/rl_intro3 reading sources... [ 64%] spinningup/rl_intro4 reading sources... [ 67%] spinningup/spinningup reading sources... [ 70%] user/algorithms reading sources... [ 74%] user/installation reading sources... [ 77%] user/introduction reading sources... [ 80%] user/plotting reading sources... [ 83%] user/running reading sources... [ 87%] user/saving_and_loading reading sources... [ 90%] utils/logger reading sources... [ 93%] utils/mpi reading sources... [ 96%] utils/plotter reading sources... [100%] utils/run_utils /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/spinningup/exercises.rst:3: WARNING: Duplicate explicit target name: "solution available here.". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/spinningup/rl_intro3.rst:3: WARNING: Duplicate explicit target name: "on github". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/spinningup/rl_intro3.rst:380: WARNING: Line block ends without a blank line. /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/user/plotting.rst:55: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/user/plotting.rst:63: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/user/running.rst:141: WARNING: Duplicate explicit target name: "cmdoption--ac_kwargs". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/user/running.rst:282: WARNING: Inline strong start-string without end-string. /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/user/saving_and_loading.rst:119: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/user/saving_and_loading.rst:146: WARNING: Duplicate explicit target name: "cmdoption-arg-default". looking for now-outdated files... none found pickling environment... done checking consistency... /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/spinningup/exercise2_1_soln.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/spinningup/exercise2_2_soln.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/spinningup/extra_pg_proof1.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/spinningup/extra_pg_proof2.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/spinningup/rl_intro4.rst:: WARNING: document isn't included in any toctree done preparing documents... done writing output... [ 3%] algorithms/ddpg writing output... [ 6%] algorithms/ppo writing output... [ 9%] algorithms/sac writing output... [ 12%] algorithms/td3 writing output... [ 16%] algorithms/trpo writing output... [ 19%] algorithms/vpg writing output... [ 22%] etc/acknowledgements writing output... [ 25%] etc/author writing output... [ 29%] index writing output... [ 32%] spinningup/bench writing output... [ 35%] spinningup/exercise2_1_soln writing output... [ 38%] spinningup/exercise2_2_soln writing output... [ 41%] spinningup/exercises writing output... [ 45%] spinningup/extra_pg_proof1 writing output... [ 48%] spinningup/extra_pg_proof2 writing output... [ 51%] spinningup/keypapers writing output... [ 54%] spinningup/rl_intro writing output... [ 58%] spinningup/rl_intro2 writing output... [ 61%] spinningup/rl_intro3 writing output... [ 64%] spinningup/rl_intro4 writing output... [ 67%] spinningup/spinningup writing output... [ 70%] user/algorithms writing output... [ 74%] user/installation writing output... [ 77%] user/introduction writing output... [ 80%] user/plotting writing output... [ 83%] user/running writing output... [ 87%] user/saving_and_loading writing output... [ 90%] utils/logger writing output... [ 93%] utils/mpi writing output... [ 96%] utils/plotter writing output... [100%] utils/run_utils generating indices... genindex py-modindex writing additional pages... copying images... [ 10%] images/spinning-up-in-rl.png copying images... [ 20%] spinningup/../images/bench/bench_halfcheetah.svg copying images... [ 30%] spinningup/../images/bench/bench_hopper.svg copying images... [ 40%] spinningup/../images/bench/bench_walker.svg copying images... [ 50%] spinningup/../images/bench/bench_swim.svg copying images... [ 60%] spinningup/../images/bench/bench_ant.svg copying images... [ 70%] spinningup/../images/ex2-1_trpo_hopper.png copying images... [ 80%] spinningup/../images/ex2-2_ddpg_bug.svg copying images... [ 90%] spinningup/../images/rl_diagram_transparent_bg.png copying images... [100%] spinningup/../images/rl_algorithms_9_15.svg copying static files... WARNING: html_static_path entry '/home/docs/checkouts/readthedocs.org/readthedocs/templates/sphinx/_static' does not exist WARNING: favicon file 'openai_icon.ico' does not exist done copying extra files... done writing mimetype file... writing META-INF/container.xml file... writing content.opf file... WARNING: unknown mimetype for _static/openai-favicon2_32x32.ico, ignoring WARNING: unknown mimetype for _static/openai_icon.ico, ignoring writing nav.xhtml file... writing toc.ncx file... writing SpinningUp.epub file... build succeeded, 18 warnings. [rtd-command-info] start-time: 2018-11-11T14:49:45.331391Z, end-time: 2018-11-11T14:49:45.391321Z, duration: 0, exit-code: 0 mv -f /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/stable/docs/_build/epub/SpinningUp.epub /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/artifacts/stable/sphinx_epub/openai-education-spinningup.epub