Read the Docs build information Build id: 158542 Project: openai-education-spinningup Version: latest Commit: 981e0d6586db7eb7c2fdd5f0e81c57e703ea1437 Date: 2018-11-12T19:01:27.550035Z State: finished Success: True [rtd-command-info] start-time: 2018-11-13T01:01:28.016756Z, end-time: 2018-11-13T01:01:29.354158Z, duration: 1, exit-code: 0 git clone git@github.com:openai/spinningup.git . Cloning into '.'... [rtd-command-info] start-time: 2018-11-13T01:01:29.424035Z, end-time: 2018-11-13T01:01:29.806271Z, duration: 0, exit-code: 0 git checkout --force origin/master Note: checking out 'origin/master'. You are in 'detached HEAD' state. You can look around, make experimental changes and commit them, and you can discard any commits you make in this state without impacting any branches by performing another checkout. If you want to create a new branch to retain commits you create, you may do so (now or later) by using -b with the checkout command again. Example: git checkout -b HEAD is now at 981e0d6 Merge pull request #25 from rcshubhadeep/feature/vscode-gitignore [rtd-command-info] start-time: 2018-11-13T01:01:29.879098Z, end-time: 2018-11-13T01:01:29.886542Z, duration: 0, exit-code: 0 git clean -d -f -f [rtd-command-info] start-time: 2018-11-13T01:01:29.965958Z, end-time: 2018-11-13T01:01:29.970723Z, duration: 0, exit-code: 0 git branch -r origin/HEAD -> origin/master origin/master [rtd-command-info] start-time: 2018-11-13T01:01:30.697431Z, end-time: 2018-11-13T01:01:34.012527Z, duration: 3, exit-code: 0 python3.6 -mvirtualenv --no-site-packages --no-download Using base prefix '/home/docs/.pyenv/versions/3.6.2' New python executable in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/bin/python3.6 Also creating executable in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/bin/python Installing setuptools, pip, wheel...done. [rtd-command-info] start-time: 2018-11-13T01:01:34.075457Z, end-time: 2018-11-13T01:01:43.519513Z, duration: 9, exit-code: 0 python pip install --upgrade --cache-dir /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/.cache/pip Pygments==2.2.0 setuptools<40 docutils==0.13.1 mock==1.0.1 pillow==2.6.1 alabaster>=0.7,<0.8,!=0.7.5 commonmark==0.5.4 recommonmark==0.4.0 sphinx<1.8 sphinx-rtd-theme<0.5 readthedocs-sphinx-ext<0.6 Collecting Pygments==2.2.0 Using cached https://files.pythonhosted.org/packages/02/ee/b6e02dc6529e82b75bb06823ff7d005b141037cb1416b10c6f00fc419dca/Pygments-2.2.0-py2.py3-none-any.whl Collecting setuptools<40 Using cached https://files.pythonhosted.org/packages/7f/e1/820d941153923aac1d49d7fc37e17b6e73bfbd2904959fffbad77900cf92/setuptools-39.2.0-py2.py3-none-any.whl Collecting docutils==0.13.1 Using cached https://files.pythonhosted.org/packages/7c/30/8fb30d820c012a6f701a66618ce065b6d61d08ac0a77e47fc7808dbaee47/docutils-0.13.1-py3-none-any.whl Collecting mock==1.0.1 Collecting pillow==2.6.1 Collecting alabaster!=0.7.5,<0.8,>=0.7 Using cached https://files.pythonhosted.org/packages/10/ad/00b090d23a222943eb0eda509720a404f531a439e803f6538f35136cae9e/alabaster-0.7.12-py2.py3-none-any.whl Collecting commonmark==0.5.4 Collecting recommonmark==0.4.0 Using cached https://files.pythonhosted.org/packages/df/a5/8ee4b84af7f997dfdba71254a88008cfc19c49df98983c9a4919e798f8ce/recommonmark-0.4.0-py2.py3-none-any.whl Collecting sphinx<1.8 Using cached https://files.pythonhosted.org/packages/90/f9/a0babe32c78480994e4f1b93315558f5ed756104054a7029c672a8d77b72/Sphinx-1.7.9-py2.py3-none-any.whl Collecting sphinx-rtd-theme<0.5 Using cached https://files.pythonhosted.org/packages/ef/0c/e4a462190506bc4bff6ca8cf93da07b2d13e540466d2e8a760352d0c69b0/sphinx_rtd_theme-0.4.2-py2.py3-none-any.whl Collecting readthedocs-sphinx-ext<0.6 Using cached https://files.pythonhosted.org/packages/2b/c5/126eb75a57918bb3d2f858ddda05f5670d6f07bfa356bc8870e2885f6aac/readthedocs_sphinx_ext-0.5.15-py2.py3-none-any.whl Collecting snowballstemmer>=1.1 (from sphinx<1.8) Using cached https://files.pythonhosted.org/packages/d4/6c/8a935e2c7b54a37714656d753e4187ee0631988184ed50c0cf6476858566/snowballstemmer-1.2.1-py2.py3-none-any.whl Collecting sphinxcontrib-websupport (from sphinx<1.8) Using cached https://files.pythonhosted.org/packages/52/69/3c2fbdc3702358c5b34ee25e387b24838597ef099761fc9a42c166796e8f/sphinxcontrib_websupport-1.1.0-py2.py3-none-any.whl Collecting babel!=2.0,>=1.3 (from sphinx<1.8) Using cached https://files.pythonhosted.org/packages/b8/ad/c6f60602d3ee3d92fbed87675b6fb6a6f9a38c223343ababdb44ba201f10/Babel-2.6.0-py2.py3-none-any.whl Collecting imagesize (from sphinx<1.8) Using cached https://files.pythonhosted.org/packages/fc/b6/aef66b4c52a6ad6ac18cf6ebc5731ed06d8c9ae4d3b2d9951f261150be67/imagesize-1.1.0-py2.py3-none-any.whl Collecting six>=1.5 (from sphinx<1.8) Using cached https://files.pythonhosted.org/packages/67/4b/141a581104b1f6397bfa78ac9d43d8ad29a7ca43ea90a2d863fe3056e86a/six-1.11.0-py2.py3-none-any.whl Collecting requests>=2.0.0 (from sphinx<1.8) Using cached https://files.pythonhosted.org/packages/ff/17/5cbb026005115301a8fb2f9b0e3e8d32313142fe8b617070e7baad20554f/requests-2.20.1-py2.py3-none-any.whl Collecting packaging (from sphinx<1.8) Using cached https://files.pythonhosted.org/packages/89/d1/92e6df2e503a69df9faab187c684585f0136662c12bb1f36901d426f3fab/packaging-18.0-py2.py3-none-any.whl Collecting Jinja2>=2.3 (from sphinx<1.8) Using cached https://files.pythonhosted.org/packages/7f/ff/ae64bacdfc95f27a016a7bed8e8686763ba4d277a78ca76f32659220a731/Jinja2-2.10-py2.py3-none-any.whl Collecting pytz>=0a (from babel!=2.0,>=1.3->sphinx<1.8) Using cached https://files.pythonhosted.org/packages/f8/0e/2365ddc010afb3d79147f1dd544e5ee24bf4ece58ab99b16fbb465ce6dc0/pytz-2018.7-py2.py3-none-any.whl Collecting idna<2.8,>=2.5 (from requests>=2.0.0->sphinx<1.8) Using cached https://files.pythonhosted.org/packages/4b/2a/0276479a4b3caeb8a8c1af2f8e4355746a97fab05a372e4a2c6a6b876165/idna-2.7-py2.py3-none-any.whl Collecting certifi>=2017.4.17 (from requests>=2.0.0->sphinx<1.8) Using cached https://files.pythonhosted.org/packages/56/9d/1d02dd80bc4cd955f98980f28c5ee2200e1209292d5f9e9cc8d030d18655/certifi-2018.10.15-py2.py3-none-any.whl Collecting urllib3<1.25,>=1.21.1 (from requests>=2.0.0->sphinx<1.8) Using cached https://files.pythonhosted.org/packages/62/00/ee1d7de624db8ba7090d1226aebefab96a2c71cd5cfa7629d6ad3f61b79e/urllib3-1.24.1-py2.py3-none-any.whl Collecting chardet<3.1.0,>=3.0.2 (from requests>=2.0.0->sphinx<1.8) Using cached https://files.pythonhosted.org/packages/bc/a9/01ffebfb562e4274b6487b4bb1ddec7ca55ec7510b22e4c51f14098443b8/chardet-3.0.4-py2.py3-none-any.whl Collecting pyparsing>=2.0.2 (from packaging->sphinx<1.8) Using cached https://files.pythonhosted.org/packages/71/e8/6777f6624681c8b9701a8a0a5654f3eb56919a01a78e12bf3c73f5a3c714/pyparsing-2.3.0-py2.py3-none-any.whl Collecting MarkupSafe>=0.23 (from Jinja2>=2.3->sphinx<1.8) Using cached https://files.pythonhosted.org/packages/08/04/f2191b50fb7f0712f03f064b71d8b4605190f2178ba02e975a87f7b89a0d/MarkupSafe-1.1.0-cp36-cp36m-manylinux1_x86_64.whl Installing collected packages: Pygments, setuptools, docutils, mock, pillow, alabaster, commonmark, recommonmark, snowballstemmer, sphinxcontrib-websupport, pytz, babel, imagesize, six, idna, certifi, urllib3, chardet, requests, pyparsing, packaging, MarkupSafe, Jinja2, sphinx, sphinx-rtd-theme, readthedocs-sphinx-ext Found existing installation: setuptools 39.0.1 Uninstalling setuptools-39.0.1: Successfully uninstalled setuptools-39.0.1 Successfully installed Jinja2-2.10 MarkupSafe-1.1.0 Pygments-2.2.0 alabaster-0.7.12 babel-2.6.0 certifi-2018.10.15 chardet-3.0.4 commonmark-0.5.4 docutils-0.13.1 idna-2.7 imagesize-1.1.0 mock-1.0.1 packaging-18.0 pillow-2.6.1 pyparsing-2.3.0 pytz-2018.7 readthedocs-sphinx-ext-0.5.15 recommonmark-0.4.0 requests-2.20.1 setuptools-39.2.0 six-1.11.0 snowballstemmer-1.2.1 sphinx-1.7.9 sphinx-rtd-theme-0.4.2 sphinxcontrib-websupport-1.1.0 urllib3-1.24.1 You are using pip version 9.0.3, however version 18.1 is available. You should consider upgrading via the 'pip install --upgrade pip' command. [rtd-command-info] start-time: 2018-11-13T01:01:43.580856Z, end-time: 2018-11-13T01:02:30.286291Z, duration: 46, exit-code: 0 python pip install --exists-action=w --cache-dir /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/.cache/pip -r docs/docs_requirements.txt Collecting cloudpickle==0.5.2 (from -r docs/docs_requirements.txt (line 1)) Using cached https://files.pythonhosted.org/packages/aa/18/514b557c4d8d4ada1f0454ad06c845454ad438fd5c5e0039ba51d6b032fe/cloudpickle-0.5.2-py2.py3-none-any.whl Collecting gym>=0.10.8 (from -r docs/docs_requirements.txt (line 2)) Collecting ipython (from -r docs/docs_requirements.txt (line 3)) Using cached https://files.pythonhosted.org/packages/1b/e2/ffb8c1b574f972cf4183b0aac8f16b57f1e3bbe876b31555b107ea3fd009/ipython-7.1.1-py3-none-any.whl Collecting joblib (from -r docs/docs_requirements.txt (line 4)) Using cached https://files.pythonhosted.org/packages/0d/1b/995167f6c66848d4eb7eabc386aebe07a1571b397629b2eac3b7bebdc343/joblib-0.13.0-py2.py3-none-any.whl Collecting matplotlib (from -r docs/docs_requirements.txt (line 5)) Using cached https://files.pythonhosted.org/packages/71/07/16d781df15be30df4acfd536c479268f1208b2dfbc91e9ca5d92c9caf673/matplotlib-3.0.2-cp36-cp36m-manylinux1_x86_64.whl Collecting numpy (from -r docs/docs_requirements.txt (line 6)) Using cached https://files.pythonhosted.org/packages/ff/7f/9d804d2348471c67a7d8b5f84f9bc59fd1cefa148986f2b74552f8573555/numpy-1.15.4-cp36-cp36m-manylinux1_x86_64.whl Collecting pandas (from -r docs/docs_requirements.txt (line 7)) Using cached https://files.pythonhosted.org/packages/e1/d8/feeb346d41f181e83fba45224ab14a8d8af019b48af742e047f3845d8cff/pandas-0.23.4-cp36-cp36m-manylinux1_x86_64.whl Collecting pytest (from -r docs/docs_requirements.txt (line 8)) Downloading https://files.pythonhosted.org/packages/57/94/305477fb977546970a3464c21b63c6800df6705384af2978b89acccfb151/pytest-3.10.1-py2.py3-none-any.whl (216kB) Collecting psutil (from -r docs/docs_requirements.txt (line 9)) Collecting scipy (from -r docs/docs_requirements.txt (line 10)) Using cached https://files.pythonhosted.org/packages/a8/0b/f163da98d3a01b3e0ef1cab8dd2123c34aee2bafbb1c5bffa354cc8a1730/scipy-1.1.0-cp36-cp36m-manylinux1_x86_64.whl Collecting seaborn==0.8.1 (from -r docs/docs_requirements.txt (line 11)) Collecting sphinx==1.5.6 (from -r docs/docs_requirements.txt (line 12)) Using cached https://files.pythonhosted.org/packages/cd/c3/3fc2985e07f6111b47328be116df9e05d5c2f246a050e2e2ebf6bdc9c692/Sphinx-1.5.6-py2.py3-none-any.whl Collecting sphinx-autobuild==0.7.1 (from -r docs/docs_requirements.txt (line 13)) Collecting sphinx-rtd-theme==0.4.1 (from -r docs/docs_requirements.txt (line 14)) Using cached https://files.pythonhosted.org/packages/87/30/7460f7b77b6e8a080dd3688f750fe5d5666c49358f8941449c5b128fa97d/sphinx_rtd_theme-0.4.1-py2.py3-none-any.whl Collecting tensorflow>=1.8.0 (from -r docs/docs_requirements.txt (line 15)) Using cached https://files.pythonhosted.org/packages/22/cc/ca70b78087015d21c5f3f93694107f34ebccb3be9624385a911d4b52ecef/tensorflow-1.12.0-cp36-cp36m-manylinux1_x86_64.whl Collecting tqdm (from -r docs/docs_requirements.txt (line 16)) Using cached https://files.pythonhosted.org/packages/91/55/8cb23a97301b177e9c8e3226dba45bb454411de2cbd25746763267f226c2/tqdm-4.28.1-py2.py3-none-any.whl Requirement already satisfied: six in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from gym>=0.10.8->-r docs/docs_requirements.txt (line 2)) Collecting pyglet>=1.2.0 (from gym>=0.10.8->-r docs/docs_requirements.txt (line 2)) Using cached https://files.pythonhosted.org/packages/1c/fc/dad5eaaab68f0c21e2f906a94ddb98175662cc5a654eee404d59554ce0fa/pyglet-1.3.2-py2.py3-none-any.whl Requirement already satisfied: requests>=2.0 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from gym>=0.10.8->-r docs/docs_requirements.txt (line 2)) Collecting prompt-toolkit<2.1.0,>=2.0.0 (from ipython->-r docs/docs_requirements.txt (line 3)) Using cached https://files.pythonhosted.org/packages/d1/e6/adb3be5576f5d27c6faa33f1e9fea8fe5dbd9351db12148de948507e352c/prompt_toolkit-2.0.7-py3-none-any.whl Requirement already satisfied: pygments in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from ipython->-r docs/docs_requirements.txt (line 3)) Collecting decorator (from ipython->-r docs/docs_requirements.txt (line 3)) Using cached https://files.pythonhosted.org/packages/bc/bb/a24838832ba35baf52f32ab1a49b906b5f82fb7c76b2f6a7e35e140bac30/decorator-4.3.0-py2.py3-none-any.whl Collecting backcall (from ipython->-r docs/docs_requirements.txt (line 3)) Collecting traitlets>=4.2 (from ipython->-r docs/docs_requirements.txt (line 3)) Using cached https://files.pythonhosted.org/packages/93/d6/abcb22de61d78e2fc3959c964628a5771e47e7cc60d53e9342e21ed6cc9a/traitlets-4.3.2-py2.py3-none-any.whl Collecting pexpect; sys_platform != "win32" (from ipython->-r docs/docs_requirements.txt (line 3)) Using cached https://files.pythonhosted.org/packages/89/e6/b5a1de8b0cc4e07ca1b305a4fcc3f9806025c1b651ea302646341222f88b/pexpect-4.6.0-py2.py3-none-any.whl Collecting jedi>=0.10 (from ipython->-r docs/docs_requirements.txt (line 3)) Using cached https://files.pythonhosted.org/packages/7a/1a/9bd24a185873b998611c2d8d4fb15cd5e8a879ead36355df7ee53e9111bf/jedi-0.13.1-py2.py3-none-any.whl Requirement already satisfied: setuptools>=18.5 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from ipython->-r docs/docs_requirements.txt (line 3)) Collecting pickleshare (from ipython->-r docs/docs_requirements.txt (line 3)) Using cached https://files.pythonhosted.org/packages/9a/41/220f49aaea88bc6fa6cba8d05ecf24676326156c23b991e80b3f2fc24c77/pickleshare-0.7.5-py2.py3-none-any.whl Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from matplotlib->-r docs/docs_requirements.txt (line 5)) Collecting python-dateutil>=2.1 (from matplotlib->-r docs/docs_requirements.txt (line 5)) Using cached https://files.pythonhosted.org/packages/74/68/d87d9b36af36f44254a8d512cbfc48369103a3b9e474be9bdfe536abfc45/python_dateutil-2.7.5-py2.py3-none-any.whl Collecting cycler>=0.10 (from matplotlib->-r docs/docs_requirements.txt (line 5)) Using cached https://files.pythonhosted.org/packages/f7/d2/e07d3ebb2bd7af696440ce7e754c59dd546ffe1bbe732c8ab68b9c834e61/cycler-0.10.0-py2.py3-none-any.whl Collecting kiwisolver>=1.0.1 (from matplotlib->-r docs/docs_requirements.txt (line 5)) Using cached https://files.pythonhosted.org/packages/69/a7/88719d132b18300b4369fbffa741841cfd36d1e637e1990f27929945b538/kiwisolver-1.0.1-cp36-cp36m-manylinux1_x86_64.whl Requirement already satisfied: pytz>=2011k in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from pandas->-r docs/docs_requirements.txt (line 7)) Collecting attrs>=17.4.0 (from pytest->-r docs/docs_requirements.txt (line 8)) Using cached https://files.pythonhosted.org/packages/3a/e1/5f9023cc983f1a628a8c2fd051ad19e76ff7b142a0faf329336f9a62a514/attrs-18.2.0-py2.py3-none-any.whl Collecting pluggy>=0.7 (from pytest->-r docs/docs_requirements.txt (line 8)) Using cached https://files.pythonhosted.org/packages/1c/e7/017c262070af41fe251401cb0d0e1b7c38f656da634cd0c15604f1f30864/pluggy-0.8.0-py2.py3-none-any.whl Collecting more-itertools>=4.0.0 (from pytest->-r docs/docs_requirements.txt (line 8)) Using cached https://files.pythonhosted.org/packages/79/b1/eace304ef66bd7d3d8b2f78cc374b73ca03bc53664d78151e9df3b3996cc/more_itertools-4.3.0-py3-none-any.whl Collecting py>=1.5.0 (from pytest->-r docs/docs_requirements.txt (line 8)) Using cached https://files.pythonhosted.org/packages/3e/c7/3da685ef117d42ac8d71af525208759742dd235f8094221fdaafcd3dba8f/py-1.7.0-py2.py3-none-any.whl Collecting atomicwrites>=1.0 (from pytest->-r docs/docs_requirements.txt (line 8)) Using cached https://files.pythonhosted.org/packages/3a/9a/9d878f8d885706e2530402de6417141129a943802c084238914fa6798d97/atomicwrites-1.2.1-py2.py3-none-any.whl Requirement already satisfied: imagesize in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from sphinx==1.5.6->-r docs/docs_requirements.txt (line 12)) Requirement already satisfied: Jinja2>=2.3 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from sphinx==1.5.6->-r docs/docs_requirements.txt (line 12)) Requirement already satisfied: alabaster<0.8,>=0.7 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from sphinx==1.5.6->-r docs/docs_requirements.txt (line 12)) Requirement already satisfied: docutils>=0.11 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from sphinx==1.5.6->-r docs/docs_requirements.txt (line 12)) Requirement already satisfied: babel!=2.0,>=1.3 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from sphinx==1.5.6->-r docs/docs_requirements.txt (line 12)) Requirement already satisfied: snowballstemmer>=1.1 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from sphinx==1.5.6->-r docs/docs_requirements.txt (line 12)) Collecting port-for==0.3.1 (from sphinx-autobuild==0.7.1->-r docs/docs_requirements.txt (line 13)) Collecting pathtools>=0.1.2 (from sphinx-autobuild==0.7.1->-r docs/docs_requirements.txt (line 13)) Collecting watchdog>=0.7.1 (from sphinx-autobuild==0.7.1->-r docs/docs_requirements.txt (line 13)) Collecting livereload>=2.3.0 (from sphinx-autobuild==0.7.1->-r docs/docs_requirements.txt (line 13)) Using cached https://files.pythonhosted.org/packages/dd/b4/213daced3ff1b4e02a1f700748e20e9a7481f5bfef57d11ae9babfd4aa2f/livereload-2.5.2-py2.py3-none-any.whl Collecting tornado>=3.2 (from sphinx-autobuild==0.7.1->-r docs/docs_requirements.txt (line 13)) Collecting argh>=0.24.1 (from sphinx-autobuild==0.7.1->-r docs/docs_requirements.txt (line 13)) Using cached https://files.pythonhosted.org/packages/06/1c/e667a7126f0b84aaa1c56844337bf0ac12445d1beb9c8a6199a7314944bf/argh-0.26.2-py2.py3-none-any.whl Collecting PyYAML>=3.10 (from sphinx-autobuild==0.7.1->-r docs/docs_requirements.txt (line 13)) Collecting termcolor>=1.1.0 (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Collecting gast>=0.2.0 (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Requirement already satisfied: wheel>=0.26 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Collecting keras-preprocessing>=1.0.5 (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Using cached https://files.pythonhosted.org/packages/fc/94/74e0fa783d3fc07e41715973435dd051ca89c550881b3454233c39c73e69/Keras_Preprocessing-1.0.5-py2.py3-none-any.whl Collecting keras-applications>=1.0.6 (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Using cached https://files.pythonhosted.org/packages/3f/c4/2ff40221029f7098d58f8d7fb99b97e8100f3293f9856f0fb5834bef100b/Keras_Applications-1.0.6-py2.py3-none-any.whl Collecting absl-py>=0.1.6 (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Collecting tensorboard<1.13.0,>=1.12.0 (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Using cached https://files.pythonhosted.org/packages/e0/d0/65fe48383146199f16dbd5999ef226b87bce63ad5cd73c840cf722637969/tensorboard-1.12.0-py3-none-any.whl Collecting astor>=0.6.0 (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Using cached https://files.pythonhosted.org/packages/35/6b/11530768cac581a12952a2aad00e1526b89d242d0b9f59534ef6e6a1752f/astor-0.7.1-py2.py3-none-any.whl Collecting protobuf>=3.6.1 (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Using cached https://files.pythonhosted.org/packages/c2/f9/28787754923612ca9bfdffc588daa05580ed70698add063a5629d1a4209d/protobuf-3.6.1-cp36-cp36m-manylinux1_x86_64.whl Collecting grpcio>=1.8.6 (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Using cached https://files.pythonhosted.org/packages/c3/4c/0a7c55764ac3013ca7a5e9638ee7b161488c0611afc2be465452987a3ccc/grpcio-1.16.0-cp36-cp36m-manylinux1_x86_64.whl Collecting future (from pyglet>=1.2.0->gym>=0.10.8->-r docs/docs_requirements.txt (line 2)) Requirement already satisfied: urllib3<1.25,>=1.21.1 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from requests>=2.0->gym>=0.10.8->-r docs/docs_requirements.txt (line 2)) Requirement already satisfied: certifi>=2017.4.17 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from requests>=2.0->gym>=0.10.8->-r docs/docs_requirements.txt (line 2)) Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from requests>=2.0->gym>=0.10.8->-r docs/docs_requirements.txt (line 2)) Requirement already satisfied: idna<2.8,>=2.5 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from requests>=2.0->gym>=0.10.8->-r docs/docs_requirements.txt (line 2)) Collecting wcwidth (from prompt-toolkit<2.1.0,>=2.0.0->ipython->-r docs/docs_requirements.txt (line 3)) Using cached https://files.pythonhosted.org/packages/7e/9f/526a6947247599b084ee5232e4f9190a38f398d7300d866af3ab571a5bfe/wcwidth-0.1.7-py2.py3-none-any.whl Collecting ipython-genutils (from traitlets>=4.2->ipython->-r docs/docs_requirements.txt (line 3)) Using cached https://files.pythonhosted.org/packages/fa/bc/9bd3b5c2b4774d5f33b2d544f1460be9df7df2fe42f352135381c347c69a/ipython_genutils-0.2.0-py2.py3-none-any.whl Collecting ptyprocess>=0.5 (from pexpect; sys_platform != "win32"->ipython->-r docs/docs_requirements.txt (line 3)) Using cached https://files.pythonhosted.org/packages/d1/29/605c2cc68a9992d18dada28206eeada56ea4bd07a239669da41674648b6f/ptyprocess-0.6.0-py2.py3-none-any.whl Collecting parso>=0.3.0 (from jedi>=0.10->ipython->-r docs/docs_requirements.txt (line 3)) Using cached https://files.pythonhosted.org/packages/09/51/9c48a46334be50c13d25a3afe55fa05c445699304c5ad32619de953a2305/parso-0.3.1-py2.py3-none-any.whl Requirement already satisfied: MarkupSafe>=0.23 in /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/envs/latest/lib/python3.6/site-packages (from Jinja2>=2.3->sphinx==1.5.6->-r docs/docs_requirements.txt (line 12)) Collecting h5py (from keras-applications>=1.0.6->tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Using cached https://files.pythonhosted.org/packages/8e/cb/726134109e7bd71d98d1fcc717ffe051767aac42ede0e7326fd1787e5d64/h5py-2.8.0-cp36-cp36m-manylinux1_x86_64.whl Collecting markdown>=2.6.8 (from tensorboard<1.13.0,>=1.12.0->tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Using cached https://files.pythonhosted.org/packages/7a/6b/5600647404ba15545ec37d2f7f58844d690baf2f81f3a60b862e48f29287/Markdown-3.0.1-py2.py3-none-any.whl Collecting werkzeug>=0.11.10 (from tensorboard<1.13.0,>=1.12.0->tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Using cached https://files.pythonhosted.org/packages/20/c4/12e3e56473e52375aa29c4764e70d1b8f3efa6682bef8d0aae04fe335243/Werkzeug-0.14.1-py2.py3-none-any.whl Installing collected packages: cloudpickle, numpy, scipy, future, pyglet, gym, wcwidth, prompt-toolkit, decorator, backcall, ipython-genutils, traitlets, ptyprocess, pexpect, parso, jedi, pickleshare, ipython, joblib, python-dateutil, cycler, kiwisolver, matplotlib, pandas, attrs, pluggy, more-itertools, py, atomicwrites, pytest, psutil, seaborn, sphinx, port-for, pathtools, argh, PyYAML, watchdog, tornado, livereload, sphinx-autobuild, sphinx-rtd-theme, termcolor, gast, keras-preprocessing, h5py, keras-applications, absl-py, grpcio, protobuf, markdown, werkzeug, tensorboard, astor, tensorflow, tqdm Found existing installation: Sphinx 1.7.9 Uninstalling Sphinx-1.7.9: Successfully uninstalled Sphinx-1.7.9 Found existing installation: sphinx-rtd-theme 0.4.2 Uninstalling sphinx-rtd-theme-0.4.2: Successfully uninstalled sphinx-rtd-theme-0.4.2 Successfully installed PyYAML-3.13 absl-py-0.6.1 argh-0.26.2 astor-0.7.1 atomicwrites-1.2.1 attrs-18.2.0 backcall-0.1.0 cloudpickle-0.5.2 cycler-0.10.0 decorator-4.3.0 future-0.17.1 gast-0.2.0 grpcio-1.16.0 gym-0.10.9 h5py-2.8.0 ipython-7.1.1 ipython-genutils-0.2.0 jedi-0.13.1 joblib-0.13.0 keras-applications-1.0.6 keras-preprocessing-1.0.5 kiwisolver-1.0.1 livereload-2.5.2 markdown-3.0.1 matplotlib-3.0.2 more-itertools-4.3.0 numpy-1.15.4 pandas-0.23.4 parso-0.3.1 pathtools-0.1.2 pexpect-4.6.0 pickleshare-0.7.5 pluggy-0.8.0 port-for-0.3.1 prompt-toolkit-2.0.7 protobuf-3.6.1 psutil-5.4.8 ptyprocess-0.6.0 py-1.7.0 pyglet-1.3.2 pytest-3.10.1 python-dateutil-2.7.5 scipy-1.1.0 seaborn-0.8.1 sphinx-1.5.6 sphinx-autobuild-0.7.1 sphinx-rtd-theme-0.4.1 tensorboard-1.12.0 tensorflow-1.12.0 termcolor-1.1.0 tornado-5.1.1 tqdm-4.28.1 traitlets-4.3.2 watchdog-0.9.0 wcwidth-0.1.7 werkzeug-0.14.1 You are using pip version 9.0.3, however version 18.1 is available. You should consider upgrading via the 'pip install --upgrade pip' command. [rtd-command-info] start-time: 2018-11-13T01:02:30.743438Z, end-time: 2018-11-13T01:02:30.811228Z, duration: 0, exit-code: 0 cat docs/conf.py #!/usr/bin/env python3 # -*- coding: utf-8 -*- # # Spinning Up documentation build configuration file, created by # sphinx-quickstart on Wed Aug 15 04:21:07 2018. # # This file is execfile()d with the current directory set to its # containing dir. # # Note that not all possible configuration values are present in this # autogenerated file. # # All configuration values have a default; values that are commented out # serve to show the default. # If extensions (or modules to document with autodoc) are in another directory, # add these directories to sys.path here. If the directory is relative to the # documentation root, use os.path.abspath to make it absolute, like shown here. # import os import sys # Make sure spinup is accessible without going through setup.py dirname = os.path.dirname sys.path.insert(0, dirname(dirname(__file__))) # Mock mpi4py to get around having to install it on RTD server (which fails) from unittest.mock import MagicMock class Mock(MagicMock): @classmethod def __getattr__(cls, name): return MagicMock() MOCK_MODULES = ['mpi4py'] sys.modules.update((mod_name, Mock()) for mod_name in MOCK_MODULES) # Finish imports import spinup from recommonmark.parser import CommonMarkParser source_parsers = { '.md': CommonMarkParser, } # -- General configuration ------------------------------------------------ # If your documentation needs a minimal Sphinx version, state it here. # # needs_sphinx = '1.0' # Add any Sphinx extension module names here, as strings. They can be # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom # ones. extensions = ['sphinx.ext.imgmath', 'sphinx.ext.viewcode', 'sphinx.ext.autodoc', 'sphinx.ext.napoleon'] #'sphinx.ext.mathjax', ?? # imgmath settings imgmath_image_format = 'svg' imgmath_font_size = 14 # Add any paths that contain templates here, relative to this directory. templates_path = ['_templates'] # The suffix(es) of source filenames. # You can specify multiple suffix as a list of string: # source_suffix = ['.rst', '.md'] # source_suffix = '.rst' # The master toctree document. master_doc = 'index' # General information about the project. project = 'Spinning Up' copyright = '2018, OpenAI' author = 'Joshua Achiam' # The version info for the project you're documenting, acts as replacement for # |version| and |release|, also used in various other places throughout the # built documents. # # The short X.Y version. version = '' # The full version, including alpha/beta/rc tags. release = '' # The language for content autogenerated by Sphinx. Refer to documentation # for a list of supported languages. # # This is also used if you do content translation via gettext catalogs. # Usually you set "language" from the command line for these cases. language = None # List of patterns, relative to source directory, that match files and # directories to ignore when looking for source files. # This patterns also effect to html_static_path and html_extra_path exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store'] # The name of the Pygments (syntax highlighting) style to use. pygments_style = 'default' #'sphinx' # If true, `todo` and `todoList` produce output, else they produce nothing. todo_include_todos = False # -- Options for HTML output ---------------------------------------------- # The theme to use for HTML and HTML Help pages. See the documentation for # a list of builtin themes. # # html_theme = 'alabaster' html_theme = "sphinx_rtd_theme" # Theme options are theme-specific and customize the look and feel of a theme # further. For a list of options available for each theme, see the # documentation. # # html_theme_options = {} # Add any paths that contain custom static files (such as style sheets) here, # relative to this directory. They are copied after the builtin static files, # so a file named "default.css" will overwrite the builtin "default.css". html_static_path = ['_static'] html_logo = 'images/spinning-up-logo2.png' html_theme_options = { 'logo_only': True } #html_favicon = 'openai-favicon2_32x32.ico' html_favicon = 'openai_icon.ico' # -- Options for HTMLHelp output ------------------------------------------ # Output file base name for HTML help builder. htmlhelp_basename = 'SpinningUpdoc' # -- Options for LaTeX output --------------------------------------------- latex_elements = { # The paper size ('letterpaper' or 'a4paper'). # # 'papersize': 'letterpaper', # The font size ('10pt', '11pt' or '12pt'). # # 'pointsize': '10pt', # Additional stuff for the LaTeX preamble. # # 'preamble': '', # Latex figure (float) alignment # # 'figure_align': 'htbp', } imgmath_latex_preamble = r''' \usepackage{algorithm} \usepackage{algorithmic} \usepackage{cancel} \usepackage[verbose=true,letterpaper]{geometry} \geometry{ textheight=12in, textwidth=6.5in, top=1in, headheight=12pt, headsep=25pt, footskip=30pt } \newcommand{\E}{{\mathrm E}} \newcommand{\underE}[2]{\underset{\begin{subarray}{c}#1 \end{subarray}}{\E}\left[ #2 \right]} \newcommand{\Epi}[1]{\underset{\begin{subarray}{c}\tau \sim \pi \end{subarray}}{\E}\left[ #1 \right]} ''' # Grouping the document tree into LaTeX files. List of tuples # (source start file, target name, title, # author, documentclass [howto, manual, or own class]). latex_documents = [ (master_doc, 'SpinningUp.tex', 'Spinning Up Documentation', 'Joshua Achiam', 'manual'), ] # -- Options for manual page output --------------------------------------- # One entry per manual page. List of tuples # (source start file, name, description, authors, manual section). man_pages = [ (master_doc, 'spinningup', 'Spinning Up Documentation', [author], 1) ] # -- Options for Texinfo output ------------------------------------------- # Grouping the document tree into Texinfo files. List of tuples # (source start file, target name, title, author, # dir menu entry, description, category) texinfo_documents = [ (master_doc, 'SpinningUp', 'Spinning Up Documentation', author, 'SpinningUp', 'One line description of project.', 'Miscellaneous'), ] def setup(app): app.add_stylesheet('css/modify.css') ########################################################################### # auto-created readthedocs.org specific configuration # ########################################################################### # # The following code was added during an automated build on readthedocs.org # It is auto created and injected for every build. The result is based on the # conf.py.tmpl file found in the readthedocs.org codebase: # https://github.com/rtfd/readthedocs.org/blob/master/readthedocs/doc_builder/templates/doc_builder/conf.py.tmpl # import importlib import sys import os.path from six import string_types from sphinx import version_info # Get suffix for proper linking to GitHub # This is deprecated in Sphinx 1.3+, # as each page can have its own suffix if globals().get('source_suffix', False): if isinstance(source_suffix, string_types): SUFFIX = source_suffix else: SUFFIX = source_suffix[0] else: SUFFIX = '.rst' # Add RTD Static Path. Add to the end because it overwrites previous files. if not 'html_static_path' in globals(): html_static_path = [] if os.path.exists('_static'): html_static_path.append('_static') html_static_path.append('/home/docs/checkouts/readthedocs.org/readthedocs/templates/sphinx/_static') # Add RTD Theme only if they aren't overriding it already using_rtd_theme = ( ( 'html_theme' in globals() and html_theme in ['default'] and # Allow people to bail with a hack of having an html_style 'html_style' not in globals() ) or 'html_theme' not in globals() ) if using_rtd_theme: theme = importlib.import_module('sphinx_rtd_theme') html_theme = 'sphinx_rtd_theme' html_style = None html_theme_options = {} if 'html_theme_path' in globals(): html_theme_path.append(theme.get_html_theme_path()) else: html_theme_path = [theme.get_html_theme_path()] if globals().get('websupport2_base_url', False): websupport2_base_url = 'https://readthedocs.com/websupport' websupport2_static_url = 'https://media.readthedocs.com/' #Add project information to the template context. context = { 'using_theme': using_rtd_theme, 'html_theme': html_theme, 'current_version': "latest", 'version_slug': "latest", 'MEDIA_URL': "https://media.readthedocs.com/media/", 'STATIC_URL': "https://media.readthedocs.com/", 'PRODUCTION_DOMAIN': "readthedocs.com", 'versions': [ ("latest", "/en/latest/"), ], 'downloads': [ ("pdf", "//readthedocs.com/projects/openai-education-spinningup/downloads/pdf/latest/"), ("htmlzip", "//readthedocs.com/projects/openai-education-spinningup/downloads/htmlzip/latest/"), ("epub", "//readthedocs.com/projects/openai-education-spinningup/downloads/epub/latest/"), ], 'subprojects': [ ], 'slug': 'openai-education-spinningup', 'name': u'spinningup', 'rtd_language': u'en', 'programming_language': u'words', 'canonical_url': 'https://spinningup.openai.com/en/latest/', 'analytics_code': '', 'single_version': False, 'conf_py_path': '/docs/', 'api_host': 'https://readthedocs.com', 'github_user': 'openai', 'github_repo': 'spinningup', 'github_version': 'master', 'display_github': True, 'bitbucket_user': 'None', 'bitbucket_repo': 'None', 'bitbucket_version': 'master', 'display_bitbucket': False, 'gitlab_user': 'None', 'gitlab_repo': 'None', 'gitlab_version': 'master', 'display_gitlab': False, 'READTHEDOCS': True, 'using_theme': (html_theme == "default"), 'new_theme': (html_theme == "sphinx_rtd_theme"), 'source_suffix': SUFFIX, 'ad_free': False, 'user_analytics_code': '', 'global_analytics_code': 'UA-17997319-2', 'commit': '981e0d65', } if 'html_context' in globals(): html_context.update(context) else: html_context = context # Add custom RTD extension if 'extensions' in globals(): # Insert at the beginning because it can interfere # with other extensions. # See https://github.com/rtfd/readthedocs.org/pull/4054 extensions.insert(0, "readthedocs_ext.readthedocs") else: extensions = ["readthedocs_ext.readthedocs"] [rtd-command-info] start-time: 2018-11-13T01:02:30.884484Z, end-time: 2018-11-13T01:04:08.042526Z, duration: 97, exit-code: 0 python sphinx-build -T -E -b readthedocs -d _build/doctrees-readthedocs -D language=en . _build/html Running Sphinx v1.5.6 making output directory... loading translations [en]... done building [mo]: targets for 0 po files that are out of date building [readthedocs]: targets for 31 source files that are out of date updating environment: 31 added, 0 changed, 0 removed reading sources... [ 3%] algorithms/ddpg reading sources... [ 6%] algorithms/ppo reading sources... [ 9%] algorithms/sac reading sources... [ 12%] algorithms/td3 reading sources... [ 16%] algorithms/trpo reading sources... [ 19%] algorithms/vpg reading sources... [ 22%] etc/acknowledgements reading sources... [ 25%] etc/author reading sources... [ 29%] index reading sources... [ 32%] spinningup/bench reading sources... [ 35%] spinningup/exercise2_1_soln reading sources... [ 38%] spinningup/exercise2_2_soln reading sources... [ 41%] spinningup/exercises reading sources... [ 45%] spinningup/extra_pg_proof1 reading sources... [ 48%] spinningup/extra_pg_proof2 reading sources... [ 51%] spinningup/keypapers reading sources... [ 54%] spinningup/rl_intro reading sources... [ 58%] spinningup/rl_intro2 reading sources... [ 61%] spinningup/rl_intro3 reading sources... [ 64%] spinningup/rl_intro4 reading sources... [ 67%] spinningup/spinningup reading sources... [ 70%] user/algorithms reading sources... [ 74%] user/installation reading sources... [ 77%] user/introduction reading sources... [ 80%] user/plotting reading sources... [ 83%] user/running reading sources... [ 87%] user/saving_and_loading reading sources... [ 90%] utils/logger reading sources... [ 93%] utils/mpi reading sources... [ 96%] utils/plotter reading sources... [100%] utils/run_utils /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/exercises.rst:3: WARNING: Duplicate explicit target name: "solution available here.". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/rl_intro3.rst:3: WARNING: Duplicate explicit target name: "on github". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/rl_intro3.rst:380: WARNING: Line block ends without a blank line. /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/plotting.rst:55: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/plotting.rst:63: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/running.rst:144: WARNING: Duplicate explicit target name: "cmdoption--ac_kwargs". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/running.rst:285: WARNING: Inline strong start-string without end-string. /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/saving_and_loading.rst:119: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/saving_and_loading.rst:146: WARNING: Duplicate explicit target name: "cmdoption-arg-default". looking for now-outdated files... none found pickling environment... done checking consistency... /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/exercise2_1_soln.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/exercise2_2_soln.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/extra_pg_proof1.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/extra_pg_proof2.rst:: WARNING: document isn't included in any toctree done preparing documents... /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/rl_intro4.rst:: WARNING: document isn't included in any toctree done writing output... [ 3%] algorithms/ddpg writing output... [ 6%] algorithms/ppo writing output... [ 9%] algorithms/sac writing output... [ 12%] algorithms/td3 writing output... [ 16%] algorithms/trpo writing output... [ 19%] algorithms/vpg writing output... [ 22%] etc/acknowledgements writing output... [ 25%] etc/author writing output... [ 29%] index writing output... [ 32%] spinningup/bench writing output... [ 35%] spinningup/exercise2_1_soln writing output... [ 38%] spinningup/exercise2_2_soln writing output... [ 41%] spinningup/exercises writing output... [ 45%] spinningup/extra_pg_proof1 writing output... [ 48%] spinningup/extra_pg_proof2 writing output... [ 51%] spinningup/keypapers writing output... [ 54%] spinningup/rl_intro writing output... [ 58%] spinningup/rl_intro2 writing output... [ 61%] spinningup/rl_intro3 writing output... [ 64%] spinningup/rl_intro4 writing output... [ 67%] spinningup/spinningup writing output... [ 70%] user/algorithms writing output... [ 74%] user/installation writing output... [ 77%] user/introduction writing output... [ 80%] user/plotting writing output... [ 83%] user/running writing output... [ 87%] user/saving_and_loading writing output... [ 90%] utils/logger writing output... [ 93%] utils/mpi writing output... [ 96%] utils/plotter writing output... [100%] utils/run_utils generating indices... genindex py-modindex highlighting module code... [ 10%] spinup.algos.ddpg.ddpg highlighting module code... [ 20%] spinup.algos.ppo.ppo highlighting module code... [ 30%] spinup.algos.sac.sac highlighting module code... [ 40%] spinup.algos.td3.td3 highlighting module code... [ 50%] spinup.algos.trpo.trpo highlighting module code... [ 60%] spinup.algos.vpg.vpg highlighting module code... [ 70%] spinup.utils.logx highlighting module code... [ 80%] spinup.utils.mpi_tools highlighting module code... [ 90%] spinup.utils.mpi_tf highlighting module code... [100%] spinup.utils.run_utils writing additional pages... search copying images... [ 10%] images/spinning-up-in-rl.png copying images... [ 20%] spinningup/../images/bench/bench_halfcheetah.svg copying images... [ 30%] spinningup/../images/bench/bench_hopper.svg copying images... [ 40%] spinningup/../images/bench/bench_walker.svg copying images... [ 50%] spinningup/../images/bench/bench_swim.svg copying images... [ 60%] spinningup/../images/bench/bench_ant.svg copying images... [ 70%] spinningup/../images/ex2-1_trpo_hopper.png copying images... [ 80%] spinningup/../images/ex2-2_ddpg_bug.svg copying images... [ 90%] spinningup/../images/rl_diagram_transparent_bg.png copying images... [100%] spinningup/../images/rl_algorithms_9_15.svg copying static files... WARNING: html_static_path entry '/home/docs/checkouts/readthedocs.org/readthedocs/templates/sphinx/_static' does not exist WARNING: favicon file 'openai_icon.ico' does not exist done copying extra files... done dumping search index in English (code: en) ... done dumping object inventory... done build succeeded, 16 warnings. [rtd-command-info] start-time: 2018-11-13T01:04:08.223032Z, end-time: 2018-11-13T01:05:25.993095Z, duration: 77, exit-code: 0 python sphinx-build -T -b readthedocssinglehtmllocalmedia -d _build/doctrees-readthedocssinglehtmllocalmedia -D language=en . _build/localmedia Running Sphinx v1.5.6 making output directory... loading translations [en]... done loading pickled environment... not yet created building [mo]: targets for 0 po files that are out of date building [readthedocssinglehtmllocalmedia]: all documents updating environment: 31 added, 0 changed, 0 removed reading sources... [ 3%] algorithms/ddpg reading sources... [ 6%] algorithms/ppo reading sources... [ 9%] algorithms/sac reading sources... [ 12%] algorithms/td3 reading sources... [ 16%] algorithms/trpo reading sources... [ 19%] algorithms/vpg reading sources... [ 22%] etc/acknowledgements reading sources... [ 25%] etc/author reading sources... [ 29%] index reading sources... [ 32%] spinningup/bench reading sources... [ 35%] spinningup/exercise2_1_soln reading sources... [ 38%] spinningup/exercise2_2_soln reading sources... [ 41%] spinningup/exercises reading sources... [ 45%] spinningup/extra_pg_proof1 reading sources... [ 48%] spinningup/extra_pg_proof2 reading sources... [ 51%] spinningup/keypapers reading sources... [ 54%] spinningup/rl_intro reading sources... [ 58%] spinningup/rl_intro2 reading sources... [ 61%] spinningup/rl_intro3 reading sources... [ 64%] spinningup/rl_intro4 reading sources... [ 67%] spinningup/spinningup reading sources... [ 70%] user/algorithms reading sources... [ 74%] user/installation reading sources... [ 77%] user/introduction reading sources... [ 80%] user/plotting reading sources... [ 83%] user/running reading sources... [ 87%] user/saving_and_loading reading sources... [ 90%] utils/logger reading sources... [ 93%] utils/mpi reading sources... [ 96%] utils/plotter reading sources... [100%] utils/run_utils /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/exercises.rst:3: WARNING: Duplicate explicit target name: "solution available here.". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/rl_intro3.rst:3: WARNING: Duplicate explicit target name: "on github". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/rl_intro3.rst:380: WARNING: Line block ends without a blank line. /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/plotting.rst:55: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/plotting.rst:63: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/running.rst:144: WARNING: Duplicate explicit target name: "cmdoption--ac_kwargs". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/running.rst:285: WARNING: Inline strong start-string without end-string. /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/saving_and_loading.rst:119: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/saving_and_loading.rst:146: WARNING: Duplicate explicit target name: "cmdoption-arg-default". looking for now-outdated files... none found pickling environment... done checking consistency... /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/exercise2_1_soln.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/exercise2_2_soln.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/extra_pg_proof1.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/extra_pg_proof2.rst:: WARNING: document isn't included in any toctree done /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/rl_intro4.rst:: WARNING: document isn't included in any toctree preparing documents... done assembling single document... user/introduction user/installation user/algorithms user/running user/saving_and_loading user/plotting spinningup/rl_intro spinningup/rl_intro2 spinningup/rl_intro3 spinningup/spinningup spinningup/keypapers spinningup/exercises spinningup/bench algorithms/vpg algorithms/trpo algorithms/ppo algorithms/ddpg algorithms/td3 algorithms/sac utils/logger utils/plotter utils/mpi utils/run_utils etc/acknowledgements etc/author writing... done writing additional files... copying images... [ 12%] images/spinning-up-in-rl.png copying images... [ 25%] spinningup/../images/rl_diagram_transparent_bg.png copying images... [ 37%] spinningup/../images/rl_algorithms_9_15.svg copying images... [ 50%] spinningup/../images/bench/bench_halfcheetah.svg copying images... [ 62%] spinningup/../images/bench/bench_hopper.svg copying images... [ 75%] spinningup/../images/bench/bench_walker.svg copying images... [ 87%] spinningup/../images/bench/bench_swim.svg copying images... [100%] spinningup/../images/bench/bench_ant.svg copying static files... WARNING: html_static_path entry '/home/docs/checkouts/readthedocs.org/readthedocs/templates/sphinx/_static' does not exist done WARNING: favicon file 'openai_icon.ico' does not exist copying extra files... done dumping object inventory... done build succeeded, 16 warnings. [rtd-command-info] start-time: 2018-11-13T01:05:26.158417Z, end-time: 2018-11-13T01:05:31.758231Z, duration: 5, exit-code: 0 python sphinx-build -b latex -D language=en -d _build/doctrees . _build/latex Running Sphinx v1.5.6 making output directory... loading translations [en]... done loading pickled environment... failed: source directory has changed building [mo]: targets for 0 po files that are out of date building [latex]: all documents updating environment: 31 added, 0 changed, 0 removed reading sources... [ 3%] algorithms/ddpg reading sources... [ 6%] algorithms/ppo reading sources... [ 9%] algorithms/sac reading sources... [ 12%] algorithms/td3 reading sources... [ 16%] algorithms/trpo reading sources... [ 19%] algorithms/vpg reading sources... [ 22%] etc/acknowledgements reading sources... [ 25%] etc/author reading sources... [ 29%] index reading sources... [ 32%] spinningup/bench reading sources... [ 35%] spinningup/exercise2_1_soln reading sources... [ 38%] spinningup/exercise2_2_soln reading sources... [ 41%] spinningup/exercises reading sources... [ 45%] spinningup/extra_pg_proof1 reading sources... [ 48%] spinningup/extra_pg_proof2 reading sources... [ 51%] spinningup/keypapers reading sources... [ 54%] spinningup/rl_intro reading sources... [ 58%] spinningup/rl_intro2 reading sources... [ 61%] spinningup/rl_intro3 reading sources... [ 64%] spinningup/rl_intro4 reading sources... [ 67%] spinningup/spinningup reading sources... [ 70%] user/algorithms reading sources... [ 74%] user/installation reading sources... [ 77%] user/introduction reading sources... [ 80%] user/plotting reading sources... [ 83%] user/running reading sources... [ 87%] user/saving_and_loading reading sources... [ 90%] utils/logger reading sources... [ 93%] utils/mpi reading sources... [ 96%] utils/plotter reading sources... [100%] utils/run_utils /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/exercises.rst:3: WARNING: Duplicate explicit target name: "solution available here.". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/rl_intro3.rst:3: WARNING: Duplicate explicit target name: "on github". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/rl_intro3.rst:380: WARNING: Line block ends without a blank line. /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/plotting.rst:55: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/plotting.rst:63: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/running.rst:144: WARNING: Duplicate explicit target name: "cmdoption--ac_kwargs". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/running.rst:285: WARNING: Inline strong start-string without end-string. /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/saving_and_loading.rst:119: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/saving_and_loading.rst:146: WARNING: Duplicate explicit target name: "cmdoption-arg-default". looking for now-outdated files... none found pickling environment... done checking consistency... /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/exercise2_1_soln.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/exercise2_2_soln.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/extra_pg_proof1.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/extra_pg_proof2.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/rl_intro4.rst:: WARNING: document isn't included in any toctree done processing SpinningUp.tex... index user/introduction user/installation user/algorithms user/running user/saving_and_loading user/plotting spinningup/rl_intro spinningup/rl_intro2 spinningup/rl_intro3 spinningup/spinningup spinningup/keypapers spinningup/exercises spinningup/bench algorithms/vpg algorithms/trpo algorithms/ppo algorithms/ddpg algorithms/td3 algorithms/sac utils/logger utils/plotter utils/mpi utils/run_utils etc/acknowledgements etc/author resolving references... writing... done copying images... images/spinning-up-in-rl.png spinningup/../images/rl_diagram_transparent_bg.png spinningup/../images/rl_algorithms_9_15.svg spinningup/../images/bench/bench_halfcheetah.svg spinningup/../images/bench/bench_hopper.svg spinningup/../images/bench/bench_walker.svg spinningup/../images/bench/bench_swim.svg spinningup/../images/bench/bench_ant.svg copying TeX support files... done build succeeded, 14 warnings. [rtd-command-info] start-time: 2018-11-13T01:05:31.828494Z, end-time: 2018-11-13T01:05:33.183568Z, duration: 1, exit-code: 0 pdflatex -interaction=nonstopmode /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/_build/latex/SpinningUp.tex This is pdfTeX, Version 3.14159265-2.6-1.40.16 (TeX Live 2015/Debian) (preloaded format=pdflatex) restricted \write18 enabled. entering extended mode (/home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/c heckouts/latest/docs/_build/latex/SpinningUp.tex LaTeX2e <2016/02/01> Babel <3.9q> and hyphenation patterns for 81 language(s) loaded. (./sphinxmanual.cls Document Class: sphinxmanual 2017/03/26 v1.5.4 Document class (Sphinx manual) (/usr/share/texlive/texmf-dist/tex/latex/base/report.cls Document Class: report 2014/09/29 v1.4h Standard LaTeX document class (/usr/share/texlive/texmf-dist/tex/latex/base/size10.clo))) (/usr/share/texlive/texmf-dist/tex/latex/base/inputenc.sty (/usr/share/texlive/texmf-dist/tex/latex/base/utf8.def (/usr/share/texlive/texmf-dist/tex/latex/base/t1enc.dfu) (/usr/share/texlive/texmf-dist/tex/latex/base/ot1enc.dfu) (/usr/share/texlive/texmf-dist/tex/latex/base/omsenc.dfu))) (/usr/share/texlive/texmf-dist/tex/latex/cmap/cmap.sty) (/usr/share/texlive/texmf-dist/tex/latex/base/fontenc.sty (/usr/share/texlive/texmf-dist/tex/latex/base/t1enc.def)<>) (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsmath.sty For additional information on amsmath, use the `?' option. (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amstext.sty (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsgen.sty)) (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsbsy.sty) (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsopn.sty)) (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/amssymb.sty (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/amsfonts.sty)) (/usr/share/texlive/texmf-dist/tex/generic/babel/babel.sty (/usr/share/texlive/texmf-dist/tex/generic/babel-english/english.ldf (/usr/share/texlive/texmf-dist/tex/generic/babel/babel.def))) (/usr/share/texlive/texmf-dist/tex/latex/psnfss/times.sty) (/usr/share/texlive/texmf-dist/tex/latex/fncychap/fncychap.sty) (/usr/share/texlive/texmf-dist/tex/latex/tools/longtable.sty) (./sphinx.sty (/usr/share/texlive/texmf-dist/tex/latex/graphics/graphicx.sty (/usr/share/texlive/texmf-dist/tex/latex/graphics/keyval.sty) (/usr/share/texlive/texmf-dist/tex/latex/graphics/graphics.sty (/usr/share/texlive/texmf-dist/tex/latex/graphics/trig.sty) (/usr/share/texlive/texmf-dist/tex/latex/latexconfig/graphics.cfg) (/usr/share/texlive/texmf-dist/tex/latex/pdftex-def/pdftex.def (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/infwarerr.sty) (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ltxcmds.sty)))) (/usr/share/texlive/texmf-dist/tex/latex/fancyhdr/fancyhdr.sty) (/usr/share/texlive/texmf-dist/tex/latex/base/textcomp.sty (/usr/share/texlive/texmf-dist/tex/latex/base/ts1enc.def (/usr/share/texlive/texmf-dist/tex/latex/base/ts1enc.dfu))) (/usr/share/texlive/texmf-dist/tex/latex/titlesec/titlesec.sty) (/usr/share/texlive/texmf-dist/tex/latex/etoolbox/etoolbox.sty) Package Sphinx Info: **** titlesec 2.10.1 successfully patched for bugfix **** (/usr/share/texlive/texmf-dist/tex/latex/tabulary/tabulary.sty (/usr/share/texlive/texmf-dist/tex/latex/tools/array.sty)) (/usr/share/texlive/texmf-dist/tex/latex/base/makeidx.sty) (/usr/share/texlive/texmf-dist/tex/latex/framed/framed.sty) (/usr/share/texlive/texmf-dist/tex/latex/xcolor/xcolor.sty (/usr/share/texlive/texmf-dist/tex/latex/latexconfig/color.cfg)) (/usr/share/texlive/texmf-dist/tex/latex/fancyvrb/fancyvrb.sty Style option: `fancyvrb' v2.7a, with DG/SPQR fixes, and firstline=lastline fix <2008/02/07> (tvz)) (/usr/share/texlive/texmf-dist/tex/latex/threeparttable/threeparttable.sty) (./footnotehyper-sphinx.sty (/usr/share/texlive/texmf-dist/tex/latex/mdwtools/footnote.sty)) (/usr/share/texlive/texmf-dist/tex/latex/float/float.sty) (/usr/share/texlive/texmf-dist/tex/latex/wrapfig/wrapfig.sty) (/usr/share/texlive/texmf-dist/tex/latex/parskip/parskip.sty) (/usr/share/texlive/texmf-dist/tex/latex/base/alltt.sty) (/usr/share/texlive/texmf-dist/tex/latex/upquote/upquote.sty) (/usr/share/texlive/texmf-dist/tex/latex/capt-of/capt-of.sty) (./needspace.sty) (./sphinxhighlight.sty) (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/kvoptions.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/kvsetkeys.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/etexcmds.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ifluatex.sty)))) (/usr/share/texlive/texmf-dist/tex/generic/pdftex/pdfcolor.tex) ** (sphinx) defining (legacy) text style macros without \sphinx prefix ** if clashes with packages, set latex_keep_old_macro_names=False in conf.py ) (/usr/share/texlive/texmf-dist/tex/latex/geometry/geometry.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ifpdf.sty) (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ifvtex.sty) (/usr/share/texlive/texmf-dist/tex/generic/ifxetex/ifxetex.sty)) (/usr/share/texlive/texmf-dist/tex/latex/multirow/multirow.sty) (/usr/share/texlive/texmf-dist/tex/latex/eqparbox/eqparbox.sty (/usr/share/texlive/texmf-dist/tex/latex/environ/environ.sty (/usr/share/texlive/texmf-dist/tex/latex/trimspaces/trimspaces.sty))) (/usr/share/texlive/texmf-dist/tex/latex/hyperref/hyperref.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/hobsub-hyperref.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/hobsub-generic.sty)) (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/auxhook.sty) (/usr/share/texlive/texmf-dist/tex/latex/hyperref/pd1enc.def) (/usr/share/texlive/texmf-dist/tex/latex/latexconfig/hyperref.cfg) (/usr/share/texlive/texmf-dist/tex/latex/hyperref/puenc.def) (/usr/share/texlive/texmf-dist/tex/latex/url/url.sty)) Package hyperref Message: Driver (autodetected): hpdftex. (/usr/share/texlive/texmf-dist/tex/latex/hyperref/hpdftex.def (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/rerunfilecheck.sty)) (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/hypcap.sty) Writing index file SpinningUp.idx No file SpinningUp.aux. (/usr/share/texlive/texmf-dist/tex/latex/base/ts1cmr.fd) (/usr/share/texlive/texmf-dist/tex/latex/psnfss/t1ptm.fd) (/usr/share/texlive/texmf-dist/tex/context/base/supp-pdf.mkii [Loading MPS to PDF converter (version 2006.09.02).] ) (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/epstopdf-base.sty (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/grfext.sty) (/usr/share/texlive/texmf-dist/tex/latex/latexconfig/epstopdf-sys.cfg)) *geometry* driver: auto-detecting *geometry* detected driver: pdftex (/usr/share/texlive/texmf-dist/tex/latex/hyperref/nameref.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/gettitlestring.sty)) (/usr/share/texlive/texmf-dist/tex/latex/psnfss/t1phv.fd)<><><><> (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/umsa.fd) (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/umsb.fd) [1{/var/lib/texmf/fo nts/map/pdftex/updmap/pdftex.map}] [2] [1] [2] (/usr/share/texlive/texmf-dist/tex/latex/psnfss/t1pcr.fd) [1 <./spinning-up-in-rl.png>] [2] Chapter 1. (/usr/share/texlive/texmf-dist/tex/latex/psnfss/ts1ptm.fd) LaTeX Warning: Hyper reference `user/introduction:introduction' on page 3 undef ined on input line 70. LaTeX Warning: Hyper reference `user/introduction:what-this-is' on page 3 undef ined on input line 73. LaTeX Warning: Hyper reference `user/introduction:why-we-built-this' on page 3 undefined on input line 76. LaTeX Warning: Hyper reference `user/introduction:how-this-serves-our-mission' on page 3 undefined on input line 79. LaTeX Warning: Hyper reference `user/introduction:code-design-philosophy' on pa ge 3 undefined on input line 82. LaTeX Warning: Hyper reference `user/introduction:support-plan' on page 3 undef ined on input line 85. [3] [4] [5] [6] Chapter 2. LaTeX Warning: Hyper reference `user/installation:installation' on page 7 undef ined on input line 209. LaTeX Warning: Hyper reference `user/installation:installing-python' on page 7 undefined on input line 212. LaTeX Warning: Hyper reference `user/installation:installing-openmpi' on page 7 undefined on input line 215. LaTeX Warning: Hyper reference `user/installation:ubuntu' on page 7 undefined o n input line 218. LaTeX Warning: Hyper reference `user/installation:mac-os-x' on page 7 undefined on input line 221. LaTeX Warning: Hyper reference `user/installation:installing-spinning-up' on pa ge 7 undefined on input line 226. LaTeX Warning: Hyper reference `user/installation:check-your-install' on page 7 undefined on input line 229. LaTeX Warning: Hyper reference `user/installation:installing-mujoco-optional' o n page 7 undefined on input line 232. [7] [8] [9] [10] Chapter 3. LaTeX Warning: Hyper reference `user/algorithms:algorithms' on page 11 undefine d on input line 359. LaTeX Warning: Hyper reference `user/algorithms:what-s-included' on page 11 und efined on input line 362. LaTeX Warning: Hyper reference `user/algorithms:why-these-algorithms' on page 1 1 undefined on input line 365. LaTeX Warning: Hyper reference `user/algorithms:the-on-policy-algorithms' on pa ge 11 undefined on input line 368. LaTeX Warning: Hyper reference `user/algorithms:the-off-policy-algorithms' on p age 11 undefined on input line 371. LaTeX Warning: Hyper reference `user/algorithms:code-format' on page 11 undefin ed on input line 376. LaTeX Warning: Hyper reference `user/algorithms:the-algorithm-file' on page 11 undefined on input line 379. LaTeX Warning: Hyper reference `user/algorithms:the-core-file' on page 11 undef ined on input line 382. [11] [12] [13] [14] Chapter 4. LaTeX Warning: Hyper reference `user/running:running-experiments' on page 15 un defined on input line 528. LaTeX Warning: Hyper reference `user/running:launching-from-the-command-line' o n page 15 undefined on input line 531. LaTeX Warning: Hyper reference `user/running:setting-hyperparameters-from-the-c ommand-line' on page 15 undefined on input line 534. LaTeX Warning: Hyper reference `user/running:launching-multiple-experiments-at- once' on page 15 undefined on input line 537. LaTeX Warning: Hyper reference `user/running:special-flags' on page 15 undefine d on input line 540. LaTeX Warning: Hyper reference `user/running:environment-flag' on page 15 undef ined on input line 543. LaTeX Warning: Hyper reference `user/running:shortcut-flags' on page 15 undefin ed on input line 546. LaTeX Warning: Hyper reference `user/running:config-flags' on page 15 undefined on input line 549. LaTeX Warning: Hyper reference `user/running:where-results-are-saved' on page 1 5 undefined on input line 554. LaTeX Warning: Hyper reference `user/running:how-is-suffix-determined' on page 15 undefined on input line 557. LaTeX Warning: Hyper reference `user/running:extra' on page 15 undefined on inp ut line 562. LaTeX Warning: Hyper reference `user/running:launching-from-scripts' on page 15 undefined on input line 567. LaTeX Warning: Hyper reference `user/running:using-experimentgrid' on page 15 u ndefined on input line 570. [15] [16] [17] [18] (/usr/share/texlive/texmf-dist/tex/latex/psnfss/ts1pcr.fd) [19] [20] Chapter 5. LaTeX Warning: Hyper reference `user/saving_and_loading:experiment-outputs' on page 21 undefined on input line 898. LaTeX Warning: Hyper reference `user/saving_and_loading:algorithm-outputs' on p age 21 undefined on input line 901. LaTeX Warning: Hyper reference `user/saving_and_loading:save-directory-location ' on page 21 undefined on input line 904. LaTeX Warning: Hyper reference `user/saving_and_loading:loading-and-running-tra ined-policies' on page 21 undefined on input line 907. LaTeX Warning: Hyper reference `user/saving_and_loading:if-environment-saves-su ccessfully' on page 21 undefined on input line 910. LaTeX Warning: Hyper reference `user/saving_and_loading:environment-not-found-e rror' on page 21 undefined on input line 913. LaTeX Warning: Hyper reference `user/saving_and_loading:using-trained-value-fun ctions' on page 21 undefined on input line 916. [21] LaTeX Warning: Hyper reference `user/saving_and_loading:details-below' on page 22 undefined on input line 961. [22] [23] [24] [25] [26] Chapter 6. [27] [28] Chapter 7. LaTeX Warning: Hyper reference `spinningup/rl_intro:part-1-key-concepts-in-rl' on page 29 undefined on input line 1277. LaTeX Warning: Hyper reference `spinningup/rl_intro:what-can-rl-do' on page 29 undefined on input line 1280. LaTeX Warning: Hyper reference `spinningup/rl_intro:key-concepts-and-terminolog y' on page 29 undefined on input line 1283. LaTeX Warning: Hyper reference `spinningup/rl_intro:optional-formalism' on page 29 undefined on input line 1286. [29] [30 <./rl_diagram_transparent_bg.png>] [31] [32] [33] ! Undefined control sequence. ... } P(\tau |\pi ) R(\tau ) = \underE {\tau \sim \pi }{R(\tau )}... l.1529 ...nderE{\tau\sim \pi}{R(\tau)}.\end{split} ! Undefined control sequence. ... } P(\tau |\pi ) R(\tau ) = \underE {\tau \sim \pi }{R(\tau )}... l.1529 ...nderE{\tau\sim \pi}{R(\tau)}.\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {\tau \sim \pi }{R(\tau )\... l.1548 ...R(\tau)\left| s_0 = s\right.}\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {\tau \sim \pi }{R(\tau )\... l.1548 ...R(\tau)\left| s_0 = s\right.}\end{split} [34] ! Undefined control sequence. ...egin {split}Q^{\pi }(s,a) = \underE {\tau \sim \pi }{R(\tau )\... l.1555 ...eft| s_0 = s, a_0 = a\right.}\end{split} ! Undefined control sequence. ...egin {split}Q^{\pi }(s,a) = \underE {\tau \sim \pi }{R(\tau )\... l.1555 ...eft| s_0 = s, a_0 = a\right.}\end{split} ! Undefined control sequence. ...split}V^*(s) = \max _{\pi } \underE {\tau \sim \pi }{R(\tau )\... l.1562 ...R(\tau)\left| s_0 = s\right.}\end{split} ! Undefined control sequence. ...split}V^*(s) = \max _{\pi } \underE {\tau \sim \pi }{R(\tau )\... l.1562 ...R(\tau)\left| s_0 = s\right.}\end{split} ! Undefined control sequence. ...lit}Q^*(s,a) = \max _{\pi } \underE {\tau \sim \pi }{R(\tau )\... l.1569 ...eft| s_0 = s, a_0 = a\right.}\end{split} ! Undefined control sequence. ...lit}Q^*(s,a) = \max _{\pi } \underE {\tau \sim \pi }{R(\tau )\... l.1569 ...eft| s_0 = s, a_0 = a\right.}\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {a\sim \pi }{Q^{\pi }(s,a)... l.1583 ...erE{a\sim \pi}{Q^{\pi}(s,a)},\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {a\sim \pi }{Q^{\pi }(s,a)... l.1583 ...erE{a\sim \pi}{Q^{\pi}(s,a)},\end{split} [35] ! Undefined control sequence. V^{\pi }(s) &= \underE {a \sim \pi \\ s'\sim P}{r(s,a) + \gamma ... l.1616 \end{align*} ! Missing } inserted. } l.1616 \end{align*} ! Missing { inserted. { l.1616 \end{align*} ! Undefined control sequence. ...}(s')}, \\ Q^{\pi }(s,a) &= \underE {s'\sim P}{r(s,a) + \gamma... l.1616 \end{align*} ! Undefined control sequence. ... {s'\sim P}{r(s,a) + \gamma \underE {a'\sim \pi }{Q^{\pi }(s',... l.1616 \end{align*} ! Undefined control sequence. V^{\pi }(s) &= \underE {a \sim \pi \\ s'\sim P}{r(s,a) + \gamma ... l.1616 \end{align*} ! Missing } inserted. } l.1616 \end{align*} ! Missing \endgroup inserted. \endgroup l.1616 \end{align*} ! Misplaced \omit. \math@cr@@@ ...@ \@ne \add@amps \maxfields@ \omit \kern -\alignsep@ \iftag@ ... l.1616 \end{align*} ! Missing { inserted. { l.1616 \end{align*} ! Undefined control sequence. ...}(s')}, \\ Q^{\pi }(s,a) &= \underE {s'\sim P}{r(s,a) + \gamma... l.1616 \end{align*} ! Undefined control sequence. ... {s'\sim P}{r(s,a) + \gamma \underE {a'\sim \pi }{Q^{\pi }(s',... l.1616 \end{align*} ! Undefined control sequence. V^*(s) &= \max _a \underE {s'\sim P}{r(s,a) + \gamma V^*(s')}, \... l.1623 \end{align*} ! Undefined control sequence. ...ma V^*(s')}, \\ Q^*(s,a) &= \underE {s'\sim P}{r(s,a) + \gamma... l.1623 \end{align*} ! Undefined control sequence. V^*(s) &= \max _a \underE {s'\sim P}{r(s,a) + \gamma V^*(s')}, \... l.1623 \end{align*} ! Undefined control sequence. ...ma V^*(s')}, \\ Q^*(s,a) &= \underE {s'\sim P}{r(s,a) + \gamma... l.1623 \end{align*} [36] [37] [38] Chapter 8. LaTeX Warning: Hyper reference `spinningup/rl_intro2:part-2-kinds-of-rl-algorit hms' on page 39 undefined on input line 1676. LaTeX Warning: Hyper reference `spinningup/rl_intro2:a-taxonomy-of-rl-algorithm s' on page 39 undefined on input line 1679. LaTeX Warning: Hyper reference `spinningup/rl_intro2:links-to-algorithms-in-tax onomy' on page 39 undefined on input line 1682. ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.1697 ...ncludegraphics{{rl_algorithms_9_15}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.1697 ...ncludegraphics{{rl_algorithms_9_15}.svg} LaTeX Warning: Hyper reference `spinningup/rl_intro2:citations-below' on page 3 9 undefined on input line 1698. [39] [40] [41] [42] Chapter 9. LaTeX Warning: Hyper reference `spinningup/rl_intro3:part-3-intro-to-policy-opt imization' on page 43 undefined on input line 1839. LaTeX Warning: Hyper reference `spinningup/rl_intro3:deriving-the-simplest-poli cy-gradient' on page 43 undefined on input line 1842. LaTeX Warning: Hyper reference `spinningup/rl_intro3:implementing-the-simplest- policy-gradient' on page 43 undefined on input line 1845. LaTeX Warning: Hyper reference `spinningup/rl_intro3:expected-grad-log-prob-lem ma' on page 43 undefined on input line 1848. LaTeX Warning: Hyper reference `spinningup/rl_intro3:don-t-let-the-past-distrac t-you' on page 43 undefined on input line 1851. LaTeX Warning: Hyper reference `spinningup/rl_intro3:implementing-reward-to-go- policy-gradient' on page 43 undefined on input line 1854. LaTeX Warning: Hyper reference `spinningup/rl_intro3:baselines-in-policy-gradie nts' on page 43 undefined on input line 1857. LaTeX Warning: Hyper reference `spinningup/rl_intro3:other-forms-of-the-policy- gradient' on page 43 undefined on input line 1860. LaTeX Warning: Hyper reference `spinningup/rl_intro3:recap' on page 43 undefine d on input line 1863. ! Undefined control sequence. l.1888 ...ected return \(J(\pi_{\theta}) = \underE {\tau \sim \pi_{\theta}}{R... [43] ! Undefined control sequence. ...} \log P(\tau | \theta ) &= \cancel {\nabla _{\theta } \log \r... l.1919 ... \log \pi_{\theta}(a_t |s_t).\end{split} ! Undefined control sequence. ...} + \sum _{t=0}^{T} \bigg ( \cancel {\nabla _{\theta } \log P(... l.1919 ... \log \pi_{\theta}(a_t |s_t).\end{split} ! Undefined control sequence. ...} \log P(\tau | \theta ) &= \cancel {\nabla _{\theta } \log \r... l.1919 ... \log \pi_{\theta}(a_t |s_t).\end{split} ! Undefined control sequence. ...} + \sum _{t=0}^{T} \bigg ( \cancel {\nabla _{\theta } \log P(... l.1919 ... \log \pi_{\theta}(a_t |s_t).\end{split} ! Undefined control sequence. ...eta }) &= \nabla _{\theta } \underE {\tau \sim \pi _{\theta }}... l.1931 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...Log-derivative trick} \\ &= \underE {\tau \sim \pi _{\theta }}... l.1931 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...heta } J(\pi _{\theta }) &= \underE {\tau \sim \pi _{\theta }}... l.1931 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...eta }) &= \nabla _{\theta } \underE {\tau \sim \pi _{\theta }}... l.1931 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...Log-derivative trick} \\ &= \underE {\tau \sim \pi _{\theta }}... l.1931 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...heta } J(\pi _{\theta }) &= \underE {\tau \sim \pi _{\theta }}... l.1931 \end{align*} \end{sphinxadmonition} [44] [45] [46] [47] ! Undefined control sequence. \split@tag \begin {split}\underE {x \sim P_{\theta }}{\nabla _{\t... l.2080 ...eta} \log P_{\theta}(x)} = 0.\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {x \sim P_{\theta }}{\nabla _{\t... l.2080 ...eta} \log P_{\theta}(x)} = 0.\end{split} ! Undefined control sequence. ...eta }(x) \\ \therefore 0 &= \underE {x \sim P_{\theta }}{\nabl... l.2097 ...{\theta} \log P_{\theta}(x)}.\end{split} ! Undefined control sequence. ...eta }(x) \\ \therefore 0 &= \underE {x \sim P_{\theta }}{\nabl... l.2097 ...{\theta} \log P_{\theta}(x)}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2105 ..._{\theta}(a_t |s_t) R(\tau)}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2105 ..._{\theta}(a_t |s_t) R(\tau)}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2113 ...R(s_{t'}, a_{t'}, s_{t'+1})}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2113 ...R(s_{t'}, a_{t'}, s_{t'+1})}.\end{split} [48] ! Undefined control sequence. \split@tag \begin {split}\underE {a_t \sim \pi _{\theta }}{\nabla... l.2165 ...\theta}(a_t|s_t) b(s_t)} = 0.\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {a_t \sim \pi _{\theta }}{\nabla... l.2165 ...\theta}(a_t|s_t) b(s_t)} = 0.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2169 ..., s_{t'+1}) - b(s_t)\right)}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2169 ..., s_{t'+1}) - b(s_t)\right)}.\end{split} [49] ! Undefined control sequence. ...phi _k = \arg \min _{\phi } \underE {s_t, \hat {R}_t \sim \pi ... l.2183 ...(s_t) - \hat{R}_t \right)^2},\end{split} ! Undefined control sequence. ...phi _k = \arg \min _{\phi } \underE {s_t, \hat {R}_t \sim \pi ... l.2183 ...(s_t) - \hat{R}_t \right)^2},\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2197 ...i_{\theta}(a_t |s_t) \Phi_t},\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2197 ...i_{\theta}(a_t |s_t) \Phi_t},\end{split} [50] [51] [52] Chapter 10. LaTeX Warning: Hyper reference `spinningup/spinningup:spinning-up-as-a-deep-rl- researcher' on page 53 undefined on input line 2251. LaTeX Warning: Hyper reference `spinningup/spinningup:the-right-background' on page 53 undefined on input line 2254. LaTeX Warning: Hyper reference `spinningup/spinningup:learn-by-doing' on page 5 3 undefined on input line 2257. LaTeX Warning: Hyper reference `spinningup/spinningup:developing-a-research-pro ject' on page 53 undefined on input line 2260. LaTeX Warning: Hyper reference `spinningup/spinningup:doing-rigorous-research-i n-rl' on page 53 undefined on input line 2263. LaTeX Warning: Hyper reference `spinningup/spinningup:closing-thoughts' on page 53 undefined on input line 2266. LaTeX Warning: Hyper reference `spinningup/spinningup:ps-other-resources' on pa ge 53 undefined on input line 2269. LaTeX Warning: Hyper reference `spinningup/spinningup:references' on page 53 un defined on input line 2272. [53] [54] [55] [56] [57] [58] Chapter 11. LaTeX Warning: Hyper reference `spinningup/keypapers:key-papers-in-deep-rl' on page 59 undefined on input line 2381. LaTeX Warning: Hyper reference `spinningup/keypapers:model-free-rl' on page 59 undefined on input line 2384. LaTeX Warning: Hyper reference `spinningup/keypapers:exploration' on page 59 un defined on input line 2387. LaTeX Warning: Hyper reference `spinningup/keypapers:transfer-and-multitask-rl' on page 59 undefined on input line 2390. LaTeX Warning: Hyper reference `spinningup/keypapers:hierarchy' on page 59 unde fined on input line 2393. LaTeX Warning: Hyper reference `spinningup/keypapers:memory' on page 59 undefin ed on input line 2396. LaTeX Warning: Hyper reference `spinningup/keypapers:model-based-rl' on page 59 undefined on input line 2399. LaTeX Warning: Hyper reference `spinningup/keypapers:meta-rl' on page 59 undefi ned on input line 2402. LaTeX Warning: Hyper reference `spinningup/keypapers:scaling-rl' on page 59 und efined on input line 2405. LaTeX Warning: Hyper reference `spinningup/keypapers:rl-in-the-real-world' on p age 59 undefined on input line 2408. LaTeX Warning: Hyper reference `spinningup/keypapers:safety' on page 59 undefin ed on input line 2411. LaTeX Warning: Hyper reference `spinningup/keypapers:imitation-learning-and-inv erse-reinforcement-learning' on page 59 undefined on input line 2414. LaTeX Warning: Hyper reference `spinningup/keypapers:reproducibility-analysis-a nd-critique' on page 59 undefined on input line 2417. LaTeX Warning: Hyper reference `spinningup/keypapers:bonus-classic-papers-in-rl -theory-or-review' on page 59 undefined on input line 2420. [59] [60] Overfull \vbox (108.35579pt too high) has occurred while \output is active [61] [62] Chapter 12. LaTeX Warning: Hyper reference `spinningup/exercises:exercises' on page 63 unde fined on input line 2509. LaTeX Warning: Hyper reference `spinningup/exercises:problem-set-1-basics-of-im plementation' on page 63 undefined on input line 2512. LaTeX Warning: Hyper reference `spinningup/exercises:problem-set-2-algorithm-fa ilure-modes' on page 63 undefined on input line 2515. LaTeX Warning: Hyper reference `spinningup/exercises:challenges' on page 63 und efined on input line 2518. [63] [64] [65] [66] Chapter 13. LaTeX Warning: Hyper reference `spinningup/bench:benchmarks-for-spinning-up-imp lementations' on page 67 undefined on input line 2657. LaTeX Warning: Hyper reference `spinningup/bench:performance-in-each-environmen t' on page 67 undefined on input line 2660. LaTeX Warning: Hyper reference `spinningup/bench:halfcheetah' on page 67 undefi ned on input line 2663. LaTeX Warning: Hyper reference `spinningup/bench:hopper' on page 67 undefined o n input line 2666. LaTeX Warning: Hyper reference `spinningup/bench:walker' on page 67 undefined o n input line 2669. LaTeX Warning: Hyper reference `spinningup/bench:swimmer' on page 67 undefined on input line 2672. LaTeX Warning: Hyper reference `spinningup/bench:ant' on page 67 undefined on i nput line 2675. LaTeX Warning: Hyper reference `spinningup/bench:experiment-details' on page 67 undefined on input line 2680. ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2698 ...includegraphics{{bench_halfcheetah}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2698 ...includegraphics{{bench_halfcheetah}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2707 ...phinxincludegraphics{{bench_hopper}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2707 ...phinxincludegraphics{{bench_hopper}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2716 ...phinxincludegraphics{{bench_walker}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2716 ...phinxincludegraphics{{bench_walker}.svg} [67] ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2725 ...\sphinxincludegraphics{{bench_swim}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2725 ...\sphinxincludegraphics{{bench_swim}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2734 ...t\sphinxincludegraphics{{bench_ant}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2734 ...t\sphinxincludegraphics{{bench_ant}.svg} [68] [69] [70] Chapter 14. LaTeX Warning: Hyper reference `algorithms/vpg:vanilla-policy-gradient' on page 71 undefined on input line 2757. LaTeX Warning: Hyper reference `algorithms/vpg:background' on page 71 undefined on input line 2760. LaTeX Warning: Hyper reference `algorithms/vpg:quick-facts' on page 71 undefine d on input line 2763. LaTeX Warning: Hyper reference `algorithms/vpg:key-equations' on page 71 undefi ned on input line 2766. LaTeX Warning: Hyper reference `algorithms/vpg:exploration-vs-exploitation' on page 71 undefined on input line 2769. LaTeX Warning: Hyper reference `algorithms/vpg:pseudocode' on page 71 undefined on input line 2772. LaTeX Warning: Hyper reference `algorithms/vpg:documentation' on page 71 undefi ned on input line 2777. LaTeX Warning: Hyper reference `algorithms/vpg:saved-model-contents' on page 71 undefined on input line 2780. LaTeX Warning: Hyper reference `algorithms/vpg:references' on page 71 undefined on input line 2785. LaTeX Warning: Hyper reference `algorithms/vpg:relevant-papers' on page 71 unde fined on input line 2788. LaTeX Warning: Hyper reference `algorithms/vpg:why-these-papers' on page 71 und efined on input line 2791. LaTeX Warning: Hyper reference `algorithms/vpg:other-public-implementations' on page 71 undefined on input line 2794. [71] ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2831 },\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2831 },\end{split} ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2848 ...rithms/vpg:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2849 \caption {Vanilla Policy Gradient Algorithm} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2851 \begin{algorithmic} [1] ! Undefined control sequence. l.2852 \STATE Input: initial policy parameters $\theta_0$, initial value... ! Undefined control sequence. l.2853 \FOR {$k = 0,1,2,...$} ! Undefined control sequence. l.2854 \STATE Collect set of trajectories ${\mathcal D}_k = \{\tau_i\}$ ... ! Undefined control sequence. l.2855 \STATE Compute rewards-to-go $\hat{R}_t$. ! Undefined control sequence. l.2856 \STATE Compute advantage estimates, $\hat{A}_t$ (using any method... ! Undefined control sequence. l.2857 \STATE Estimate policy gradient as ! Undefined control sequence. l.2861 \STATE Compute policy update, either using standard gradient ascent, ! Undefined control sequence. l.2866 \STATE Fit value function by regression on mean-squared error: ! Undefined control sequence. l.2871 \ENDFOR ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2872 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2873 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 2879--2879 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 2879--2879 \T1/ptm/m/it/10 steps_per_epoch=4000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 epochs=50\ T1/ptm/m/n/10 , \T1/ptm/m/it/10 gamma=0.99\T1/ptm/m/n/10 , \T1/ptm/m/it/10 pi_l r=0.0003\T1/ptm/m/n/10 , \T1/ptm/m/it/10 vf_lr=0.001\T1/ptm/m/n/10 , [72] [73] [74] [75] [76] Chapter 15. LaTeX Warning: Hyper reference `algorithms/trpo:trust-region-policy-optimizatio n' on page 77 undefined on input line 3079. LaTeX Warning: Hyper reference `algorithms/trpo:background' on page 77 undefine d on input line 3082. LaTeX Warning: Hyper reference `algorithms/trpo:quick-facts' on page 77 undefin ed on input line 3085. LaTeX Warning: Hyper reference `algorithms/trpo:key-equations' on page 77 undef ined on input line 3088. LaTeX Warning: Hyper reference `algorithms/trpo:exploration-vs-exploitation' on page 77 undefined on input line 3091. LaTeX Warning: Hyper reference `algorithms/trpo:pseudocode' on page 77 undefine d on input line 3094. LaTeX Warning: Hyper reference `algorithms/trpo:documentation' on page 77 undef ined on input line 3099. LaTeX Warning: Hyper reference `algorithms/trpo:saved-model-contents' on page 7 7 undefined on input line 3102. LaTeX Warning: Hyper reference `algorithms/trpo:references' on page 77 undefine d on input line 3107. LaTeX Warning: Hyper reference `algorithms/trpo:relevant-papers' on page 77 und efined on input line 3110. LaTeX Warning: Hyper reference `algorithms/trpo:why-these-papers' on page 77 un defined on input line 3113. LaTeX Warning: Hyper reference `algorithms/trpo:other-public-implementations' o n page 77 undefined on input line 3116. [77] ! Undefined control sequence. ...al L}(\theta _k, \theta ) = \underE {s,a \sim \pi _{\theta _k}... l.3160 },\end{split} ! Undefined control sequence. ...al L}(\theta _k, \theta ) = \underE {s,a \sim \pi _{\theta _k}... l.3160 },\end{split} ! Undefined control sequence. ...{KL}(\theta || \theta _k) = \underE {s \sim \pi _{\theta _k}}{... l.3166 }.\end{split} ! Undefined control sequence. ...{KL}(\theta || \theta _k) = \underE {s \sim \pi _{\theta _k}}{... l.3166 }.\end{split} [78] ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3215 ...ithms/trpo:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3216 \caption {Trust Region Policy Optimization} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3218 \begin{algorithmic} [1] ! Undefined control sequence. l.3219 \STATE Input: initial policy parameters $\theta_0$, initial value... ! Undefined control sequence. l.3220 \STATE Hyperparameters: KL-divergence limit $\delta$, backtrackin... ! Undefined control sequence. l.3221 \FOR {$k = 0,1,2,...$} ! Undefined control sequence. l.3222 \STATE Collect set of trajectories ${\mathcal D}_k = \{\tau_i\}$ ... ! Undefined control sequence. l.3223 \STATE Compute rewards-to-go $\hat{R}_t$. ! Undefined control sequence. l.3224 \STATE Compute advantage estimates, $\hat{A}_t$ (using any method... ! Undefined control sequence. l.3225 \STATE Estimate policy gradient as ! Undefined control sequence. l.3229 \STATE Use the conjugate gradient algorithm to compute ! Undefined control sequence. l.3234 \STATE Update the policy by backtracking line search with [79] ! Undefined control sequence. l.3239 \STATE Fit value function by regression on mean-squared error: ! Undefined control sequence. l.3244 \ENDFOR ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3245 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3246 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 3252--3252 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 3252--3252 \T1/ptm/m/it/10 steps_per_epoch=4000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 epochs=50\ T1/ptm/m/n/10 , \T1/ptm/m/it/10 gamma=0.99\T1/ptm/m/n/10 , \T1/ptm/m/it/10 delt a=0.01\T1/ptm/m/n/10 , \T1/ptm/m/it/10 vf_lr=0.001\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 3252--3252 \T1/ptm/m/it/10 train_v_iters=80\T1/ptm/m/n/10 , \T1/ptm/m/it/10 damp-ing_coeff =0.1\T1/ptm/m/n/10 , \T1/ptm/m/it/10 cg_iters=10\T1/ptm/m/n/10 , \T1/ptm/m/it/1 0 back-track_iters=10\T1/ptm/m/n/10 , \T1/ptm/m/it/10 back- Underfull \hbox (badness 10000) in paragraph at lines 3252--3252 \T1/ptm/m/it/10 track_coeff=0.8\T1/ptm/m/n/10 , \T1/ptm/m/it/10 lam=0.97\T1/ptm /m/n/10 , \T1/ptm/m/it/10 max_ep_len=1000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 log-g er_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 save_freq=10\T1/ptm/m/n/10 , [80] Overfull \vbox (100.02797pt too high) has occurred while \output is active [81] [82] [83] [84] Chapter 16. LaTeX Warning: Hyper reference `algorithms/ppo:proximal-policy-optimization' on page 85 undefined on input line 3526. LaTeX Warning: Hyper reference `algorithms/ppo:background' on page 85 undefined on input line 3529. LaTeX Warning: Hyper reference `algorithms/ppo:quick-facts' on page 85 undefine d on input line 3532. LaTeX Warning: Hyper reference `algorithms/ppo:key-equations' on page 85 undefi ned on input line 3535. LaTeX Warning: Hyper reference `algorithms/ppo:exploration-vs-exploitation' on page 85 undefined on input line 3538. LaTeX Warning: Hyper reference `algorithms/ppo:pseudocode' on page 85 undefined on input line 3541. LaTeX Warning: Hyper reference `algorithms/ppo:documentation' on page 85 undefi ned on input line 3546. LaTeX Warning: Hyper reference `algorithms/ppo:saved-model-contents' on page 85 undefined on input line 3549. LaTeX Warning: Hyper reference `algorithms/ppo:references' on page 85 undefined on input line 3554. LaTeX Warning: Hyper reference `algorithms/ppo:relevant-papers' on page 85 unde fined on input line 3557. LaTeX Warning: Hyper reference `algorithms/ppo:why-these-papers' on page 85 und efined on input line 3560. LaTeX Warning: Hyper reference `algorithms/ppo:other-public-implementations' on page 85 undefined on input line 3563. [85] [86] ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3672 ...rithms/ppo:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3673 \caption {PPO-Clip} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3675 \begin{algorithmic} [1] ! Undefined control sequence. l.3676 \STATE Input: initial policy parameters $\theta_0$, initial value... ! Undefined control sequence. l.3677 \FOR {$k = 0,1,2,...$} ! Undefined control sequence. l.3678 \STATE Collect set of trajectories ${\mathcal D}_k = \{\tau_i\}$ ... ! Undefined control sequence. l.3679 \STATE Compute rewards-to-go $\hat{R}_t$. ! Undefined control sequence. l.3680 \STATE Compute advantage estimates, $\hat{A}_t$ (using any method... ! Undefined control sequence. l.3681 \STATE Update the policy by maximizing the PPO-Clip objective: ! Undefined control sequence. l.3689 \STATE Fit value function by regression on mean-squared error: ! Undefined control sequence. l.3694 \ENDFOR ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3695 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3696 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 3702--3702 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , [87] [88] [89] [90] Chapter 17. LaTeX Warning: Hyper reference `algorithms/ddpg:deep-deterministic-policy-gradi ent' on page 91 undefined on input line 3922. LaTeX Warning: Hyper reference `algorithms/ddpg:background' on page 91 undefine d on input line 3925. LaTeX Warning: Hyper reference `algorithms/ddpg:quick-facts' on page 91 undefin ed on input line 3928. LaTeX Warning: Hyper reference `algorithms/ddpg:key-equations' on page 91 undef ined on input line 3931. LaTeX Warning: Hyper reference `algorithms/ddpg:the-q-learning-side-of-ddpg' on page 91 undefined on input line 3934. LaTeX Warning: Hyper reference `algorithms/ddpg:the-policy-learning-side-of-ddp g' on page 91 undefined on input line 3937. LaTeX Warning: Hyper reference `algorithms/ddpg:exploration-vs-exploitation' on page 91 undefined on input line 3942. LaTeX Warning: Hyper reference `algorithms/ddpg:pseudocode' on page 91 undefine d on input line 3945. LaTeX Warning: Hyper reference `algorithms/ddpg:documentation' on page 91 undef ined on input line 3950. LaTeX Warning: Hyper reference `algorithms/ddpg:saved-model-contents' on page 9 1 undefined on input line 3953. LaTeX Warning: Hyper reference `algorithms/ddpg:references' on page 91 undefine d on input line 3958. LaTeX Warning: Hyper reference `algorithms/ddpg:relevant-papers' on page 91 und efined on input line 3961. LaTeX Warning: Hyper reference `algorithms/ddpg:why-these-papers' on page 91 un defined on input line 3964. LaTeX Warning: Hyper reference `algorithms/ddpg:other-public-implementations' o n page 91 undefined on input line 3967. [91] [92] [93] ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4086 ...ithms/ddpg:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4087 \caption {Deep Deterministic Policy Gradient} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4089 \begin{algorithmic} [1] ! Undefined control sequence. l.4090 \STATE Input: initial policy parameters $\theta$, Q-function para... ! Undefined control sequence. l.4091 \STATE Set target parameters equal to main parameters $\theta_{\t... ! Undefined control sequence. l.4092 \REPEAT ! Undefined control sequence. l.4093 \STATE Observe state $s$ and select action $a = \text{clip}(\... ! Undefined control sequence. l.4094 \STATE Execute $a$ in the environment ! Undefined control sequence. l.4095 \STATE Observe next state $s'$, reward $r$, and done signal $... ! Undefined control sequence. l.4096 \STATE Store $(s,a,r,s',d)$ in replay buffer $\mathcal{D}$ ! Undefined control sequence. l.4097 \STATE If $s'$ is terminal, reset environment state. ! Undefined control sequence. l.4098 \IF {it's time to update} ! Undefined control sequence. l.4099 \FOR {however many updates} ! Undefined control sequence. l.4100 \STATE Randomly sample a batch of transitions, $B = \... ! Undefined control sequence. l.4101 \STATE Compute targets ! Undefined control sequence. l.4105 \STATE Update Q-function by one step of gradient desc... ! Undefined control sequence. l.4109 \STATE Update policy by one step of gradient ascent u... ! Undefined control sequence. l.4113 \STATE Update target networks with ! Undefined control sequence. l.4118 \ENDFOR ! Undefined control sequence. l.4119 \ENDIF ! Undefined control sequence. l.4120 \UNTIL {convergence} ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4121 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4122 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 4128--4128 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 4128--4128 \T1/ptm/m/it/10 pi_lr=0.001\T1/ptm/m/n/10 , \T1/ptm/m/it/10 q_lr=0.001\T1/ptm/m /n/10 , \T1/ptm/m/it/10 batch_size=100\T1/ptm/m/n/10 , \T1/ptm/m/it/10 start_st eps=10000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 act_noise=0.1\T1/ptm/m/n/10 , [94] [95] [96] Chapter 18. LaTeX Warning: Hyper reference `algorithms/td3:twin-delayed-ddpg' on page 97 un defined on input line 4343. LaTeX Warning: Hyper reference `algorithms/td3:background' on page 97 undefined on input line 4346. LaTeX Warning: Hyper reference `algorithms/td3:quick-facts' on page 97 undefine d on input line 4349. LaTeX Warning: Hyper reference `algorithms/td3:key-equations' on page 97 undefi ned on input line 4352. LaTeX Warning: Hyper reference `algorithms/td3:exploration-vs-exploitation' on page 97 undefined on input line 4355. LaTeX Warning: Hyper reference `algorithms/td3:pseudocode' on page 97 undefined on input line 4358. LaTeX Warning: Hyper reference `algorithms/td3:documentation' on page 97 undefi ned on input line 4363. LaTeX Warning: Hyper reference `algorithms/td3:saved-model-contents' on page 97 undefined on input line 4366. LaTeX Warning: Hyper reference `algorithms/td3:references' on page 97 undefined on input line 4371. LaTeX Warning: Hyper reference `algorithms/td3:relevant-papers' on page 97 unde fined on input line 4374. LaTeX Warning: Hyper reference `algorithms/td3:other-public-implementations' on page 97 undefined on input line 4377. [97] ! Undefined control sequence. ...}L(\phi _1, {\mathcal D}) = \underE {(s,a,r,s',d) \sim {\mathc... l.4434 },\end{split} ! Undefined control sequence. ...}L(\phi _1, {\mathcal D}) = \underE {(s,a,r,s',d) \sim {\mathc... l.4434 },\end{split} ! Undefined control sequence. ...}L(\phi _2, {\mathcal D}) = \underE {(s,a,r,s',d) \sim {\mathc... l.4438 }.\end{split} ! Undefined control sequence. ...}L(\phi _2, {\mathcal D}) = \underE {(s,a,r,s',d) \sim {\mathc... l.4438 }.\end{split} [98] ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4462 ...rithms/td3:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4463 \caption {Twin Delayed DDPG} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4465 \begin{algorithmic} [1] ! Undefined control sequence. l.4466 \STATE Input: initial policy parameters $\theta$, Q-function para... ! Undefined control sequence. l.4467 \STATE Set target parameters equal to main parameters $\theta_{\t... ! Undefined control sequence. l.4468 \REPEAT ! Undefined control sequence. l.4469 \STATE Observe state $s$ and select action $a = \text{clip}(\... ! Undefined control sequence. l.4470 \STATE Execute $a$ in the environment ! Undefined control sequence. l.4471 \STATE Observe next state $s'$, reward $r$, and done signal $... ! Undefined control sequence. l.4472 \STATE Store $(s,a,r,s',d)$ in replay buffer $\mathcal{D}$ ! Undefined control sequence. l.4473 \STATE If $s'$ is terminal, reset environment state. ! Undefined control sequence. l.4474 \IF {it's time to update} ! Undefined control sequence. l.4475 \FOR {$j$ in range(however many updates)} ! Undefined control sequence. l.4476 \STATE Randomly sample a batch of transitions, $B = \... ! Undefined control sequence. l.4477 \STATE Compute target actions ! Undefined control sequence. l.4481 \STATE Compute targets ! Undefined control sequence. l.4485 \STATE Update Q-functions by one step of gradient des... ! Undefined control sequence. l.4489 \IF { $j \mod$ \texttt{policy\_delay} $ = 0$} ! Undefined control sequence. l.4490 \STATE Update policy by one step of gradient asce... ! Undefined control sequence. l.4494 \STATE Update target networks with ! Undefined control sequence. l.4499 \ENDIF ! Undefined control sequence. l.4500 \ENDFOR ! Undefined control sequence. l.4501 \ENDIF ! Undefined control sequence. l.4502 \UNTIL {convergence} ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4503 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4504 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 4510--4510 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 4510--4510 \T1/ptm/m/it/10 pi_lr=0.001\T1/ptm/m/n/10 , \T1/ptm/m/it/10 q_lr=0.001\T1/ptm/m /n/10 , \T1/ptm/m/it/10 batch_size=100\T1/ptm/m/n/10 , \T1/ptm/m/it/10 start_st eps=10000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 act_noise=0.1\T1/ptm/m/n/10 , \T1/ptm /m/it/10 tar- [99] [100] [101] [102] Chapter 19. LaTeX Warning: Hyper reference `algorithms/sac:soft-actor-critic' on page 103 u ndefined on input line 4736. LaTeX Warning: Hyper reference `algorithms/sac:background' on page 103 undefine d on input line 4739. LaTeX Warning: Hyper reference `algorithms/sac:quick-facts' on page 103 undefin ed on input line 4742. LaTeX Warning: Hyper reference `algorithms/sac:key-equations' on page 103 undef ined on input line 4745. LaTeX Warning: Hyper reference `algorithms/sac:entropy-regularized-reinforcemen t-learning' on page 103 undefined on input line 4748. LaTeX Warning: Hyper reference `algorithms/sac:id1' on page 103 undefined on in put line 4751. LaTeX Warning: Hyper reference `algorithms/sac:exploration-vs-exploitation' on page 103 undefined on input line 4756. LaTeX Warning: Hyper reference `algorithms/sac:pseudocode' on page 103 undefine d on input line 4759. LaTeX Warning: Hyper reference `algorithms/sac:documentation' on page 103 undef ined on input line 4764. LaTeX Warning: Hyper reference `algorithms/sac:saved-model-contents' on page 10 3 undefined on input line 4767. LaTeX Warning: Hyper reference `algorithms/sac:references' on page 103 undefine d on input line 4772. LaTeX Warning: Hyper reference `algorithms/sac:relevant-papers' on page 103 und efined on input line 4775. LaTeX Warning: Hyper reference `algorithms/sac:other-public-implementations' on page 103 undefined on input line 4778. [103] ! Undefined control sequence. ...it@tag \begin {split}H(P) = \underE {x \sim P}{-\log P(x)}.\en... l.4825 ...underE{x \sim P}{-\log P(x)}.\end{split} ! Undefined control sequence. ...it@tag \begin {split}H(P) = \underE {x \sim P}{-\log P(x)}.\en... l.4825 ...underE{x \sim P}{-\log P(x)}.\end{split} ! Undefined control sequence. ...}\pi ^* = \arg \max _{\pi } \underE {\tau \sim \pi }{ \sum _{t... l.4829 ...pi(\cdot|s_t)\right) \bigg)},\end{split} ! Undefined control sequence. ...}\pi ^* = \arg \max _{\pi } \underE {\tau \sim \pi }{ \sum _{t... l.4829 ...pi(\cdot|s_t)\right) \bigg)},\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {\tau \sim \pi }{ \left . ... l.4833 ...ight) \bigg) \right| s_0 = s}\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {\tau \sim \pi }{ \left . ... l.4833 ...ight) \bigg) \right| s_0 = s}\end{split} ! Undefined control sequence. ...egin {split}Q^{\pi }(s,a) = \underE {\tau \sim \pi }{ \left . ... l.4837 ...ght)\right| s_0 = s, a_0 = a}\end{split} ! Undefined control sequence. ...egin {split}Q^{\pi }(s,a) = \underE {\tau \sim \pi }{ \left . ... l.4837 ...ght)\right| s_0 = s, a_0 = a}\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {a \sim \pi }{Q^{\pi }(s,a... l.4841 ...ha H\left(\pi(\cdot|s)\right)\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {a \sim \pi }{Q^{\pi }(s,a... l.4841 ...ha H\left(\pi(\cdot|s)\right)\end{split} ! Undefined control sequence. ...gin {split}Q^{\pi }(s,a) &= \underE {s' \sim P \\ a' \sim \pi ... l.4846 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Missing } inserted. } l.4846 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Missing { inserted. { l.4846 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Undefined control sequence. ...s')\right ) \right )} \\ &= \underE {s' \sim P}{R(s,a,s') + \g... l.4846 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Undefined control sequence. ...gin {split}Q^{\pi }(s,a) &= \underE {s' \sim P \\ a' \sim \pi ... l.4846 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Missing } inserted. } l.4846 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Missing { inserted. { l.4846 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Undefined control sequence. ...s')\right ) \right )} \\ &= \underE {s' \sim P}{R(s,a,s') + \g... l.4846 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} [104] ! Undefined control sequence. ...begin {split}V^{\pi }(s) &= \underE {a \sim \pi }{Q^{\pi }(s,a... l.4869 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. ...pi (\cdot |s)\right ) \\ &= \underE {a \sim \pi }{Q^{\pi }(s,a... l.4869 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. ...begin {split}V^{\pi }(s) &= \underE {a \sim \pi }{Q^{\pi }(s,a... l.4869 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. ...pi (\cdot |s)\right ) \\ &= \underE {a \sim \pi }{Q^{\pi }(s,a... l.4869 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. ...it}L(\psi , {\mathcal D}) = \underE {s \sim \mathcal {D} \\ \t... l.4877 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Missing } inserted. } l.4877 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Missing { inserted. { l.4877 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Undefined control sequence. ...it}L(\psi , {\mathcal D}) = \underE {s \sim \mathcal {D} \\ \t... l.4877 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Missing } inserted. } l.4877 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Missing { inserted. { l.4877 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {a \sim \pi }{Q^{\pi }(s,a) - \a... l.4883 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {a \sim \pi }{Q^{\pi }(s,a) - \a... l.4883 ...s,a) - \alpha \log \pi(a|s)}.\end{split} [105] ! Undefined control sequence. \split@tag \begin {split}\underE {a \sim \pi _{\theta }}{Q^{\pi _... l.4900 ...\tilde{a}_{\theta}(s,\xi)|s)}\end{split} ! Undefined control sequence. ...\log \pi _{\theta }(a|s)} = \underE {\xi \sim \mathcal {N}}{Q^... l.4900 ...\tilde{a}_{\theta}(s,\xi)|s)}\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {a \sim \pi _{\theta }}{Q^{\pi _... l.4900 ...\tilde{a}_{\theta}(s,\xi)|s)}\end{split} ! Undefined control sequence. ...\log \pi _{\theta }(a|s)} = \underE {\xi \sim \mathcal {N}}{Q^... l.4900 ...\tilde{a}_{\theta}(s,\xi)|s)}\end{split} ! Undefined control sequence. ...egin {split}\max _{\theta } \underE {s \sim \mathcal {D} \\ \x... l.4904 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Missing } inserted. } l.4904 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Missing { inserted. { l.4904 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Undefined control sequence. ...egin {split}\max _{\theta } \underE {s \sim \mathcal {D} \\ \x... l.4904 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Missing } inserted. } l.4904 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Missing { inserted. { l.4904 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4922 ...rithms/sac:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4923 \caption {Soft Actor-Critic} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4925 \begin{algorithmic} [1] ! Undefined control sequence. l.4926 \STATE Input: initial policy parameters $\theta$, Q-function para... ! Undefined control sequence. l.4927 \STATE Set target parameters equal to main parameters $\psi_{\tex... ! Undefined control sequence. l.4928 \REPEAT ! Undefined control sequence. l.4929 \STATE Observe state $s$ and select action $a \sim \pi_{\thet... ! Undefined control sequence. l.4930 \STATE Execute $a$ in the environment ! Undefined control sequence. l.4931 \STATE Observe next state $s'$, reward $r$, and done signal $... ! Undefined control sequence. l.4932 \STATE Store $(s,a,r,s',d)$ in replay buffer $\mathcal{D}$ ! Undefined control sequence. l.4933 \STATE If $s'$ is terminal, reset environment state. ! Undefined control sequence. l.4934 \IF {it's time to update} ! Undefined control sequence. l.4935 \FOR {$j$ in range(however many updates)} ! Undefined control sequence. l.4936 \STATE Randomly sample a batch of transitions, $B = \... ! Undefined control sequence. l.4937 \STATE Compute targets for Q and V functions: [106] ! Undefined control sequence. l.4942 \STATE Update Q-functions by one step of gradient des... ! Undefined control sequence. l.4946 \STATE Update V-function by one step of gradient desc... ! Undefined control sequence. l.4950 \STATE Update policy by one step of gradient ascent u... ! Undefined control sequence. l.4955 \STATE Update target value network with ! Undefined control sequence. l.4959 \ENDFOR ! Undefined control sequence. l.4960 \ENDIF ! Undefined control sequence. l.4961 \UNTIL {convergence} ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4962 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4963 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 4969--4969 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 4969--4969 \T1/ptm/m/it/10 lr=0.001\T1/ptm/m/n/10 , \T1/ptm/m/it/10 al-pha=0.2\T1/ptm/m/n/ 10 , \T1/ptm/m/it/10 batch_size=100\T1/ptm/m/n/10 , \T1/ptm/m/it/10 start_steps =10000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 max_ep_len=1000\T1/ptm/m/n/10 , \T1/ptm/ m/it/10 log- [107] Overfull \vbox (77.80809pt too high) has occurred while \output is active [108] [109] [110] Chapter 20. LaTeX Warning: Hyper reference `utils/logger:logger' on page 111 undefined on i nput line 5233. LaTeX Warning: Hyper reference `utils/logger:using-a-logger' on page 111 undefi ned on input line 5236. LaTeX Warning: Hyper reference `utils/logger:examples' on page 111 undefined on input line 5239. LaTeX Warning: Hyper reference `utils/logger:logging-and-mpi' on page 111 undef ined on input line 5242. LaTeX Warning: Hyper reference `utils/logger:logger-classes' on page 111 undefi ned on input line 5247. LaTeX Warning: Hyper reference `utils/logger:loading-saved-graphs' on page 111 undefined on input line 5250. [111] [112] [113] [114] LaTeX Warning: Hyper reference `utils/logger:spinup.utils.logx.Logger' on page 115 undefined on input line 5548. [115] [116] Chapter 21. [117] [118] Chapter 22. LaTeX Warning: Hyper reference `utils/mpi:mpi-tools' on page 119 undefined on i nput line 5693. LaTeX Warning: Hyper reference `utils/mpi:module-spinup.utils.mpi_tools' on pag e 119 undefined on input line 5696. LaTeX Warning: Hyper reference `utils/mpi:mpi-tensorflow-utilities' on page 119 undefined on input line 5699. [119] [120] Chapter 23. LaTeX Warning: Hyper reference `utils/run_utils:run-utils' on page 121 undefine d on input line 5826. LaTeX Warning: Hyper reference `utils/run_utils:experimentgrid' on page 121 und efined on input line 5829. LaTeX Warning: Hyper reference `utils/run_utils:calling-experiments' on page 12 1 undefined on input line 5832. [121] Underfull \hbox (badness 10000) in paragraph at lines 5984--5984 []\T1/ptm/m/it/10 exp_name\T1/ptm/m/n/10 , \T1/ptm/m/it/10 thunk\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , \T1/ptm/m/it/10 num_cpu=1\T1/ptm/m/n/1 0 , [122] [123] [124] Chapter 24. [125] [126] Chapter 25. [127] [128] Chapter 26. [129] [130] LaTeX Warning: Reference `utils/mpi:module-spinup.utils.mpi_tf' on page 131 und efined on input line 6117. LaTeX Warning: Reference `utils/mpi:module-spinup.utils.mpi_tools' on page 131 undefined on input line 6118. [131] No file SpinningUp.ind. (./SpinningUp.aux) Package rerunfilecheck Warning: File `SpinningUp.out' has changed. (rerunfilecheck) Rerun to get outlines right (rerunfilecheck) or use package `bookmark'. LaTeX Warning: There were undefined references. LaTeX Warning: Label(s) may have changed. Rerun to get cross-references right. ) (see the transcript file for additional information){/usr/share/texlive/texmf-d ist/fonts/enc/dvips/base/8r.enc}< /usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmr5.pfb> Output written on SpinningUp.pdf (135 pages, 1117986 bytes). Transcript written on SpinningUp.log. [rtd-command-info] start-time: 2018-11-13T01:05:33.267074Z, end-time: 2018-11-13T01:05:33.390341Z, duration: 0, exit-code: 0 makeindex -s python.ist SpinningUp.idx This is makeindex, version 2.15 [TeX Live 2015] (kpathsea + Thai support). Scanning style file ./python.ist.......done (7 attributes redefined, 0 ignored). Scanning input file SpinningUp.idx....done (78 entries accepted, 0 rejected). Sorting entries....done (506 comparisons). Generating output file SpinningUp.ind....done (144 lines written, 0 warnings). Output written in SpinningUp.ind. Transcript written in SpinningUp.ilg. [rtd-command-info] start-time: 2018-11-13T01:05:33.450182Z, end-time: 2018-11-13T01:05:34.635337Z, duration: 1, exit-code: 0 pdflatex -interaction=nonstopmode /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/_build/latex/SpinningUp.tex This is pdfTeX, Version 3.14159265-2.6-1.40.16 (TeX Live 2015/Debian) (preloaded format=pdflatex) restricted \write18 enabled. entering extended mode (/home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/c heckouts/latest/docs/_build/latex/SpinningUp.tex LaTeX2e <2016/02/01> Babel <3.9q> and hyphenation patterns for 81 language(s) loaded. (./sphinxmanual.cls Document Class: sphinxmanual 2017/03/26 v1.5.4 Document class (Sphinx manual) (/usr/share/texlive/texmf-dist/tex/latex/base/report.cls Document Class: report 2014/09/29 v1.4h Standard LaTeX document class (/usr/share/texlive/texmf-dist/tex/latex/base/size10.clo))) (/usr/share/texlive/texmf-dist/tex/latex/base/inputenc.sty (/usr/share/texlive/texmf-dist/tex/latex/base/utf8.def (/usr/share/texlive/texmf-dist/tex/latex/base/t1enc.dfu) (/usr/share/texlive/texmf-dist/tex/latex/base/ot1enc.dfu) (/usr/share/texlive/texmf-dist/tex/latex/base/omsenc.dfu))) (/usr/share/texlive/texmf-dist/tex/latex/cmap/cmap.sty) (/usr/share/texlive/texmf-dist/tex/latex/base/fontenc.sty (/usr/share/texlive/texmf-dist/tex/latex/base/t1enc.def)<>) (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsmath.sty For additional information on amsmath, use the `?' option. (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amstext.sty (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsgen.sty)) (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsbsy.sty) (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsopn.sty)) (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/amssymb.sty (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/amsfonts.sty)) (/usr/share/texlive/texmf-dist/tex/generic/babel/babel.sty (/usr/share/texlive/texmf-dist/tex/generic/babel-english/english.ldf (/usr/share/texlive/texmf-dist/tex/generic/babel/babel.def))) (/usr/share/texlive/texmf-dist/tex/latex/psnfss/times.sty) (/usr/share/texlive/texmf-dist/tex/latex/fncychap/fncychap.sty) (/usr/share/texlive/texmf-dist/tex/latex/tools/longtable.sty) (./sphinx.sty (/usr/share/texlive/texmf-dist/tex/latex/graphics/graphicx.sty (/usr/share/texlive/texmf-dist/tex/latex/graphics/keyval.sty) (/usr/share/texlive/texmf-dist/tex/latex/graphics/graphics.sty (/usr/share/texlive/texmf-dist/tex/latex/graphics/trig.sty) (/usr/share/texlive/texmf-dist/tex/latex/latexconfig/graphics.cfg) (/usr/share/texlive/texmf-dist/tex/latex/pdftex-def/pdftex.def (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/infwarerr.sty) (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ltxcmds.sty)))) (/usr/share/texlive/texmf-dist/tex/latex/fancyhdr/fancyhdr.sty) (/usr/share/texlive/texmf-dist/tex/latex/base/textcomp.sty (/usr/share/texlive/texmf-dist/tex/latex/base/ts1enc.def (/usr/share/texlive/texmf-dist/tex/latex/base/ts1enc.dfu))) (/usr/share/texlive/texmf-dist/tex/latex/titlesec/titlesec.sty) (/usr/share/texlive/texmf-dist/tex/latex/etoolbox/etoolbox.sty) Package Sphinx Info: **** titlesec 2.10.1 successfully patched for bugfix **** (/usr/share/texlive/texmf-dist/tex/latex/tabulary/tabulary.sty (/usr/share/texlive/texmf-dist/tex/latex/tools/array.sty)) (/usr/share/texlive/texmf-dist/tex/latex/base/makeidx.sty) (/usr/share/texlive/texmf-dist/tex/latex/framed/framed.sty) (/usr/share/texlive/texmf-dist/tex/latex/xcolor/xcolor.sty (/usr/share/texlive/texmf-dist/tex/latex/latexconfig/color.cfg)) (/usr/share/texlive/texmf-dist/tex/latex/fancyvrb/fancyvrb.sty Style option: `fancyvrb' v2.7a, with DG/SPQR fixes, and firstline=lastline fix <2008/02/07> (tvz)) (/usr/share/texlive/texmf-dist/tex/latex/threeparttable/threeparttable.sty) (./footnotehyper-sphinx.sty (/usr/share/texlive/texmf-dist/tex/latex/mdwtools/footnote.sty)) (/usr/share/texlive/texmf-dist/tex/latex/float/float.sty) (/usr/share/texlive/texmf-dist/tex/latex/wrapfig/wrapfig.sty) (/usr/share/texlive/texmf-dist/tex/latex/parskip/parskip.sty) (/usr/share/texlive/texmf-dist/tex/latex/base/alltt.sty) (/usr/share/texlive/texmf-dist/tex/latex/upquote/upquote.sty) (/usr/share/texlive/texmf-dist/tex/latex/capt-of/capt-of.sty) (./needspace.sty) (./sphinxhighlight.sty) (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/kvoptions.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/kvsetkeys.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/etexcmds.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ifluatex.sty)))) (/usr/share/texlive/texmf-dist/tex/generic/pdftex/pdfcolor.tex) ** (sphinx) defining (legacy) text style macros without \sphinx prefix ** if clashes with packages, set latex_keep_old_macro_names=False in conf.py ) (/usr/share/texlive/texmf-dist/tex/latex/geometry/geometry.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ifpdf.sty) (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ifvtex.sty) (/usr/share/texlive/texmf-dist/tex/generic/ifxetex/ifxetex.sty)) (/usr/share/texlive/texmf-dist/tex/latex/multirow/multirow.sty) (/usr/share/texlive/texmf-dist/tex/latex/eqparbox/eqparbox.sty (/usr/share/texlive/texmf-dist/tex/latex/environ/environ.sty (/usr/share/texlive/texmf-dist/tex/latex/trimspaces/trimspaces.sty))) (/usr/share/texlive/texmf-dist/tex/latex/hyperref/hyperref.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/hobsub-hyperref.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/hobsub-generic.sty)) (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/auxhook.sty) (/usr/share/texlive/texmf-dist/tex/latex/hyperref/pd1enc.def) (/usr/share/texlive/texmf-dist/tex/latex/latexconfig/hyperref.cfg) (/usr/share/texlive/texmf-dist/tex/latex/hyperref/puenc.def) (/usr/share/texlive/texmf-dist/tex/latex/url/url.sty)) Package hyperref Message: Driver (autodetected): hpdftex. (/usr/share/texlive/texmf-dist/tex/latex/hyperref/hpdftex.def (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/rerunfilecheck.sty)) (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/hypcap.sty) Writing index file SpinningUp.idx (./SpinningUp.aux LaTeX Warning: Label `alg1' multiply defined. LaTeX Warning: Label `alg1' multiply defined. LaTeX Warning: Label `alg1' multiply defined. LaTeX Warning: Label `alg1' multiply defined. LaTeX Warning: Label `alg1' multiply defined. ) (/usr/share/texlive/texmf-dist/tex/latex/base/ts1cmr.fd) (/usr/share/texlive/texmf-dist/tex/latex/psnfss/t1ptm.fd) (/usr/share/texlive/texmf-dist/tex/context/base/supp-pdf.mkii [Loading MPS to PDF converter (version 2006.09.02).] ) (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/epstopdf-base.sty (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/grfext.sty) (/usr/share/texlive/texmf-dist/tex/latex/latexconfig/epstopdf-sys.cfg)) *geometry* driver: auto-detecting *geometry* detected driver: pdftex (/usr/share/texlive/texmf-dist/tex/latex/hyperref/nameref.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/gettitlestring.sty)) (./SpinningUp.out) (./SpinningUp.out) (/usr/share/texlive/texmf-dist/tex/latex/psnfss/t1phv.fd)<><><><> (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/umsa.fd) (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/umsb.fd) [1{/var/lib/texmf/fo nts/map/pdftex/updmap/pdftex.map}] [2] (./SpinningUp.toc [1] [2]) [3] [4] (/usr/share/texlive/texmf-dist/tex/latex/psnfss/t1pcr.fd) [1 <./spinning-up-in-rl.png>] [2] Chapter 1. (/usr/share/texlive/texmf-dist/tex/latex/psnfss/ts1ptm.fd) [3] [4] [5] [6] Chapter 2. [7] [8] [9] [10] Chapter 3. [11] [12] [13] [14] Chapter 4. [15] [16] [17] [18] (/usr/share/texlive/texmf-dist/tex/latex/psnfss/ts1pcr.fd) [19] [20] Chapter 5. [21] [22] [23] [24] [25] [26] Chapter 6. [27] [28] Chapter 7. [29] [30 <./rl_diagram_transparent_bg.png>] [31] [32] [33] ! Undefined control sequence. ... } P(\tau |\pi ) R(\tau ) = \underE {\tau \sim \pi }{R(\tau )}... l.1529 ...nderE{\tau\sim \pi}{R(\tau)}.\end{split} ! Undefined control sequence. ... } P(\tau |\pi ) R(\tau ) = \underE {\tau \sim \pi }{R(\tau )}... l.1529 ...nderE{\tau\sim \pi}{R(\tau)}.\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {\tau \sim \pi }{R(\tau )\... l.1548 ...R(\tau)\left| s_0 = s\right.}\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {\tau \sim \pi }{R(\tau )\... l.1548 ...R(\tau)\left| s_0 = s\right.}\end{split} [34] ! Undefined control sequence. ...egin {split}Q^{\pi }(s,a) = \underE {\tau \sim \pi }{R(\tau )\... l.1555 ...eft| s_0 = s, a_0 = a\right.}\end{split} ! Undefined control sequence. ...egin {split}Q^{\pi }(s,a) = \underE {\tau \sim \pi }{R(\tau )\... l.1555 ...eft| s_0 = s, a_0 = a\right.}\end{split} ! Undefined control sequence. ...split}V^*(s) = \max _{\pi } \underE {\tau \sim \pi }{R(\tau )\... l.1562 ...R(\tau)\left| s_0 = s\right.}\end{split} ! Undefined control sequence. ...split}V^*(s) = \max _{\pi } \underE {\tau \sim \pi }{R(\tau )\... l.1562 ...R(\tau)\left| s_0 = s\right.}\end{split} ! Undefined control sequence. ...lit}Q^*(s,a) = \max _{\pi } \underE {\tau \sim \pi }{R(\tau )\... l.1569 ...eft| s_0 = s, a_0 = a\right.}\end{split} ! Undefined control sequence. ...lit}Q^*(s,a) = \max _{\pi } \underE {\tau \sim \pi }{R(\tau )\... l.1569 ...eft| s_0 = s, a_0 = a\right.}\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {a\sim \pi }{Q^{\pi }(s,a)... l.1583 ...erE{a\sim \pi}{Q^{\pi}(s,a)},\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {a\sim \pi }{Q^{\pi }(s,a)... l.1583 ...erE{a\sim \pi}{Q^{\pi}(s,a)},\end{split} [35] ! Undefined control sequence. V^{\pi }(s) &= \underE {a \sim \pi \\ s'\sim P}{r(s,a) + \gamma ... l.1616 \end{align*} ! Missing } inserted. } l.1616 \end{align*} ! Missing { inserted. { l.1616 \end{align*} ! Undefined control sequence. ...}(s')}, \\ Q^{\pi }(s,a) &= \underE {s'\sim P}{r(s,a) + \gamma... l.1616 \end{align*} ! Undefined control sequence. ... {s'\sim P}{r(s,a) + \gamma \underE {a'\sim \pi }{Q^{\pi }(s',... l.1616 \end{align*} ! Undefined control sequence. V^{\pi }(s) &= \underE {a \sim \pi \\ s'\sim P}{r(s,a) + \gamma ... l.1616 \end{align*} ! Missing } inserted. } l.1616 \end{align*} ! Missing \endgroup inserted. \endgroup l.1616 \end{align*} ! Misplaced \omit. \math@cr@@@ ...@ \@ne \add@amps \maxfields@ \omit \kern -\alignsep@ \iftag@ ... l.1616 \end{align*} ! Missing { inserted. { l.1616 \end{align*} ! Undefined control sequence. ...}(s')}, \\ Q^{\pi }(s,a) &= \underE {s'\sim P}{r(s,a) + \gamma... l.1616 \end{align*} ! Undefined control sequence. ... {s'\sim P}{r(s,a) + \gamma \underE {a'\sim \pi }{Q^{\pi }(s',... l.1616 \end{align*} ! Undefined control sequence. V^*(s) &= \max _a \underE {s'\sim P}{r(s,a) + \gamma V^*(s')}, \... l.1623 \end{align*} ! Undefined control sequence. ...ma V^*(s')}, \\ Q^*(s,a) &= \underE {s'\sim P}{r(s,a) + \gamma... l.1623 \end{align*} ! Undefined control sequence. V^*(s) &= \max _a \underE {s'\sim P}{r(s,a) + \gamma V^*(s')}, \... l.1623 \end{align*} ! Undefined control sequence. ...ma V^*(s')}, \\ Q^*(s,a) &= \underE {s'\sim P}{r(s,a) + \gamma... l.1623 \end{align*} [36] [37] [38] Chapter 8. ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.1697 ...ncludegraphics{{rl_algorithms_9_15}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.1697 ...ncludegraphics{{rl_algorithms_9_15}.svg} [39] [40] [41] [42] Chapter 9. ! Undefined control sequence. l.1888 ...ected return \(J(\pi_{\theta}) = \underE {\tau \sim \pi_{\theta}}{R... [43] ! Undefined control sequence. ...} \log P(\tau | \theta ) &= \cancel {\nabla _{\theta } \log \r... l.1919 ... \log \pi_{\theta}(a_t |s_t).\end{split} ! Undefined control sequence. ...} + \sum _{t=0}^{T} \bigg ( \cancel {\nabla _{\theta } \log P(... l.1919 ... \log \pi_{\theta}(a_t |s_t).\end{split} ! Undefined control sequence. ...} \log P(\tau | \theta ) &= \cancel {\nabla _{\theta } \log \r... l.1919 ... \log \pi_{\theta}(a_t |s_t).\end{split} ! Undefined control sequence. ...} + \sum _{t=0}^{T} \bigg ( \cancel {\nabla _{\theta } \log P(... l.1919 ... \log \pi_{\theta}(a_t |s_t).\end{split} ! Undefined control sequence. ...eta }) &= \nabla _{\theta } \underE {\tau \sim \pi _{\theta }}... l.1931 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...Log-derivative trick} \\ &= \underE {\tau \sim \pi _{\theta }}... l.1931 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...heta } J(\pi _{\theta }) &= \underE {\tau \sim \pi _{\theta }}... l.1931 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...eta }) &= \nabla _{\theta } \underE {\tau \sim \pi _{\theta }}... l.1931 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...Log-derivative trick} \\ &= \underE {\tau \sim \pi _{\theta }}... l.1931 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...heta } J(\pi _{\theta }) &= \underE {\tau \sim \pi _{\theta }}... l.1931 \end{align*} \end{sphinxadmonition} [44] [45] [46] [47] ! Undefined control sequence. \split@tag \begin {split}\underE {x \sim P_{\theta }}{\nabla _{\t... l.2080 ...eta} \log P_{\theta}(x)} = 0.\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {x \sim P_{\theta }}{\nabla _{\t... l.2080 ...eta} \log P_{\theta}(x)} = 0.\end{split} ! Undefined control sequence. ...eta }(x) \\ \therefore 0 &= \underE {x \sim P_{\theta }}{\nabl... l.2097 ...{\theta} \log P_{\theta}(x)}.\end{split} ! Undefined control sequence. ...eta }(x) \\ \therefore 0 &= \underE {x \sim P_{\theta }}{\nabl... l.2097 ...{\theta} \log P_{\theta}(x)}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2105 ..._{\theta}(a_t |s_t) R(\tau)}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2105 ..._{\theta}(a_t |s_t) R(\tau)}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2113 ...R(s_{t'}, a_{t'}, s_{t'+1})}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2113 ...R(s_{t'}, a_{t'}, s_{t'+1})}.\end{split} [48] ! Undefined control sequence. \split@tag \begin {split}\underE {a_t \sim \pi _{\theta }}{\nabla... l.2165 ...\theta}(a_t|s_t) b(s_t)} = 0.\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {a_t \sim \pi _{\theta }}{\nabla... l.2165 ...\theta}(a_t|s_t) b(s_t)} = 0.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2169 ..., s_{t'+1}) - b(s_t)\right)}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2169 ..., s_{t'+1}) - b(s_t)\right)}.\end{split} [49] ! Undefined control sequence. ...phi _k = \arg \min _{\phi } \underE {s_t, \hat {R}_t \sim \pi ... l.2183 ...(s_t) - \hat{R}_t \right)^2},\end{split} ! Undefined control sequence. ...phi _k = \arg \min _{\phi } \underE {s_t, \hat {R}_t \sim \pi ... l.2183 ...(s_t) - \hat{R}_t \right)^2},\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2197 ...i_{\theta}(a_t |s_t) \Phi_t},\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2197 ...i_{\theta}(a_t |s_t) \Phi_t},\end{split} [50] [51] [52] Chapter 10. [53] [54] [55] [56] [57] [58] Chapter 11. [59] [60] Overfull \vbox (108.35579pt too high) has occurred while \output is active [61] [62] Chapter 12. [63] [64] [65] [66] Chapter 13. ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2698 ...includegraphics{{bench_halfcheetah}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2698 ...includegraphics{{bench_halfcheetah}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2707 ...phinxincludegraphics{{bench_hopper}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2707 ...phinxincludegraphics{{bench_hopper}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2716 ...phinxincludegraphics{{bench_walker}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2716 ...phinxincludegraphics{{bench_walker}.svg} [67] ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2725 ...\sphinxincludegraphics{{bench_swim}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2725 ...\sphinxincludegraphics{{bench_swim}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2734 ...t\sphinxincludegraphics{{bench_ant}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2734 ...t\sphinxincludegraphics{{bench_ant}.svg} [68] [69] [70] Chapter 14. [71] ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2831 },\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2831 },\end{split} ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2848 ...rithms/vpg:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2849 \caption {Vanilla Policy Gradient Algorithm} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2851 \begin{algorithmic} [1] ! Undefined control sequence. l.2852 \STATE Input: initial policy parameters $\theta_0$, initial value... ! Undefined control sequence. l.2853 \FOR {$k = 0,1,2,...$} ! Undefined control sequence. l.2854 \STATE Collect set of trajectories ${\mathcal D}_k = \{\tau_i\}$ ... ! Undefined control sequence. l.2855 \STATE Compute rewards-to-go $\hat{R}_t$. ! Undefined control sequence. l.2856 \STATE Compute advantage estimates, $\hat{A}_t$ (using any method... ! Undefined control sequence. l.2857 \STATE Estimate policy gradient as ! Undefined control sequence. l.2861 \STATE Compute policy update, either using standard gradient ascent, ! Undefined control sequence. l.2866 \STATE Fit value function by regression on mean-squared error: ! Undefined control sequence. l.2871 \ENDFOR ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2872 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2873 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 2879--2879 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 2879--2879 \T1/ptm/m/it/10 steps_per_epoch=4000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 epochs=50\ T1/ptm/m/n/10 , \T1/ptm/m/it/10 gamma=0.99\T1/ptm/m/n/10 , \T1/ptm/m/it/10 pi_l r=0.0003\T1/ptm/m/n/10 , \T1/ptm/m/it/10 vf_lr=0.001\T1/ptm/m/n/10 , [72] [73] [74] [75] [76] Chapter 15. [77] ! Undefined control sequence. ...al L}(\theta _k, \theta ) = \underE {s,a \sim \pi _{\theta _k}... l.3160 },\end{split} ! Undefined control sequence. ...al L}(\theta _k, \theta ) = \underE {s,a \sim \pi _{\theta _k}... l.3160 },\end{split} ! Undefined control sequence. ...{KL}(\theta || \theta _k) = \underE {s \sim \pi _{\theta _k}}{... l.3166 }.\end{split} ! Undefined control sequence. ...{KL}(\theta || \theta _k) = \underE {s \sim \pi _{\theta _k}}{... l.3166 }.\end{split} [78] ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3215 ...ithms/trpo:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3216 \caption {Trust Region Policy Optimization} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3218 \begin{algorithmic} [1] ! Undefined control sequence. l.3219 \STATE Input: initial policy parameters $\theta_0$, initial value... ! Undefined control sequence. l.3220 \STATE Hyperparameters: KL-divergence limit $\delta$, backtrackin... ! Undefined control sequence. l.3221 \FOR {$k = 0,1,2,...$} ! Undefined control sequence. l.3222 \STATE Collect set of trajectories ${\mathcal D}_k = \{\tau_i\}$ ... ! Undefined control sequence. l.3223 \STATE Compute rewards-to-go $\hat{R}_t$. ! Undefined control sequence. l.3224 \STATE Compute advantage estimates, $\hat{A}_t$ (using any method... ! Undefined control sequence. l.3225 \STATE Estimate policy gradient as ! Undefined control sequence. l.3229 \STATE Use the conjugate gradient algorithm to compute ! Undefined control sequence. l.3234 \STATE Update the policy by backtracking line search with [79] ! Undefined control sequence. l.3239 \STATE Fit value function by regression on mean-squared error: ! Undefined control sequence. l.3244 \ENDFOR ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3245 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3246 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 3252--3252 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 3252--3252 \T1/ptm/m/it/10 steps_per_epoch=4000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 epochs=50\ T1/ptm/m/n/10 , \T1/ptm/m/it/10 gamma=0.99\T1/ptm/m/n/10 , \T1/ptm/m/it/10 delt a=0.01\T1/ptm/m/n/10 , \T1/ptm/m/it/10 vf_lr=0.001\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 3252--3252 \T1/ptm/m/it/10 train_v_iters=80\T1/ptm/m/n/10 , \T1/ptm/m/it/10 damp-ing_coeff =0.1\T1/ptm/m/n/10 , \T1/ptm/m/it/10 cg_iters=10\T1/ptm/m/n/10 , \T1/ptm/m/it/1 0 back-track_iters=10\T1/ptm/m/n/10 , \T1/ptm/m/it/10 back- Underfull \hbox (badness 10000) in paragraph at lines 3252--3252 \T1/ptm/m/it/10 track_coeff=0.8\T1/ptm/m/n/10 , \T1/ptm/m/it/10 lam=0.97\T1/ptm /m/n/10 , \T1/ptm/m/it/10 max_ep_len=1000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 log-g er_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 save_freq=10\T1/ptm/m/n/10 , [80] Overfull \vbox (100.02797pt too high) has occurred while \output is active [81] [82] [83] [84] Chapter 16. [85] [86] ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3672 ...rithms/ppo:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3673 \caption {PPO-Clip} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3675 \begin{algorithmic} [1] ! Undefined control sequence. l.3676 \STATE Input: initial policy parameters $\theta_0$, initial value... ! Undefined control sequence. l.3677 \FOR {$k = 0,1,2,...$} ! Undefined control sequence. l.3678 \STATE Collect set of trajectories ${\mathcal D}_k = \{\tau_i\}$ ... ! Undefined control sequence. l.3679 \STATE Compute rewards-to-go $\hat{R}_t$. ! Undefined control sequence. l.3680 \STATE Compute advantage estimates, $\hat{A}_t$ (using any method... ! Undefined control sequence. l.3681 \STATE Update the policy by maximizing the PPO-Clip objective: ! Undefined control sequence. l.3689 \STATE Fit value function by regression on mean-squared error: ! Undefined control sequence. l.3694 \ENDFOR ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3695 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3696 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 3702--3702 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , [87] [88] [89] [90] Chapter 17. [91] [92] [93] ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4086 ...ithms/ddpg:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4087 \caption {Deep Deterministic Policy Gradient} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4089 \begin{algorithmic} [1] ! Undefined control sequence. l.4090 \STATE Input: initial policy parameters $\theta$, Q-function para... ! Undefined control sequence. l.4091 \STATE Set target parameters equal to main parameters $\theta_{\t... ! Undefined control sequence. l.4092 \REPEAT ! Undefined control sequence. l.4093 \STATE Observe state $s$ and select action $a = \text{clip}(\... ! Undefined control sequence. l.4094 \STATE Execute $a$ in the environment ! Undefined control sequence. l.4095 \STATE Observe next state $s'$, reward $r$, and done signal $... ! Undefined control sequence. l.4096 \STATE Store $(s,a,r,s',d)$ in replay buffer $\mathcal{D}$ ! Undefined control sequence. l.4097 \STATE If $s'$ is terminal, reset environment state. ! Undefined control sequence. l.4098 \IF {it's time to update} ! Undefined control sequence. l.4099 \FOR {however many updates} ! Undefined control sequence. l.4100 \STATE Randomly sample a batch of transitions, $B = \... ! Undefined control sequence. l.4101 \STATE Compute targets ! Undefined control sequence. l.4105 \STATE Update Q-function by one step of gradient desc... ! Undefined control sequence. l.4109 \STATE Update policy by one step of gradient ascent u... ! Undefined control sequence. l.4113 \STATE Update target networks with ! Undefined control sequence. l.4118 \ENDFOR ! Undefined control sequence. l.4119 \ENDIF ! Undefined control sequence. l.4120 \UNTIL {convergence} ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4121 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4122 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 4128--4128 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 4128--4128 \T1/ptm/m/it/10 pi_lr=0.001\T1/ptm/m/n/10 , \T1/ptm/m/it/10 q_lr=0.001\T1/ptm/m /n/10 , \T1/ptm/m/it/10 batch_size=100\T1/ptm/m/n/10 , \T1/ptm/m/it/10 start_st eps=10000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 act_noise=0.1\T1/ptm/m/n/10 , [94] [95] [96] Chapter 18. [97] ! Undefined control sequence. ...}L(\phi _1, {\mathcal D}) = \underE {(s,a,r,s',d) \sim {\mathc... l.4434 },\end{split} ! Undefined control sequence. ...}L(\phi _1, {\mathcal D}) = \underE {(s,a,r,s',d) \sim {\mathc... l.4434 },\end{split} ! Undefined control sequence. ...}L(\phi _2, {\mathcal D}) = \underE {(s,a,r,s',d) \sim {\mathc... l.4438 }.\end{split} ! Undefined control sequence. ...}L(\phi _2, {\mathcal D}) = \underE {(s,a,r,s',d) \sim {\mathc... l.4438 }.\end{split} [98] ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4462 ...rithms/td3:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4463 \caption {Twin Delayed DDPG} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4465 \begin{algorithmic} [1] ! Undefined control sequence. l.4466 \STATE Input: initial policy parameters $\theta$, Q-function para... ! Undefined control sequence. l.4467 \STATE Set target parameters equal to main parameters $\theta_{\t... ! Undefined control sequence. l.4468 \REPEAT ! Undefined control sequence. l.4469 \STATE Observe state $s$ and select action $a = \text{clip}(\... ! Undefined control sequence. l.4470 \STATE Execute $a$ in the environment ! Undefined control sequence. l.4471 \STATE Observe next state $s'$, reward $r$, and done signal $... ! Undefined control sequence. l.4472 \STATE Store $(s,a,r,s',d)$ in replay buffer $\mathcal{D}$ ! Undefined control sequence. l.4473 \STATE If $s'$ is terminal, reset environment state. ! Undefined control sequence. l.4474 \IF {it's time to update} ! Undefined control sequence. l.4475 \FOR {$j$ in range(however many updates)} ! Undefined control sequence. l.4476 \STATE Randomly sample a batch of transitions, $B = \... ! Undefined control sequence. l.4477 \STATE Compute target actions ! Undefined control sequence. l.4481 \STATE Compute targets ! Undefined control sequence. l.4485 \STATE Update Q-functions by one step of gradient des... ! Undefined control sequence. l.4489 \IF { $j \mod$ \texttt{policy\_delay} $ = 0$} ! Undefined control sequence. l.4490 \STATE Update policy by one step of gradient asce... ! Undefined control sequence. l.4494 \STATE Update target networks with ! Undefined control sequence. l.4499 \ENDIF ! Undefined control sequence. l.4500 \ENDFOR ! Undefined control sequence. l.4501 \ENDIF ! Undefined control sequence. l.4502 \UNTIL {convergence} ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4503 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4504 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 4510--4510 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 4510--4510 \T1/ptm/m/it/10 pi_lr=0.001\T1/ptm/m/n/10 , \T1/ptm/m/it/10 q_lr=0.001\T1/ptm/m /n/10 , \T1/ptm/m/it/10 batch_size=100\T1/ptm/m/n/10 , \T1/ptm/m/it/10 start_st eps=10000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 act_noise=0.1\T1/ptm/m/n/10 , \T1/ptm /m/it/10 tar- [99] [100] [101] [102] Chapter 19. [103] ! Undefined control sequence. ...it@tag \begin {split}H(P) = \underE {x \sim P}{-\log P(x)}.\en... l.4825 ...underE{x \sim P}{-\log P(x)}.\end{split} ! Undefined control sequence. ...it@tag \begin {split}H(P) = \underE {x \sim P}{-\log P(x)}.\en... l.4825 ...underE{x \sim P}{-\log P(x)}.\end{split} ! Undefined control sequence. ...}\pi ^* = \arg \max _{\pi } \underE {\tau \sim \pi }{ \sum _{t... l.4829 ...pi(\cdot|s_t)\right) \bigg)},\end{split} ! Undefined control sequence. ...}\pi ^* = \arg \max _{\pi } \underE {\tau \sim \pi }{ \sum _{t... l.4829 ...pi(\cdot|s_t)\right) \bigg)},\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {\tau \sim \pi }{ \left . ... l.4833 ...ight) \bigg) \right| s_0 = s}\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {\tau \sim \pi }{ \left . ... l.4833 ...ight) \bigg) \right| s_0 = s}\end{split} ! Undefined control sequence. ...egin {split}Q^{\pi }(s,a) = \underE {\tau \sim \pi }{ \left . ... l.4837 ...ght)\right| s_0 = s, a_0 = a}\end{split} ! Undefined control sequence. ...egin {split}Q^{\pi }(s,a) = \underE {\tau \sim \pi }{ \left . ... l.4837 ...ght)\right| s_0 = s, a_0 = a}\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {a \sim \pi }{Q^{\pi }(s,a... l.4841 ...ha H\left(\pi(\cdot|s)\right)\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {a \sim \pi }{Q^{\pi }(s,a... l.4841 ...ha H\left(\pi(\cdot|s)\right)\end{split} ! Undefined control sequence. ...gin {split}Q^{\pi }(s,a) &= \underE {s' \sim P \\ a' \sim \pi ... l.4846 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Missing } inserted. } l.4846 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Missing { inserted. { l.4846 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Undefined control sequence. ...s')\right ) \right )} \\ &= \underE {s' \sim P}{R(s,a,s') + \g... l.4846 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Undefined control sequence. ...gin {split}Q^{\pi }(s,a) &= \underE {s' \sim P \\ a' \sim \pi ... l.4846 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Missing } inserted. } l.4846 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Missing { inserted. { l.4846 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Undefined control sequence. ...s')\right ) \right )} \\ &= \underE {s' \sim P}{R(s,a,s') + \g... l.4846 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} [104] ! Undefined control sequence. ...begin {split}V^{\pi }(s) &= \underE {a \sim \pi }{Q^{\pi }(s,a... l.4869 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. ...pi (\cdot |s)\right ) \\ &= \underE {a \sim \pi }{Q^{\pi }(s,a... l.4869 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. ...begin {split}V^{\pi }(s) &= \underE {a \sim \pi }{Q^{\pi }(s,a... l.4869 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. ...pi (\cdot |s)\right ) \\ &= \underE {a \sim \pi }{Q^{\pi }(s,a... l.4869 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. ...it}L(\psi , {\mathcal D}) = \underE {s \sim \mathcal {D} \\ \t... l.4877 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Missing } inserted. } l.4877 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Missing { inserted. { l.4877 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Undefined control sequence. ...it}L(\psi , {\mathcal D}) = \underE {s \sim \mathcal {D} \\ \t... l.4877 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Missing } inserted. } l.4877 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Missing { inserted. { l.4877 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {a \sim \pi }{Q^{\pi }(s,a) - \a... l.4883 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {a \sim \pi }{Q^{\pi }(s,a) - \a... l.4883 ...s,a) - \alpha \log \pi(a|s)}.\end{split} [105] ! Undefined control sequence. \split@tag \begin {split}\underE {a \sim \pi _{\theta }}{Q^{\pi _... l.4900 ...\tilde{a}_{\theta}(s,\xi)|s)}\end{split} ! Undefined control sequence. ...\log \pi _{\theta }(a|s)} = \underE {\xi \sim \mathcal {N}}{Q^... l.4900 ...\tilde{a}_{\theta}(s,\xi)|s)}\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {a \sim \pi _{\theta }}{Q^{\pi _... l.4900 ...\tilde{a}_{\theta}(s,\xi)|s)}\end{split} ! Undefined control sequence. ...\log \pi _{\theta }(a|s)} = \underE {\xi \sim \mathcal {N}}{Q^... l.4900 ...\tilde{a}_{\theta}(s,\xi)|s)}\end{split} ! Undefined control sequence. ...egin {split}\max _{\theta } \underE {s \sim \mathcal {D} \\ \x... l.4904 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Missing } inserted. } l.4904 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Missing { inserted. { l.4904 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Undefined control sequence. ...egin {split}\max _{\theta } \underE {s \sim \mathcal {D} \\ \x... l.4904 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Missing } inserted. } l.4904 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Missing { inserted. { l.4904 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4922 ...rithms/sac:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4923 \caption {Soft Actor-Critic} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4925 \begin{algorithmic} [1] ! Undefined control sequence. l.4926 \STATE Input: initial policy parameters $\theta$, Q-function para... ! Undefined control sequence. l.4927 \STATE Set target parameters equal to main parameters $\psi_{\tex... ! Undefined control sequence. l.4928 \REPEAT ! Undefined control sequence. l.4929 \STATE Observe state $s$ and select action $a \sim \pi_{\thet... ! Undefined control sequence. l.4930 \STATE Execute $a$ in the environment ! Undefined control sequence. l.4931 \STATE Observe next state $s'$, reward $r$, and done signal $... ! Undefined control sequence. l.4932 \STATE Store $(s,a,r,s',d)$ in replay buffer $\mathcal{D}$ ! Undefined control sequence. l.4933 \STATE If $s'$ is terminal, reset environment state. ! Undefined control sequence. l.4934 \IF {it's time to update} ! Undefined control sequence. l.4935 \FOR {$j$ in range(however many updates)} ! Undefined control sequence. l.4936 \STATE Randomly sample a batch of transitions, $B = \... ! Undefined control sequence. l.4937 \STATE Compute targets for Q and V functions: [106] ! Undefined control sequence. l.4942 \STATE Update Q-functions by one step of gradient des... ! Undefined control sequence. l.4946 \STATE Update V-function by one step of gradient desc... ! Undefined control sequence. l.4950 \STATE Update policy by one step of gradient ascent u... ! Undefined control sequence. l.4955 \STATE Update target value network with ! Undefined control sequence. l.4959 \ENDFOR ! Undefined control sequence. l.4960 \ENDIF ! Undefined control sequence. l.4961 \UNTIL {convergence} ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4962 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4963 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 4969--4969 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 4969--4969 \T1/ptm/m/it/10 lr=0.001\T1/ptm/m/n/10 , \T1/ptm/m/it/10 al-pha=0.2\T1/ptm/m/n/ 10 , \T1/ptm/m/it/10 batch_size=100\T1/ptm/m/n/10 , \T1/ptm/m/it/10 start_steps =10000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 max_ep_len=1000\T1/ptm/m/n/10 , \T1/ptm/ m/it/10 log- [107] Overfull \vbox (77.80809pt too high) has occurred while \output is active [108] [109] [110] Chapter 20. [111] [112] [113] [114] [115] [116] Chapter 21. [117] [118] Chapter 22. [119] [120] Chapter 23. [121] Underfull \hbox (badness 10000) in paragraph at lines 5984--5984 []\T1/ptm/m/it/10 exp_name\T1/ptm/m/n/10 , \T1/ptm/m/it/10 thunk\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , \T1/ptm/m/it/10 num_cpu=1\T1/ptm/m/n/1 0 , [122] [123] [124] Chapter 24. [125] [126] Chapter 25. [127] [128] Chapter 26. [129] [130] [131] (./SpinningUp.ind [132] Underfull \hbox (badness 7522) in paragraph at lines 47--48 []\T1/ptm/m/n/10 add() (spinup.utils.run_utils.ExperimentGrid method), Overfull \hbox (5.61969pt too wide) in paragraph at lines 48--49 []\T1/ptm/m/n/10 apply_gradients() (spinup.utils.mpi_tf.MpiAdamOptimizer Overfull \hbox (17.83952pt too wide) in paragraph at lines 74--75 []\T1/ptm/m/n/10 compute_gradients() (spinup.utils.mpi_tf.MpiAdamOptimizer [133] Underfull \hbox (badness 10000) in paragraph at lines 103--104 []\T1/ptm/m/n/10 mpi_statistics_scalar() (in mod-ule Underfull \hbox (badness 10000) in paragraph at lines 119--120 []\T1/ptm/m/n/10 run() (spinup.utils.run_utils.ExperimentGrid method), Underfull \hbox (badness 10000) in paragraph at lines 140--141 []\T1/ptm/m/n/10 variant_name() (spinup.utils.run_utils.ExperimentGrid Underfull \hbox (badness 10000) in paragraph at lines 141--142 []\T1/ptm/m/n/10 variants() (spinup.utils.run_utils.ExperimentGrid [134]) (./SpinningUp.aux) Package rerunfilecheck Warning: File `SpinningUp.out' has changed. (rerunfilecheck) Rerun to get outlines right (rerunfilecheck) or use package `bookmark'. LaTeX Warning: There were multiply-defined labels. ) (see the transcript file for additional information){/usr/share/texlive/texmf-d ist/fonts/enc/dvips/base/8r.enc}< /usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmr5.pfb> Output written on SpinningUp.pdf (140 pages, 1145929 bytes). Transcript written on SpinningUp.log. [rtd-command-info] start-time: 2018-11-13T01:05:34.709672Z, end-time: 2018-11-13T01:05:34.773722Z, duration: 0, exit-code: 0 mv -f /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/_build/latex/SpinningUp.pdf /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/artifacts/latest/sphinx_pdf/openai-education-spinningup.pdf [rtd-command-info] start-time: 2018-11-13T01:05:34.833525Z, end-time: 2018-11-13T01:07:05.994784Z, duration: 91, exit-code: 0 python sphinx-build -T -b epub -d _build/doctrees-epub -D language=en . _build/epub Running Sphinx v1.5.6 making output directory... loading translations [en]... done loading pickled environment... not yet created building [mo]: targets for 0 po files that are out of date building [epub]: targets for 31 source files that are out of date updating environment: 31 added, 0 changed, 0 removed reading sources... [ 3%] algorithms/ddpg reading sources... [ 6%] algorithms/ppo reading sources... [ 9%] algorithms/sac reading sources... [ 12%] algorithms/td3 reading sources... [ 16%] algorithms/trpo reading sources... [ 19%] algorithms/vpg reading sources... [ 22%] etc/acknowledgements reading sources... [ 25%] etc/author reading sources... [ 29%] index reading sources... [ 32%] spinningup/bench reading sources... [ 35%] spinningup/exercise2_1_soln reading sources... [ 38%] spinningup/exercise2_2_soln reading sources... [ 41%] spinningup/exercises reading sources... [ 45%] spinningup/extra_pg_proof1 reading sources... [ 48%] spinningup/extra_pg_proof2 reading sources... [ 51%] spinningup/keypapers reading sources... [ 54%] spinningup/rl_intro reading sources... [ 58%] spinningup/rl_intro2 reading sources... [ 61%] spinningup/rl_intro3 reading sources... [ 64%] spinningup/rl_intro4 reading sources... [ 67%] spinningup/spinningup reading sources... [ 70%] user/algorithms reading sources... [ 74%] user/installation reading sources... [ 77%] user/introduction reading sources... [ 80%] user/plotting reading sources... [ 83%] user/running reading sources... [ 87%] user/saving_and_loading reading sources... [ 90%] utils/logger reading sources... [ 93%] utils/mpi reading sources... [ 96%] utils/plotter reading sources... [100%] utils/run_utils /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/exercises.rst:3: WARNING: Duplicate explicit target name: "solution available here.". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/rl_intro3.rst:3: WARNING: Duplicate explicit target name: "on github". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/rl_intro3.rst:380: WARNING: Line block ends without a blank line. /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/plotting.rst:55: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/plotting.rst:63: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/running.rst:144: WARNING: Duplicate explicit target name: "cmdoption--ac_kwargs". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/running.rst:285: WARNING: Inline strong start-string without end-string. /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/saving_and_loading.rst:119: WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/user/saving_and_loading.rst:146: WARNING: Duplicate explicit target name: "cmdoption-arg-default". looking for now-outdated files... none found pickling environment... done checking consistency... /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/exercise2_1_soln.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/exercise2_2_soln.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/extra_pg_proof1.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/extra_pg_proof2.rst:: WARNING: document isn't included in any toctree /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/spinningup/rl_intro4.rst:: WARNING: document isn't included in any toctree done preparing documents... done writing output... [ 3%] algorithms/ddpg writing output... [ 6%] algorithms/ppo writing output... [ 9%] algorithms/sac writing output... [ 12%] algorithms/td3 writing output... [ 16%] algorithms/trpo writing output... [ 19%] algorithms/vpg writing output... [ 22%] etc/acknowledgements writing output... [ 25%] etc/author writing output... [ 29%] index writing output... [ 32%] spinningup/bench writing output... [ 35%] spinningup/exercise2_1_soln writing output... [ 38%] spinningup/exercise2_2_soln writing output... [ 41%] spinningup/exercises writing output... [ 45%] spinningup/extra_pg_proof1 writing output... [ 48%] spinningup/extra_pg_proof2 writing output... [ 51%] spinningup/keypapers writing output... [ 54%] spinningup/rl_intro writing output... [ 58%] spinningup/rl_intro2 writing output... [ 61%] spinningup/rl_intro3 writing output... [ 64%] spinningup/rl_intro4 writing output... [ 67%] spinningup/spinningup writing output... [ 70%] user/algorithms writing output... [ 74%] user/installation writing output... [ 77%] user/introduction writing output... [ 80%] user/plotting writing output... [ 83%] user/running writing output... [ 87%] user/saving_and_loading writing output... [ 90%] utils/logger writing output... [ 93%] utils/mpi writing output... [ 96%] utils/plotter writing output... [100%] utils/run_utils generating indices... genindex py-modindex writing additional pages... copying images... [ 10%] images/spinning-up-in-rl.png copying images... [ 20%] spinningup/../images/bench/bench_halfcheetah.svg copying images... [ 30%] spinningup/../images/bench/bench_hopper.svg copying images... [ 40%] spinningup/../images/bench/bench_walker.svg copying images... [ 50%] spinningup/../images/bench/bench_swim.svg copying images... [ 60%] spinningup/../images/bench/bench_ant.svg copying images... [ 70%] spinningup/../images/ex2-1_trpo_hopper.png copying images... [ 80%] spinningup/../images/ex2-2_ddpg_bug.svg copying images... [ 90%] spinningup/../images/rl_diagram_transparent_bg.png copying images... [100%] spinningup/../images/rl_algorithms_9_15.svg copying static files... WARNING: html_static_path entry '/home/docs/checkouts/readthedocs.org/readthedocs/templates/sphinx/_static' does not exist done copying extra files... WARNING: favicon file 'openai_icon.ico' does not exist done writing mimetype file... writing META-INF/container.xml file... writing content.opf file... WARNING: unknown mimetype for _static/openai-favicon2_32x32.ico, ignoring WARNING: unknown mimetype for _static/openai_icon.ico, ignoring writing nav.xhtml file... writing toc.ncx file... writing SpinningUp.epub file... build succeeded, 18 warnings. [rtd-command-info] start-time: 2018-11-13T01:07:06.096583Z, end-time: 2018-11-13T01:07:06.158156Z, duration: 0, exit-code: 0 mv -f /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/checkouts/latest/docs/_build/epub/SpinningUp.epub /home/docs/checkouts/readthedocs.org/user_builds/openai-education-spinningup/artifacts/latest/sphinx_epub/openai-education-spinningup.epub