Read the Docs build information Build id: 160601 Project: openai-education-spinningup Version: latest Commit: 606d935235b20fb5e0276ad2368d410696b6f6e5 Date: 2018-11-17T07:18:16.998636Z State: finished Success: True [rtd-command-info] start-time: 2018-11-17T13:23:53.231262Z, end-time: 2018-11-17T13:23:53.238115Z, duration: 0, exit-code: 0 git remote set-url origin [rtd-command-info] start-time: 2018-11-17T13:23:53.300680Z, end-time: 2018-11-17T13:23:53.668986Z, duration: 0, exit-code: 0 git fetch --tags --prune --prune-tags From ff610ce..606d935 master -> origin/master [rtd-command-info] start-time: 2018-11-17T13:23:53.740053Z, end-time: 2018-11-17T13:23:53.772324Z, duration: 0, exit-code: 0 git checkout --force origin/master Previous HEAD position was ff610ce Merge pull request #36 from maksay/master HEAD is now at 606d935 Merge pull request #39 from siyavash/patch-1 [rtd-command-info] start-time: 2018-11-17T13:23:53.834810Z, end-time: 2018-11-17T13:23:53.861297Z, duration: 0, exit-code: 0 git clean -d -f -f Removing docs/_build/doctrees-epub/ Removing docs/_build/doctrees-readthedocs/ Removing docs/_build/doctrees-readthedocssinglehtmllocalmedia/ Removing docs/_build/epub/ Removing docs/_build/html/_static/readthedocs-data.js Removing docs/_build/json/ Removing docs/_build/latex/ Removing docs/_build/localmedia/ [rtd-command-info] start-time: 2018-11-17T13:23:53.942535Z, end-time: 2018-11-17T13:23:53.948125Z, duration: 0, exit-code: 0 git branch -r origin/HEAD -> origin/master origin/master [rtd-command-info] start-time: 2018-11-17T13:23:54.809795Z, end-time: 2018-11-17T13:23:57.917961Z, duration: 3, exit-code: 0 python3.6 -mvirtualenv --no-site-packages --no-download Using base prefix '/home/docs/.pyenv/versions/3.6.2' New python executable in /home/docs/checkouts/ Not overwriting existing python script /home/docs/checkouts/ (you must use /home/docs/checkouts/ Installing setuptools, pip, wheel...done. [rtd-command-info] start-time: 2018-11-17T13:23:57.987205Z, end-time: 2018-11-17T13:24:01.973146Z, duration: 3, exit-code: 0 python pip install --upgrade --cache-dir /home/docs/checkouts/ Pygments==2.2.0 setuptools<40 docutils==0.13.1 mock==1.0.1 pillow==2.6.1 alabaster>=0.7,<0.8,!=0.7.5 commonmark==0.5.4 recommonmark==0.4.0 sphinx<1.8 sphinx-rtd-theme<0.5 readthedocs-sphinx-ext<0.6 Requirement already up-to-date: Pygments==2.2.0 in /home/docs/checkouts/ Requirement already up-to-date: setuptools<40 in /home/docs/checkouts/ Requirement already up-to-date: docutils==0.13.1 in /home/docs/checkouts/ Requirement already up-to-date: mock==1.0.1 in /home/docs/checkouts/ Requirement already up-to-date: pillow==2.6.1 in /home/docs/checkouts/ Requirement already up-to-date: alabaster!=0.7.5,<0.8,>=0.7 in /home/docs/checkouts/ Requirement already up-to-date: commonmark==0.5.4 in /home/docs/checkouts/ Requirement already up-to-date: recommonmark==0.4.0 in /home/docs/checkouts/ Collecting sphinx<1.8 Using cached Collecting sphinx-rtd-theme<0.5 Using cached Requirement already up-to-date: readthedocs-sphinx-ext<0.6 in /home/docs/checkouts/ Requirement already up-to-date: packaging in /home/docs/checkouts/ (from sphinx<1.8) Requirement already up-to-date: imagesize in /home/docs/checkouts/ (from sphinx<1.8) Requirement already up-to-date: six>=1.5 in /home/docs/checkouts/ (from sphinx<1.8) Requirement already up-to-date: babel!=2.0,>=1.3 in /home/docs/checkouts/ (from sphinx<1.8) Requirement already up-to-date: snowballstemmer>=1.1 in /home/docs/checkouts/ (from sphinx<1.8) Requirement already up-to-date: requests>=2.0.0 in /home/docs/checkouts/ (from sphinx<1.8) Requirement already up-to-date: sphinxcontrib-websupport in /home/docs/checkouts/ (from sphinx<1.8) Requirement already up-to-date: Jinja2>=2.3 in /home/docs/checkouts/ (from sphinx<1.8) Requirement already up-to-date: pyparsing>=2.0.2 in /home/docs/checkouts/ (from packaging->sphinx<1.8) Requirement already up-to-date: pytz>=0a in /home/docs/checkouts/ (from babel!=2.0,>=1.3->sphinx<1.8) Requirement already up-to-date: certifi>=2017.4.17 in /home/docs/checkouts/ (from requests>=2.0.0->sphinx<1.8) Requirement already up-to-date: chardet<3.1.0,>=3.0.2 in /home/docs/checkouts/ (from requests>=2.0.0->sphinx<1.8) Requirement already up-to-date: idna<2.8,>=2.5 in /home/docs/checkouts/ (from requests>=2.0.0->sphinx<1.8) Requirement already up-to-date: urllib3<1.25,>=1.21.1 in /home/docs/checkouts/ (from requests>=2.0.0->sphinx<1.8) Requirement already up-to-date: MarkupSafe>=0.23 in /home/docs/checkouts/ (from Jinja2>=2.3->sphinx<1.8) Installing collected packages: sphinx, sphinx-rtd-theme Found existing installation: Sphinx 1.5.6 Uninstalling Sphinx-1.5.6: Successfully uninstalled Sphinx-1.5.6 Found existing installation: sphinx-rtd-theme 0.4.1 Uninstalling sphinx-rtd-theme-0.4.1: Successfully uninstalled sphinx-rtd-theme-0.4.1 Successfully installed sphinx-1.7.9 sphinx-rtd-theme-0.4.2 You are using pip version 9.0.3, however version 18.1 is available. You should consider upgrading via the 'pip install --upgrade pip' command. [rtd-command-info] start-time: 2018-11-17T13:24:02.076070Z, end-time: 2018-11-17T13:24:05.072881Z, duration: 2, exit-code: 0 python pip install --exists-action=w --cache-dir /home/docs/checkouts/ -r docs/docs_requirements.txt Requirement already satisfied: cloudpickle==0.5.2 in /home/docs/checkouts/ (from -r docs/docs_requirements.txt (line 1)) Requirement already satisfied: gym>=0.10.8 in /home/docs/checkouts/ (from -r docs/docs_requirements.txt (line 2)) Requirement already satisfied: ipython in /home/docs/checkouts/ (from -r docs/docs_requirements.txt (line 3)) Requirement already satisfied: joblib in /home/docs/checkouts/ (from -r docs/docs_requirements.txt (line 4)) Requirement already satisfied: matplotlib in /home/docs/checkouts/ (from -r docs/docs_requirements.txt (line 5)) Requirement already satisfied: numpy in /home/docs/checkouts/ (from -r docs/docs_requirements.txt (line 6)) Requirement already satisfied: pandas in /home/docs/checkouts/ (from -r docs/docs_requirements.txt (line 7)) Requirement already satisfied: pytest in /home/docs/checkouts/ (from -r docs/docs_requirements.txt (line 8)) Requirement already satisfied: psutil in /home/docs/checkouts/ (from -r docs/docs_requirements.txt (line 9)) Requirement already satisfied: scipy in /home/docs/checkouts/ (from -r docs/docs_requirements.txt (line 10)) Requirement already satisfied: seaborn==0.8.1 in /home/docs/checkouts/ (from -r docs/docs_requirements.txt (line 11)) Collecting sphinx==1.5.6 (from -r docs/docs_requirements.txt (line 12)) Using cached Requirement already satisfied: sphinx-autobuild==0.7.1 in /home/docs/checkouts/ (from -r docs/docs_requirements.txt (line 13)) Collecting sphinx-rtd-theme==0.4.1 (from -r docs/docs_requirements.txt (line 14)) Using cached Requirement already satisfied: tensorflow>=1.8.0 in /home/docs/checkouts/ (from -r docs/docs_requirements.txt (line 15)) Requirement already satisfied: tqdm in /home/docs/checkouts/ (from -r docs/docs_requirements.txt (line 16)) Requirement already satisfied: pyglet>=1.2.0 in /home/docs/checkouts/ (from gym>=0.10.8->-r docs/docs_requirements.txt (line 2)) Requirement already satisfied: requests>=2.0 in /home/docs/checkouts/ (from gym>=0.10.8->-r docs/docs_requirements.txt (line 2)) Requirement already satisfied: six in /home/docs/checkouts/ (from gym>=0.10.8->-r docs/docs_requirements.txt (line 2)) Requirement already satisfied: pygments in /home/docs/checkouts/ (from ipython->-r docs/docs_requirements.txt (line 3)) Requirement already satisfied: pickleshare in /home/docs/checkouts/ (from ipython->-r docs/docs_requirements.txt (line 3)) Requirement already satisfied: traitlets>=4.2 in /home/docs/checkouts/ (from ipython->-r docs/docs_requirements.txt (line 3)) Requirement already satisfied: decorator in /home/docs/checkouts/ (from ipython->-r docs/docs_requirements.txt (line 3)) Requirement already satisfied: setuptools>=18.5 in /home/docs/checkouts/ (from ipython->-r docs/docs_requirements.txt (line 3)) Requirement already satisfied: prompt-toolkit<2.1.0,>=2.0.0 in /home/docs/checkouts/ (from ipython->-r docs/docs_requirements.txt (line 3)) Requirement already satisfied: pexpect; sys_platform != "win32" in /home/docs/checkouts/ (from ipython->-r docs/docs_requirements.txt (line 3)) Requirement already satisfied: jedi>=0.10 in /home/docs/checkouts/ (from ipython->-r docs/docs_requirements.txt (line 3)) Requirement already satisfied: backcall in /home/docs/checkouts/ (from ipython->-r docs/docs_requirements.txt (line 3)) Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /home/docs/checkouts/ (from matplotlib->-r docs/docs_requirements.txt (line 5)) Requirement already satisfied: cycler>=0.10 in /home/docs/checkouts/ (from matplotlib->-r docs/docs_requirements.txt (line 5)) Requirement already satisfied: python-dateutil>=2.1 in /home/docs/checkouts/ (from matplotlib->-r docs/docs_requirements.txt (line 5)) Requirement already satisfied: kiwisolver>=1.0.1 in /home/docs/checkouts/ (from matplotlib->-r docs/docs_requirements.txt (line 5)) Requirement already satisfied: pytz>=2011k in /home/docs/checkouts/ (from pandas->-r docs/docs_requirements.txt (line 7)) Requirement already satisfied: more-itertools>=4.0.0 in /home/docs/checkouts/ (from pytest->-r docs/docs_requirements.txt (line 8)) Requirement already satisfied: atomicwrites>=1.0 in /home/docs/checkouts/ (from pytest->-r docs/docs_requirements.txt (line 8)) Requirement already satisfied: pluggy>=0.7 in /home/docs/checkouts/ (from pytest->-r docs/docs_requirements.txt (line 8)) Requirement already satisfied: py>=1.5.0 in /home/docs/checkouts/ (from pytest->-r docs/docs_requirements.txt (line 8)) Requirement already satisfied: attrs>=17.4.0 in /home/docs/checkouts/ (from pytest->-r docs/docs_requirements.txt (line 8)) Requirement already satisfied: docutils>=0.11 in /home/docs/checkouts/ (from sphinx==1.5.6->-r docs/docs_requirements.txt (line 12)) Requirement already satisfied: Jinja2>=2.3 in /home/docs/checkouts/ (from sphinx==1.5.6->-r docs/docs_requirements.txt (line 12)) Requirement already satisfied: imagesize in /home/docs/checkouts/ (from sphinx==1.5.6->-r docs/docs_requirements.txt (line 12)) Requirement already satisfied: snowballstemmer>=1.1 in /home/docs/checkouts/ (from sphinx==1.5.6->-r docs/docs_requirements.txt (line 12)) Requirement already satisfied: babel!=2.0,>=1.3 in /home/docs/checkouts/ (from sphinx==1.5.6->-r docs/docs_requirements.txt (line 12)) Requirement already satisfied: alabaster<0.8,>=0.7 in /home/docs/checkouts/ (from sphinx==1.5.6->-r docs/docs_requirements.txt (line 12)) Requirement already satisfied: pathtools>=0.1.2 in /home/docs/checkouts/ (from sphinx-autobuild==0.7.1->-r docs/docs_requirements.txt (line 13)) Requirement already satisfied: livereload>=2.3.0 in /home/docs/checkouts/ (from sphinx-autobuild==0.7.1->-r docs/docs_requirements.txt (line 13)) Requirement already satisfied: tornado>=3.2 in /home/docs/checkouts/ (from sphinx-autobuild==0.7.1->-r docs/docs_requirements.txt (line 13)) Requirement already satisfied: argh>=0.24.1 in /home/docs/checkouts/ (from sphinx-autobuild==0.7.1->-r docs/docs_requirements.txt (line 13)) Requirement already satisfied: watchdog>=0.7.1 in /home/docs/checkouts/ (from sphinx-autobuild==0.7.1->-r docs/docs_requirements.txt (line 13)) Requirement already satisfied: port-for==0.3.1 in /home/docs/checkouts/ (from sphinx-autobuild==0.7.1->-r docs/docs_requirements.txt (line 13)) Requirement already satisfied: PyYAML>=3.10 in /home/docs/checkouts/ (from sphinx-autobuild==0.7.1->-r docs/docs_requirements.txt (line 13)) Requirement already satisfied: gast>=0.2.0 in /home/docs/checkouts/ (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Requirement already satisfied: absl-py>=0.1.6 in /home/docs/checkouts/ (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Requirement already satisfied: keras-preprocessing>=1.0.5 in /home/docs/checkouts/ (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Requirement already satisfied: wheel>=0.26 in /home/docs/checkouts/ (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Requirement already satisfied: keras-applications>=1.0.6 in /home/docs/checkouts/ (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Requirement already satisfied: grpcio>=1.8.6 in /home/docs/checkouts/ (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Requirement already satisfied: tensorboard<1.13.0,>=1.12.0 in /home/docs/checkouts/ (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Requirement already satisfied: astor>=0.6.0 in /home/docs/checkouts/ (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Requirement already satisfied: protobuf>=3.6.1 in /home/docs/checkouts/ (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Requirement already satisfied: termcolor>=1.1.0 in /home/docs/checkouts/ (from tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Requirement already satisfied: future in /home/docs/checkouts/ (from pyglet>=1.2.0->gym>=0.10.8->-r docs/docs_requirements.txt (line 2)) Requirement already satisfied: certifi>=2017.4.17 in /home/docs/checkouts/ (from requests>=2.0->gym>=0.10.8->-r docs/docs_requirements.txt (line 2)) Requirement already satisfied: urllib3<1.25,>=1.21.1 in /home/docs/checkouts/ (from requests>=2.0->gym>=0.10.8->-r docs/docs_requirements.txt (line 2)) Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /home/docs/checkouts/ (from requests>=2.0->gym>=0.10.8->-r docs/docs_requirements.txt (line 2)) Requirement already satisfied: idna<2.8,>=2.5 in /home/docs/checkouts/ (from requests>=2.0->gym>=0.10.8->-r docs/docs_requirements.txt (line 2)) Requirement already satisfied: ipython-genutils in /home/docs/checkouts/ (from traitlets>=4.2->ipython->-r docs/docs_requirements.txt (line 3)) Requirement already satisfied: wcwidth in /home/docs/checkouts/ (from prompt-toolkit<2.1.0,>=2.0.0->ipython->-r docs/docs_requirements.txt (line 3)) Requirement already satisfied: ptyprocess>=0.5 in /home/docs/checkouts/ (from pexpect; sys_platform != "win32"->ipython->-r docs/docs_requirements.txt (line 3)) Requirement already satisfied: parso>=0.3.0 in /home/docs/checkouts/ (from jedi>=0.10->ipython->-r docs/docs_requirements.txt (line 3)) Requirement already satisfied: MarkupSafe>=0.23 in /home/docs/checkouts/ (from Jinja2>=2.3->sphinx==1.5.6->-r docs/docs_requirements.txt (line 12)) Requirement already satisfied: h5py in /home/docs/checkouts/ (from keras-applications>=1.0.6->tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Requirement already satisfied: werkzeug>=0.11.10 in /home/docs/checkouts/ (from tensorboard<1.13.0,>=1.12.0->tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Requirement already satisfied: markdown>=2.6.8 in /home/docs/checkouts/ (from tensorboard<1.13.0,>=1.12.0->tensorflow>=1.8.0->-r docs/docs_requirements.txt (line 15)) Installing collected packages: sphinx, sphinx-rtd-theme Found existing installation: Sphinx 1.7.9 Uninstalling Sphinx-1.7.9: Successfully uninstalled Sphinx-1.7.9 Found existing installation: sphinx-rtd-theme 0.4.2 Uninstalling sphinx-rtd-theme-0.4.2: Successfully uninstalled sphinx-rtd-theme-0.4.2 Successfully installed sphinx-1.5.6 sphinx-rtd-theme-0.4.1 You are using pip version 9.0.3, however version 18.1 is available. You should consider upgrading via the 'pip install --upgrade pip' command. [rtd-command-info] start-time: 2018-11-17T13:24:05.544180Z, end-time: 2018-11-17T13:24:05.605300Z, duration: 0, exit-code: 0 cat docs/ #!/usr/bin/env python3 # -*- coding: utf-8 -*- # # Spinning Up documentation build configuration file, created by # sphinx-quickstart on Wed Aug 15 04:21:07 2018. # # This file is execfile()d with the current directory set to its # containing dir. # # Note that not all possible configuration values are present in this # autogenerated file. # # All configuration values have a default; values that are commented out # serve to show the default. # If extensions (or modules to document with autodoc) are in another directory, # add these directories to sys.path here. If the directory is relative to the # documentation root, use os.path.abspath to make it absolute, like shown here. # import os import sys # Make sure spinup is accessible without going through dirname = os.path.dirname sys.path.insert(0, dirname(dirname(__file__))) # Mock mpi4py to get around having to install it on RTD server (which fails) from unittest.mock import MagicMock class Mock(MagicMock): @classmethod def __getattr__(cls, name): return MagicMock() MOCK_MODULES = ['mpi4py'] sys.modules.update((mod_name, Mock()) for mod_name in MOCK_MODULES) # Finish imports import spinup from recommonmark.parser import CommonMarkParser source_parsers = { '.md': CommonMarkParser, } # -- General configuration ------------------------------------------------ # If your documentation needs a minimal Sphinx version, state it here. # # needs_sphinx = '1.0' # Add any Sphinx extension module names here, as strings. They can be # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom # ones. extensions = ['sphinx.ext.imgmath', 'sphinx.ext.viewcode', 'sphinx.ext.autodoc', 'sphinx.ext.napoleon'] #'sphinx.ext.mathjax', ?? # imgmath settings imgmath_image_format = 'svg' imgmath_font_size = 14 # Add any paths that contain templates here, relative to this directory. templates_path = ['_templates'] # The suffix(es) of source filenames. # You can specify multiple suffix as a list of string: # source_suffix = ['.rst', '.md'] # source_suffix = '.rst' # The master toctree document. master_doc = 'index' # General information about the project. project = 'Spinning Up' copyright = '2018, OpenAI' author = 'Joshua Achiam' # The version info for the project you're documenting, acts as replacement for # |version| and |release|, also used in various other places throughout the # built documents. # # The short X.Y version. version = '' # The full version, including alpha/beta/rc tags. release = '' # The language for content autogenerated by Sphinx. Refer to documentation # for a list of supported languages. # # This is also used if you do content translation via gettext catalogs. # Usually you set "language" from the command line for these cases. language = None # List of patterns, relative to source directory, that match files and # directories to ignore when looking for source files. # This patterns also effect to html_static_path and html_extra_path exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store'] # The name of the Pygments (syntax highlighting) style to use. pygments_style = 'default' #'sphinx' # If true, `todo` and `todoList` produce output, else they produce nothing. todo_include_todos = False # -- Options for HTML output ---------------------------------------------- # The theme to use for HTML and HTML Help pages. See the documentation for # a list of builtin themes. # # html_theme = 'alabaster' html_theme = "sphinx_rtd_theme" # Theme options are theme-specific and customize the look and feel of a theme # further. For a list of options available for each theme, see the # documentation. # # html_theme_options = {} # Add any paths that contain custom static files (such as style sheets) here, # relative to this directory. They are copied after the builtin static files, # so a file named "default.css" will overwrite the builtin "default.css". html_static_path = ['_static'] html_logo = 'images/spinning-up-logo2.png' html_theme_options = { 'logo_only': True } #html_favicon = 'openai-favicon2_32x32.ico' html_favicon = 'openai_icon.ico' # -- Options for HTMLHelp output ------------------------------------------ # Output file base name for HTML help builder. htmlhelp_basename = 'SpinningUpdoc' # -- Options for LaTeX output --------------------------------------------- latex_elements = { # The paper size ('letterpaper' or 'a4paper'). # # 'papersize': 'letterpaper', # The font size ('10pt', '11pt' or '12pt'). # # 'pointsize': '10pt', # Additional stuff for the LaTeX preamble. # # 'preamble': '', # Latex figure (float) alignment # # 'figure_align': 'htbp', } imgmath_latex_preamble = r''' \usepackage{algorithm} \usepackage{algorithmic} \usepackage{cancel} \usepackage[verbose=true,letterpaper]{geometry} \geometry{ textheight=12in, textwidth=6.5in, top=1in, headheight=12pt, headsep=25pt, footskip=30pt } \newcommand{\E}{{\mathrm E}} \newcommand{\underE}[2]{\underset{\begin{subarray}{c}#1 \end{subarray}}{\E}\left[ #2 \right]} \newcommand{\Epi}[1]{\underset{\begin{subarray}{c}\tau \sim \pi \end{subarray}}{\E}\left[ #1 \right]} ''' # Grouping the document tree into LaTeX files. List of tuples # (source start file, target name, title, # author, documentclass [howto, manual, or own class]). latex_documents = [ (master_doc, 'SpinningUp.tex', 'Spinning Up Documentation', 'Joshua Achiam', 'manual'), ] # -- Options for manual page output --------------------------------------- # One entry per manual page. List of tuples # (source start file, name, description, authors, manual section). man_pages = [ (master_doc, 'spinningup', 'Spinning Up Documentation', [author], 1) ] # -- Options for Texinfo output ------------------------------------------- # Grouping the document tree into Texinfo files. List of tuples # (source start file, target name, title, author, # dir menu entry, description, category) texinfo_documents = [ (master_doc, 'SpinningUp', 'Spinning Up Documentation', author, 'SpinningUp', 'One line description of project.', 'Miscellaneous'), ] def setup(app): app.add_stylesheet('css/modify.css') ########################################################################### # auto-created specific configuration # ########################################################################### # # The following code was added during an automated build on # It is auto created and injected for every build. The result is based on the # file found in the codebase: # # import importlib import sys import os.path from six import string_types from sphinx import version_info # Get suffix for proper linking to GitHub # This is deprecated in Sphinx 1.3+, # as each page can have its own suffix if globals().get('source_suffix', False): if isinstance(source_suffix, string_types): SUFFIX = source_suffix else: SUFFIX = source_suffix[0] else: SUFFIX = '.rst' # Add RTD Static Path. Add to the end because it overwrites previous files. if not 'html_static_path' in globals(): html_static_path = [] if os.path.exists('_static'): html_static_path.append('_static') html_static_path.append('/home/docs/checkouts/') # Add RTD Theme only if they aren't overriding it already using_rtd_theme = ( ( 'html_theme' in globals() and html_theme in ['default'] and # Allow people to bail with a hack of having an html_style 'html_style' not in globals() ) or 'html_theme' not in globals() ) if using_rtd_theme: theme = importlib.import_module('sphinx_rtd_theme') html_theme = 'sphinx_rtd_theme' html_style = None html_theme_options = {} if 'html_theme_path' in globals(): html_theme_path.append(theme.get_html_theme_path()) else: html_theme_path = [theme.get_html_theme_path()] if globals().get('websupport2_base_url', False): websupport2_base_url = '' websupport2_static_url = '' #Add project information to the template context. context = { 'using_theme': using_rtd_theme, 'html_theme': html_theme, 'current_version': "latest", 'version_slug': "latest", 'MEDIA_URL': "", 'STATIC_URL': "", 'PRODUCTION_DOMAIN': "", 'versions': [ ("latest", "/en/latest/"), ], 'downloads': [ ("pdf", "//"), ("htmlzip", "//"), ("epub", "//"), ], 'subprojects': [ ], 'slug': 'openai-education-spinningup', 'name': u'spinningup', 'rtd_language': u'en', 'programming_language': u'words', 'canonical_url': '', 'analytics_code': 'UA-129132782-1', 'single_version': False, 'conf_py_path': '/docs/', 'api_host': '', 'github_user': 'openai', 'github_repo': 'spinningup', 'github_version': 'master', 'display_github': True, 'bitbucket_user': 'None', 'bitbucket_repo': 'None', 'bitbucket_version': 'master', 'display_bitbucket': False, 'gitlab_user': 'None', 'gitlab_repo': 'None', 'gitlab_version': 'master', 'display_gitlab': False, 'READTHEDOCS': True, 'using_theme': (html_theme == "default"), 'new_theme': (html_theme == "sphinx_rtd_theme"), 'source_suffix': SUFFIX, 'ad_free': False, 'user_analytics_code': 'UA-129132782-1', 'global_analytics_code': 'UA-17997319-2', 'commit': '606d9352', } if 'html_context' in globals(): html_context.update(context) else: html_context = context # Add custom RTD extension if 'extensions' in globals(): # Insert at the beginning because it can interfere # with other extensions. # See extensions.insert(0, "readthedocs_ext.readthedocs") else: extensions = ["readthedocs_ext.readthedocs"] [rtd-command-info] start-time: 2018-11-17T13:24:05.681390Z, end-time: 2018-11-17T13:25:37.338265Z, duration: 91, exit-code: 0 python sphinx-build -T -E -b readthedocs -d _build/doctrees-readthedocs -D language=en . _build/html Running Sphinx v1.5.6 making output directory... loading translations [en]... done building [mo]: targets for 0 po files that are out of date building [readthedocs]: targets for 31 source files that are out of date updating environment: 31 added, 0 changed, 0 removed reading sources... [ 3%] algorithms/ddpg reading sources... [ 6%] algorithms/ppo reading sources... [ 9%] algorithms/sac reading sources... [ 12%] algorithms/td3 reading sources... [ 16%] algorithms/trpo reading sources... [ 19%] algorithms/vpg reading sources... [ 22%] etc/acknowledgements reading sources... [ 25%] etc/author reading sources... [ 29%] index reading sources... [ 32%] spinningup/bench reading sources... [ 35%] spinningup/exercise2_1_soln reading sources... [ 38%] spinningup/exercise2_2_soln reading sources... [ 41%] spinningup/exercises reading sources... [ 45%] spinningup/extra_pg_proof1 reading sources... [ 48%] spinningup/extra_pg_proof2 reading sources... [ 51%] spinningup/keypapers reading sources... [ 54%] spinningup/rl_intro reading sources... [ 58%] spinningup/rl_intro2 reading sources... [ 61%] spinningup/rl_intro3 reading sources... [ 64%] spinningup/rl_intro4 reading sources... [ 67%] spinningup/spinningup reading sources... [ 70%] user/algorithms reading sources... [ 74%] user/installation reading sources... [ 77%] user/introduction reading sources... [ 80%] user/plotting reading sources... [ 83%] user/running reading sources... [ 87%] user/saving_and_loading reading sources... [ 90%] utils/logger reading sources... [ 93%] utils/mpi reading sources... [ 96%] utils/plotter reading sources... [100%] utils/run_utils /home/docs/checkouts/ WARNING: Duplicate explicit target name: "solution available here.". /home/docs/checkouts/ WARNING: Duplicate explicit target name: "on github". /home/docs/checkouts/ WARNING: Line block ends without a blank line. /home/docs/checkouts/ WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/ WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/ WARNING: Duplicate explicit target name: "cmdoption--ac_kwargs". /home/docs/checkouts/ WARNING: Inline strong start-string without end-string. /home/docs/checkouts/ WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/ WARNING: Duplicate explicit target name: "cmdoption-arg-default". looking for now-outdated files... none found pickling environment... done checking consistency... /home/docs/checkouts/ WARNING: document isn't included in any toctree /home/docs/checkouts/ WARNING: document isn't included in any toctree /home/docs/checkouts/ WARNING: document isn't included in any toctree /home/docs/checkouts/ WARNING: document isn't included in any toctree /home/docs/checkouts/ WARNING: document isn't included in any toctree done preparing documents... done writing output... [ 3%] algorithms/ddpg writing output... [ 6%] algorithms/ppo writing output... [ 9%] algorithms/sac writing output... [ 12%] algorithms/td3 writing output... [ 16%] algorithms/trpo writing output... [ 19%] algorithms/vpg writing output... [ 22%] etc/acknowledgements writing output... [ 25%] etc/author writing output... [ 29%] index writing output... [ 32%] spinningup/bench writing output... [ 35%] spinningup/exercise2_1_soln writing output... [ 38%] spinningup/exercise2_2_soln writing output... [ 41%] spinningup/exercises writing output... [ 45%] spinningup/extra_pg_proof1 writing output... [ 48%] spinningup/extra_pg_proof2 writing output... [ 51%] spinningup/keypapers writing output... [ 54%] spinningup/rl_intro writing output... [ 58%] spinningup/rl_intro2 writing output... [ 61%] spinningup/rl_intro3 writing output... [ 64%] spinningup/rl_intro4 writing output... [ 67%] spinningup/spinningup writing output... [ 70%] user/algorithms writing output... [ 74%] user/installation writing output... [ 77%] user/introduction writing output... [ 80%] user/plotting writing output... [ 83%] user/running writing output... [ 87%] user/saving_and_loading writing output... [ 90%] utils/logger writing output... [ 93%] utils/mpi writing output... [ 96%] utils/plotter writing output... [100%] utils/run_utils generating indices... genindex py-modindex highlighting module code... [ 10%] spinup.algos.ddpg.ddpg highlighting module code... [ 20%] spinup.algos.ppo.ppo highlighting module code... [ 30%] spinup.algos.sac.sac highlighting module code... [ 40%] spinup.algos.td3.td3 highlighting module code... [ 50%] spinup.algos.trpo.trpo highlighting module code... [ 60%] spinup.algos.vpg.vpg highlighting module code... [ 70%] spinup.utils.logx highlighting module code... [ 80%] spinup.utils.mpi_tools highlighting module code... [ 90%] spinup.utils.mpi_tf highlighting module code... [100%] spinup.utils.run_utils writing additional pages... search copying images... [ 10%] images/spinning-up-in-rl.png copying images... [ 20%] spinningup/../images/bench/bench_halfcheetah.svg copying images... [ 30%] spinningup/../images/bench/bench_hopper.svg copying images... [ 40%] spinningup/../images/bench/bench_walker.svg copying images... [ 50%] spinningup/../images/bench/bench_swim.svg copying images... [ 60%] spinningup/../images/bench/bench_ant.svg copying images... [ 70%] spinningup/../images/ex2-1_trpo_hopper.png copying images... [ 80%] spinningup/../images/ex2-2_ddpg_bug.svg copying images... [ 90%] spinningup/../images/rl_diagram_transparent_bg.png copying images... [100%] spinningup/../images/rl_algorithms_9_15.svg copying static files... WARNING: html_static_path entry '/home/docs/checkouts/' does not exist WARNING: favicon file 'openai_icon.ico' does not exist done copying extra files... done dumping search index in English (code: en) ... done dumping object inventory... done build succeeded, 16 warnings. [rtd-command-info] start-time: 2018-11-17T13:25:37.520689Z, end-time: 2018-11-17T13:26:54.486594Z, duration: 76, exit-code: 0 python sphinx-build -T -b readthedocssinglehtmllocalmedia -d _build/doctrees-readthedocssinglehtmllocalmedia -D language=en . _build/localmedia Running Sphinx v1.5.6 making output directory... loading translations [en]... done loading pickled environment... not yet created building [mo]: targets for 0 po files that are out of date building [readthedocssinglehtmllocalmedia]: all documents updating environment: 31 added, 0 changed, 0 removed reading sources... [ 3%] algorithms/ddpg reading sources... [ 6%] algorithms/ppo reading sources... [ 9%] algorithms/sac reading sources... [ 12%] algorithms/td3 reading sources... [ 16%] algorithms/trpo reading sources... [ 19%] algorithms/vpg reading sources... [ 22%] etc/acknowledgements reading sources... [ 25%] etc/author reading sources... [ 29%] index reading sources... [ 32%] spinningup/bench reading sources... [ 35%] spinningup/exercise2_1_soln reading sources... [ 38%] spinningup/exercise2_2_soln reading sources... [ 41%] spinningup/exercises reading sources... [ 45%] spinningup/extra_pg_proof1 reading sources... [ 48%] spinningup/extra_pg_proof2 reading sources... [ 51%] spinningup/keypapers reading sources... [ 54%] spinningup/rl_intro reading sources... [ 58%] spinningup/rl_intro2 reading sources... [ 61%] spinningup/rl_intro3 reading sources... [ 64%] spinningup/rl_intro4 reading sources... [ 67%] spinningup/spinningup reading sources... [ 70%] user/algorithms reading sources... [ 74%] user/installation reading sources... [ 77%] user/introduction reading sources... [ 80%] user/plotting reading sources... [ 83%] user/running reading sources... [ 87%] user/saving_and_loading reading sources... [ 90%] utils/logger reading sources... [ 93%] utils/mpi reading sources... [ 96%] utils/plotter reading sources... [100%] utils/run_utils /home/docs/checkouts/ WARNING: Duplicate explicit target name: "solution available here.". /home/docs/checkouts/ WARNING: Duplicate explicit target name: "on github". /home/docs/checkouts/ WARNING: Line block ends without a blank line. /home/docs/checkouts/ WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/ WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/ WARNING: Duplicate explicit target name: "cmdoption--ac_kwargs". /home/docs/checkouts/ WARNING: Inline strong start-string without end-string. /home/docs/checkouts/ WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/ WARNING: Duplicate explicit target name: "cmdoption-arg-default". looking for now-outdated files... none found pickling environment... done checking consistency... /home/docs/checkouts/ WARNING: document isn't included in any toctree /home/docs/checkouts/ WARNING: document isn't included in any toctree /home/docs/checkouts/ WARNING: document isn't included in any toctree done preparing documents... /home/docs/checkouts/ WARNING: document isn't included in any toctree /home/docs/checkouts/ WARNING: document isn't included in any toctree done assembling single document... user/introduction user/installation user/algorithms user/running user/saving_and_loading user/plotting spinningup/rl_intro spinningup/rl_intro2 spinningup/rl_intro3 spinningup/spinningup spinningup/keypapers spinningup/exercises spinningup/bench algorithms/vpg algorithms/trpo algorithms/ppo algorithms/ddpg algorithms/td3 algorithms/sac utils/logger utils/plotter utils/mpi utils/run_utils etc/acknowledgements etc/author writing... done writing additional files... copying images... [ 12%] images/spinning-up-in-rl.png copying images... [ 25%] spinningup/../images/rl_diagram_transparent_bg.png copying images... [ 37%] spinningup/../images/rl_algorithms_9_15.svg copying images... [ 50%] spinningup/../images/bench/bench_halfcheetah.svg copying images... [ 62%] spinningup/../images/bench/bench_hopper.svg copying images... [ 75%] spinningup/../images/bench/bench_walker.svg copying images... [ 87%] spinningup/../images/bench/bench_swim.svg copying images... [100%] spinningup/../images/bench/bench_ant.svg copying static files... WARNING: html_static_path entry '/home/docs/checkouts/' does not exist WARNING: favicon file 'openai_icon.ico' does not exist done copying extra files... done dumping object inventory... done build succeeded, 16 warnings. [rtd-command-info] start-time: 2018-11-17T13:26:54.649728Z, end-time: 2018-11-17T13:27:00.162843Z, duration: 5, exit-code: 0 python sphinx-build -b latex -D language=en -d _build/doctrees . _build/latex Running Sphinx v1.5.6 making output directory... loading translations [en]... done loading pickled environment... failed: source directory has changed building [mo]: targets for 0 po files that are out of date building [latex]: all documents updating environment: 31 added, 0 changed, 0 removed reading sources... [ 3%] algorithms/ddpg reading sources... [ 6%] algorithms/ppo reading sources... [ 9%] algorithms/sac reading sources... [ 12%] algorithms/td3 reading sources... [ 16%] algorithms/trpo reading sources... [ 19%] algorithms/vpg reading sources... [ 22%] etc/acknowledgements reading sources... [ 25%] etc/author reading sources... [ 29%] index reading sources... [ 32%] spinningup/bench reading sources... [ 35%] spinningup/exercise2_1_soln reading sources... [ 38%] spinningup/exercise2_2_soln reading sources... [ 41%] spinningup/exercises reading sources... [ 45%] spinningup/extra_pg_proof1 reading sources... [ 48%] spinningup/extra_pg_proof2 reading sources... [ 51%] spinningup/keypapers reading sources... [ 54%] spinningup/rl_intro reading sources... [ 58%] spinningup/rl_intro2 reading sources... [ 61%] spinningup/rl_intro3 reading sources... [ 64%] spinningup/rl_intro4 reading sources... [ 67%] spinningup/spinningup reading sources... [ 70%] user/algorithms reading sources... [ 74%] user/installation reading sources... [ 77%] user/introduction reading sources... [ 80%] user/plotting reading sources... [ 83%] user/running reading sources... [ 87%] user/saving_and_loading reading sources... [ 90%] utils/logger reading sources... [ 93%] utils/mpi reading sources... [ 96%] utils/plotter reading sources... [100%] utils/run_utils /home/docs/checkouts/ WARNING: Duplicate explicit target name: "solution available here.". /home/docs/checkouts/ WARNING: Duplicate explicit target name: "on github". /home/docs/checkouts/ WARNING: Line block ends without a blank line. /home/docs/checkouts/ WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/ WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/ WARNING: Duplicate explicit target name: "cmdoption--ac_kwargs". /home/docs/checkouts/ WARNING: Inline strong start-string without end-string. /home/docs/checkouts/ WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/ WARNING: Duplicate explicit target name: "cmdoption-arg-default". looking for now-outdated files... none found pickling environment... done checking consistency... /home/docs/checkouts/ WARNING: document isn't included in any toctree /home/docs/checkouts/ WARNING: document isn't included in any toctree /home/docs/checkouts/ WARNING: document isn't included in any toctree /home/docs/checkouts/ WARNING: document isn't included in any toctree /home/docs/checkouts/ WARNING: document isn't included in any toctree done processing SpinningUp.tex... index user/introduction user/installation user/algorithms user/running user/saving_and_loading user/plotting spinningup/rl_intro spinningup/rl_intro2 spinningup/rl_intro3 spinningup/spinningup spinningup/keypapers spinningup/exercises spinningup/bench algorithms/vpg algorithms/trpo algorithms/ppo algorithms/ddpg algorithms/td3 algorithms/sac utils/logger utils/plotter utils/mpi utils/run_utils etc/acknowledgements etc/author resolving references... writing... done copying images... images/spinning-up-in-rl.png spinningup/../images/rl_diagram_transparent_bg.png spinningup/../images/rl_algorithms_9_15.svg spinningup/../images/bench/bench_halfcheetah.svg spinningup/../images/bench/bench_hopper.svg spinningup/../images/bench/bench_walker.svg spinningup/../images/bench/bench_swim.svg spinningup/../images/bench/bench_ant.svg copying TeX support files... done build succeeded, 14 warnings. [rtd-command-info] start-time: 2018-11-17T13:27:00.223309Z, end-time: 2018-11-17T13:27:01.372241Z, duration: 1, exit-code: 0 pdflatex -interaction=nonstopmode /home/docs/checkouts/ This is pdfTeX, Version 3.14159265-2.6-1.40.16 (TeX Live 2015/Debian) (preloaded format=pdflatex) restricted \write18 enabled. entering extended mode (/home/docs/checkouts/ heckouts/latest/docs/_build/latex/SpinningUp.tex LaTeX2e <2016/02/01> Babel <3.9q> and hyphenation patterns for 81 language(s) loaded. (./sphinxmanual.cls Document Class: sphinxmanual 2017/03/26 v1.5.4 Document class (Sphinx manual) (/usr/share/texlive/texmf-dist/tex/latex/base/report.cls Document Class: report 2014/09/29 v1.4h Standard LaTeX document class (/usr/share/texlive/texmf-dist/tex/latex/base/size10.clo))) (/usr/share/texlive/texmf-dist/tex/latex/base/inputenc.sty (/usr/share/texlive/texmf-dist/tex/latex/base/utf8.def (/usr/share/texlive/texmf-dist/tex/latex/base/t1enc.dfu) (/usr/share/texlive/texmf-dist/tex/latex/base/ot1enc.dfu) (/usr/share/texlive/texmf-dist/tex/latex/base/omsenc.dfu))) (/usr/share/texlive/texmf-dist/tex/latex/cmap/cmap.sty) (/usr/share/texlive/texmf-dist/tex/latex/base/fontenc.sty (/usr/share/texlive/texmf-dist/tex/latex/base/t1enc.def)<>) (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsmath.sty For additional information on amsmath, use the `?' option. (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amstext.sty (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsgen.sty)) (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsbsy.sty) (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsopn.sty)) (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/amssymb.sty (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/amsfonts.sty)) (/usr/share/texlive/texmf-dist/tex/generic/babel/babel.sty (/usr/share/texlive/texmf-dist/tex/generic/babel-english/english.ldf (/usr/share/texlive/texmf-dist/tex/generic/babel/babel.def))) (/usr/share/texlive/texmf-dist/tex/latex/psnfss/times.sty) (/usr/share/texlive/texmf-dist/tex/latex/fncychap/fncychap.sty) (/usr/share/texlive/texmf-dist/tex/latex/tools/longtable.sty) (./sphinx.sty (/usr/share/texlive/texmf-dist/tex/latex/graphics/graphicx.sty (/usr/share/texlive/texmf-dist/tex/latex/graphics/keyval.sty) (/usr/share/texlive/texmf-dist/tex/latex/graphics/graphics.sty (/usr/share/texlive/texmf-dist/tex/latex/graphics/trig.sty) (/usr/share/texlive/texmf-dist/tex/latex/latexconfig/graphics.cfg) (/usr/share/texlive/texmf-dist/tex/latex/pdftex-def/pdftex.def (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/infwarerr.sty) (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ltxcmds.sty)))) (/usr/share/texlive/texmf-dist/tex/latex/fancyhdr/fancyhdr.sty) (/usr/share/texlive/texmf-dist/tex/latex/base/textcomp.sty (/usr/share/texlive/texmf-dist/tex/latex/base/ts1enc.def (/usr/share/texlive/texmf-dist/tex/latex/base/ts1enc.dfu))) (/usr/share/texlive/texmf-dist/tex/latex/titlesec/titlesec.sty) (/usr/share/texlive/texmf-dist/tex/latex/etoolbox/etoolbox.sty) Package Sphinx Info: **** titlesec 2.10.1 successfully patched for bugfix **** (/usr/share/texlive/texmf-dist/tex/latex/tabulary/tabulary.sty (/usr/share/texlive/texmf-dist/tex/latex/tools/array.sty)) (/usr/share/texlive/texmf-dist/tex/latex/base/makeidx.sty) (/usr/share/texlive/texmf-dist/tex/latex/framed/framed.sty) (/usr/share/texlive/texmf-dist/tex/latex/xcolor/xcolor.sty (/usr/share/texlive/texmf-dist/tex/latex/latexconfig/color.cfg)) (/usr/share/texlive/texmf-dist/tex/latex/fancyvrb/fancyvrb.sty Style option: `fancyvrb' v2.7a, with DG/SPQR fixes, and firstline=lastline fix <2008/02/07> (tvz)) (/usr/share/texlive/texmf-dist/tex/latex/threeparttable/threeparttable.sty) (./footnotehyper-sphinx.sty (/usr/share/texlive/texmf-dist/tex/latex/mdwtools/footnote.sty)) (/usr/share/texlive/texmf-dist/tex/latex/float/float.sty) (/usr/share/texlive/texmf-dist/tex/latex/wrapfig/wrapfig.sty) (/usr/share/texlive/texmf-dist/tex/latex/parskip/parskip.sty) (/usr/share/texlive/texmf-dist/tex/latex/base/alltt.sty) (/usr/share/texlive/texmf-dist/tex/latex/upquote/upquote.sty) (/usr/share/texlive/texmf-dist/tex/latex/capt-of/capt-of.sty) (./needspace.sty) (./sphinxhighlight.sty) (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/kvoptions.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/kvsetkeys.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/etexcmds.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ifluatex.sty)))) (/usr/share/texlive/texmf-dist/tex/generic/pdftex/pdfcolor.tex) ** (sphinx) defining (legacy) text style macros without \sphinx prefix ** if clashes with packages, set latex_keep_old_macro_names=False in ) (/usr/share/texlive/texmf-dist/tex/latex/geometry/geometry.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ifpdf.sty) (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ifvtex.sty) (/usr/share/texlive/texmf-dist/tex/generic/ifxetex/ifxetex.sty)) (/usr/share/texlive/texmf-dist/tex/latex/multirow/multirow.sty) (/usr/share/texlive/texmf-dist/tex/latex/eqparbox/eqparbox.sty (/usr/share/texlive/texmf-dist/tex/latex/environ/environ.sty (/usr/share/texlive/texmf-dist/tex/latex/trimspaces/trimspaces.sty))) (/usr/share/texlive/texmf-dist/tex/latex/hyperref/hyperref.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/hobsub-hyperref.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/hobsub-generic.sty)) (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/auxhook.sty) (/usr/share/texlive/texmf-dist/tex/latex/hyperref/pd1enc.def) (/usr/share/texlive/texmf-dist/tex/latex/latexconfig/hyperref.cfg) (/usr/share/texlive/texmf-dist/tex/latex/hyperref/puenc.def) (/usr/share/texlive/texmf-dist/tex/latex/url/url.sty)) Package hyperref Message: Driver (autodetected): hpdftex. (/usr/share/texlive/texmf-dist/tex/latex/hyperref/hpdftex.def (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/rerunfilecheck.sty)) (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/hypcap.sty) Writing index file SpinningUp.idx No file SpinningUp.aux. (/usr/share/texlive/texmf-dist/tex/latex/base/ts1cmr.fd) (/usr/share/texlive/texmf-dist/tex/latex/psnfss/t1ptm.fd) (/usr/share/texlive/texmf-dist/tex/context/base/supp-pdf.mkii [Loading MPS to PDF converter (version 2006.09.02).] ) (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/epstopdf-base.sty (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/grfext.sty) (/usr/share/texlive/texmf-dist/tex/latex/latexconfig/epstopdf-sys.cfg)) *geometry* driver: auto-detecting *geometry* detected driver: pdftex (/usr/share/texlive/texmf-dist/tex/latex/hyperref/nameref.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/gettitlestring.sty)) (/usr/share/texlive/texmf-dist/tex/latex/psnfss/t1phv.fd)<><><><> (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/umsa.fd) (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/umsb.fd) [1{/var/lib/texmf/fo nts/map/pdftex/updmap/}] [2] [1] [2] (/usr/share/texlive/texmf-dist/tex/latex/psnfss/t1pcr.fd) [1 <./spinning-up-in-rl.png>] [2] Chapter 1. (/usr/share/texlive/texmf-dist/tex/latex/psnfss/ts1ptm.fd) LaTeX Warning: Hyper reference `user/introduction:introduction' on page 3 undef ined on input line 68. LaTeX Warning: Hyper reference `user/introduction:what-this-is' on page 3 undef ined on input line 71. LaTeX Warning: Hyper reference `user/introduction:why-we-built-this' on page 3 undefined on input line 74. LaTeX Warning: Hyper reference `user/introduction:how-this-serves-our-mission' on page 3 undefined on input line 77. LaTeX Warning: Hyper reference `user/introduction:code-design-philosophy' on pa ge 3 undefined on input line 80. LaTeX Warning: Hyper reference `user/introduction:support-plan' on page 3 undef ined on input line 83. [3] [4] [5] [6] Chapter 2. LaTeX Warning: Hyper reference `user/installation:installation' on page 7 undef ined on input line 207. LaTeX Warning: Hyper reference `user/installation:installing-python' on page 7 undefined on input line 210. LaTeX Warning: Hyper reference `user/installation:installing-openmpi' on page 7 undefined on input line 213. LaTeX Warning: Hyper reference `user/installation:ubuntu' on page 7 undefined o n input line 216. LaTeX Warning: Hyper reference `user/installation:mac-os-x' on page 7 undefined on input line 219. LaTeX Warning: Hyper reference `user/installation:installing-spinning-up' on pa ge 7 undefined on input line 224. LaTeX Warning: Hyper reference `user/installation:check-your-install' on page 7 undefined on input line 227. LaTeX Warning: Hyper reference `user/installation:installing-mujoco-optional' o n page 7 undefined on input line 230. [7] [8] [9] [10] Chapter 3. LaTeX Warning: Hyper reference `user/algorithms:algorithms' on page 11 undefine d on input line 365. LaTeX Warning: Hyper reference `user/algorithms:what-s-included' on page 11 und efined on input line 368. LaTeX Warning: Hyper reference `user/algorithms:why-these-algorithms' on page 1 1 undefined on input line 371. LaTeX Warning: Hyper reference `user/algorithms:the-on-policy-algorithms' on pa ge 11 undefined on input line 374. LaTeX Warning: Hyper reference `user/algorithms:the-off-policy-algorithms' on p age 11 undefined on input line 377. LaTeX Warning: Hyper reference `user/algorithms:code-format' on page 11 undefin ed on input line 382. LaTeX Warning: Hyper reference `user/algorithms:the-algorithm-file' on page 11 undefined on input line 385. LaTeX Warning: Hyper reference `user/algorithms:the-core-file' on page 11 undef ined on input line 388. [11] [12] [13] [14] Chapter 4. LaTeX Warning: Hyper reference `user/running:running-experiments' on page 15 un defined on input line 534. LaTeX Warning: Hyper reference `user/running:launching-from-the-command-line' o n page 15 undefined on input line 537. LaTeX Warning: Hyper reference `user/running:setting-hyperparameters-from-the-c ommand-line' on page 15 undefined on input line 540. LaTeX Warning: Hyper reference `user/running:launching-multiple-experiments-at- once' on page 15 undefined on input line 543. LaTeX Warning: Hyper reference `user/running:special-flags' on page 15 undefine d on input line 546. LaTeX Warning: Hyper reference `user/running:environment-flag' on page 15 undef ined on input line 549. LaTeX Warning: Hyper reference `user/running:shortcut-flags' on page 15 undefin ed on input line 552. LaTeX Warning: Hyper reference `user/running:config-flags' on page 15 undefined on input line 555. LaTeX Warning: Hyper reference `user/running:where-results-are-saved' on page 1 5 undefined on input line 560. LaTeX Warning: Hyper reference `user/running:how-is-suffix-determined' on page 15 undefined on input line 563. LaTeX Warning: Hyper reference `user/running:extra' on page 15 undefined on inp ut line 568. LaTeX Warning: Hyper reference `user/running:launching-from-scripts' on page 15 undefined on input line 573. LaTeX Warning: Hyper reference `user/running:using-experimentgrid' on page 15 u ndefined on input line 576. [15] [16] [17] [18] (/usr/share/texlive/texmf-dist/tex/latex/psnfss/ts1pcr.fd) [19] [20] Chapter 5. LaTeX Warning: Hyper reference `user/saving_and_loading:experiment-outputs' on page 21 undefined on input line 904. LaTeX Warning: Hyper reference `user/saving_and_loading:algorithm-outputs' on p age 21 undefined on input line 907. LaTeX Warning: Hyper reference `user/saving_and_loading:save-directory-location ' on page 21 undefined on input line 910. LaTeX Warning: Hyper reference `user/saving_and_loading:loading-and-running-tra ined-policies' on page 21 undefined on input line 913. LaTeX Warning: Hyper reference `user/saving_and_loading:if-environment-saves-su ccessfully' on page 21 undefined on input line 916. LaTeX Warning: Hyper reference `user/saving_and_loading:environment-not-found-e rror' on page 21 undefined on input line 919. LaTeX Warning: Hyper reference `user/saving_and_loading:using-trained-value-fun ctions' on page 21 undefined on input line 922. [21] LaTeX Warning: Hyper reference `user/saving_and_loading:details-below' on page 22 undefined on input line 967. [22] [23] [24] [25] [26] Chapter 6. [27] [28] Chapter 7. LaTeX Warning: Hyper reference `spinningup/rl_intro:part-1-key-concepts-in-rl' on page 29 undefined on input line 1283. LaTeX Warning: Hyper reference `spinningup/rl_intro:what-can-rl-do' on page 29 undefined on input line 1286. LaTeX Warning: Hyper reference `spinningup/rl_intro:key-concepts-and-terminolog y' on page 29 undefined on input line 1289. LaTeX Warning: Hyper reference `spinningup/rl_intro:optional-formalism' on page 29 undefined on input line 1292. [29] [30 <./rl_diagram_transparent_bg.png>] [31] [32] [33] ! Undefined control sequence. ... } P(\tau |\pi ) R(\tau ) = \underE {\tau \sim \pi }{R(\tau )}... l.1535 ...nderE{\tau\sim \pi}{R(\tau)}.\end{split} ! Undefined control sequence. ... } P(\tau |\pi ) R(\tau ) = \underE {\tau \sim \pi }{R(\tau )}... l.1535 ...nderE{\tau\sim \pi}{R(\tau)}.\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {\tau \sim \pi }{R(\tau )\... l.1554 ...R(\tau)\left| s_0 = s\right.}\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {\tau \sim \pi }{R(\tau )\... l.1554 ...R(\tau)\left| s_0 = s\right.}\end{split} [34] ! Undefined control sequence. ...egin {split}Q^{\pi }(s,a) = \underE {\tau \sim \pi }{R(\tau )\... l.1561 ...eft| s_0 = s, a_0 = a\right.}\end{split} ! Undefined control sequence. ...egin {split}Q^{\pi }(s,a) = \underE {\tau \sim \pi }{R(\tau )\... l.1561 ...eft| s_0 = s, a_0 = a\right.}\end{split} ! Undefined control sequence. ...split}V^*(s) = \max _{\pi } \underE {\tau \sim \pi }{R(\tau )\... l.1568 ...R(\tau)\left| s_0 = s\right.}\end{split} ! Undefined control sequence. ...split}V^*(s) = \max _{\pi } \underE {\tau \sim \pi }{R(\tau )\... l.1568 ...R(\tau)\left| s_0 = s\right.}\end{split} ! Undefined control sequence. ...lit}Q^*(s,a) = \max _{\pi } \underE {\tau \sim \pi }{R(\tau )\... l.1575 ...eft| s_0 = s, a_0 = a\right.}\end{split} ! Undefined control sequence. ...lit}Q^*(s,a) = \max _{\pi } \underE {\tau \sim \pi }{R(\tau )\... l.1575 ...eft| s_0 = s, a_0 = a\right.}\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {a\sim \pi }{Q^{\pi }(s,a)... l.1589 ...erE{a\sim \pi}{Q^{\pi}(s,a)},\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {a\sim \pi }{Q^{\pi }(s,a)... l.1589 ...erE{a\sim \pi}{Q^{\pi}(s,a)},\end{split} [35] ! Undefined control sequence. V^{\pi }(s) &= \underE {a \sim \pi \\ s'\sim P}{r(s,a) + \gamma ... l.1622 \end{align*} ! Missing } inserted. } l.1622 \end{align*} ! Missing { inserted. { l.1622 \end{align*} ! Undefined control sequence. ...}(s')}, \\ Q^{\pi }(s,a) &= \underE {s'\sim P}{r(s,a) + \gamma... l.1622 \end{align*} ! Undefined control sequence. ... {s'\sim P}{r(s,a) + \gamma \underE {a'\sim \pi }{Q^{\pi }(s',... l.1622 \end{align*} ! Undefined control sequence. V^{\pi }(s) &= \underE {a \sim \pi \\ s'\sim P}{r(s,a) + \gamma ... l.1622 \end{align*} ! Missing } inserted. } l.1622 \end{align*} ! Missing \endgroup inserted. \endgroup l.1622 \end{align*} ! Misplaced \omit. \math@cr@@@ ...@ \@ne \add@amps \maxfields@ \omit \kern -\alignsep@ \iftag@ ... l.1622 \end{align*} ! Missing { inserted. { l.1622 \end{align*} ! Undefined control sequence. ...}(s')}, \\ Q^{\pi }(s,a) &= \underE {s'\sim P}{r(s,a) + \gamma... l.1622 \end{align*} ! Undefined control sequence. ... {s'\sim P}{r(s,a) + \gamma \underE {a'\sim \pi }{Q^{\pi }(s',... l.1622 \end{align*} ! Undefined control sequence. V^*(s) &= \max _a \underE {s'\sim P}{r(s,a) + \gamma V^*(s')}, \... l.1629 \end{align*} ! Undefined control sequence. V^*(s')}, \\ Q^*(s,a) &= \underE {s'\sim P}{r(s,a) + \gamma... l.1629 \end{align*} ! Undefined control sequence. V^*(s) &= \max _a \underE {s'\sim P}{r(s,a) + \gamma V^*(s')}, \... l.1629 \end{align*} ! Undefined control sequence. V^*(s')}, \\ Q^*(s,a) &= \underE {s'\sim P}{r(s,a) + \gamma... l.1629 \end{align*} [36] [37] [38] Chapter 8. LaTeX Warning: Hyper reference `spinningup/rl_intro2:part-2-kinds-of-rl-algorit hms' on page 39 undefined on input line 1682. LaTeX Warning: Hyper reference `spinningup/rl_intro2:a-taxonomy-of-rl-algorithm s' on page 39 undefined on input line 1685. LaTeX Warning: Hyper reference `spinningup/rl_intro2:links-to-algorithms-in-tax onomy' on page 39 undefined on input line 1688. ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.1703 ...ncludegraphics{{rl_algorithms_9_15}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.1703 ...ncludegraphics{{rl_algorithms_9_15}.svg} LaTeX Warning: Hyper reference `spinningup/rl_intro2:citations-below' on page 3 9 undefined on input line 1704. [39] [40] [41] [42] Chapter 9. LaTeX Warning: Hyper reference `spinningup/rl_intro3:part-3-intro-to-policy-opt imization' on page 43 undefined on input line 1845. LaTeX Warning: Hyper reference `spinningup/rl_intro3:deriving-the-simplest-poli cy-gradient' on page 43 undefined on input line 1848. LaTeX Warning: Hyper reference `spinningup/rl_intro3:implementing-the-simplest- policy-gradient' on page 43 undefined on input line 1851. LaTeX Warning: Hyper reference `spinningup/rl_intro3:expected-grad-log-prob-lem ma' on page 43 undefined on input line 1854. LaTeX Warning: Hyper reference `spinningup/rl_intro3:don-t-let-the-past-distrac t-you' on page 43 undefined on input line 1857. LaTeX Warning: Hyper reference `spinningup/rl_intro3:implementing-reward-to-go- policy-gradient' on page 43 undefined on input line 1860. LaTeX Warning: Hyper reference `spinningup/rl_intro3:baselines-in-policy-gradie nts' on page 43 undefined on input line 1863. LaTeX Warning: Hyper reference `spinningup/rl_intro3:other-forms-of-the-policy- gradient' on page 43 undefined on input line 1866. LaTeX Warning: Hyper reference `spinningup/rl_intro3:recap' on page 43 undefine d on input line 1869. ! Undefined control sequence. l.1894 ...ected return \(J(\pi_{\theta}) = \underE {\tau \sim \pi_{\theta}}{R... [43] ! Undefined control sequence. ...} \log P(\tau | \theta ) &= \cancel {\nabla _{\theta } \log \r... l.1925 ... \log \pi_{\theta}(a_t |s_t).\end{split} ! Undefined control sequence. ...} + \sum _{t=0}^{T} \bigg ( \cancel {\nabla _{\theta } \log P(... l.1925 ... \log \pi_{\theta}(a_t |s_t).\end{split} ! Undefined control sequence. ...} \log P(\tau | \theta ) &= \cancel {\nabla _{\theta } \log \r... l.1925 ... \log \pi_{\theta}(a_t |s_t).\end{split} ! Undefined control sequence. ...} + \sum _{t=0}^{T} \bigg ( \cancel {\nabla _{\theta } \log P(... l.1925 ... \log \pi_{\theta}(a_t |s_t).\end{split} ! Undefined control sequence. ...eta }) &= \nabla _{\theta } \underE {\tau \sim \pi _{\theta }}... l.1937 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...Log-derivative trick} \\ &= \underE {\tau \sim \pi _{\theta }}... l.1937 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...heta } J(\pi _{\theta }) &= \underE {\tau \sim \pi _{\theta }}... l.1937 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...eta }) &= \nabla _{\theta } \underE {\tau \sim \pi _{\theta }}... l.1937 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...Log-derivative trick} \\ &= \underE {\tau \sim \pi _{\theta }}... l.1937 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...heta } J(\pi _{\theta }) &= \underE {\tau \sim \pi _{\theta }}... l.1937 \end{align*} \end{sphinxadmonition} [44] [45] [46] [47] ! Undefined control sequence. \split@tag \begin {split}\underE {x \sim P_{\theta }}{\nabla _{\t... l.2086 ...eta} \log P_{\theta}(x)} = 0.\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {x \sim P_{\theta }}{\nabla _{\t... l.2086 ...eta} \log P_{\theta}(x)} = 0.\end{split} ! Undefined control sequence. ...eta }(x) \\ \therefore 0 &= \underE {x \sim P_{\theta }}{\nabl... l.2103 ...{\theta} \log P_{\theta}(x)}.\end{split} ! Undefined control sequence. ...eta }(x) \\ \therefore 0 &= \underE {x \sim P_{\theta }}{\nabl... l.2103 ...{\theta} \log P_{\theta}(x)}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2111 ..._{\theta}(a_t |s_t) R(\tau)}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2111 ..._{\theta}(a_t |s_t) R(\tau)}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2119 ...R(s_{t'}, a_{t'}, s_{t'+1})}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2119 ...R(s_{t'}, a_{t'}, s_{t'+1})}.\end{split} [48] ! Undefined control sequence. \split@tag \begin {split}\underE {a_t \sim \pi _{\theta }}{\nabla... l.2171 ...\theta}(a_t|s_t) b(s_t)} = 0.\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {a_t \sim \pi _{\theta }}{\nabla... l.2171 ...\theta}(a_t|s_t) b(s_t)} = 0.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2175 ..., s_{t'+1}) - b(s_t)\right)}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2175 ..., s_{t'+1}) - b(s_t)\right)}.\end{split} [49] ! Undefined control sequence. ...phi _k = \arg \min _{\phi } \underE {s_t, \hat {R}_t \sim \pi ... l.2189 ...(s_t) - \hat{R}_t \right)^2},\end{split} ! Undefined control sequence. ...phi _k = \arg \min _{\phi } \underE {s_t, \hat {R}_t \sim \pi ... l.2189 ...(s_t) - \hat{R}_t \right)^2},\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2203 ...i_{\theta}(a_t |s_t) \Phi_t},\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2203 ...i_{\theta}(a_t |s_t) \Phi_t},\end{split} [50] [51] [52] Chapter 10. LaTeX Warning: Hyper reference `spinningup/spinningup:spinning-up-as-a-deep-rl- researcher' on page 53 undefined on input line 2257. LaTeX Warning: Hyper reference `spinningup/spinningup:the-right-background' on page 53 undefined on input line 2260. LaTeX Warning: Hyper reference `spinningup/spinningup:learn-by-doing' on page 5 3 undefined on input line 2263. LaTeX Warning: Hyper reference `spinningup/spinningup:developing-a-research-pro ject' on page 53 undefined on input line 2266. LaTeX Warning: Hyper reference `spinningup/spinningup:doing-rigorous-research-i n-rl' on page 53 undefined on input line 2269. LaTeX Warning: Hyper reference `spinningup/spinningup:closing-thoughts' on page 53 undefined on input line 2272. LaTeX Warning: Hyper reference `spinningup/spinningup:ps-other-resources' on pa ge 53 undefined on input line 2275. LaTeX Warning: Hyper reference `spinningup/spinningup:references' on page 53 un defined on input line 2278. [53] [54] [55] [56] [57] [58] Chapter 11. LaTeX Warning: Hyper reference `spinningup/keypapers:key-papers-in-deep-rl' on page 59 undefined on input line 2387. LaTeX Warning: Hyper reference `spinningup/keypapers:model-free-rl' on page 59 undefined on input line 2390. LaTeX Warning: Hyper reference `spinningup/keypapers:exploration' on page 59 un defined on input line 2393. LaTeX Warning: Hyper reference `spinningup/keypapers:transfer-and-multitask-rl' on page 59 undefined on input line 2396. LaTeX Warning: Hyper reference `spinningup/keypapers:hierarchy' on page 59 unde fined on input line 2399. LaTeX Warning: Hyper reference `spinningup/keypapers:memory' on page 59 undefin ed on input line 2402. LaTeX Warning: Hyper reference `spinningup/keypapers:model-based-rl' on page 59 undefined on input line 2405. LaTeX Warning: Hyper reference `spinningup/keypapers:meta-rl' on page 59 undefi ned on input line 2408. LaTeX Warning: Hyper reference `spinningup/keypapers:scaling-rl' on page 59 und efined on input line 2411. LaTeX Warning: Hyper reference `spinningup/keypapers:rl-in-the-real-world' on p age 59 undefined on input line 2414. LaTeX Warning: Hyper reference `spinningup/keypapers:safety' on page 59 undefin ed on input line 2417. LaTeX Warning: Hyper reference `spinningup/keypapers:imitation-learning-and-inv erse-reinforcement-learning' on page 59 undefined on input line 2420. LaTeX Warning: Hyper reference `spinningup/keypapers:reproducibility-analysis-a nd-critique' on page 59 undefined on input line 2423. LaTeX Warning: Hyper reference `spinningup/keypapers:bonus-classic-papers-in-rl -theory-or-review' on page 59 undefined on input line 2426. [59] [60] Overfull \vbox (108.35579pt too high) has occurred while \output is active [61] [62] Chapter 12. LaTeX Warning: Hyper reference `spinningup/exercises:exercises' on page 63 unde fined on input line 2515. LaTeX Warning: Hyper reference `spinningup/exercises:problem-set-1-basics-of-im plementation' on page 63 undefined on input line 2518. LaTeX Warning: Hyper reference `spinningup/exercises:problem-set-2-algorithm-fa ilure-modes' on page 63 undefined on input line 2521. LaTeX Warning: Hyper reference `spinningup/exercises:challenges' on page 63 und efined on input line 2524. [63] [64] [65] [66] Chapter 13. LaTeX Warning: Hyper reference `spinningup/bench:benchmarks-for-spinning-up-imp lementations' on page 67 undefined on input line 2663. LaTeX Warning: Hyper reference `spinningup/bench:performance-in-each-environmen t' on page 67 undefined on input line 2666. LaTeX Warning: Hyper reference `spinningup/bench:halfcheetah' on page 67 undefi ned on input line 2669. LaTeX Warning: Hyper reference `spinningup/bench:hopper' on page 67 undefined o n input line 2672. LaTeX Warning: Hyper reference `spinningup/bench:walker' on page 67 undefined o n input line 2675. LaTeX Warning: Hyper reference `spinningup/bench:swimmer' on page 67 undefined on input line 2678. LaTeX Warning: Hyper reference `spinningup/bench:ant' on page 67 undefined on i nput line 2681. LaTeX Warning: Hyper reference `spinningup/bench:experiment-details' on page 67 undefined on input line 2686. ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2704 ...includegraphics{{bench_halfcheetah}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2704 ...includegraphics{{bench_halfcheetah}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2713 ...phinxincludegraphics{{bench_hopper}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2713 ...phinxincludegraphics{{bench_hopper}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2722 ...phinxincludegraphics{{bench_walker}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2722 ...phinxincludegraphics{{bench_walker}.svg} [67] ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2731 ...\sphinxincludegraphics{{bench_swim}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2731 ...\sphinxincludegraphics{{bench_swim}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2740 ...t\sphinxincludegraphics{{bench_ant}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2740 ...t\sphinxincludegraphics{{bench_ant}.svg} [68] [69] [70] Chapter 14. LaTeX Warning: Hyper reference `algorithms/vpg:vanilla-policy-gradient' on page 71 undefined on input line 2763. LaTeX Warning: Hyper reference `algorithms/vpg:background' on page 71 undefined on input line 2766. LaTeX Warning: Hyper reference `algorithms/vpg:quick-facts' on page 71 undefine d on input line 2769. LaTeX Warning: Hyper reference `algorithms/vpg:key-equations' on page 71 undefi ned on input line 2772. LaTeX Warning: Hyper reference `algorithms/vpg:exploration-vs-exploitation' on page 71 undefined on input line 2775. LaTeX Warning: Hyper reference `algorithms/vpg:pseudocode' on page 71 undefined on input line 2778. LaTeX Warning: Hyper reference `algorithms/vpg:documentation' on page 71 undefi ned on input line 2783. LaTeX Warning: Hyper reference `algorithms/vpg:saved-model-contents' on page 71 undefined on input line 2786. LaTeX Warning: Hyper reference `algorithms/vpg:references' on page 71 undefined on input line 2791. LaTeX Warning: Hyper reference `algorithms/vpg:relevant-papers' on page 71 unde fined on input line 2794. LaTeX Warning: Hyper reference `algorithms/vpg:why-these-papers' on page 71 und efined on input line 2797. LaTeX Warning: Hyper reference `algorithms/vpg:other-public-implementations' on page 71 undefined on input line 2800. [71] ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2837 },\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2837 },\end{split} ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2854 ...rithms/vpg:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2855 \caption {Vanilla Policy Gradient Algorithm} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2857 \begin{algorithmic} [1] ! Undefined control sequence. l.2858 \STATE Input: initial policy parameters $\theta_0$, initial value... ! Undefined control sequence. l.2859 \FOR {$k = 0,1,2,...$} ! Undefined control sequence. l.2860 \STATE Collect set of trajectories ${\mathcal D}_k = \{\tau_i\}$ ... ! Undefined control sequence. l.2861 \STATE Compute rewards-to-go $\hat{R}_t$. ! Undefined control sequence. l.2862 \STATE Compute advantage estimates, $\hat{A}_t$ (using any method... ! Undefined control sequence. l.2863 \STATE Estimate policy gradient as ! Undefined control sequence. l.2867 \STATE Compute policy update, either using standard gradient ascent, ! Undefined control sequence. l.2872 \STATE Fit value function by regression on mean-squared error: ! Undefined control sequence. l.2877 \ENDFOR ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2878 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2879 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 2885--2885 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 2885--2885 \T1/ptm/m/it/10 steps_per_epoch=4000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 epochs=50\ T1/ptm/m/n/10 , \T1/ptm/m/it/10 gamma=0.99\T1/ptm/m/n/10 , \T1/ptm/m/it/10 pi_l r=0.0003\T1/ptm/m/n/10 , \T1/ptm/m/it/10 vf_lr=0.001\T1/ptm/m/n/10 , [72] [73] [74] [75] [76] Chapter 15. LaTeX Warning: Hyper reference `algorithms/trpo:trust-region-policy-optimizatio n' on page 77 undefined on input line 3085. LaTeX Warning: Hyper reference `algorithms/trpo:background' on page 77 undefine d on input line 3088. LaTeX Warning: Hyper reference `algorithms/trpo:quick-facts' on page 77 undefin ed on input line 3091. LaTeX Warning: Hyper reference `algorithms/trpo:key-equations' on page 77 undef ined on input line 3094. LaTeX Warning: Hyper reference `algorithms/trpo:exploration-vs-exploitation' on page 77 undefined on input line 3097. LaTeX Warning: Hyper reference `algorithms/trpo:pseudocode' on page 77 undefine d on input line 3100. LaTeX Warning: Hyper reference `algorithms/trpo:documentation' on page 77 undef ined on input line 3105. LaTeX Warning: Hyper reference `algorithms/trpo:saved-model-contents' on page 7 7 undefined on input line 3108. LaTeX Warning: Hyper reference `algorithms/trpo:references' on page 77 undefine d on input line 3113. LaTeX Warning: Hyper reference `algorithms/trpo:relevant-papers' on page 77 und efined on input line 3116. LaTeX Warning: Hyper reference `algorithms/trpo:why-these-papers' on page 77 un defined on input line 3119. LaTeX Warning: Hyper reference `algorithms/trpo:other-public-implementations' o n page 77 undefined on input line 3122. [77] ! Undefined control sequence. L}(\theta _k, \theta ) = \underE {s,a \sim \pi _{\theta _k}... l.3166 },\end{split} ! Undefined control sequence. L}(\theta _k, \theta ) = \underE {s,a \sim \pi _{\theta _k}... l.3166 },\end{split} ! Undefined control sequence. ...{KL}(\theta || \theta _k) = \underE {s \sim \pi _{\theta _k}}{... l.3172 }.\end{split} ! Undefined control sequence. ...{KL}(\theta || \theta _k) = \underE {s \sim \pi _{\theta _k}}{... l.3172 }.\end{split} [78] ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3221 ...ithms/trpo:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3222 \caption {Trust Region Policy Optimization} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3224 \begin{algorithmic} [1] ! Undefined control sequence. l.3225 \STATE Input: initial policy parameters $\theta_0$, initial value... ! Undefined control sequence. l.3226 \STATE Hyperparameters: KL-divergence limit $\delta$, backtrackin... ! Undefined control sequence. l.3227 \FOR {$k = 0,1,2,...$} ! Undefined control sequence. l.3228 \STATE Collect set of trajectories ${\mathcal D}_k = \{\tau_i\}$ ... ! Undefined control sequence. l.3229 \STATE Compute rewards-to-go $\hat{R}_t$. ! Undefined control sequence. l.3230 \STATE Compute advantage estimates, $\hat{A}_t$ (using any method... ! Undefined control sequence. l.3231 \STATE Estimate policy gradient as ! Undefined control sequence. l.3235 \STATE Use the conjugate gradient algorithm to compute ! Undefined control sequence. l.3240 \STATE Update the policy by backtracking line search with [79] ! Undefined control sequence. l.3245 \STATE Fit value function by regression on mean-squared error: ! Undefined control sequence. l.3250 \ENDFOR ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3251 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3252 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 3258--3258 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 3258--3258 \T1/ptm/m/it/10 steps_per_epoch=4000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 epochs=50\ T1/ptm/m/n/10 , \T1/ptm/m/it/10 gamma=0.99\T1/ptm/m/n/10 , \T1/ptm/m/it/10 delt a=0.01\T1/ptm/m/n/10 , \T1/ptm/m/it/10 vf_lr=0.001\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 3258--3258 \T1/ptm/m/it/10 train_v_iters=80\T1/ptm/m/n/10 , \T1/ptm/m/it/10 damp-ing_coeff =0.1\T1/ptm/m/n/10 , \T1/ptm/m/it/10 cg_iters=10\T1/ptm/m/n/10 , \T1/ptm/m/it/1 0 back-track_iters=10\T1/ptm/m/n/10 , \T1/ptm/m/it/10 back- Underfull \hbox (badness 10000) in paragraph at lines 3258--3258 \T1/ptm/m/it/10 track_coeff=0.8\T1/ptm/m/n/10 , \T1/ptm/m/it/10 lam=0.97\T1/ptm /m/n/10 , \T1/ptm/m/it/10 max_ep_len=1000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 log-g er_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 save_freq=10\T1/ptm/m/n/10 , [80] Overfull \vbox (100.02797pt too high) has occurred while \output is active [81] [82] [83] [84] Chapter 16. LaTeX Warning: Hyper reference `algorithms/ppo:proximal-policy-optimization' on page 85 undefined on input line 3532. LaTeX Warning: Hyper reference `algorithms/ppo:background' on page 85 undefined on input line 3535. LaTeX Warning: Hyper reference `algorithms/ppo:quick-facts' on page 85 undefine d on input line 3538. LaTeX Warning: Hyper reference `algorithms/ppo:key-equations' on page 85 undefi ned on input line 3541. LaTeX Warning: Hyper reference `algorithms/ppo:exploration-vs-exploitation' on page 85 undefined on input line 3544. LaTeX Warning: Hyper reference `algorithms/ppo:pseudocode' on page 85 undefined on input line 3547. LaTeX Warning: Hyper reference `algorithms/ppo:documentation' on page 85 undefi ned on input line 3552. LaTeX Warning: Hyper reference `algorithms/ppo:saved-model-contents' on page 85 undefined on input line 3555. LaTeX Warning: Hyper reference `algorithms/ppo:references' on page 85 undefined on input line 3560. LaTeX Warning: Hyper reference `algorithms/ppo:relevant-papers' on page 85 unde fined on input line 3563. LaTeX Warning: Hyper reference `algorithms/ppo:why-these-papers' on page 85 und efined on input line 3566. LaTeX Warning: Hyper reference `algorithms/ppo:other-public-implementations' on page 85 undefined on input line 3569. [85] [86] ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3678 ...rithms/ppo:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3679 \caption {PPO-Clip} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3681 \begin{algorithmic} [1] ! Undefined control sequence. l.3682 \STATE Input: initial policy parameters $\theta_0$, initial value... ! Undefined control sequence. l.3683 \FOR {$k = 0,1,2,...$} ! Undefined control sequence. l.3684 \STATE Collect set of trajectories ${\mathcal D}_k = \{\tau_i\}$ ... ! Undefined control sequence. l.3685 \STATE Compute rewards-to-go $\hat{R}_t$. ! Undefined control sequence. l.3686 \STATE Compute advantage estimates, $\hat{A}_t$ (using any method... ! Undefined control sequence. l.3687 \STATE Update the policy by maximizing the PPO-Clip objective: ! Undefined control sequence. l.3695 \STATE Fit value function by regression on mean-squared error: ! Undefined control sequence. l.3700 \ENDFOR ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3701 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3702 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 3708--3708 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , [87] [88] [89] [90] Chapter 17. LaTeX Warning: Hyper reference `algorithms/ddpg:deep-deterministic-policy-gradi ent' on page 91 undefined on input line 3928. LaTeX Warning: Hyper reference `algorithms/ddpg:background' on page 91 undefine d on input line 3931. LaTeX Warning: Hyper reference `algorithms/ddpg:quick-facts' on page 91 undefin ed on input line 3934. LaTeX Warning: Hyper reference `algorithms/ddpg:key-equations' on page 91 undef ined on input line 3937. LaTeX Warning: Hyper reference `algorithms/ddpg:the-q-learning-side-of-ddpg' on page 91 undefined on input line 3940. LaTeX Warning: Hyper reference `algorithms/ddpg:the-policy-learning-side-of-ddp g' on page 91 undefined on input line 3943. LaTeX Warning: Hyper reference `algorithms/ddpg:exploration-vs-exploitation' on page 91 undefined on input line 3948. LaTeX Warning: Hyper reference `algorithms/ddpg:pseudocode' on page 91 undefine d on input line 3951. LaTeX Warning: Hyper reference `algorithms/ddpg:documentation' on page 91 undef ined on input line 3956. LaTeX Warning: Hyper reference `algorithms/ddpg:saved-model-contents' on page 9 1 undefined on input line 3959. LaTeX Warning: Hyper reference `algorithms/ddpg:references' on page 91 undefine d on input line 3964. LaTeX Warning: Hyper reference `algorithms/ddpg:relevant-papers' on page 91 und efined on input line 3967. LaTeX Warning: Hyper reference `algorithms/ddpg:why-these-papers' on page 91 un defined on input line 3970. LaTeX Warning: Hyper reference `algorithms/ddpg:other-public-implementations' o n page 91 undefined on input line 3973. [91] [92] [93] ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4092 ...ithms/ddpg:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4093 \caption {Deep Deterministic Policy Gradient} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4095 \begin{algorithmic} [1] ! Undefined control sequence. l.4096 \STATE Input: initial policy parameters $\theta$, Q-function para... ! Undefined control sequence. l.4097 \STATE Set target parameters equal to main parameters $\theta_{\t... ! Undefined control sequence. l.4098 \REPEAT ! Undefined control sequence. l.4099 \STATE Observe state $s$ and select action $a = \text{clip}(\... ! Undefined control sequence. l.4100 \STATE Execute $a$ in the environment ! Undefined control sequence. l.4101 \STATE Observe next state $s'$, reward $r$, and done signal $... ! Undefined control sequence. l.4102 \STATE Store $(s,a,r,s',d)$ in replay buffer $\mathcal{D}$ ! Undefined control sequence. l.4103 \STATE If $s'$ is terminal, reset environment state. ! Undefined control sequence. l.4104 \IF {it's time to update} ! Undefined control sequence. l.4105 \FOR {however many updates} ! Undefined control sequence. l.4106 \STATE Randomly sample a batch of transitions, $B = \... ! Undefined control sequence. l.4107 \STATE Compute targets ! Undefined control sequence. l.4111 \STATE Update Q-function by one step of gradient desc... ! Undefined control sequence. l.4115 \STATE Update policy by one step of gradient ascent u... ! Undefined control sequence. l.4119 \STATE Update target networks with ! Undefined control sequence. l.4124 \ENDFOR ! Undefined control sequence. l.4125 \ENDIF ! Undefined control sequence. l.4126 \UNTIL {convergence} ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4127 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4128 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 4134--4134 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 4134--4134 \T1/ptm/m/it/10 pi_lr=0.001\T1/ptm/m/n/10 , \T1/ptm/m/it/10 q_lr=0.001\T1/ptm/m /n/10 , \T1/ptm/m/it/10 batch_size=100\T1/ptm/m/n/10 , \T1/ptm/m/it/10 start_st eps=10000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 act_noise=0.1\T1/ptm/m/n/10 , [94] [95] [96] Chapter 18. LaTeX Warning: Hyper reference `algorithms/td3:twin-delayed-ddpg' on page 97 un defined on input line 4349. LaTeX Warning: Hyper reference `algorithms/td3:background' on page 97 undefined on input line 4352. LaTeX Warning: Hyper reference `algorithms/td3:quick-facts' on page 97 undefine d on input line 4355. LaTeX Warning: Hyper reference `algorithms/td3:key-equations' on page 97 undefi ned on input line 4358. LaTeX Warning: Hyper reference `algorithms/td3:exploration-vs-exploitation' on page 97 undefined on input line 4361. LaTeX Warning: Hyper reference `algorithms/td3:pseudocode' on page 97 undefined on input line 4364. LaTeX Warning: Hyper reference `algorithms/td3:documentation' on page 97 undefi ned on input line 4369. LaTeX Warning: Hyper reference `algorithms/td3:saved-model-contents' on page 97 undefined on input line 4372. LaTeX Warning: Hyper reference `algorithms/td3:references' on page 97 undefined on input line 4377. LaTeX Warning: Hyper reference `algorithms/td3:relevant-papers' on page 97 unde fined on input line 4380. LaTeX Warning: Hyper reference `algorithms/td3:other-public-implementations' on page 97 undefined on input line 4383. [97] ! Undefined control sequence. ...}L(\phi _1, {\mathcal D}) = \underE {(s,a,r,s',d) \sim {\mathc... l.4440 },\end{split} ! Undefined control sequence. ...}L(\phi _1, {\mathcal D}) = \underE {(s,a,r,s',d) \sim {\mathc... l.4440 },\end{split} ! Undefined control sequence. ...}L(\phi _2, {\mathcal D}) = \underE {(s,a,r,s',d) \sim {\mathc... l.4444 }.\end{split} ! Undefined control sequence. ...}L(\phi _2, {\mathcal D}) = \underE {(s,a,r,s',d) \sim {\mathc... l.4444 }.\end{split} [98] ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4468 ...rithms/td3:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4469 \caption {Twin Delayed DDPG} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4471 \begin{algorithmic} [1] ! Undefined control sequence. l.4472 \STATE Input: initial policy parameters $\theta$, Q-function para... ! Undefined control sequence. l.4473 \STATE Set target parameters equal to main parameters $\theta_{\t... ! Undefined control sequence. l.4474 \REPEAT ! Undefined control sequence. l.4475 \STATE Observe state $s$ and select action $a = \text{clip}(\... ! Undefined control sequence. l.4476 \STATE Execute $a$ in the environment ! Undefined control sequence. l.4477 \STATE Observe next state $s'$, reward $r$, and done signal $... ! Undefined control sequence. l.4478 \STATE Store $(s,a,r,s',d)$ in replay buffer $\mathcal{D}$ ! Undefined control sequence. l.4479 \STATE If $s'$ is terminal, reset environment state. ! Undefined control sequence. l.4480 \IF {it's time to update} ! Undefined control sequence. l.4481 \FOR {$j$ in range(however many updates)} ! Undefined control sequence. l.4482 \STATE Randomly sample a batch of transitions, $B = \... ! Undefined control sequence. l.4483 \STATE Compute target actions ! Undefined control sequence. l.4487 \STATE Compute targets ! Undefined control sequence. l.4491 \STATE Update Q-functions by one step of gradient des... ! Undefined control sequence. l.4495 \IF { $j \mod$ \texttt{policy\_delay} $ = 0$} ! Undefined control sequence. l.4496 \STATE Update policy by one step of gradient asce... ! Undefined control sequence. l.4500 \STATE Update target networks with ! Undefined control sequence. l.4505 \ENDIF ! Undefined control sequence. l.4506 \ENDFOR ! Undefined control sequence. l.4507 \ENDIF ! Undefined control sequence. l.4508 \UNTIL {convergence} ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4509 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4510 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 4516--4516 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 4516--4516 \T1/ptm/m/it/10 pi_lr=0.001\T1/ptm/m/n/10 , \T1/ptm/m/it/10 q_lr=0.001\T1/ptm/m /n/10 , \T1/ptm/m/it/10 batch_size=100\T1/ptm/m/n/10 , \T1/ptm/m/it/10 start_st eps=10000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 act_noise=0.1\T1/ptm/m/n/10 , \T1/ptm /m/it/10 tar- [99] [100] [101] [102] Chapter 19. LaTeX Warning: Hyper reference `algorithms/sac:soft-actor-critic' on page 103 u ndefined on input line 4742. LaTeX Warning: Hyper reference `algorithms/sac:background' on page 103 undefine d on input line 4745. LaTeX Warning: Hyper reference `algorithms/sac:quick-facts' on page 103 undefin ed on input line 4748. LaTeX Warning: Hyper reference `algorithms/sac:key-equations' on page 103 undef ined on input line 4751. LaTeX Warning: Hyper reference `algorithms/sac:entropy-regularized-reinforcemen t-learning' on page 103 undefined on input line 4754. LaTeX Warning: Hyper reference `algorithms/sac:id1' on page 103 undefined on in put line 4757. LaTeX Warning: Hyper reference `algorithms/sac:exploration-vs-exploitation' on page 103 undefined on input line 4762. LaTeX Warning: Hyper reference `algorithms/sac:pseudocode' on page 103 undefine d on input line 4765. LaTeX Warning: Hyper reference `algorithms/sac:documentation' on page 103 undef ined on input line 4770. LaTeX Warning: Hyper reference `algorithms/sac:saved-model-contents' on page 10 3 undefined on input line 4773. LaTeX Warning: Hyper reference `algorithms/sac:references' on page 103 undefine d on input line 4778. LaTeX Warning: Hyper reference `algorithms/sac:relevant-papers' on page 103 und efined on input line 4781. LaTeX Warning: Hyper reference `algorithms/sac:other-public-implementations' on page 103 undefined on input line 4784. [103] ! Undefined control sequence. \begin {split}H(P) = \underE {x \sim P}{-\log P(x)}.\en... l.4831 ...underE{x \sim P}{-\log P(x)}.\end{split} ! Undefined control sequence. \begin {split}H(P) = \underE {x \sim P}{-\log P(x)}.\en... l.4831 ...underE{x \sim P}{-\log P(x)}.\end{split} ! Undefined control sequence. ...}\pi ^* = \arg \max _{\pi } \underE {\tau \sim \pi }{ \sum _{t... l.4835 ...pi(\cdot|s_t)\right) \bigg)},\end{split} ! Undefined control sequence. ...}\pi ^* = \arg \max _{\pi } \underE {\tau \sim \pi }{ \sum _{t... l.4835 ...pi(\cdot|s_t)\right) \bigg)},\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {\tau \sim \pi }{ \left . ... l.4839 ...ight) \bigg) \right| s_0 = s}\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {\tau \sim \pi }{ \left . ... l.4839 ...ight) \bigg) \right| s_0 = s}\end{split} ! Undefined control sequence. ...egin {split}Q^{\pi }(s,a) = \underE {\tau \sim \pi }{ \left . ... l.4843 ...ght)\right| s_0 = s, a_0 = a}\end{split} ! Undefined control sequence. ...egin {split}Q^{\pi }(s,a) = \underE {\tau \sim \pi }{ \left . ... l.4843 ...ght)\right| s_0 = s, a_0 = a}\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {a \sim \pi }{Q^{\pi }(s,a... l.4847 ...ha H\left(\pi(\cdot|s)\right)\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {a \sim \pi }{Q^{\pi }(s,a... l.4847 ...ha H\left(\pi(\cdot|s)\right)\end{split} ! Undefined control sequence. ...gin {split}Q^{\pi }(s,a) &= \underE {s' \sim P \\ a' \sim \pi ... l.4852 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Missing } inserted. } l.4852 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Missing { inserted. { l.4852 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Undefined control sequence. ...s')\right ) \right )} \\ &= \underE {s' \sim P}{R(s,a,s') + \g... l.4852 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Undefined control sequence. ...gin {split}Q^{\pi }(s,a) &= \underE {s' \sim P \\ a' \sim \pi ... l.4852 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Missing } inserted. } l.4852 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Missing { inserted. { l.4852 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Undefined control sequence. ...s')\right ) \right )} \\ &= \underE {s' \sim P}{R(s,a,s') + \g... l.4852 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} [104] ! Undefined control sequence. ...begin {split}V^{\pi }(s) &= \underE {a \sim \pi }{Q^{\pi }(s,a... l.4875 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. ...pi (\cdot |s)\right ) \\ &= \underE {a \sim \pi }{Q^{\pi }(s,a... l.4875 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. ...begin {split}V^{\pi }(s) &= \underE {a \sim \pi }{Q^{\pi }(s,a... l.4875 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. ...pi (\cdot |s)\right ) \\ &= \underE {a \sim \pi }{Q^{\pi }(s,a... l.4875 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence.}L(\psi , {\mathcal D}) = \underE {s \sim \mathcal {D} \\ \t... l.4883 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Missing } inserted. } l.4883 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Missing { inserted. { l.4883 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Undefined control sequence.}L(\psi , {\mathcal D}) = \underE {s \sim \mathcal {D} \\ \t... l.4883 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Missing } inserted. } l.4883 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Missing { inserted. { l.4883 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {a \sim \pi }{Q^{\pi }(s,a) - \a... l.4889 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {a \sim \pi }{Q^{\pi }(s,a) - \a... l.4889 ...s,a) - \alpha \log \pi(a|s)}.\end{split} [105] ! Undefined control sequence. \split@tag \begin {split}\underE {a \sim \pi _{\theta }}{Q^{\pi _... l.4906 ...\tilde{a}_{\theta}(s,\xi)|s)}\end{split} ! Undefined control sequence. ...\log \pi _{\theta }(a|s)} = \underE {\xi \sim \mathcal {N}}{Q^... l.4906 ...\tilde{a}_{\theta}(s,\xi)|s)}\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {a \sim \pi _{\theta }}{Q^{\pi _... l.4906 ...\tilde{a}_{\theta}(s,\xi)|s)}\end{split} ! Undefined control sequence. ...\log \pi _{\theta }(a|s)} = \underE {\xi \sim \mathcal {N}}{Q^... l.4906 ...\tilde{a}_{\theta}(s,\xi)|s)}\end{split} ! Undefined control sequence. ...egin {split}\max _{\theta } \underE {s \sim \mathcal {D} \\ \x... l.4910 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Missing } inserted. } l.4910 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Missing { inserted. { l.4910 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Undefined control sequence. ...egin {split}\max _{\theta } \underE {s \sim \mathcal {D} \\ \x... l.4910 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Missing } inserted. } l.4910 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Missing { inserted. { l.4910 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4928 ...rithms/sac:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4929 \caption {Soft Actor-Critic} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4931 \begin{algorithmic} [1] ! Undefined control sequence. l.4932 \STATE Input: initial policy parameters $\theta$, Q-function para... ! Undefined control sequence. l.4933 \STATE Set target parameters equal to main parameters $\psi_{\tex... ! Undefined control sequence. l.4934 \REPEAT ! Undefined control sequence. l.4935 \STATE Observe state $s$ and select action $a \sim \pi_{\thet... ! Undefined control sequence. l.4936 \STATE Execute $a$ in the environment ! Undefined control sequence. l.4937 \STATE Observe next state $s'$, reward $r$, and done signal $... ! Undefined control sequence. l.4938 \STATE Store $(s,a,r,s',d)$ in replay buffer $\mathcal{D}$ ! Undefined control sequence. l.4939 \STATE If $s'$ is terminal, reset environment state. ! Undefined control sequence. l.4940 \IF {it's time to update} ! Undefined control sequence. l.4941 \FOR {$j$ in range(however many updates)} ! Undefined control sequence. l.4942 \STATE Randomly sample a batch of transitions, $B = \... ! Undefined control sequence. l.4943 \STATE Compute targets for Q and V functions: [106] ! Undefined control sequence. l.4948 \STATE Update Q-functions by one step of gradient des... ! Undefined control sequence. l.4952 \STATE Update V-function by one step of gradient desc... ! Undefined control sequence. l.4956 \STATE Update policy by one step of gradient ascent u... ! Undefined control sequence. l.4961 \STATE Update target value network with ! Undefined control sequence. l.4965 \ENDFOR ! Undefined control sequence. l.4966 \ENDIF ! Undefined control sequence. l.4967 \UNTIL {convergence} ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4968 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4969 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 4975--4975 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 4975--4975 \T1/ptm/m/it/10 lr=0.001\T1/ptm/m/n/10 , \T1/ptm/m/it/10 al-pha=0.2\T1/ptm/m/n/ 10 , \T1/ptm/m/it/10 batch_size=100\T1/ptm/m/n/10 , \T1/ptm/m/it/10 start_steps =10000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 max_ep_len=1000\T1/ptm/m/n/10 , \T1/ptm/ m/it/10 log- [107] Overfull \vbox (77.80809pt too high) has occurred while \output is active [108] [109] [110] Chapter 20. LaTeX Warning: Hyper reference `utils/logger:logger' on page 111 undefined on i nput line 5239. LaTeX Warning: Hyper reference `utils/logger:using-a-logger' on page 111 undefi ned on input line 5242. LaTeX Warning: Hyper reference `utils/logger:examples' on page 111 undefined on input line 5245. LaTeX Warning: Hyper reference `utils/logger:logging-and-mpi' on page 111 undef ined on input line 5248. LaTeX Warning: Hyper reference `utils/logger:logger-classes' on page 111 undefi ned on input line 5253. LaTeX Warning: Hyper reference `utils/logger:loading-saved-graphs' on page 111 undefined on input line 5256. [111] [112] [113] [114] LaTeX Warning: Hyper reference `utils/logger:spinup.utils.logx.Logger' on page 115 undefined on input line 5554. [115] [116] Chapter 21. [117] [118] Chapter 22. LaTeX Warning: Hyper reference `utils/mpi:mpi-tools' on page 119 undefined on i nput line 5699. LaTeX Warning: Hyper reference `utils/mpi:module-spinup.utils.mpi_tools' on pag e 119 undefined on input line 5702. LaTeX Warning: Hyper reference `utils/mpi:mpi-tensorflow-utilities' on page 119 undefined on input line 5705. [119] [120] Chapter 23. LaTeX Warning: Hyper reference `utils/run_utils:run-utils' on page 121 undefine d on input line 5832. LaTeX Warning: Hyper reference `utils/run_utils:experimentgrid' on page 121 und efined on input line 5835. LaTeX Warning: Hyper reference `utils/run_utils:calling-experiments' on page 12 1 undefined on input line 5838. [121] Underfull \hbox (badness 10000) in paragraph at lines 5990--5990 []\T1/ptm/m/it/10 exp_name\T1/ptm/m/n/10 , \T1/ptm/m/it/10 thunk\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , \T1/ptm/m/it/10 num_cpu=1\T1/ptm/m/n/1 0 , [122] [123] [124] Chapter 24. [125] [126] Chapter 25. [127] [128] Chapter 26. [129] [130] LaTeX Warning: Reference `utils/mpi:module-spinup.utils.mpi_tf' on page 131 und efined on input line 6123. LaTeX Warning: Reference `utils/mpi:module-spinup.utils.mpi_tools' on page 131 undefined on input line 6124. [131] No file SpinningUp.ind. (./SpinningUp.aux) Package rerunfilecheck Warning: File `SpinningUp.out' has changed. (rerunfilecheck) Rerun to get outlines right (rerunfilecheck) or use package `bookmark'. LaTeX Warning: There were undefined references. LaTeX Warning: Label(s) may have changed. Rerun to get cross-references right. ) (see the transcript file for additional information){/usr/share/texlive/texmf-d ist/fonts/enc/dvips/base/8r.enc}< /usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmr5.pfb> Output written on SpinningUp.pdf (135 pages, 1118694 bytes). Transcript written on SpinningUp.log. [rtd-command-info] start-time: 2018-11-17T13:27:01.449098Z, end-time: 2018-11-17T13:27:01.579067Z, duration: 0, exit-code: 0 makeindex -s SpinningUp.idx This is makeindex, version 2.15 [TeX Live 2015] (kpathsea + Thai support). Scanning style file ./ (7 attributes redefined, 0 ignored). Scanning input file SpinningUp.idx....done (78 entries accepted, 0 rejected). Sorting entries....done (506 comparisons). Generating output file SpinningUp.ind....done (144 lines written, 0 warnings). Output written in SpinningUp.ind. Transcript written in SpinningUp.ilg. [rtd-command-info] start-time: 2018-11-17T13:27:01.642465Z, end-time: 2018-11-17T13:27:02.841420Z, duration: 1, exit-code: 0 pdflatex -interaction=nonstopmode /home/docs/checkouts/ This is pdfTeX, Version 3.14159265-2.6-1.40.16 (TeX Live 2015/Debian) (preloaded format=pdflatex) restricted \write18 enabled. entering extended mode (/home/docs/checkouts/ heckouts/latest/docs/_build/latex/SpinningUp.tex LaTeX2e <2016/02/01> Babel <3.9q> and hyphenation patterns for 81 language(s) loaded. (./sphinxmanual.cls Document Class: sphinxmanual 2017/03/26 v1.5.4 Document class (Sphinx manual) (/usr/share/texlive/texmf-dist/tex/latex/base/report.cls Document Class: report 2014/09/29 v1.4h Standard LaTeX document class (/usr/share/texlive/texmf-dist/tex/latex/base/size10.clo))) (/usr/share/texlive/texmf-dist/tex/latex/base/inputenc.sty (/usr/share/texlive/texmf-dist/tex/latex/base/utf8.def (/usr/share/texlive/texmf-dist/tex/latex/base/t1enc.dfu) (/usr/share/texlive/texmf-dist/tex/latex/base/ot1enc.dfu) (/usr/share/texlive/texmf-dist/tex/latex/base/omsenc.dfu))) (/usr/share/texlive/texmf-dist/tex/latex/cmap/cmap.sty) (/usr/share/texlive/texmf-dist/tex/latex/base/fontenc.sty (/usr/share/texlive/texmf-dist/tex/latex/base/t1enc.def)<>) (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsmath.sty For additional information on amsmath, use the `?' option. (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amstext.sty (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsgen.sty)) (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsbsy.sty) (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsopn.sty)) (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/amssymb.sty (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/amsfonts.sty)) (/usr/share/texlive/texmf-dist/tex/generic/babel/babel.sty (/usr/share/texlive/texmf-dist/tex/generic/babel-english/english.ldf (/usr/share/texlive/texmf-dist/tex/generic/babel/babel.def))) (/usr/share/texlive/texmf-dist/tex/latex/psnfss/times.sty) (/usr/share/texlive/texmf-dist/tex/latex/fncychap/fncychap.sty) (/usr/share/texlive/texmf-dist/tex/latex/tools/longtable.sty) (./sphinx.sty (/usr/share/texlive/texmf-dist/tex/latex/graphics/graphicx.sty (/usr/share/texlive/texmf-dist/tex/latex/graphics/keyval.sty) (/usr/share/texlive/texmf-dist/tex/latex/graphics/graphics.sty (/usr/share/texlive/texmf-dist/tex/latex/graphics/trig.sty) (/usr/share/texlive/texmf-dist/tex/latex/latexconfig/graphics.cfg) (/usr/share/texlive/texmf-dist/tex/latex/pdftex-def/pdftex.def (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/infwarerr.sty) (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ltxcmds.sty)))) (/usr/share/texlive/texmf-dist/tex/latex/fancyhdr/fancyhdr.sty) (/usr/share/texlive/texmf-dist/tex/latex/base/textcomp.sty (/usr/share/texlive/texmf-dist/tex/latex/base/ts1enc.def (/usr/share/texlive/texmf-dist/tex/latex/base/ts1enc.dfu))) (/usr/share/texlive/texmf-dist/tex/latex/titlesec/titlesec.sty) (/usr/share/texlive/texmf-dist/tex/latex/etoolbox/etoolbox.sty) Package Sphinx Info: **** titlesec 2.10.1 successfully patched for bugfix **** (/usr/share/texlive/texmf-dist/tex/latex/tabulary/tabulary.sty (/usr/share/texlive/texmf-dist/tex/latex/tools/array.sty)) (/usr/share/texlive/texmf-dist/tex/latex/base/makeidx.sty) (/usr/share/texlive/texmf-dist/tex/latex/framed/framed.sty) (/usr/share/texlive/texmf-dist/tex/latex/xcolor/xcolor.sty (/usr/share/texlive/texmf-dist/tex/latex/latexconfig/color.cfg)) (/usr/share/texlive/texmf-dist/tex/latex/fancyvrb/fancyvrb.sty Style option: `fancyvrb' v2.7a, with DG/SPQR fixes, and firstline=lastline fix <2008/02/07> (tvz)) (/usr/share/texlive/texmf-dist/tex/latex/threeparttable/threeparttable.sty) (./footnotehyper-sphinx.sty (/usr/share/texlive/texmf-dist/tex/latex/mdwtools/footnote.sty)) (/usr/share/texlive/texmf-dist/tex/latex/float/float.sty) (/usr/share/texlive/texmf-dist/tex/latex/wrapfig/wrapfig.sty) (/usr/share/texlive/texmf-dist/tex/latex/parskip/parskip.sty) (/usr/share/texlive/texmf-dist/tex/latex/base/alltt.sty) (/usr/share/texlive/texmf-dist/tex/latex/upquote/upquote.sty) (/usr/share/texlive/texmf-dist/tex/latex/capt-of/capt-of.sty) (./needspace.sty) (./sphinxhighlight.sty) (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/kvoptions.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/kvsetkeys.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/etexcmds.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ifluatex.sty)))) (/usr/share/texlive/texmf-dist/tex/generic/pdftex/pdfcolor.tex) ** (sphinx) defining (legacy) text style macros without \sphinx prefix ** if clashes with packages, set latex_keep_old_macro_names=False in ) (/usr/share/texlive/texmf-dist/tex/latex/geometry/geometry.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ifpdf.sty) (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ifvtex.sty) (/usr/share/texlive/texmf-dist/tex/generic/ifxetex/ifxetex.sty)) (/usr/share/texlive/texmf-dist/tex/latex/multirow/multirow.sty) (/usr/share/texlive/texmf-dist/tex/latex/eqparbox/eqparbox.sty (/usr/share/texlive/texmf-dist/tex/latex/environ/environ.sty (/usr/share/texlive/texmf-dist/tex/latex/trimspaces/trimspaces.sty))) (/usr/share/texlive/texmf-dist/tex/latex/hyperref/hyperref.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/hobsub-hyperref.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/hobsub-generic.sty)) (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/auxhook.sty) (/usr/share/texlive/texmf-dist/tex/latex/hyperref/pd1enc.def) (/usr/share/texlive/texmf-dist/tex/latex/latexconfig/hyperref.cfg) (/usr/share/texlive/texmf-dist/tex/latex/hyperref/puenc.def) (/usr/share/texlive/texmf-dist/tex/latex/url/url.sty)) Package hyperref Message: Driver (autodetected): hpdftex. (/usr/share/texlive/texmf-dist/tex/latex/hyperref/hpdftex.def (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/rerunfilecheck.sty)) (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/hypcap.sty) Writing index file SpinningUp.idx (./SpinningUp.aux LaTeX Warning: Label `alg1' multiply defined. LaTeX Warning: Label `alg1' multiply defined. LaTeX Warning: Label `alg1' multiply defined. LaTeX Warning: Label `alg1' multiply defined. LaTeX Warning: Label `alg1' multiply defined. ) (/usr/share/texlive/texmf-dist/tex/latex/base/ts1cmr.fd) (/usr/share/texlive/texmf-dist/tex/latex/psnfss/t1ptm.fd) (/usr/share/texlive/texmf-dist/tex/context/base/supp-pdf.mkii [Loading MPS to PDF converter (version 2006.09.02).] ) (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/epstopdf-base.sty (/usr/share/texlive/texmf-dist/tex/latex/oberdiek/grfext.sty) (/usr/share/texlive/texmf-dist/tex/latex/latexconfig/epstopdf-sys.cfg)) *geometry* driver: auto-detecting *geometry* detected driver: pdftex (/usr/share/texlive/texmf-dist/tex/latex/hyperref/nameref.sty (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/gettitlestring.sty)) (./SpinningUp.out) (./SpinningUp.out) (/usr/share/texlive/texmf-dist/tex/latex/psnfss/t1phv.fd)<><><><> (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/umsa.fd) (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/umsb.fd) [1{/var/lib/texmf/fo nts/map/pdftex/updmap/}] [2] (./SpinningUp.toc [1] [2]) [3] [4] (/usr/share/texlive/texmf-dist/tex/latex/psnfss/t1pcr.fd) [1 <./spinning-up-in-rl.png>] [2] Chapter 1. (/usr/share/texlive/texmf-dist/tex/latex/psnfss/ts1ptm.fd) [3] [4] [5] [6] Chapter 2. [7] [8] [9] [10] Chapter 3. [11] [12] [13] [14] Chapter 4. [15] [16] [17] [18] (/usr/share/texlive/texmf-dist/tex/latex/psnfss/ts1pcr.fd) [19] [20] Chapter 5. [21] [22] [23] [24] [25] [26] Chapter 6. [27] [28] Chapter 7. [29] [30 <./rl_diagram_transparent_bg.png>] [31] [32] [33] ! Undefined control sequence. ... } P(\tau |\pi ) R(\tau ) = \underE {\tau \sim \pi }{R(\tau )}... l.1535 ...nderE{\tau\sim \pi}{R(\tau)}.\end{split} ! Undefined control sequence. ... } P(\tau |\pi ) R(\tau ) = \underE {\tau \sim \pi }{R(\tau )}... l.1535 ...nderE{\tau\sim \pi}{R(\tau)}.\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {\tau \sim \pi }{R(\tau )\... l.1554 ...R(\tau)\left| s_0 = s\right.}\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {\tau \sim \pi }{R(\tau )\... l.1554 ...R(\tau)\left| s_0 = s\right.}\end{split} [34] ! Undefined control sequence. ...egin {split}Q^{\pi }(s,a) = \underE {\tau \sim \pi }{R(\tau )\... l.1561 ...eft| s_0 = s, a_0 = a\right.}\end{split} ! Undefined control sequence. ...egin {split}Q^{\pi }(s,a) = \underE {\tau \sim \pi }{R(\tau )\... l.1561 ...eft| s_0 = s, a_0 = a\right.}\end{split} ! Undefined control sequence. ...split}V^*(s) = \max _{\pi } \underE {\tau \sim \pi }{R(\tau )\... l.1568 ...R(\tau)\left| s_0 = s\right.}\end{split} ! Undefined control sequence. ...split}V^*(s) = \max _{\pi } \underE {\tau \sim \pi }{R(\tau )\... l.1568 ...R(\tau)\left| s_0 = s\right.}\end{split} ! Undefined control sequence. ...lit}Q^*(s,a) = \max _{\pi } \underE {\tau \sim \pi }{R(\tau )\... l.1575 ...eft| s_0 = s, a_0 = a\right.}\end{split} ! Undefined control sequence. ...lit}Q^*(s,a) = \max _{\pi } \underE {\tau \sim \pi }{R(\tau )\... l.1575 ...eft| s_0 = s, a_0 = a\right.}\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {a\sim \pi }{Q^{\pi }(s,a)... l.1589 ...erE{a\sim \pi}{Q^{\pi}(s,a)},\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {a\sim \pi }{Q^{\pi }(s,a)... l.1589 ...erE{a\sim \pi}{Q^{\pi}(s,a)},\end{split} [35] ! Undefined control sequence. V^{\pi }(s) &= \underE {a \sim \pi \\ s'\sim P}{r(s,a) + \gamma ... l.1622 \end{align*} ! Missing } inserted. } l.1622 \end{align*} ! Missing { inserted. { l.1622 \end{align*} ! Undefined control sequence. ...}(s')}, \\ Q^{\pi }(s,a) &= \underE {s'\sim P}{r(s,a) + \gamma... l.1622 \end{align*} ! Undefined control sequence. ... {s'\sim P}{r(s,a) + \gamma \underE {a'\sim \pi }{Q^{\pi }(s',... l.1622 \end{align*} ! Undefined control sequence. V^{\pi }(s) &= \underE {a \sim \pi \\ s'\sim P}{r(s,a) + \gamma ... l.1622 \end{align*} ! Missing } inserted. } l.1622 \end{align*} ! Missing \endgroup inserted. \endgroup l.1622 \end{align*} ! Misplaced \omit. \math@cr@@@ ...@ \@ne \add@amps \maxfields@ \omit \kern -\alignsep@ \iftag@ ... l.1622 \end{align*} ! Missing { inserted. { l.1622 \end{align*} ! Undefined control sequence. ...}(s')}, \\ Q^{\pi }(s,a) &= \underE {s'\sim P}{r(s,a) + \gamma... l.1622 \end{align*} ! Undefined control sequence. ... {s'\sim P}{r(s,a) + \gamma \underE {a'\sim \pi }{Q^{\pi }(s',... l.1622 \end{align*} ! Undefined control sequence. V^*(s) &= \max _a \underE {s'\sim P}{r(s,a) + \gamma V^*(s')}, \... l.1629 \end{align*} ! Undefined control sequence. V^*(s')}, \\ Q^*(s,a) &= \underE {s'\sim P}{r(s,a) + \gamma... l.1629 \end{align*} ! Undefined control sequence. V^*(s) &= \max _a \underE {s'\sim P}{r(s,a) + \gamma V^*(s')}, \... l.1629 \end{align*} ! Undefined control sequence. V^*(s')}, \\ Q^*(s,a) &= \underE {s'\sim P}{r(s,a) + \gamma... l.1629 \end{align*} [36] [37] [38] Chapter 8. ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.1703 ...ncludegraphics{{rl_algorithms_9_15}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.1703 ...ncludegraphics{{rl_algorithms_9_15}.svg} [39] [40] [41] [42] Chapter 9. ! Undefined control sequence. l.1894 ...ected return \(J(\pi_{\theta}) = \underE {\tau \sim \pi_{\theta}}{R... [43] ! Undefined control sequence. ...} \log P(\tau | \theta ) &= \cancel {\nabla _{\theta } \log \r... l.1925 ... \log \pi_{\theta}(a_t |s_t).\end{split} ! Undefined control sequence. ...} + \sum _{t=0}^{T} \bigg ( \cancel {\nabla _{\theta } \log P(... l.1925 ... \log \pi_{\theta}(a_t |s_t).\end{split} ! Undefined control sequence. ...} \log P(\tau | \theta ) &= \cancel {\nabla _{\theta } \log \r... l.1925 ... \log \pi_{\theta}(a_t |s_t).\end{split} ! Undefined control sequence. ...} + \sum _{t=0}^{T} \bigg ( \cancel {\nabla _{\theta } \log P(... l.1925 ... \log \pi_{\theta}(a_t |s_t).\end{split} ! Undefined control sequence. ...eta }) &= \nabla _{\theta } \underE {\tau \sim \pi _{\theta }}... l.1937 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...Log-derivative trick} \\ &= \underE {\tau \sim \pi _{\theta }}... l.1937 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...heta } J(\pi _{\theta }) &= \underE {\tau \sim \pi _{\theta }}... l.1937 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...eta }) &= \nabla _{\theta } \underE {\tau \sim \pi _{\theta }}... l.1937 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...Log-derivative trick} \\ &= \underE {\tau \sim \pi _{\theta }}... l.1937 \end{align*} \end{sphinxadmonition} ! Undefined control sequence. ...heta } J(\pi _{\theta }) &= \underE {\tau \sim \pi _{\theta }}... l.1937 \end{align*} \end{sphinxadmonition} [44] [45] [46] [47] ! Undefined control sequence. \split@tag \begin {split}\underE {x \sim P_{\theta }}{\nabla _{\t... l.2086 ...eta} \log P_{\theta}(x)} = 0.\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {x \sim P_{\theta }}{\nabla _{\t... l.2086 ...eta} \log P_{\theta}(x)} = 0.\end{split} ! Undefined control sequence. ...eta }(x) \\ \therefore 0 &= \underE {x \sim P_{\theta }}{\nabl... l.2103 ...{\theta} \log P_{\theta}(x)}.\end{split} ! Undefined control sequence. ...eta }(x) \\ \therefore 0 &= \underE {x \sim P_{\theta }}{\nabl... l.2103 ...{\theta} \log P_{\theta}(x)}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2111 ..._{\theta}(a_t |s_t) R(\tau)}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2111 ..._{\theta}(a_t |s_t) R(\tau)}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2119 ...R(s_{t'}, a_{t'}, s_{t'+1})}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2119 ...R(s_{t'}, a_{t'}, s_{t'+1})}.\end{split} [48] ! Undefined control sequence. \split@tag \begin {split}\underE {a_t \sim \pi _{\theta }}{\nabla... l.2171 ...\theta}(a_t|s_t) b(s_t)} = 0.\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {a_t \sim \pi _{\theta }}{\nabla... l.2171 ...\theta}(a_t|s_t) b(s_t)} = 0.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2175 ..., s_{t'+1}) - b(s_t)\right)}.\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2175 ..., s_{t'+1}) - b(s_t)\right)}.\end{split} [49] ! Undefined control sequence. ...phi _k = \arg \min _{\phi } \underE {s_t, \hat {R}_t \sim \pi ... l.2189 ...(s_t) - \hat{R}_t \right)^2},\end{split} ! Undefined control sequence. ...phi _k = \arg \min _{\phi } \underE {s_t, \hat {R}_t \sim \pi ... l.2189 ...(s_t) - \hat{R}_t \right)^2},\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2203 ...i_{\theta}(a_t |s_t) \Phi_t},\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2203 ...i_{\theta}(a_t |s_t) \Phi_t},\end{split} [50] [51] [52] Chapter 10. [53] [54] [55] [56] [57] [58] Chapter 11. [59] [60] Overfull \vbox (108.35579pt too high) has occurred while \output is active [61] [62] Chapter 12. [63] [64] [65] [66] Chapter 13. ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2704 ...includegraphics{{bench_halfcheetah}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2704 ...includegraphics{{bench_halfcheetah}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2713 ...phinxincludegraphics{{bench_hopper}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2713 ...phinxincludegraphics{{bench_hopper}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2722 ...phinxincludegraphics{{bench_walker}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2722 ...phinxincludegraphics{{bench_walker}.svg} [67] ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2731 ...\sphinxincludegraphics{{bench_swim}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2731 ...\sphinxincludegraphics{{bench_swim}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2740 ...t\sphinxincludegraphics{{bench_ant}.svg} ! LaTeX Error: Unknown graphics extension: .svg. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2740 ...t\sphinxincludegraphics{{bench_ant}.svg} [68] [69] [70] Chapter 14. [71] ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2837 },\end{split} ! Undefined control sequence. ...theta } J(\pi _{\theta }) = \underE {\tau \sim \pi _{\theta }}... l.2837 },\end{split} ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2854 ...rithms/vpg:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2855 \caption {Vanilla Policy Gradient Algorithm} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2857 \begin{algorithmic} [1] ! Undefined control sequence. l.2858 \STATE Input: initial policy parameters $\theta_0$, initial value... ! Undefined control sequence. l.2859 \FOR {$k = 0,1,2,...$} ! Undefined control sequence. l.2860 \STATE Collect set of trajectories ${\mathcal D}_k = \{\tau_i\}$ ... ! Undefined control sequence. l.2861 \STATE Compute rewards-to-go $\hat{R}_t$. ! Undefined control sequence. l.2862 \STATE Compute advantage estimates, $\hat{A}_t$ (using any method... ! Undefined control sequence. l.2863 \STATE Estimate policy gradient as ! Undefined control sequence. l.2867 \STATE Compute policy update, either using standard gradient ascent, ! Undefined control sequence. l.2872 \STATE Fit value function by regression on mean-squared error: ! Undefined control sequence. l.2877 \ENDFOR ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2878 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.2879 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 2885--2885 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 2885--2885 \T1/ptm/m/it/10 steps_per_epoch=4000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 epochs=50\ T1/ptm/m/n/10 , \T1/ptm/m/it/10 gamma=0.99\T1/ptm/m/n/10 , \T1/ptm/m/it/10 pi_l r=0.0003\T1/ptm/m/n/10 , \T1/ptm/m/it/10 vf_lr=0.001\T1/ptm/m/n/10 , [72] [73] [74] [75] [76] Chapter 15. [77] ! Undefined control sequence. L}(\theta _k, \theta ) = \underE {s,a \sim \pi _{\theta _k}... l.3166 },\end{split} ! Undefined control sequence. L}(\theta _k, \theta ) = \underE {s,a \sim \pi _{\theta _k}... l.3166 },\end{split} ! Undefined control sequence. ...{KL}(\theta || \theta _k) = \underE {s \sim \pi _{\theta _k}}{... l.3172 }.\end{split} ! Undefined control sequence. ...{KL}(\theta || \theta _k) = \underE {s \sim \pi _{\theta _k}}{... l.3172 }.\end{split} [78] ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3221 ...ithms/trpo:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3222 \caption {Trust Region Policy Optimization} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3224 \begin{algorithmic} [1] ! Undefined control sequence. l.3225 \STATE Input: initial policy parameters $\theta_0$, initial value... ! Undefined control sequence. l.3226 \STATE Hyperparameters: KL-divergence limit $\delta$, backtrackin... ! Undefined control sequence. l.3227 \FOR {$k = 0,1,2,...$} ! Undefined control sequence. l.3228 \STATE Collect set of trajectories ${\mathcal D}_k = \{\tau_i\}$ ... ! Undefined control sequence. l.3229 \STATE Compute rewards-to-go $\hat{R}_t$. ! Undefined control sequence. l.3230 \STATE Compute advantage estimates, $\hat{A}_t$ (using any method... ! Undefined control sequence. l.3231 \STATE Estimate policy gradient as ! Undefined control sequence. l.3235 \STATE Use the conjugate gradient algorithm to compute ! Undefined control sequence. l.3240 \STATE Update the policy by backtracking line search with [79] ! Undefined control sequence. l.3245 \STATE Fit value function by regression on mean-squared error: ! Undefined control sequence. l.3250 \ENDFOR ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3251 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3252 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 3258--3258 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 3258--3258 \T1/ptm/m/it/10 steps_per_epoch=4000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 epochs=50\ T1/ptm/m/n/10 , \T1/ptm/m/it/10 gamma=0.99\T1/ptm/m/n/10 , \T1/ptm/m/it/10 delt a=0.01\T1/ptm/m/n/10 , \T1/ptm/m/it/10 vf_lr=0.001\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 3258--3258 \T1/ptm/m/it/10 train_v_iters=80\T1/ptm/m/n/10 , \T1/ptm/m/it/10 damp-ing_coeff =0.1\T1/ptm/m/n/10 , \T1/ptm/m/it/10 cg_iters=10\T1/ptm/m/n/10 , \T1/ptm/m/it/1 0 back-track_iters=10\T1/ptm/m/n/10 , \T1/ptm/m/it/10 back- Underfull \hbox (badness 10000) in paragraph at lines 3258--3258 \T1/ptm/m/it/10 track_coeff=0.8\T1/ptm/m/n/10 , \T1/ptm/m/it/10 lam=0.97\T1/ptm /m/n/10 , \T1/ptm/m/it/10 max_ep_len=1000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 log-g er_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 save_freq=10\T1/ptm/m/n/10 , [80] Overfull \vbox (100.02797pt too high) has occurred while \output is active [81] [82] [83] [84] Chapter 16. [85] [86] ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3678 ...rithms/ppo:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3679 \caption {PPO-Clip} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3681 \begin{algorithmic} [1] ! Undefined control sequence. l.3682 \STATE Input: initial policy parameters $\theta_0$, initial value... ! Undefined control sequence. l.3683 \FOR {$k = 0,1,2,...$} ! Undefined control sequence. l.3684 \STATE Collect set of trajectories ${\mathcal D}_k = \{\tau_i\}$ ... ! Undefined control sequence. l.3685 \STATE Compute rewards-to-go $\hat{R}_t$. ! Undefined control sequence. l.3686 \STATE Compute advantage estimates, $\hat{A}_t$ (using any method... ! Undefined control sequence. l.3687 \STATE Update the policy by maximizing the PPO-Clip objective: ! Undefined control sequence. l.3695 \STATE Fit value function by regression on mean-squared error: ! Undefined control sequence. l.3700 \ENDFOR ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3701 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.3702 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 3708--3708 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , [87] [88] [89] [90] Chapter 17. [91] [92] [93] ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4092 ...ithms/ddpg:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4093 \caption {Deep Deterministic Policy Gradient} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4095 \begin{algorithmic} [1] ! Undefined control sequence. l.4096 \STATE Input: initial policy parameters $\theta$, Q-function para... ! Undefined control sequence. l.4097 \STATE Set target parameters equal to main parameters $\theta_{\t... ! Undefined control sequence. l.4098 \REPEAT ! Undefined control sequence. l.4099 \STATE Observe state $s$ and select action $a = \text{clip}(\... ! Undefined control sequence. l.4100 \STATE Execute $a$ in the environment ! Undefined control sequence. l.4101 \STATE Observe next state $s'$, reward $r$, and done signal $... ! Undefined control sequence. l.4102 \STATE Store $(s,a,r,s',d)$ in replay buffer $\mathcal{D}$ ! Undefined control sequence. l.4103 \STATE If $s'$ is terminal, reset environment state. ! Undefined control sequence. l.4104 \IF {it's time to update} ! Undefined control sequence. l.4105 \FOR {however many updates} ! Undefined control sequence. l.4106 \STATE Randomly sample a batch of transitions, $B = \... ! Undefined control sequence. l.4107 \STATE Compute targets ! Undefined control sequence. l.4111 \STATE Update Q-function by one step of gradient desc... ! Undefined control sequence. l.4115 \STATE Update policy by one step of gradient ascent u... ! Undefined control sequence. l.4119 \STATE Update target networks with ! Undefined control sequence. l.4124 \ENDFOR ! Undefined control sequence. l.4125 \ENDIF ! Undefined control sequence. l.4126 \UNTIL {convergence} ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4127 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4128 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 4134--4134 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 4134--4134 \T1/ptm/m/it/10 pi_lr=0.001\T1/ptm/m/n/10 , \T1/ptm/m/it/10 q_lr=0.001\T1/ptm/m /n/10 , \T1/ptm/m/it/10 batch_size=100\T1/ptm/m/n/10 , \T1/ptm/m/it/10 start_st eps=10000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 act_noise=0.1\T1/ptm/m/n/10 , [94] [95] [96] Chapter 18. [97] ! Undefined control sequence. ...}L(\phi _1, {\mathcal D}) = \underE {(s,a,r,s',d) \sim {\mathc... l.4440 },\end{split} ! Undefined control sequence. ...}L(\phi _1, {\mathcal D}) = \underE {(s,a,r,s',d) \sim {\mathc... l.4440 },\end{split} ! Undefined control sequence. ...}L(\phi _2, {\mathcal D}) = \underE {(s,a,r,s',d) \sim {\mathc... l.4444 }.\end{split} ! Undefined control sequence. ...}L(\phi _2, {\mathcal D}) = \underE {(s,a,r,s',d) \sim {\mathc... l.4444 }.\end{split} [98] ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4468 ...rithms/td3:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4469 \caption {Twin Delayed DDPG} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4471 \begin{algorithmic} [1] ! Undefined control sequence. l.4472 \STATE Input: initial policy parameters $\theta$, Q-function para... ! Undefined control sequence. l.4473 \STATE Set target parameters equal to main parameters $\theta_{\t... ! Undefined control sequence. l.4474 \REPEAT ! Undefined control sequence. l.4475 \STATE Observe state $s$ and select action $a = \text{clip}(\... ! Undefined control sequence. l.4476 \STATE Execute $a$ in the environment ! Undefined control sequence. l.4477 \STATE Observe next state $s'$, reward $r$, and done signal $... ! Undefined control sequence. l.4478 \STATE Store $(s,a,r,s',d)$ in replay buffer $\mathcal{D}$ ! Undefined control sequence. l.4479 \STATE If $s'$ is terminal, reset environment state. ! Undefined control sequence. l.4480 \IF {it's time to update} ! Undefined control sequence. l.4481 \FOR {$j$ in range(however many updates)} ! Undefined control sequence. l.4482 \STATE Randomly sample a batch of transitions, $B = \... ! Undefined control sequence. l.4483 \STATE Compute target actions ! Undefined control sequence. l.4487 \STATE Compute targets ! Undefined control sequence. l.4491 \STATE Update Q-functions by one step of gradient des... ! Undefined control sequence. l.4495 \IF { $j \mod$ \texttt{policy\_delay} $ = 0$} ! Undefined control sequence. l.4496 \STATE Update policy by one step of gradient asce... ! Undefined control sequence. l.4500 \STATE Update target networks with ! Undefined control sequence. l.4505 \ENDIF ! Undefined control sequence. l.4506 \ENDFOR ! Undefined control sequence. l.4507 \ENDIF ! Undefined control sequence. l.4508 \UNTIL {convergence} ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4509 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4510 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 4516--4516 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 4516--4516 \T1/ptm/m/it/10 pi_lr=0.001\T1/ptm/m/n/10 , \T1/ptm/m/it/10 q_lr=0.001\T1/ptm/m /n/10 , \T1/ptm/m/it/10 batch_size=100\T1/ptm/m/n/10 , \T1/ptm/m/it/10 start_st eps=10000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 act_noise=0.1\T1/ptm/m/n/10 , \T1/ptm /m/it/10 tar- [99] [100] [101] [102] Chapter 19. [103] ! Undefined control sequence. \begin {split}H(P) = \underE {x \sim P}{-\log P(x)}.\en... l.4831 ...underE{x \sim P}{-\log P(x)}.\end{split} ! Undefined control sequence. \begin {split}H(P) = \underE {x \sim P}{-\log P(x)}.\en... l.4831 ...underE{x \sim P}{-\log P(x)}.\end{split} ! Undefined control sequence. ...}\pi ^* = \arg \max _{\pi } \underE {\tau \sim \pi }{ \sum _{t... l.4835 ...pi(\cdot|s_t)\right) \bigg)},\end{split} ! Undefined control sequence. ...}\pi ^* = \arg \max _{\pi } \underE {\tau \sim \pi }{ \sum _{t... l.4835 ...pi(\cdot|s_t)\right) \bigg)},\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {\tau \sim \pi }{ \left . ... l.4839 ...ight) \bigg) \right| s_0 = s}\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {\tau \sim \pi }{ \left . ... l.4839 ...ight) \bigg) \right| s_0 = s}\end{split} ! Undefined control sequence. ...egin {split}Q^{\pi }(s,a) = \underE {\tau \sim \pi }{ \left . ... l.4843 ...ght)\right| s_0 = s, a_0 = a}\end{split} ! Undefined control sequence. ...egin {split}Q^{\pi }(s,a) = \underE {\tau \sim \pi }{ \left . ... l.4843 ...ght)\right| s_0 = s, a_0 = a}\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {a \sim \pi }{Q^{\pi }(s,a... l.4847 ...ha H\left(\pi(\cdot|s)\right)\end{split} ! Undefined control sequence. ...\begin {split}V^{\pi }(s) = \underE {a \sim \pi }{Q^{\pi }(s,a... l.4847 ...ha H\left(\pi(\cdot|s)\right)\end{split} ! Undefined control sequence. ...gin {split}Q^{\pi }(s,a) &= \underE {s' \sim P \\ a' \sim \pi ... l.4852 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Missing } inserted. } l.4852 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Missing { inserted. { l.4852 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Undefined control sequence. ...s')\right ) \right )} \\ &= \underE {s' \sim P}{R(s,a,s') + \g... l.4852 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Undefined control sequence. ...gin {split}Q^{\pi }(s,a) &= \underE {s' \sim P \\ a' \sim \pi ... l.4852 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Missing } inserted. } l.4852 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Missing { inserted. { l.4852 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} ! Undefined control sequence. ...s')\right ) \right )} \\ &= \underE {s' \sim P}{R(s,a,s') + \g... l.4852 ...,a,s') + \gamma V^{\pi}(s')}.\end{split} [104] ! Undefined control sequence. ...begin {split}V^{\pi }(s) &= \underE {a \sim \pi }{Q^{\pi }(s,a... l.4875 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. ...pi (\cdot |s)\right ) \\ &= \underE {a \sim \pi }{Q^{\pi }(s,a... l.4875 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. ...begin {split}V^{\pi }(s) &= \underE {a \sim \pi }{Q^{\pi }(s,a... l.4875 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. ...pi (\cdot |s)\right ) \\ &= \underE {a \sim \pi }{Q^{\pi }(s,a... l.4875 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence.}L(\psi , {\mathcal D}) = \underE {s \sim \mathcal {D} \\ \t... l.4883 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Missing } inserted. } l.4883 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Missing { inserted. { l.4883 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Undefined control sequence.}L(\psi , {\mathcal D}) = \underE {s \sim \mathcal {D} \\ \t... l.4883 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Missing } inserted. } l.4883 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Missing { inserted. { l.4883 ...tilde{a}|s) \right)\Bigg)^2}.\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {a \sim \pi }{Q^{\pi }(s,a) - \a... l.4889 ...s,a) - \alpha \log \pi(a|s)}.\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {a \sim \pi }{Q^{\pi }(s,a) - \a... l.4889 ...s,a) - \alpha \log \pi(a|s)}.\end{split} [105] ! Undefined control sequence. \split@tag \begin {split}\underE {a \sim \pi _{\theta }}{Q^{\pi _... l.4906 ...\tilde{a}_{\theta}(s,\xi)|s)}\end{split} ! Undefined control sequence. ...\log \pi _{\theta }(a|s)} = \underE {\xi \sim \mathcal {N}}{Q^... l.4906 ...\tilde{a}_{\theta}(s,\xi)|s)}\end{split} ! Undefined control sequence. \split@tag \begin {split}\underE {a \sim \pi _{\theta }}{Q^{\pi _... l.4906 ...\tilde{a}_{\theta}(s,\xi)|s)}\end{split} ! Undefined control sequence. ...\log \pi _{\theta }(a|s)} = \underE {\xi \sim \mathcal {N}}{Q^... l.4906 ...\tilde{a}_{\theta}(s,\xi)|s)}\end{split} ! Undefined control sequence. ...egin {split}\max _{\theta } \underE {s \sim \mathcal {D} \\ \x... l.4910 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Missing } inserted. } l.4910 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Missing { inserted. { l.4910 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Undefined control sequence. ...egin {split}\max _{\theta } \underE {s \sim \mathcal {D} \\ \x... l.4910 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Missing } inserted. } l.4910 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! Missing { inserted. { l.4910 ...tilde{a}_{\theta}(s,\xi)|s)},\end{split} ! LaTeX Error: Environment algorithm undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4928 ...rithms/sac:pseudocode}}\begin{algorithm} [H] ! LaTeX Error: \caption outside float. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4929 \caption {Soft Actor-Critic} ! LaTeX Error: Environment algorithmic undefined. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4931 \begin{algorithmic} [1] ! Undefined control sequence. l.4932 \STATE Input: initial policy parameters $\theta$, Q-function para... ! Undefined control sequence. l.4933 \STATE Set target parameters equal to main parameters $\psi_{\tex... ! Undefined control sequence. l.4934 \REPEAT ! Undefined control sequence. l.4935 \STATE Observe state $s$ and select action $a \sim \pi_{\thet... ! Undefined control sequence. l.4936 \STATE Execute $a$ in the environment ! Undefined control sequence. l.4937 \STATE Observe next state $s'$, reward $r$, and done signal $... ! Undefined control sequence. l.4938 \STATE Store $(s,a,r,s',d)$ in replay buffer $\mathcal{D}$ ! Undefined control sequence. l.4939 \STATE If $s'$ is terminal, reset environment state. ! Undefined control sequence. l.4940 \IF {it's time to update} ! Undefined control sequence. l.4941 \FOR {$j$ in range(however many updates)} ! Undefined control sequence. l.4942 \STATE Randomly sample a batch of transitions, $B = \... ! Undefined control sequence. l.4943 \STATE Compute targets for Q and V functions: [106] ! Undefined control sequence. l.4948 \STATE Update Q-functions by one step of gradient des... ! Undefined control sequence. l.4952 \STATE Update V-function by one step of gradient desc... ! Undefined control sequence. l.4956 \STATE Update policy by one step of gradient ascent u... ! Undefined control sequence. l.4961 \STATE Update target value network with ! Undefined control sequence. l.4965 \ENDFOR ! Undefined control sequence. l.4966 \ENDIF ! Undefined control sequence. l.4967 \UNTIL {convergence} ! LaTeX Error: \begin{document} ended by \end{algorithmic}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4968 \end{algorithmic} ! LaTeX Error: \begin{document} ended by \end{algorithm}. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.4969 \end{algorithm} Underfull \hbox (badness 10000) in paragraph at lines 4975--4975 []\T1/ptm/m/it/10 env_fn\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac-tor_critic=\T1/ptm/m/n/10 , \T1/ptm/m/it/10 ac_kwargs={}\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , Underfull \hbox (badness 10000) in paragraph at lines 4975--4975 \T1/ptm/m/it/10 lr=0.001\T1/ptm/m/n/10 , \T1/ptm/m/it/10 al-pha=0.2\T1/ptm/m/n/ 10 , \T1/ptm/m/it/10 batch_size=100\T1/ptm/m/n/10 , \T1/ptm/m/it/10 start_steps =10000\T1/ptm/m/n/10 , \T1/ptm/m/it/10 max_ep_len=1000\T1/ptm/m/n/10 , \T1/ptm/ m/it/10 log- [107] Overfull \vbox (77.80809pt too high) has occurred while \output is active [108] [109] [110] Chapter 20. [111] [112] [113] [114] [115] [116] Chapter 21. [117] [118] Chapter 22. [119] [120] Chapter 23. [121] Underfull \hbox (badness 10000) in paragraph at lines 5990--5990 []\T1/ptm/m/it/10 exp_name\T1/ptm/m/n/10 , \T1/ptm/m/it/10 thunk\T1/ptm/m/n/10 , \T1/ptm/m/it/10 seed=0\T1/ptm/m/n/10 , \T1/ptm/m/it/10 num_cpu=1\T1/ptm/m/n/1 0 , [122] [123] [124] Chapter 24. [125] [126] Chapter 25. [127] [128] Chapter 26. [129] [130] [131] (./SpinningUp.ind [132] Underfull \hbox (badness 7522) in paragraph at lines 47--48 []\T1/ptm/m/n/10 add() (spinup.utils.run_utils.ExperimentGrid method), Overfull \hbox (5.61969pt too wide) in paragraph at lines 48--49 []\T1/ptm/m/n/10 apply_gradients() (spinup.utils.mpi_tf.MpiAdamOptimizer Overfull \hbox (17.83952pt too wide) in paragraph at lines 74--75 []\T1/ptm/m/n/10 compute_gradients() (spinup.utils.mpi_tf.MpiAdamOptimizer [133] Underfull \hbox (badness 10000) in paragraph at lines 103--104 []\T1/ptm/m/n/10 mpi_statistics_scalar() (in mod-ule Underfull \hbox (badness 10000) in paragraph at lines 119--120 []\T1/ptm/m/n/10 run() (spinup.utils.run_utils.ExperimentGrid method), Underfull \hbox (badness 10000) in paragraph at lines 140--141 []\T1/ptm/m/n/10 variant_name() (spinup.utils.run_utils.ExperimentGrid Underfull \hbox (badness 10000) in paragraph at lines 141--142 []\T1/ptm/m/n/10 variants() (spinup.utils.run_utils.ExperimentGrid [134]) (./SpinningUp.aux) Package rerunfilecheck Warning: File `SpinningUp.out' has changed. (rerunfilecheck) Rerun to get outlines right (rerunfilecheck) or use package `bookmark'. LaTeX Warning: There were multiply-defined labels. ) (see the transcript file for additional information){/usr/share/texlive/texmf-d ist/fonts/enc/dvips/base/8r.enc}< /usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmr5.pfb> Output written on SpinningUp.pdf (140 pages, 1146664 bytes). Transcript written on SpinningUp.log. [rtd-command-info] start-time: 2018-11-17T13:27:02.911803Z, end-time: 2018-11-17T13:27:02.969665Z, duration: 0, exit-code: 0 mv -f /home/docs/checkouts/ /home/docs/checkouts/ [rtd-command-info] start-time: 2018-11-17T13:27:03.029654Z, end-time: 2018-11-17T13:28:33.235642Z, duration: 90, exit-code: 0 python sphinx-build -T -b epub -d _build/doctrees-epub -D language=en . _build/epub Running Sphinx v1.5.6 making output directory... loading translations [en]... done loading pickled environment... not yet created building [mo]: targets for 0 po files that are out of date building [epub]: targets for 31 source files that are out of date updating environment: 31 added, 0 changed, 0 removed reading sources... [ 3%] algorithms/ddpg reading sources... [ 6%] algorithms/ppo reading sources... [ 9%] algorithms/sac reading sources... [ 12%] algorithms/td3 reading sources... [ 16%] algorithms/trpo reading sources... [ 19%] algorithms/vpg reading sources... [ 22%] etc/acknowledgements reading sources... [ 25%] etc/author reading sources... [ 29%] index reading sources... [ 32%] spinningup/bench reading sources... [ 35%] spinningup/exercise2_1_soln reading sources... [ 38%] spinningup/exercise2_2_soln reading sources... [ 41%] spinningup/exercises reading sources... [ 45%] spinningup/extra_pg_proof1 reading sources... [ 48%] spinningup/extra_pg_proof2 reading sources... [ 51%] spinningup/keypapers reading sources... [ 54%] spinningup/rl_intro reading sources... [ 58%] spinningup/rl_intro2 reading sources... [ 61%] spinningup/rl_intro3 reading sources... [ 64%] spinningup/rl_intro4 reading sources... [ 67%] spinningup/spinningup reading sources... [ 70%] user/algorithms reading sources... [ 74%] user/installation reading sources... [ 77%] user/introduction reading sources... [ 80%] user/plotting reading sources... [ 83%] user/running reading sources... [ 87%] user/saving_and_loading reading sources... [ 90%] utils/logger reading sources... [ 93%] utils/mpi reading sources... [ 96%] utils/plotter reading sources... [100%] utils/run_utils /home/docs/checkouts/ WARNING: Duplicate explicit target name: "solution available here.". /home/docs/checkouts/ WARNING: Duplicate explicit target name: "on github". /home/docs/checkouts/ WARNING: Line block ends without a blank line. /home/docs/checkouts/ WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/ WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/ WARNING: Duplicate explicit target name: "cmdoption--ac_kwargs". /home/docs/checkouts/ WARNING: Inline strong start-string without end-string. /home/docs/checkouts/ WARNING: Duplicate explicit target name: "cmdoption-arg-default". /home/docs/checkouts/ WARNING: Duplicate explicit target name: "cmdoption-arg-default". looking for now-outdated files... none found pickling environment... done checking consistency... /home/docs/checkouts/ WARNING: document isn't included in any toctree /home/docs/checkouts/ WARNING: document isn't included in any toctree /home/docs/checkouts/ WARNING: document isn't included in any toctree /home/docs/checkouts/ WARNING: document isn't included in any toctree /home/docs/checkouts/ WARNING: document isn't included in any toctree done preparing documents... done writing output... [ 3%] algorithms/ddpg writing output... [ 6%] algorithms/ppo writing output... [ 9%] algorithms/sac writing output... [ 12%] algorithms/td3 writing output... [ 16%] algorithms/trpo writing output... [ 19%] algorithms/vpg writing output... [ 22%] etc/acknowledgements writing output... [ 25%] etc/author writing output... [ 29%] index writing output... [ 32%] spinningup/bench writing output... [ 35%] spinningup/exercise2_1_soln writing output... [ 38%] spinningup/exercise2_2_soln writing output... [ 41%] spinningup/exercises writing output... [ 45%] spinningup/extra_pg_proof1 writing output... [ 48%] spinningup/extra_pg_proof2 writing output... [ 51%] spinningup/keypapers writing output... [ 54%] spinningup/rl_intro writing output... [ 58%] spinningup/rl_intro2 writing output... [ 61%] spinningup/rl_intro3 writing output... [ 64%] spinningup/rl_intro4 writing output... [ 67%] spinningup/spinningup writing output... [ 70%] user/algorithms writing output... [ 74%] user/installation writing output... [ 77%] user/introduction writing output... [ 80%] user/plotting writing output... [ 83%] user/running writing output... [ 87%] user/saving_and_loading writing output... [ 90%] utils/logger writing output... [ 93%] utils/mpi writing output... [ 96%] utils/plotter writing output... [100%] utils/run_utils generating indices... genindex py-modindex writing additional pages... copying images... [ 10%] images/spinning-up-in-rl.png copying images... [ 20%] spinningup/../images/bench/bench_halfcheetah.svg copying images... [ 30%] spinningup/../images/bench/bench_hopper.svg copying images... [ 40%] spinningup/../images/bench/bench_walker.svg copying images... [ 50%] spinningup/../images/bench/bench_swim.svg copying images... [ 60%] spinningup/../images/bench/bench_ant.svg copying images... [ 70%] spinningup/../images/ex2-1_trpo_hopper.png copying images... [ 80%] spinningup/../images/ex2-2_ddpg_bug.svg copying images... [ 90%] spinningup/../images/rl_diagram_transparent_bg.png copying images... [100%] spinningup/../images/rl_algorithms_9_15.svg copying static files... WARNING: html_static_path entry '/home/docs/checkouts/' does not exist done copying extra files... WARNING: favicon file 'openai_icon.ico' does not exist done writing mimetype file... writing META-INF/container.xml file... writing content.opf file... WARNING: unknown mimetype for _static/openai-favicon2_32x32.ico, ignoring WARNING: unknown mimetype for _static/openai_icon.ico, ignoring writing nav.xhtml file... writing toc.ncx file... writing SpinningUp.epub file... build succeeded, 18 warnings. [rtd-command-info] start-time: 2018-11-17T13:28:33.334561Z, end-time: 2018-11-17T13:28:33.521282Z, duration: 0, exit-code: 0 mv -f /home/docs/checkouts/ /home/docs/checkouts/