CS 294-112 at UC Berkeley



CS189 or equivalent is a prerequisite for the course. This course will assume some familiarity with reinforcement learning, numerical optimization, and machine learning. For introductory material on RL and MDPs, see the CS188 EdX course, starting with Markov Decision Processes I, as well as Chapters 3 and 4 of Sutton & Barto.


Piazza will be used for announcements, general questions and discussions, clarifications about assignments, student questions to each other, and so on. If you are a UC Berkeley student enrolled in the course, and haven't already been added to Piazza, please email Sid.

Gradescope will be used to collect and grade assignments. If you are a UC Berkeley student enrolled in the course, and haven't already been added to Gradescope, please email Sid.

If you are not a UC Berkeley student or not enrolled, but are interested in following and discussing the course, there is a subreddit forum here that we will try to monitor: reddit.com/r/berkeleydeeprlcourse/


All materials can be found on the front page.


There will be five homeworks. For each homework, we will post a PDF on the front page and starter code on Github.


We will post slides on the front page after each lecture.


We will provide a live stream of the lectures, as well as video recordings. The course is not being offered as an online course, and the videos are provided only for your personal informational and entertainment purposes. They are not part of any course requirement or degree-bearing university program.


All homeworks should be done individually.

For the final project, you may work in groups of up to three people. Each group will submit the report and give the presentations together. The expectations for the project scope will increase depending on the number of students in each group, and for groups of two or three, we will also expect a short paragraph to explain the role of each group member along with the final report. From past experience, groups of two tend to be the most effective, though you may work in a group of three or alone. Groups larger than three are not permitted without special permission from the course staff. Note that you should form your groups before Oct 9, though you are strongly encouraged to do this much sooner so that you can start on your project.

Late Policy

All assignments must be turned in via Gradescope on time. We will allow a total of five late days cumulatively. We will not make any additional allowances for late assignments: the late days are intended to provide for exceptional circumstances, and students should avoid using them unless absolutely necessary. Any assignments that are submitted late (with insufficient late days remaining) will not be graded.


  • Homework: 50% (10% per HW x 5 HWs)
  • Final Project: 50%