CS 285 at UC Berkeley



CS189 or equivalent is a prerequisite for the course. This course will assume some familiarity with reinforcement learning, numerical optimization, and machine learning. For introductory material on RL and MDPs, see the CS188 EdX course, starting with Markov Decision Processes I, as well as Chapters 3 and 4 of Sutton & Barto.


Ed will be used for announcements, general questions and discussions, clarifications about assignments, student questions to each other, and so on. If you are a UC Berkeley student enrolled in the course, and haven't already been added to Ed, please email the staff.

Gradescope will be used to collect and grade assignments. If you are a UC Berkeley student enrolled in the course, and haven't already been added to Gradescope, please email the staff.

If you are not a UC Berkeley student or not enrolled, but are interested in following and discussing the course, there is a subreddit forum here that we will try to monitor: reddit.com/r/berkeleydeeprlcourse/


All materials can be found on the front page.


There will be five homeworks. For each homework, we will post a PDF on the front page and starter code on Github. We will roughly follow the schedule below:

  • HW1: Released 8/28, due 9/11
  • HW2: Released 9/11, due 9/25
  • HW3: Released 9/25, due 10/18
  • HW4: Released 10/16, due 11/1
  • HW5: Released 11/1, due 11/20
  • Final project: due during finals week


We will post slides on the front page after each lecture.


We will provide recordings of the lectures before the lecture slot and the lecture slot would be devoted to discussion pertaining to the material covered in the videos. We will post the youtube links to the lecture recordings soon.


All homeworks should be done individually.

For the final project, you may work in groups of up to three people. Each group will submit a report. The expectations for the project scope will increase depending on the number of students in each group, and for groups of two or three, we will also expect a short paragraph to explain the role of each group member along with the final report. From past experience, groups of two tend to be the most effective, though you may work in a group of three or alone. Groups larger than three are not permitted without special permission from the course staff. Note that you should form your groups before Oct 9, though you are strongly encouraged to do this much sooner so that you can start on your project.

Late Policy

All assignments must be turned in via Gradescope on time. We will allow a total of five late days cumulatively. We will not make any additional allowances for late assignments: the late days are intended to provide for exceptional circumstances, and students should avoid using them unless absolutely necessary. Any assignments that are submitted late (with insufficient late days remaining) will not be graded.

Late days may not be used for quizzes, final project proposals, final project milestone reports, final project reports, or any of the project peer review reports, only for the five homeworks.


  • Homework: 50% (10% per HW x 5 HWs)
  • Final Project: 40%
  • Quizzes: 10%. Your quiz grade for each lecture will be the max of the first try and second try, so if you take the quiz and don't like your grade, you can take the "second try" quiz (during the 48 hours after the first try due date) and replace your grade if you do better.