CS 185/285

CS 185/285 at UC Berkeley

Syllabus

Prerequisites

CS189 or equivalent is a prerequisite for the course. This course will assume some familiarity with reinforcement learning, numerical optimization, and machine learning. For introductory material on RL and MDPs, see the CS188 EdX course, starting with Markov Decision Processes I, as well as Chapters 3 and 4 of Sutton & Barto.

Technology

Ed will be used for announcements, general questions and discussions, clarifications about assignments, student questions to each other, and so on. If you are a UC Berkeley student enrolled in the course, and haven't already been added to Ed, please email the staff.

Gradescope will be used to collect and grade assignments. If you are a UC Berkeley student enrolled in the course, and haven't already been added to Gradescope, please email the staff.

If you are not a UC Berkeley student or not enrolled, but are interested in following and discussing the course, there is a subreddit forum here that we will try to monitor: reddit.com/r/berkeleydeeprlcourse/

Materials

All materials can be found on the front page.

Homeworks

There will be five homeworks. For each homework, we will post a PDF on the front page and starter code on Github. The release and due dates for Spring 2026 are still TBD.

Slides

We will post slides on the front page 24 hours before each lecture.

Videos

Lectures will be recorded, and recordings are available on bCourses. Please check that you can access bCourses and see the Media Gallery.

Course Logistics

Sections

Sections start in Week 2 and are taught by the TAs. The section schedule will be posted shortly.

Midterm Exam

We will hold a midterm exam late in the semester, during the week of April 15. The exact date and time will be posted soon.

Mini-Quizzes

Starting with Lecture 2, each lecture has a short Gradescope mini-quiz due within 7 days of the lecture (for example, the Jan 23 lecture quiz is due Jan 30). Quizzes are quick, and you can retake them by completing the "second try" quiz if you want to improve your score.

Collaboration

All homeworks should be done individually.

For the final project, you may work in groups of up to three people. Each group will submit a report. The expectations for the project scope will increase depending on the number of students in each group, and for groups of two or three, we will also expect a short paragraph to explain the role of each group member along with the final report. From past experience, groups of two tend to be the most effective, though you may work in a group of three or alone. Groups larger than three are not permitted without special permission from the course staff. Note that you should form your groups before TBD, though you are strongly encouraged to do this much sooner so that you can start on your project.

Late Policy

All assignments must be turned in via Gradescope on time. We will allow a total of five late days cumulatively. We will not make any additional allowances for late assignments: the late days are intended to provide for exceptional circumstances, and students should avoid using them unless absolutely necessary. Any assignments that are submitted late (with insufficient late days remaining) will not be graded.

Late days may not be used for quizzes, final project proposals, final project milestone reports, final project reports, or any of the project peer review reports, only for the five homeworks.

Grading

The grade in the course will be determined by programming homeworks (50%, 10% each), final project (20%), exam (20%), and mini-quiz (10%).