Information Package | Course Catalog

Information

Unit	INSTITUTE OF NATURAL AND APPLIED SCIENCES
	COMPUTER ENGINEERING (MASTER) (WITH THESIS) (ENGLISH)
Code	CENG045
Name	Reinforcement Learning
Term	2024-2025 Academic Year
Term	Fall
Duration (T+A)	3-0 (T-A) (17 Week)
ECTS	6 ECTS
National Credit	3 National Credit
Teaching Language	İngilizce
Level	Yüksek Lisans Dersi
Type	Normal
Mode of study	Yüz Yüze Öğretim
Catalog Information Coordinator	Dr. Öğr. Üyesi Mehmet SARIGÜL
Course Instructor	Dr. Öğr. Üyesi Mehmet SARIGÜL (Güz) (A Group) (Ins. in Charge)

Course Goal / Objective

The goal of a reinforcement learning course is to teach students the fundamentals of reinforcement learning, which is a subfield of machine learning. Reinforcement learning is concerned with how agents can learn to make decisions in an environment to achieve a specific goal.

Course Content

This course covers the Introduction to Reinforcement Learning, Basic concepts of reinforcement learning, comparison with supervised and unsupervised learning, and types of reinforcement learning problems, Markov Decision Processes (MDPs), Formalism of MDPs, reward function, state transitions, policy, value function, and Bellman equations, Dynamic Programming (DP): Policy evaluation, policy iteration, value iteration, and Monte Carlo methods. Temporal Difference (TD) Learning: On-policy and off-policy learning, Q-learning, SARSA, and eligibility traces. Function Approximation: Linear and non-linear function approximation, and deep reinforcement learning. Exploration and Exploitation: Exploration strategies such as epsilon-greedy, softmax, and UCB.

Course Precondition

Knowledge of basic programming, linear algebra, and probability theory.

Resources

Sutton, Richard S., and Andrew G. Barto. Reinforcement learning: An introduction. MIT press, 2018.

Notes

Sutton, Richard S., and Andrew G. Barto. Reinforcement learning: An introduction. MIT press, 2018.

Course Learning Outcomes

Order	Course Learning Outcomes
LO01	Understanding of the fundamentals of reinforcement learning
LO02	Ability to model problems as Markov Decision Processes (MDPs)
LO03	Ability to implement reinforcement learning algorithms
LO04	Ability to evaluate and compare reinforcement learning algorithms

Relation with Program Learning Outcome

Order	Type	Program Learning Outcomes	Level
PLO01	Bilgi - Kuramsal, Olgusal	On the basis of the competencies gained at the undergraduate level, it has an advanced level of knowledge and understanding that provides the basis for original studies in the field of Computer Engineering.	3
PLO02	Bilgi - Kuramsal, Olgusal	By reaching scientific knowledge in the field of engineering, he/she reaches the knowledge in depth and depth, evaluates, interprets and applies the information.	3
PLO03	Yetkinlikler - Öğrenme Yetkinliği	Being aware of the new and developing practices of his / her profession and examining and learning when necessary.
PLO04	Yetkinlikler - Öğrenme Yetkinliği	Constructs engineering problems, develops methods to solve them and applies innovative methods in solutions.
PLO05	Yetkinlikler - Öğrenme Yetkinliği	Designs and applies analytical, modeling and experimental based researches, analyzes and interprets complex situations encountered in this process.
PLO06	Yetkinlikler - Öğrenme Yetkinliği	Develops new and / or original ideas and methods, develops innovative solutions in system, part or process design.
PLO07	Beceriler - Bilişsel, Uygulamalı	Has the skills of learning.
PLO08	Beceriler - Bilişsel, Uygulamalı	Being aware of new and emerging applications of Computer Engineering examines and learns them if necessary.
PLO09	Beceriler - Bilişsel, Uygulamalı	Transmits the processes and results of their studies in written or oral form in the national and international environments outside or outside the field of Computer Engineering.
PLO10	Beceriler - Bilişsel, Uygulamalı	Has comprehensive knowledge about current techniques and methods and their limitations in Computer Engineering.
PLO11	Beceriler - Bilişsel, Uygulamalı	Uses information and communication technologies at an advanced level interactively with computer software required by Computer Engineering.
PLO12	Bilgi - Kuramsal, Olgusal	Observes social, scientific and ethical values in all professional activities.

Week Plan

Week	Topic	Preparation	Methods
1	Introduction to reinforcement learning	Reading the lecture notes	Öğretim Yöntemleri: Anlatım
2	Markov Decision Processes (MDPs), reward function, state transitions.	Reading the lecture notes	Öğretim Yöntemleri: Anlatım
3	Policy, value function, and Bellman equations.	Reading the lecture notes	Öğretim Yöntemleri: Anlatım
4	Dynamic Programming (DP), policy evaluation, policy iteration	Reading the lecture notes	Öğretim Yöntemleri: Anlatım
5	Value iteration, and Monte Carlo methods.	Reading the lecture notes	Öğretim Yöntemleri: Anlatım
6	Temporal Difference (TD) Learning, on-policy and off-policy learning	Reading the lecture notes	Öğretim Yöntemleri: Anlatım
7	Q-learning, SARSA, and eligibility traces.	Reading the lecture notes	Öğretim Yöntemleri: Anlatım
8	Mid-Term Exam		Ölçme Yöntemleri: Yazılı Sınav
9	Function Approximation, linear and non-linear function approximation.	Reading the lecture notes	Öğretim Yöntemleri: Anlatım
10	Exploration and Exploitation, exploration strategies such as epsilon-greedy, softmax, and UCB.	Reading the lecture notes	Öğretim Yöntemleri: Anlatım
11	Policy Gradients, direct policy search methods.	Reading the lecture notes	Öğretim Yöntemleri: Anlatım
12	REINFORCE algorithm, actor-critic methods, and A3C.	Reading the lecture notes	Öğretim Yöntemleri: Anlatım
13	Multi-agent Reinforcement Learning, non-zero sum games.	Reading the lecture notes	Öğretim Yöntemleri: Anlatım
14	Nash equilibria, and coordination in multi-agent systems.	Reading the lecture notes	Öğretim Yöntemleri: Anlatım
15	Review	Reading the lecture notes	Öğretim Yöntemleri: Tartışma
16	Term Exams		Ölçme Yöntemleri: Yazılı Sınav
17	Term Exams		Ölçme Yöntemleri: Yazılı Sınav

Assessment (Exam) Methods and Criteria

Assessment Type	Midterm / Year Impact	End of Term / End of Year Impact
1. Project / Design	100	40
General Assessment
Midterm / Year Total	100	40
1. Final Exam	-	60
Grand Total	-	100

Student Workload - ECTS

Works	Number	Time (Hour)	Workload (Hour)
Course Related Works
Class Time (Exam weeks are excluded)	14	3	42
Out of Class Study (Preliminary Work, Practice)	14	5	70
Assesment Related Works
Homeworks, Projects, Others	0	0	0
Mid-term Exams (Written, Oral, etc.)	1	14	14
Final Exam	1	28	28
Total Workload (Hour)			154
Total Workload / 25 (h)			6,16
ECTS			6 ECTS

CENG045 Reinforcement Learning

Information

Course Goal / Objective

Course Content

Course Precondition

Resources

Notes

Course Learning Outcomes

Relation with Program Learning Outcome

Week Plan

Assessment (Exam) Methods and Criteria

Student Workload - ECTS

Update Time: 13.05.2024 03:07

Faydalı Linkler

Fakülte

Enstitü

Yüksekokul

Devlet Konservatuvarı

Meslek Yüksekokulu

Öğrenci Sayısı