Information
Code | CENG0046 |
Name | Text Vectorization |
Term | 2024-2025 Academic Year |
Semester | . Semester |
Duration (T+A) | 3-0 (T-A) (17 Week) |
ECTS | 6 ECTS |
National Credit | 3 National Credit |
Teaching Language | İngilizce |
Level | Doktora Dersi |
Type | Normal |
Mode of study | Yüz Yüze Öğretim |
Catalog Information Coordinator |
Course Goal / Objective
The student learns text vectorization methods that provide numerical representation of words, sentences and documents, increasing success in natural language processing problems based on deep learning. In this process, he learns to focus on the importance of context and low sparse computation methods.
Course Content
Vector Space Model and One-hot vectors, Sense Representation Problem and Synset Embedding, Sparsity Problem, Word Embedding Methods (Word2Vec, Glove, FastText), Contextualized Text Embeddings (BERT, ELMO, GPT-x), Synset Based Contextual Embedding (Generalized SemSpace)
Course Precondition
none
Resources
Son teknoloji makaleler
Notes
Daniel Jurafsky and James H. Martin, Speech and language processing an introduction to natural language processing, computational linguistics, and speech, 2000.
Course Learning Outcomes
Order | Course Learning Outcomes |
---|---|
LO01 | Learns the concepts of word and sense embeddings |
LO02 | Understands contextualization |
LO03 | Comments advantages of text representation |
LO04 | Implements state of the art methods to any text |
Relation with Program Learning Outcome
Order | Type | Program Learning Outcomes | Level |
---|---|---|---|
PLO01 | Bilgi - Kuramsal, Olgusal | On the basis of the competencies gained at the undergraduate level, it has an advanced level of knowledge and understanding that provides the basis for original studies in the field of Computer Engineering. | 3 |
PLO02 | Bilgi - Kuramsal, Olgusal | By reaching scientific knowledge in the field of engineering, he/she reaches the knowledge in depth and depth, evaluates, interprets and applies the information. | 3 |
PLO03 | Yetkinlikler - Öğrenme Yetkinliği | Being aware of the new and developing practices of his / her profession and examining and learning when necessary. | 2 |
PLO04 | Yetkinlikler - Öğrenme Yetkinliği | Constructs engineering problems, develops methods to solve them and applies innovative methods in solutions. | 4 |
PLO05 | Yetkinlikler - Öğrenme Yetkinliği | Designs and applies analytical, modeling and experimental based researches, analyzes and interprets complex situations encountered in this process. | |
PLO06 | Yetkinlikler - Öğrenme Yetkinliği | Develops new and / or original ideas and methods, develops innovative solutions in system, part or process design. | 4 |
PLO07 | Beceriler - Bilişsel, Uygulamalı | Has the skills of learning. | 3 |
PLO08 | Beceriler - Bilişsel, Uygulamalı | Being aware of new and emerging applications of Computer Engineering examines and learns them if necessary. | 2 |
PLO09 | Beceriler - Bilişsel, Uygulamalı | Transmits the processes and results of their studies in written or oral form in the national and international environments outside or outside the field of Computer Engineering. | 3 |
PLO10 | Beceriler - Bilişsel, Uygulamalı | Has comprehensive knowledge about current techniques and methods and their limitations in Computer Engineering. | 4 |
PLO11 | Beceriler - Bilişsel, Uygulamalı | Uses information and communication technologies at an advanced level interactively with computer software required by Computer Engineering. | 4 |
PLO12 | Bilgi - Kuramsal, Olgusal | Observes social, scientific and ethical values in all professional activities. |
Week Plan
Week | Topic | Preparation | Methods |
---|---|---|---|
1 | Vector Space Model | Reading paper | Öğretim Yöntemleri: Anlatım |
2 | One-hot vectors | Reading paper | Öğretim Yöntemleri: Anlatım |
3 | Word-Sense Representation Problem | Reading paper | Öğretim Yöntemleri: Anlatım |
4 | Word-Synset Embedding | Reading paper | Öğretim Yöntemleri: Anlatım |
5 | Sparsity Problem | Reading paper | Öğretim Yöntemleri: Anlatım |
6 | Word Embedding Apps (Word2Vec, Glove, FastText) | Reading paper | Öğretim Yöntemleri: Deney / Laboratuvar |
7 | Contextualized Text Embeddings | Reading paper | Öğretim Yöntemleri: Anlatım |
8 | Mid-Term Exam | Study to all lecture notes | Ölçme Yöntemleri: Yazılı Sınav |
9 | Contextualized Apps (BERT, ELMO, GPT-x) | Reading paper | Öğretim Yöntemleri: Deney / Laboratuvar |
10 | Synset Based Contextual Embedding (Generalized SemSpace) | Reading paper | Öğretim Yöntemleri: Anlatım |
11 | Projects 1 | Prepare project | Ölçme Yöntemleri: Proje / Tasarım |
12 | Projects 2 | Prepare project | Ölçme Yöntemleri: Proje / Tasarım |
13 | Projects 3 | Prepare project | Ölçme Yöntemleri: Proje / Tasarım |
14 | Projects 4 | Prepare project | Ölçme Yöntemleri: Proje / Tasarım |
15 | Projects 5 | Prepare project | Ölçme Yöntemleri: Proje / Tasarım |
16 | Term Exams | Study to all lecture notes | Ölçme Yöntemleri: Yazılı Sınav |
17 | Term Exams | Study to all lecture notes | Ölçme Yöntemleri: Yazılı Sınav |
Student Workload - ECTS
Works | Number | Time (Hour) | Workload (Hour) |
---|---|---|---|
Course Related Works | |||
Class Time (Exam weeks are excluded) | 14 | 3 | 42 |
Out of Class Study (Preliminary Work, Practice) | 14 | 2 | 28 |
Assesment Related Works | |||
Homeworks, Projects, Others | 3 | 15 | 45 |
Mid-term Exams (Written, Oral, etc.) | 1 | 15 | 15 |
Final Exam | 1 | 15 | 15 |
Total Workload (Hour) | 145 | ||
Total Workload / 25 (h) | 5,80 | ||
ECTS | 6 ECTS |