Intendierte Lernergebnisse
Students will gain a deep understanding of computer vision, video processing, and deep learning techniques for visual data analysis. They will develop practical skills in implementing and optimizing models using OpenCV, PyTorch, and advanced architectures such as CNNs (AlexNet, ResNet, EfficientNet), object detection models (YOLO, R-CNNs, U-Nets), GANs, and Vision Transformers. Additionally, they will explore state-of-the-art multimodal models like CLIP and BLIP for content-based video retrieval. Through hands-on projects, students will learn to evaluate, deploy, and optimize AI-driven vision systems for real-world applications, such as content-based search in a video archive.
Lehrmethodik
LectureProject assignments
Inhalt/e
Content analysis with OpenCVVideo processing and shot detectionDeep learning with visual dataMachine learningNeural networksConvolutional neural networks (AlexNet, GoogLeNet, ResNet, EfficientNet)PyTorchR-CNNs, U-Nets, YOLOGANsVision transformersCLIP, BLIPContent-based video retrieval
Curriculare Anmeldevoraussetzungen
Students of 289 (BA IT) und 488 (MA ICE) should have finished "Fundamentals of Image Processing" (700.30x)!