First he wanted me to give an overview over the topics of the lecture by drawing a mindmap

Then he wanted me to explain the Perceptron
	=> Schematic
	=> XOR → MLP
	=> Universal Function Approx
	=> Multiple Layer: Deep Learning

Name a loss functions that we discussed?
	=> L2 for regression, CE for classification, both optimal wrt. MLE
	=> Derivation for L2 by assuming Gauss

How does the network learn now?
	=> Backpropagation + gradient descend
	=> Chain rule: Multiplication of gradients + weight update
		=> Exploding/vanishing gradients

What other method did we use to encode the Information? (Not quite sure about the wording here)
	=> Activation Functions: Sigmoid/Tanh → ReLU 
		=> prevent vanishing gradients
What about Dying ReLU?
	=> Leaky ReLU

What is an regularization alternative fighting the internal covariate shift?
	=> Batchnormalization

Can you draw and explain the LSTM structure?

What are GANs?
	=> Generator vs Discriminator trained by MiniMax
There was a problem called Mode Collapse, please explain it.

	=> TODO
	=> D focuses only on one feature → G also


Can you explain Cycle Consistent GANs?
	=> Principle(trainiable inverse mapping) + combined loss function explained

What is an Autoencoder?
	=> Encoder-Decoder
	=> Undertetermined AE + Sparse AE


What is common for U-Net and Autoencoder? Whats the difference?
	=> Same: Encoder-Decoder structure
	=> Different: (Conv Layer), but mainly skip connections

How good would U-Net be in comparison to the AE?
	=> thanks to the skip connections it could basically copy the input