Prüfer: Prof. Nöth **Performance Evaluation** * "Explain the ROC curve". (TP-rate, FP-rate, drawing, meaning, Area Under Curve) * "What do we need to be able to use a ROC curve?" (we need a one-class-yes-or-no-problem, and not a two-or-more-class Problem). That one was very vague and it took me some time until I said what he wanted to hear. **Bayes Classifier** * "Explain the Bayes Classifier". (prior, posterior, Bayes formula, argmax p(y|x)) * "How do we get p(x|y) and p(y)?" (By doing assumptions about the kind of distribution. Then we can estimate the parameters from the training data.) * [I took the gaussian distribution as an example for such an assumption, which led over to...] **Gaussian classifier** * How to estimate the parameters * How does the decision boundary look like? (quadratic or sometimes (when?) linear) * "What can we do in order to get rid of the exp(-1/2 * x^T \Sigma x ...) part?" (What are logistic functions, how to formulate a twoclass problem in terms of logistic functions, role of the F(x), decision boundary is F(x)=0) * I also explained how the F(x) will look like in case of Gaussian, and how this explains a linear decision boundary in case of equal variances. Not sure whether he wanted to hear that. * "How large is the covariance matrix of a 100-dimensional-vectorial-data?" (about ~10000/2 entries (symmetry!), O(n^2)) * "Naive Bayes..." (...assumes independency of the entries, cov-matrix is diagonal, 100 entries) * "Something in between?" (cov-matrix with only diagonal and some minor diagonals) * "When is this appropriate; why should only some, but not all components be related to each other?" (time-sampled data or similar) **Unrelated** * "What can we do if the dimension is too high?" (e.g. PCA. no further questions) **NN- and kNN-Classifier** * "what does the NN do?" * "what requirements for the data?" (must be normalized, all entries should span the same range) * "how does kNN work?" * "explain the code" (detailed explanation of the weird matlab syntax needed!) **EM-Algorithm** * "But if we don't have a gaussian, what can we try?" (maybe GMM/EM-algorithm) * "Explain the formula" * "Explain the steps" * local maximum. **Summary** Noeth often expected me to continue his own sentences. Questions were extremely vague. I would definitely //not// call the atmosphere "kind". Noeth was pretty picky about minor mistakes, the tone was rather condescending. Nevertheless, the grading was pretty student-friendly and forgiving.