FSI Informatik/pruefungen/hauptstudium/ls5/pa-2019-08-20

Du befindest dich hier: FSI Informatik » Prüfungsfragen und Altklausuren » Hauptstudiumsprüfungen » Lehrstuhl 5 » pa-2019-08-20 (Übersicht)

Grade: 1.0

The exam started off with Prof. Riess asking to give an overview of all topics in the course that we discussed.

Q. He asked me about density estimation and what the idea behind that was.

A. I started with what the goal behind density estimation is, the approaches we could follow for the task (parametric, non-parametric). He stopped me to ask what parametric and non-parametric meant. I gave an explanation and detailed the approaches we could follow for each of the above (ML/MAP estimate for the former, Parzen window estimate for the latter).

Q. Yeah the bare bones Parzen window estimate. Could you detail on that perhaps.

A. I started once again with the idea behind using a kernel function to estimate the data, and briefly discussed how it's formulated by writing a bunch of formulas.

Q. Great. So we implemented something like this in the first exercise where we were trying to visualize the results of using this approach. Say you have a friend that sends you an output image of a raccoon but you don't know whether they used a Gaussian or a box kernel. Is there a way to know this? He then drew a 1D feature space such that it had samples on the x-axis.

A. So I assume that with a Gaussian kernel, you'd have a smoother output visually but you might get block artifacts from a box kernel. He was clearly looking for something more, and gave me a bit of time with it, asking me what 'smoothing' would mean from a visual perspective, but then I drew a box-like pdf as a result of sliding a box kernel through the samples. For the Gaussian kernel, I drew a bell-shaped estimated pdf, and this was what he was looking for.

Q. So we also discussed clustering. What is the idea behind that?

A. I explained what the goal with a clustering algorithm is, and then briefly touched upon the two algorithms that we generally use (K-means and a GMM).

Q. We also discussed another algorithm in class that isn't a conventional clustering algorithm but can be used to perform the task.

A. Yes, we talked about the mean shift algorithm. I explained the idea behind the mean shift (how we are looking for modes of density), and how we can get to these modes by looking for the zero gradients of the density. I then explained how the mean shift works for a clustering algorithm.

Q. Say your friend sends you results from another project he is working on, and its basically an image of a dataset where some clustering has been performed. But your friend isn't too talkative and you want to find out how he might have arrived at the clusters.

A. I drew two cases where a conventional k-means algorithm doesn't work well (oblong data/unequal variance and unequal samples in clusters). K-means tends to produce unintuitive results in such cases, but mean shift would work. If the data has spherical clusters it perhaps wouldn't be possible to judge which algorithm the friend used by seeing the visual results. But for the former datasets, a judgement is possible.

Q. I would like to move on to a probabilistic graphical model. But before we go into details, tell me how we tackle the probability space in these problems? He wrote down a joint probability density function (similar to the lecture), with N input observations and M output observations.

A. I remarked how using the entire combinations leads to an NxM probabilistic space which perhaps is impossible to model. So we use the Naive Bayes assumption. I then wrote a bunch of formulas detailing how the original space could be broken down with this assumption and the model becomes tractable. Also drew the probabilistic graph for Naive Bayes.

Q. Great, but the Naive Bayes is just a little naive. Could you list some potential application-specific problems, and how we address them?

A. For data where sequence of events is important (listed an example from a reference: words and parts-of-speech as labels, the sequence information is important). We can use a Hidden Markov Model, wrote a few formulas to show how the Naive Bayes could be adapted for HMM.

Q. Could you draw a HMM graph?

A. Listed the elements of HMM, drew the graph, explained what a left-right HMM is.

Q. Great, we also have some tasks in HMM and specific algorithms for those tasks.

A. Explained Inference and Training, and the tasks and algorithms in each.

Q. Could you explain what happens in the training phase? You don't need to write the formulas.

A. Gave the algorithm name, its an iterative expectation maximization approach. Explained what we do in expectation, and what we do in maximization.

Q. So the solution isn't globally optimal?

A. I said it wasn't, to which he asked how we could get to a globally optimal solution? Perhaps use different initialization schemes and then perform cross-validation to choose most optimal solution.

Environment: The atmosphere was quite friendly. The entire exam was more of a discussion rather than a monologue. After I gave him a general overview of the topic, he had specific questions where we went into greater detail.

Preparation: What helps is to have an intuitive understanding of each topic, so you can explain what a particular concept means, after which you can go into the details. I used the notes from the class. I went back to Google in case I had trouble reproducing the intuition behind a certain technique. Five or six days before the exam is a good enough time to prepare if you have gone through 60 or 70% of the material once before. I created small web diagrams for each topic, and wrote the formulas for all of them. Revised using these one day before the exam. The references are a good source for developing a well-rounded understanding of the topics. Went through around 70% of the references that were compulsory, mostly skipped the additional ones.