Du befindest dich hier: FSI Informatik » Prüfungsfragen und Altklausuren » Nebenfächer » Braindump Business Intelligence SS 2020 (Übersicht)
Unterschiede
Hier werden die Unterschiede zwischen zwei Versionen der Seite angezeigt.
Nächste Überarbeitung | Vorherige Überarbeitung | ||
pruefungen:nebenfach:bwl_bin20 [11.08.2020 17:01] – angelegt nakami | pruefungen:nebenfach:bwl_bin20 [11.08.2020 19:10] (aktuell) – nakami | ||
---|---|---|---|
Zeile 1: | Zeile 1: | ||
- | Braindump SS 2020 | + | ====== |
- | entnommen | + | Entnommen |
+ | |||
+ | Zeit: 90min; wirkt ausreichend, | ||
+ | |||
+ | Anzahl Seiten: 18(!) | ||
+ | |||
+ | == 1. Preprocessing == | ||
+ | |||
+ | |||
+ | Given dataset with 5,050 examples, shall predict whether students pass exam or not | ||
- | 1. Preprocessing: | ||
- | given dataset with 5,050 examples, shall predict whether students pass exam or not | ||
a) explain what preprocessing is necessary for using a nueral network on that data set (conversion into numeric, missing values, etc) | a) explain what preprocessing is necessary for using a nueral network on that data set (conversion into numeric, missing values, etc) | ||
- | b) I. What is the problem with a data set that has label " | + | |
+ | b) | ||
+ | |||
+ | I. What is the problem with a data set that has label " | ||
II. Is accuracy a good metric here? | II. Is accuracy a good metric here? | ||
- | -> (no, | + | |
+ | Solution: | ||
c) Your boss wants to do the following tasks: | c) Your boss wants to do the following tasks: | ||
+ | |||
1. Calculate the profit in the future | 1. Calculate the profit in the future | ||
+ | |||
2. Put employees in pre-definded groups | 2. Put employees in pre-definded groups | ||
- | (In other wordsName | + | |
- | 2. Evaluation | + | Which technology can he use that is able to solve BOTH problems. |
+ | |||
+ | (In other words: name 2 models that can do regression AND classification) | ||
+ | |||
+ | == 2. Evaluation | ||
Confusion matrix about people spending a lot in an online shop. | Confusion matrix about people spending a lot in an online shop. | ||
+ | |||
a) calculate precision, recall and F1 score | a) calculate precision, recall and F1 score | ||
+ | |||
b) argue what should one choose to classify customers that will likely review my product positively, so that I will send them my product for free (precision!) | b) argue what should one choose to classify customers that will likely review my product positively, so that I will send them my product for free (precision!) | ||
+ | |||
c) Your model shows a low training error and a high validation error error. What might be the issue? What can you change to fix it? | c) Your model shows a low training error and a high validation error error. What might be the issue? What can you change to fix it? | ||
- | -> overfitting | + | |
+ | Solution: | ||
d) given 3 ROCs, which is better | d) given 3 ROCs, which is better | ||
- | 3. Decision Trees | + | |
+ | == 3. Decision Trees == | ||
a) read classification from a given tree: What two groups are targeted? | a) read classification from a given tree: What two groups are targeted? | ||
(Decision tree has leafs "Will respond to campaign" | (Decision tree has leafs "Will respond to campaign" | ||
+ | |||
b) what to do if tree is too complex | b) what to do if tree is too complex | ||
+ | |||
c) given two trees, which is better | c) given two trees, which is better | ||
- | 4.Neural Networks | + | |
+ | == 4. Neural Networks | ||
Given a Neural Network | Given a Neural Network | ||
+ | |||
a) compute activation potential and activation value | a) compute activation potential and activation value | ||
+ | |||
b) calculate error signal and new weight | b) calculate error signal and new weight | ||
+ | |||
c) explain back propagation | c) explain back propagation | ||
+ | |||
d) explain black box property | d) explain black box property | ||
- | 5. SVMs | + | |
- | a) what happens if point x (= one support vector) is erased from data set | + | == 5. SVMs == |
- | 6. Social Media Mining | + | |
+ | (SVM x-y-coordinate system with samples from two classes is shown. There is no boundary drawn. Although both classes are mostly separated to either left or right, one sample (which looks like an outlier) is placed in the middle, but is part of one of those classes.) | ||
+ | |||
+ | a) | ||
+ | |||
+ | Solution: Boundary travels toward the class which had the outlier. | ||
+ | |||
+ | == 6. Social Media Mining | ||
+ | |||
a) What is the difference between Social Media Mining and Social Media Analytics? | a) What is the difference between Social Media Mining and Social Media Analytics? | ||
+ | |||
b) see two WoM values, explain which shows better result of a marketing campaign | b) see two WoM values, explain which shows better result of a marketing campaign | ||
+ | |||
(it's not mentioned whether the marketing campaign is intended to result in more direct clicks or clicks through recommendation) | (it's not mentioned whether the marketing campaign is intended to result in more direct clicks or clicks through recommendation) | ||
- | c) difference | + | |
- | d) calculate | + | c) Difference |
- | e) argue which network is better, all centrality and centralization measures (closeness centralization is from the last step) and the actual networks | + | |
- | 7. Association rules | + | Solution: Centrality refers to the position of an individual actor. Centralization characterizes the total network. |
+ | |||
+ | d) Calculate | ||
+ | |||
+ | e) Argue which network is better! (all centrality and centralization measures (closeness centralization is from the last step) and the actual networks | ||
+ | |||
+ | == 7. Association rules == | ||
10.000 shoes sales were tracked. Left side: Single-shoe-pair occurence (in basket) in percentage. Right side: Two shoe pairings occurence (in basket) in numbers. We are looking at Speedrunner, | 10.000 shoes sales were tracked. Left side: Single-shoe-pair occurence (in basket) in percentage. Right side: Two shoe pairings occurence (in basket) in numbers. We are looking at Speedrunner, | ||
- | a) Find the four 1-to-1-itemset- association rules ({A} -> {B}) from given data. calculate support & confidence | ||
- | b) describe | + | a) Find the four 1-to-1-itemset- association rules ({A} -> {B}) from given data. calculate support & confidence. |
+ | |||
+ | b) Describe the steps for the FP-Growth |