Du befindest dich hier: FSI Informatik » Prüfungsfragen und Altklausuren » Nebenfächer » Braindump Business Intelligence SS 2020 (Übersicht)

Unterschiede

Hier werden die Unterschiede zwischen zwei Versionen der Seite angezeigt.

Link zu der Vergleichsansicht

--- pruefungen:nebenfach:bwl_bin20 [11.08.2020 15:01] – angelegt nakami
+++ pruefungen:nebenfach:bwl_bin20 [11.08.2020 17:10] (aktuell) – nakami
@@ Zeile 1: / Zeile 1: @@
-Braindump SS 2020
+====== Braindump Business Intelligence SS 2020======
-entnommen aus: https://pad.stuve.fau.de/p/businessintelligencess2020_2
+Entnommen aus: https://pad.stuve.fau.de/p/businessintelligencess2020_2
+Zeit: 90min; wirkt ausreichend, aber man sollte auf die Uhr gucken.
+Anzahl Seiten: 18(!)
+== 1. Preprocessing ==
+Given dataset with 5,050 examples, shall predict whether students pass exam or not
-. Preprocessing:
-given dataset with 5,050 examples, shall predict whether students pass exam or not
 a) explain what preprocessing is necessary for using a nueral network on that data set (conversion into numeric, missing values, etc)
-b) I. What is the problem with a data set that has label "passed" = 5,000 and "not passed" = 50
+b)
+I. What is the problem with a data set that has label "passed" = 5,000 and "not passed" = 50
 II. Is accuracy a good metric here?
--> (no,imbalanced class ratio)
+Solution: no,imbalanced class ratio
 c) Your boss wants to do the following tasks:
 . Calculate the profit in the future
 . Put employees in pre-definded groups
-(In other wordsName 2 models that can do regression AND classification)
-. Evaluation
+Which technology can he use that is able to solve BOTH problems.
+(In other words: name 2 models that can do regression AND classification)
+== 2. Evaluation ==
 Confusion matrix about people spending a lot in an online shop.
 a) calculate precision, recall and F1 score
 b) argue what should one choose to classify customers that will likely review my product positively, so that I will send them my product for free (precision!)
 c) Your model shows a low training error and a high validation error error. What might be the issue? What can you change to fix it?
--> overfitting
+Solution: overfitting
 d) given 3 ROCs, which is better
-. Decision Trees
+== 3. Decision Trees ==
 a) read classification from a given tree: What two groups are targeted?
 (Decision tree has leafs "Will respond to campaign" Yes, No)
 b) what to do if tree is too complex
 c) given two trees, which is better
-.Neural Networks
+== 4. Neural Networks ==
 Given a Neural Network
 a) compute activation potential and activation value
 b) calculate error signal and new weight
 c) explain back propagation
 d) explain black box property
-. SVMs
-a) what happens if point x (= one support vector) is erased from data set
+== 5. SVMs ==
-. Social Media Mining
+(SVM x-y-coordinate system with samples from two classes is shown. There is no boundary drawn. Although both classes are mostly separated to either left or right, one sample (which looks like an outlier) is placed in the middle, but is part of one of those classes.)
+a)  What happens if point x (= one support vector) is erased from data set?
+Solution: Boundary travels toward the class which had the outlier.
+== 6. Social Media Mining ==
 a) What is the difference between Social Media Mining and Social Media Analytics?
 b) see two WoM values, explain which shows better result of a marketing campaign
 (it's not mentioned whether the marketing campaign is intended to result in more direct clicks or clicks through recommendation)
-c) difference between centrality and centralization
-d) calculate closeness centralization of network
+c) Difference between centrality and centralization
-e) argue which network is better, all centrality and centralization measures (closeness centralization is from the last step) and the actual networks were given
-. Association rules
+Solution: Centrality refers to the position of an individual actor. Centralization characterizes the total network.
+d) Calculate closeness centralization of network
+e) Argue which network is better! (all centrality and centralization measures (closeness centralization is from the last step) and the actual networks are given)
+== 7. Association rules ==
 .000 shoes sales were tracked. Left side: Single-shoe-pair occurence (in basket) in percentage. Right side: Two shoe pairings occurence (in basket) in numbers. We are looking at Speedrunner, Endurance and Fighter (names of shoes).
-a) Find the four 1-to-1-itemset- association rules ({A} -> {B}) from given data. calculate support & confidence
-b) describe FP-Growth
+a) Find the four 1-to-1-itemset- association rules ({A} -> {B}) from given data. calculate support & confidence.
+b) Describe the steps for the FP-Growth algorithm.