Du befindest dich hier: FSI Informatik » Prüfungsfragen und Altklausuren » Nebenfächer » Braindump Business Intelligence SS 2020   (Übersicht)

Dies ist eine alte Version des Dokuments!


Braindump SS 2020

entnommen aus: https://pad.stuve.fau.de/p/businessintelligencess2020_2

1. Preprocessing: given dataset with 5,050 examples, shall predict whether students pass exam or not a) explain what preprocessing is necessary for using a nueral network on that data set (conversion into numeric, missing values, etc) b) I. What is the problem with a data set that has label „passed“ = 5,000 and „not passed“ = 50 II. Is accuracy a good metric here? → (no,imbalanced class ratio) c) Your boss wants to do the following tasks: 1. Calculate the profit in the future 2. Put employees in pre-definded groups (In other wordsName 2 models that can do regression AND classification) 2. Evaluation Confusion matrix about people spending a lot in an online shop. a) calculate precision, recall and F1 score b) argue what should one choose to classify customers that will likely review my product positively, so that I will send them my product for free (precision!) c) Your model shows a low training error and a high validation error error. What might be the issue? What can you change to fix it? → overfitting d) given 3 ROCs, which is better 3. Decision Trees a) read classification from a given tree: What two groups are targeted? (Decision tree has leafs „Will respond to campaign“ Yes, No) b) what to do if tree is too complex c) given two trees, which is better 4.Neural Networks Given a Neural Network a) compute activation potential and activation value b) calculate error signal and new weight c) explain back propagation d) explain black box property 5. SVMs a) what happens if point x (= one support vector) is erased from data set 6. Social Media Mining a) What is the difference between Social Media Mining and Social Media Analytics? b) see two WoM values, explain which shows better result of a marketing campaign (it's not mentioned whether the marketing campaign is intended to result in more direct clicks or clicks through recommendation) c) difference between centrality and centralization d) calculate closeness centralization of network e) argue which network is better, all centrality and centralization measures (closeness centralization is from the last step) and the actual networks were given 7. Association rules 10.000 shoes sales were tracked. Left side: Single-shoe-pair occurence (in basket) in percentage. Right side: Two shoe pairings occurence (in basket) in numbers. We are looking at Speedrunner, Endurance and Fighter (names of shoes). a) Find the four 1-to-1-itemset- association rules ({A} → {B}) from given data. calculate support & confidence

b) describe FP-Growth