Hello Peers, Today we are going to share all week assessment and quizzes answers of Mastering Data Analysis in Excel course launched by Coursera for totally free of cost✅✅✅. This is a certification course for every interested students.
In case you didn’t find this course for free, then you can apply for financial ads to get this course for totally free.
Checkout this article for – “How to Apply for Financial Ads?”
Coursera, a India’s biggest learning platform which launched millions of free courses for students daily. These courses are from various recognized university, where industry experts and professors teaches in a very well manner and in a more understandable way.
Here, you will find Mastering Data Analysis in Excel Exam Answers in Bold Color which are given below.
These answers are updated recently and are 100% correct✅ answers of all week, assessment and final exam answers of Mastering Data Analysis in Excel from Coursera Free Certification Course.
Use “Ctrl+F” To Find Any Questions Answer. & For Mobile User, You Just Need To Click On Three dots In Your Browser & You Will Get A “Find” Option There. Use These Option to Get Any Random Questions Answer.
Apply Link – Mastering Data Analysis in Excel
Mastering Data Analysis in Excel Coursera Quiz Answer
Week- 4
Parametric Models for Regression (graded)
1. A University admissions test has a Gaussian distribution of test scores with a mean of 500 and standard deviation of 100. One student out-performed 97.4% of all test takers.
What was their test score (rounded to the nearest whole number)?
Hint: Refer to the Excel NormSFunctions Spreadsheet.
Excel NormS Functions Spreadsheet.xlsx
- 694
- 502
- 306
- 972
2. A carefully machined wire comes off an assembly line within a certain tolerance. Its diameter is 100 microns, and all the wires produced have a uniform distribution of error, between -11 microns and +29 microns.
A testing machine repeatedly draws samples of 180 wires and measures the sample mean. What is the distribution of sample means?
Hint: Use the CLT and Excel Rand() Spreadsheet.
CLT and Excel Rand.xlsx
- A Uniform Distribution with mean = 109 microns and standard deviation = .8607 microns.
- A Uniform Distribution with mean = 109 microns and standard deviation = 11.54 microns.
- A Gaussian distribution that, in Phi notation, is written, ϕ(109, 133.33).
- A Gaussian Distribution that, in Phi notation, is written ϕ(109, .7407).
3. A population of people suffering from Tachycardia (occasional rapid heart rate), agrees to test a new medicine that is supposed to lower heart rate. In the population being studied, before taking any medicine the mean heart rate was 120 beats per minute, with standard deviation = 15 beats per minute.
After being given the medicine, a sample of 45 people had an average heart rate of 112 beats per minute. What is the probability that this much variation from the mean could have occurred by chance alone?
Hint: Use the Typical Problem with NormSDist Spreadsheet.
Typical Problem_ NormSDist .xlsx
- .0173%
- 99.9827%
- 1.73%
- 29.690%
4. Two stocks have the following expected annual returns:
Oil stock – expected return = 9% with standard deviation = 13%
IT stock – expected return = 14% with standard deviation = 25%
The Stocks prices have a small negative correlation: R = -.22.
What is the Covariance of the two stocks?
Hint: Use the Algebra with Gaussians Spreadsheet.
Algebra with Gaussians.xlsx
- -.0286
- -.00715
- -.00573
- -.00219
5. Two stocks have the following expected annual returns:
Oil stock – expected return = 9% with standard deviation = 13%
IT stock – expected return = 14% with standard deviation = 25%
The Stocks prices have a small negative correlation: R = -.22.
Assume return data for the two stocks is standardized so that each is represented as having mean 0 and standard deviation 1. Oil is plotted against IT on the (x,y) axis.
What is the covariance?
Hint: Use the Standardization Spreadsheet.
Standardization Spreadsheet.xlsx
- -.22
- -.00573
- 0
- -1
6. Two stocks have the following expected annual returns:
Oil stock – expected return = 9% with standard deviation = 13%
IT stock – expected return = 14% with standard deviation = 25%
The Stocks prices have a small negative correlation: R = -.22.
What is the standard deviation of a portfolio consisting of 70% Oil and 30% IT?
Hint: Use either the Algebra with Gaussians or the Markowitz Portfolio Optimization Spreadsheet.
Algebra with Gaussians.xlsx
Markowitz Portfolio Optimization.xlsx
- 12.68%
- 10.44%
- 17.93%
- 11.79%
7. Two stocks have the following expected annual returns:
Oil stock – expected return = 9% with standard deviation = 13%
IT stock – expected return = 14% with standard deviation = 25%
The Stocks prices have a small negative correlation: R = -.22.
Use MS Solver and the Markowitz Portfolio Optimization Spreadsheet to Find the weighted portfolio of the two stocks with lowest volatility.
Solver Add-In.xlsx
Markowitz Portfolio Optimization.xlsx
What is the minimum volatility?
- 10.43%
- 9.5%
- 10.36%
- 11.58%
8. You are a data-analyst for a restaurant chain and are asked to forecast first-year revenues from new store locations. You use census tract data to develop a linear model.
Your first model has a standard deviation of model error of $25,000 at a correlation of R = .30. Your boss asks you to keep working on improving the model until the new standard deviation of model error is $15,000 or less.
What positive correlation R would you need to have a model error of $15,000?
(Note: you can answer this question by making small additions to the Correlation and Model Error spreadsheet).
Correlation and Model Error.xlsx
- R = .428
- R = .8200
- R = .500
- R = .572
9. An automobile parts manufacturer uses a linear regression model to forecast the dollar value of the next years’ orders from current customers as a function of a weighted sum of their past-years’ orders. The model error is assumed Gaussian with standard deviation of $130,000.
If the correlation is R = .33, and the point forecast orders $5.1 million, what is the probability that the customer will order more than $5.3 million?
Hint: Use the Typical Problem with NormSDist Spreadsheet.
Typical Problem_ NormSDist .xlsx
- 93.8%
- 4.3%
- 6.2%
- 12.4%
10. An automobile parts manufacturer uses a linear regression model to forecast the dollar value of the next years’ orders from current customers as a function of a weighted sum of that customer’s past-years orders. The linear correlation is R = .33.
After standardizing the x and y data, what portion of the uncertainty about a customer’s order size is eliminated by their historical data combined with the model?
Hint: Use the Correlation and P.I.G. Spreadsheet.
Correlation and P.I.G..xlsx
- 4.2%
- 3.5%
- 4.5%
- 5.2%
11. A restaurant offers different dinner “specials” each weeknight. The mean cash register receipt per table on Wednesdays is $75.25 with standard deviation of $13.50. The restaurant experiments one Wednesday with changing the “special” from blue fish to lobster. The average amount spent by 85 customers is $77.20.
How probable is it that Wednesday receipts are better than average by chance alone?
Hint: Use the Typical Problem with NormSDist Spreadsheet.
Typical Problem_ NormSDist .xlsx
- 9.15%
- 9.05%
- 90.85%
- 8.30%
12. Your company currently has no way to predict how long visitors will spend on the Company’s web site. All it known is the average time spent is 55 seconds, with an approximately Gaussian distribution and standard deviation of 9 seconds. It would be possible, after investing some time and money in analytics tools, to gather and analyzing information about visitors and build a linear predictive model with a standard deviation of model error of 4 seconds.
How much would the P.I. G. of that model be?
Hint: Use the Correlation and P.I.G. Spreadsheet
How to use the AUC calculator.pdf
PDF File
- 48.2%
- 61.5%
- 53.3%
- 57.2%
Mastering Data Analysis in Excel Quiz Answer
Week- 5
Probability, AUC, and Excel Linest Function
1. Keep the 125 outcomes in the Histogram Spreadsheet unchanged. Change the bin ranges so that bin 1 is [-3, -1), bin 2 is [-1,1) bin 3 is [1, 3).
Histograms Spreadsheet.xlsx
What is the approximate probability that a new outcome will fall within bin 1?
- .4
- 4%
- 5
- 5%
2. Use the Excel Probability Functions Spreadsheet.
Excel_Probability_Functions.xlsx
Assume a continuous uniform probability distribution over the range [47, 51.5].
What is the skewness of the probability distribution?
- 49.25
- 1.69
- 2.17
- 0
3. Use the Excel Probability FunctionsSpreadsheet, provided in question #2.
Assume a continuous uniform probability distribution over the range [-12, 20]
What is the entropy of this distribution?
- 5 bits
- 3 bits
- 6 bits
- 4 bits
4. Use the Excel Probability Functions Spreadsheet that was previously provided.
Assume a Gaussian Probability function with mean = 3 and
standard deviation =4.
What is the value of f(x) at f(3.5)?
- 4.05
- .352
- .099
- .550
5. Use the Excel Probability Functions Spreadsheet previously provided in this quiz.
Assume a Gaussian Probability Distribution with mean = 3 and standard deviation = 4.
What is the cumulative distribution at x = 7?
- .960
- .841
- .060
- 1.00
6. Use the AUC Calculator Spreadsheet.
AUC_Calculator and Review of AUC Curve.xlsx
If the “modification factor” in the original example given in the AUC Calculator Spreadsheet is changed from -1 to -2, what is the change in the actual Area Under the ROC Curve?
- No change
- The area increases
- The area decreases
7. Use the AUC Calculator Spreadsheet provided in question #6.
If the “modification factor” in the original example given in the AUC Calculator Spreadsheet is changed from -1 to -2, what is the threshold (row 10) that results in the lowest cost per event?
- .45
- 3.5
- .9
- 1.3
8. Refer to the AUC Calculator Spreadsheet previously provided.
Assume a binary classification model is trained on 200 ordered pairs of scores and outcomes and has an AUC of .91 on this “training set.” The same model, on 5,000 new scores and outcomes, has an AUC of .5.
Which statement is most likely to be correct?
- The model overfit the training set data and will need to be improved to work better on the new data.
- The original model is expected to perform worse on test set data and is functioning acceptably.
- The original model identified signal as noise and has no predictive value on new data.
9. Refer to the Excel Linest Function Spreadsheet.
Excel Linest Function.xlsx
If a multivariate linear regression gives a weight beta(1) of 0.4 on x(1) = “age in years,” and a new input x(7) of “age in months” is added to the regression data, which of the following statements is false?
- If the x(1) data are removed, the new beta(7) on the new x(7) data will be .033
- Using Excel linest, and including x(1) and x(7) data, the new beta(7) on the age in months will be 0.
- If the x(1) data are removed, the new beta(7) on the new x(7) data will be 0.4.
10. Use the Excel Linest Function Spreadsheet that was provided in question #9.
What is the Correlation, R for the linear regression shown in the example?
- .367
- .606
- .778 or – .778
Mastering Data Analysis in Excel Quiz Answer
Week- 6
Part 1: Building your Own Binary Classification Model
1. First Binary Classification Model
Data_Final Project.xlsx
You work for a bank as a business data analyst in the credit card risk-modeling department. Your bank conducted a bold experiment three years ago: for a single day it quietly issued credit cards to everyone who applied, regardless of their credit risk, until the bank had issued 600 cards without screening applicants.
After three years, 150, or 25%, of those card recipients defaulted: they failed to pay back at least some of the money they owed. However, the bank collected very valuable proprietary data that it can now use to optimize its future card-issuing process.
The bank initially collected six pieces of data about each person:
· Age
· Years at current employer
· Years at current address
· Income over the past year
· Current credit card debt, and
· Current automobile debt
In addition, the bank now has a binary outcome: default = 1, and no default = 0.
Your first assignment is to analyze the data and create a binary classification model to forecast future defaults.
You will combine data from the above six inputs to output a single “score.” Use the Soldier Performance spreadsheet for a simple example of combining multiple inputs.
Forecasting Soldier Performance.xlsx
The relative rank-ordering of scores will determine the model’s effectiveness. For convenience– in particular, so that you can use the AUC Calculator Spreadsheet–you are asked to use a scale for your score that has a maximum < 3.5 and a minimum > -3.5.
At first you are not told what your bank’s own best estimate for its cost per False Negative (accepted applicant who becomes a defaulting customer) and False Positive (rejected customer who would not have defaulted) classification.
Therefore, the best you can do is to design your model to maximize the Area Under the ROC Curve, or AUC.
You are told that if your model is effective (“high enough” AUC, not defined further) and “robust” (again not defined, but in general this means relatively little decrease in AUC across multiple sets of new data) then it may be adopted by the bank as its predictive model for default, to determine which future applicants will be issued credit cards.
You are first given a “Training Set” of 200 out of the 600 people in the experiment. The Data_For_Final_Project (below) has both the training set and test set you will need.
Design your model using the Training Set. Standardized versions of the input data also provided for your convenience. You may combine the six inputs by adding them to, or subtracting them from, each other, taking simple ratios, etc. Exclude inputs that are not helpful and then experiment with how to combine the most informative inputs.
Note that will need some of your quiz answers again later, so please write them down and keep track of them as you go along.
Question: What is your model? Give it as a function of the two or more of the six inputs. For example: (Age + Years at Current Address)/Income [not a great model!].
Your model should have at least two inputs.
What do you think?
Your answer cannot be more than 10000 characters.
no |
2. What is your model’s AUC on the Training Set? Use two digits to the right of the decimal place.
Enter answer here
.70 |
3. Initial Assessment for Over-fitting (testing your model on new data)
Next test your model, without changing any parameters, on the Test Set of 200 additional applicants. See the Test Set spreadsheet. It is part of the Data_For_Final_Project (below) and has both the training and test set.
Data_Final Project.xlsx
Hint: Make and use a second copy of the AUC Calculator Spreadsheet so that you can compare Test Set and Training Set results easily.
AUC_Calculator and Review of AUC Curve.xlsx
What is your model’s new AUC on the Test Set? Give two digits to the right of the decimal place.
Enter answer here
0.80 |
4. Finding the Cost-Minimizing Threshold for your Model
Now that you have, hopefully, developed your model to the point where it is relatively “robust” across the training set and test set, your boss at the bank finally gives you its current rough estimate of the bank’s average costs for each type of classification error.
[Note that all bank models here include only profits and losses within three years of when a card is issued, so the impact of out-years (years beyond 3) can be ignored.]
Cost Per False Negative: $5000
Cost Per False Positive: $2500
For the 600 individuals that were automatically given cards without being classified, the total cost of the experiment turned out to be 25%*($5000)*600 or $750,000. This is $1,250 per event.
Only models with lower cost per event than $1,250 should have any value.
Question: What is the threshold score on the Training Set data for your model that minimizes Cost per Event? You will need this number to answer later questions.
Hint: Using theAUC Calculator Spreadsheet, identify which Column displays the same cost-per-event (row 17) as the overall minimum cost-per-event shown in Cell J2. The threshold is shown in row 10 of that Column. What the threshold means is that at and above this number everything is classified as a “default.”
AUC_Calculator and Review of AUC Curve.xlsx
Enter answer here
3.5 |
5. Finding the Minimum Cost Per Event
Question: Again referring only to the Training Set data, what is the overall minimum cost-per-event?
Hint: You will need this number to answer later questions. If you used the AUC Calculator, the overall minimum cost per event will be displayed in Cell J2.
Note: for Coursera to interpret your answer correctly you must give your answer as an integer – no decimals or dollar sign.
For Example – enter $800.00 as “800”
Enter answer here
600 |
6. Comparing the New Minimum Cost Per Event on Test Set Data
When you compared AUC for the Training and Test Sets, all that is necessary is to look up the two different values in Cell G8. But to get an accurate measure of the cost-savings using the original model on new data, you can not automatically use the new threshold that results in the overall lowest cost-per-event on the Test Set.
Remember that your model is being tested for its ability to forecast – but the new optimal threshold will be known only after the outcomes for the entire Test Set are known.
All you can use is the model you developed on the Training Set data and the threshold from the Training Set that you should have recorded when answering Question 4.
Question: At that same threshold score (NOT the threshold score that would minimize costs for the new Test Set, but the “old” threshold score that minimized costs on the Training Set) what is the cost per event on the test set?
Hint: Using the AUC Calculator Spreadsheet previously provided, locate the column on the Training Set data that has the lowest-cost-per event. That same column and threshold in the Test Set copy of the AUC Calculator will have a new cost-per-event, displayed in row 17. This is almost always higher than the minimum cost-per-event on the Training Set, and also higher than what the minimal cost-per-event would be on the Test Set, if one could know the new optimal threshold in advance. This number is the actual cost per event when applying the model-and-threshold developed with the Training Set to the new, Test Set data.
Note: for Coursera to interpret your answer correctly you must give your answer as an integer – no decimals or dollar sign.
For Example – enter $800.00 as “800”
Enter answer here
700.00 |
7. Putting a Dollar Value on Your Model Plus the Data
Assume your Test Set cost-per-event results from Question 6 are sustainable long term.
Question: How much money does the bank save, per event, using your model and its data-inputs, instead of issuing credit cards to everyone who asks?
Hint: the cost of issuing credit cards to everyone (no model, no forecast) has been determined to be 25%*$5000 = $1,250 per event. Dollar value of the model-plus-data is the difference between $1,250 and your number.
Note: for Coursera to interpret your answer correctly you must give your answer as an integer – no decimals or dollar sign.
For Example – enter $800.00 as “800”
Enter answer here
100 |
8. Payback Period for Your Model
Question: Given that it apparently cost the bank $750,000 to conduct the three-year experiment, if the bank processes 1000 credit card applicants per day on average, how many days will it take to ensure future savings will pay back the bank’s initial investment?
Give number rounded to the nearest day (integer value).
Hint: multiply your answer to Question 7 – the cost savings per applicant – by 1000 to get the savings per day.
Enter answer here
3 |
9. Any model that is reducing uncertainty will have a True Positive Rate…
- …Less than the Test Incidence (% of outcomes classified as “default”)
- …Equal to the Test Incidence (% of outcomes classified as “default”)
- …Greater than the Test Incidence (% of outcomes classified as “default”)
10. Given that the base rate of default in the population is 25%, any test that is reducing uncertainty will have a Positive Predictive Value (PPV)…
- …Less than .25
- …Greater than .25
- …Equal to .25
11. Given that the base rate of default in the population is 25%, any test that is reducing uncertainty will have a Negative Predictive Value (NPV)…
- …Less than .75
- …Greater than .75
- Equal to .75
12. Confusion Matrix Metrics. To determine all performance metrics for a binary classification, it is sufficient to have three values
The Condition Incidence (here the default rate of 25%)
The probability of True Positives (the True Positive rate multiplied by the Condition Incidence)
The “Test Incidence” (also called “classification incidence” – the sum of the probability of True Positives and False Positives)
These three values can all be obtained from the AUC Calculator Spreadsheetand and then used as inputs to the Information Gain Calculator Spreadsheet to determine all other performance metrics.
AUC_Calculator and Review of AUC Curve.xlsx
Information Gain Calculator.xlsx
Question: What is your model’s True Positive Rate?
Save this answer as it will be needed again for Part 3 (Quiz 3)
Enter answer here
.30 |
13. Question: What is your model’s “test incidence”?
Save this answer as it will be needed again for Part 3 (Quiz 3)
Enter answer here
.20 |
Mastering Data Analysis in Excel Quiz Answer
Part 2: Should the Bank Buy Third-Party Credit Information?
1. Introduction
Part 2 is intended to illustrate how binary classification performance metrics make it possible for you to put an exact value, in dollars per event, on new information that relates to a predictive model.
Note that new information will be worth far more if it is compared to no forecasting model rather than the state of partial knowledge available from the current model. Sellers of information (and data science consultants!) love to take credit for any information gain they achieve over the base rate.
Very often some intermediate state of knowledge is already available for which no additional spending is required. Evaluating the realistic incremental financial gain from new information, whether licensing a third-party commercial database or collecting new data internally, is therefore of great practical value, as this sets an upper bound on what your Company should be willing to pay to license or create the new information.
In this case study, your boss has been in discussions with an advanced machine-learning predictive-analytics credit-risk analytics company that claims to score individual probability of default with very high information gain. Let’s call the company Eggertopia. Eggertopia sales representatives claim their pre-processed risk-scores can achieve AUC values as high as .85 or even higher. However, Eggertopia scores are sold per-event, and they are expensive!
Your boss asks you to determine the incremental financial value to the bank of purchasing Eggertopia risk scores on future credit-card applicants.
Eggertopia agrees to apply its algorithms to generate credit scores for the 400 individuals in the Training and Test Sets. Eggertopia scores do not need to be combined with anything else to make a model. However, since the scores range from approximately -600 (best credit risk) to 4900 (most likely to default) they will need to be standardized and adjusted to fit the -3.5 to 3.5 range of the AUC Calculator Spreadsheet (below)
AUC_Calculator and Review of AUC Curve.xlsx
You will determine the sustainable AUC of the Eggertopia scores, the sustainable cost-per-event, and the savings per event, when comparing Eggertopia data to the base rate forecast.
You will then calculate the incremental savings per event if you compare use of Eggertopia data to use of your current model developed in Part 1.
Question: What is the AUC of the Eggertopia Scores on the Training Set? Give your answer to two digits to the right of the decimal point.
- .83
- .85
- .88
- .95
2. What is the optimum threshold on the training set to minimize the average cost per test?
- .15
- .1
- .25
- .2
3. What is the average cost-per-event at the Training Set optimum threshold?
- $640
- $600
- $500
- $540
4. What is the AUC of the Eggertopia scores on the Test Set?
- .85
- .88
- .80
- .75
5. Using the same threshold as used on the training set, what is the cost per event of the Eggertopia scores on the Test Set? Round to the nearest dollar.
- $838
- $803
- $833
- $823
6. If the bank did not have your model, or any other way of forecasting default, what is the maximum (break-even) price per event that the bank could theoretically pay for Eggertopia scores? In other words, what are Eggertopia’s scores’ absolute savings-per-event?
Hint: Calculate the difference between the cost-per-event at a 25% default rate, and the cost-per-event using Eggertopia scores
- $423
- $425
- $412
- $418
7. What is the True Positive rate of the forecasting model using Eggertopia Scores?
- .70
- .72
- .76
- .74
8. What is its Positive Predictive Value (PPV) of the forecasting model using Eggertopia scores?
Hint: To calculate the PPV, divide the portion of True Positives by the total number of Positive Classifications. Review confusion matrix definitions and letter designations on the Information Gain Spreadsheet, [PPV is defined at Cell G41], obtain True Positive and False Positive Rates from the AUC Calculator Spreadsheet, and use algebra to solve.
Information Gain Calculator.xlsx
- .52
- .50
- .48
- .54
9. Incremental Financial Value of Eggertopia Scores
You calculated a cost per event for your own predictive model on Test Set data to answer Quiz 1 – Part 1, Question 6.
Incremental Financial Value of Eggertopia Scores
You calculated a cost per event for your own predictive model on Test Set data to answer Quiz 1 – Part 1, Question 6.
Question: Assuming that the performance of the Eggertopia model and your model both remain stable on any future data (a big assumption), what is the maximum, or break-even, price that the bank could pay per score for Eggertopia, given that it already has your model and data?
no |
Mastering Data Analysis in Excel Quiz Answer
Part 3: Comparing the Information Gain of Alternative Data and Models
1. Comparing the Information Gain of Eggertopia Scores and Your Model
Both the Eggertopia Scores and your binary classification model can be thought of as tools to reduce uncertainty about future default outcomes of credit card applicants.
Your own model, developed in Part 1, identifies dependencies between, on the one hand, the six types on input data collected by the bank, and on the other hand, the binary outcome default/no default.
If we assume that the dependencies identified by Eggertopia Scores and by your model on the Test Set are stable and representative of all future data (a big assumption) we can draw some further conclusions about how much information gain, or reduction in uncertainty, is provided by each.
Definitions are given in the Information Gain Calculator Spreadsheet, provided below.
Information Gain Calculator.xlsx
Question: On your model’s Test Set results, what is the conditional entropy of default, given your test classifications?
Hint: you need your model’s true positive rate from Part 1, Question 12, and “test incidence” [proportion of events your model classifies as default] from Part 1, question 13. Use the condition incidence of 25% and your model’s True Positive rate to calculate the portion of TPs. Then you have the inputs needed to use the Information Gain Calculator Spreadsheet.
What do you think?
Your answer cannot be more than 10000 characters.
no |
2. Recall that the entropy of the original base rate, minus the conditional entropy of default given your test classification, equals the Mutual Information between default and the test.
I(X;Y) = H(X) – H(X|Y).
The population of potential credit card customers consists of 25% future defaulters. The base rate incidence of default (.25, .75) has an uncertainty, or entropy, of H(.25, .75) = .25*log4 + .75*log1.333 = .8113 bits.
Question: On your test set results, what is the Mutual Information, or information Gain, in average bits per event?
What do you think?
Your answer cannot be more than 10000 characters.
no |
3. Recall that Percentage Information Gain (P.I.G.) is the ratio of I(X;Y)/H(X).
Question: on your Test Set results, what is the Percentage Information Gain (P.I.G.) of your model?
What do you think?
Your answer cannot be more than 10000 characters.
no |
4. Since you have, for you model on the Test Set, a savings-per-event, and a bits-per-event (Mutual Information) you can calculate a savings-per-bit. This is a powerful concept, because it places a financial value directly on the information content of a model (or additional data source, like the Eggertopia scores).
Question: How many dollars does the bank save, for every bit of information gain achieved by your model?
What do you think?
Your answer cannot be more than 10000 characters.
no |
5. Information Gain of Eggertopia Scores over the Base Rate
For questions in this section, assume your model and the data it uses are not available – the bank’s choice is between Eggertopia scores and the base rate.
Question: What is the Mutual Information of the Eggertopia Scores?
In other words, on the Test Set, What is the information gain, in average bits per event, over the base rate of (.25, .75) offered by the Eggertopia Scores?
- .1305 bits per event
- .1255 bits per event
- .1243 bits per event
- .1205 bits per event
6. On the test set, what is the Eggertopia scores’ Percentage Information Gain (PIG)?
- 14.85%
- 15.35%
- 15.25%
- 13.95%
7. If Eggertopia data were free, and your model was unavailable, what would the dollar savings per bit of information extracted be?
Dollar savings are $412 rounded to the nearest dollar- from quiz 2, question 6
- Value would be $427 per bit.
- Value would be $3,427 per bit.
- Value would be $3,627 per bit.
8. Incremental Information Gain of Eggertopia Scores Compared to Your Model and Available Data (any answer scores)
(For this section, assume your Model and the Data it uses are available).
Question: What is the incremental information gain of the Eggertopia scores, over your model from Part 1, in average bits per event, if any?
What do you think?
Your answer cannot be more than 10000 characters.
no |
9. What is the maximum (break-even) price the bank should pay for Eggertopia scores, per score, if your model from Part 1 and data are already available?
What do you think?
Your answer cannot be more than 10000 characters.
no |
10. At the above maximum (break-even) price per score, what would be the value per bit of incremental information gained from the Eggertopia scores? Give your answer in $/bit.
What do you think?
no |
Mastering Data Analysis in Excel Quiz Answer
Part 4: Modeling Profitability Instead of Default
1. Modeling Profitability Instead of Default
Modeling Profitability Level as a Continuous Output (Instead of Binary Classification Default/No Default)
Introduction
Both your own model and the forecast based on Eggertopia scores are binary classifications: they forecast one of just two outcomes: “Default” or “No Default.” Your boss is interested in the idea that it might be preferable instead to model and forecast profits and losses as continuous values, using a a multivariate linear regression model on the same six input variables. This idea has arisen because the bank has been reviewing individual profit and loss numbers for each customer over the three-year period and has made an interesting discovery: some defaulting customers carried so much debt for so long, and paid so much interest on it, that they were profitable for the bank even though they defaulted! Many customers who seem to have risky spending behaviors are also among the most profitable for a lending business. And, at the opposite extreme,customers who always paid off their cards in full each month never defaulted but were not very profitable: the bank barely broke even, or even lost money, on its“safest” borrowers.
Your boss asks you to forecast each applicant’s expected profitability, in dollars,before deciding whether or not to issue them a credit card. He wants to know how reliable this type of forecast would be: what is the range above and below the point estimate that will be correct 90% of the time?
Although it might be possible to combine the six inputs in other ways, in the interests of time and focusing on the key learning objectives, we will use only a simple linear combination of the six input variables for Part 4 of this Project. (You should not include the Eggertopia Scores as an input variable).
Question 1 is about the coefficients or “betas” used to combine the standardized inputs to get the best-fit-line on standardized outputs on the Training Set. We then use those fixed betas to measure the observed residual error of the model on the Test Set.
Questions 2 through 6 concern the forecasts on the Test Set.
Questions 7 through 11 look at the Training Set results so that they can be compared (for possible over-fitting) against the Test Set Results.
Questions 12 through 14 are about the uncertainty that remains in a new individual forecast of profitability.
Use the Excel “Linest” function on the six inputs and profitability output on the 200 Training Set applicants to calculate the coefficients (the “betas”) that result in the best-fit line.
Question: Do you feel prepared to take this quiz?
- Yes
- No
2. Question: What are your values for each “beta” on the Training Set?
Age
Years at current employer
Years at current address
Income over the past year
Current credit card debt
Current automobile debt
- .01, .19, -.07, .64, -.06, 0
- 01, -.19, -.07, -.64, -.06, 0
- .01, .19, .07, .64, .06, 0
3. For this question, use the Liner Regression Forecasting explanation and Excel spreadsheet.
Question: What is the root-mean-square residual (the standard deviation of model error) on Standardized output for the Test Set?
- .5835
- .8109
- 0.6750
- .6875
- .3250
4. For this question, use the Linear Regression Forecasting Explanation and Spreadsheet.
Question: What is the observed correlation R on the Test Set?
- 0.7378
- .8095
- .7590
- .7332
5. For this question, use the Linear Regression Forecasting explanation and Excel spreadsheet.
Question: What is the Standard deviation of model error, in Dollars, for the Test Set?
- $3,996.81
- $3,411.80
- $3,885.14
- $3,379.36
6. For this question, use the Linear Regression Forecasting explanation and Excel spreadsheet:
Question: What is the 90% confidence interval, in dollars, for the Test Set?
- $6,390.49 above the point estimate, and $6,390.49 below the point estimate
- $5,611.91 above the point estimate, and $5,611.91 below the point estimate
- $6,574.17 above the point estimate, and $6,574.17 below the point estimate
- $5,558.55 above the point estimate, and $5,558.55 below the point estimate
7. What is the Percentage Information Gain (P.I.G.) on the Test Set?
- 27.7%
- 18.9%
- 26.4%
- 37.2%
8. For this question, use the Linear Regression Forecasting explanation and Excel spreadsheet:
Question: What is the Correlation, R, of your model on the Training Set?
- .7505
- .7805
- .8095
9. For this question, use the Linear Regression Forecasting explanation and Excel spreadsheet:
You need to quantify the uncertainty in a regression model forecast of applicants’ future profitability. Assume that both the forecast profits and the errors have a Gaussian distribution. You will calculate the standard deviation of model error on standardized data, the standard deviation in dollars of the model error, and the 90% confidence interval for profitability estimates.
Question: What is the standard deviation of your model error on the standardized Training Set output?
- .587
- .487
- -.487
- -.587
10. For this question, use the Linear Regression Forecasting explanation and Excel spreadsheet.
Question: What is the standard deviation of model error in dollars on the Training Set?
**This may seem similar to question 5, but Q5 refers to the Test Set.
- $3,379.36
- $4,379.36
- $5,500.87
- $4,312.91
11. For this question, use the Linear Regression Forecasting explanation and Excel spreadsheet.
Question: What is the 90% confidence interval, in dollars, on the Training Set?
**This may seem similar to question 6, but Q6 refers to the Test Set.
- $5,558.55
- $6,211.18
- $5,328.93
- $7,128.55
12. For this question, use the Linear Regression Forecasting explanation and Excel spreadsheet.
Question: What is the Percentage Information Gain (P.I.G.) on the Training Set?
**This may seem similar to question 7, but Q7 refers to the Test Set.
- 36.5%
- 37.5%
- 41.4%
- 32.4%
13. Questions 13 through 15 use the same example applicant.
The following data are known about the sample applicant:
Age: 42.00
Years at Employer: 12.44
Years at Address: 0.9
Income: $121,400
CC debt: -34,228
Auto debt: -23,411
To convert above inputs to standardized form, locate the Training Set Spreadsheet (first bottom tab of workbook) in the Data for Final Project Workbook.
Data_for_Final_Project.xlsx
Use the input means [Cells C207:H207] and standard deviations [Cells C209:H209].
Use the Training Set profitability mean [$1,905.51] and standard deviation [$5755.91] from the Profit and Loss (last bottom tab) Spreadsheet.
Use the Test Set standard deviation of error on standardized outputs of .6750
Question: What is the point estimate of profitability, in dollars?
- $10,683.61
- $11,109.61
- $8,451.61
- -$10,683.61
14. The following data are known about the sample applicant:
Age: 42.00
Years at Employer: 12.44
Years at Address: 0.9
Income: $121,400
CC debt: -34,228
Auto debt: -23,411
To convert above inputs to standardized form, locate the Training Set Spreadsheet (first bottom tab) in the Data for Final Project Workbook.
Use those means [Cells C207:H207] and standard deviations [Cells C209:H209].
Use the Training Set profitability mean [$1,905.51] and standard deviation [$5755.91] from the Profit and Loss (last tab on bottom) Spreadsheet
Use the Test Set standard deviation of error on standardized outputs of .6750
Question: With 50% confidence, what is the range of profitability?
- Range from $13,304.16 to $8,063.06.
- Range from $12,962.61 to $10,683.61
- Range from $11,823.28 to $9,543.94
- Range from $10,683.61 to – $2,278.99
15. The following data are known about the sample applicant:
Age: 42.00
Years at Employer: 12.44
Years at Address: 0.9
Income: $121,400
CC debt: -34,228
Auto debt: -23,411
To convert above inputs to standardized form, locate the Training Set Spreadsheet (bottom tab) in the Data for Final Project Workbook.
Use those means [Cells C207:H207] and standard deviations [Cells C209:H209].
Use the Training Set profitability mean [$1,905.51] and standard deviation [$5755.91] from the Profit and Loss (bottom tab) Spreadsheet
Use the Test Set standard deviation of error on standardized outputs of .6750 .
Question: With 99% confidence, what is the range of profitability?
- Range from $10,683.61 to -$8,704.31
- Range from $19,388.27 to 10,683.61.
- Range from $16,388.27 to -$7,704.31
- Range from $20,691.32 to $675.90.
16. Comparing Test Set and Training Set Performance
Question 15: Between the Training Set and the Test Set, the dollar value of the standard deviation of model error…
- Increased by more than 50%, which leads to the conclusion of model over-fitting.
- Increased by more than 25%, which suggests possible model over-fitting.
- Decreased by about 15%, which suggests a very strong model on Test Set data.
- Increased by less than 20%, which suggests minimal model over-fitting.
Mastering Data Analysis in Excel Quiz Answer
Peer-graded Assignment: Part 5: Modeling Credit Card Default Risk and Customer Profitability
Project Title *
Give your project a descriptive title
Modeling Credit Card Default Risk and Customer Profitability |
What is your predictive model?
a. Describe the arithmetic clearly so that another learner could implement your model on new standardized input data if they wished.
b. Give an example of the score you would assign the following applicant, whether they would be approved or rejected for a credit card and why.
a) The main thing we should do is examine the connection of the factors, and recognize which are the most important in the model. At that point, we should distinguish the boundaries or coefficients that will go with the factors of said model, utilizing the direct relapse procedure in Excel found in the course. The most applicable parametric qualities are: Years at a current business: – 0.19 pay over the previous year: – 0.08 Current Visa obligation: – 0.19 Current car obligation: – 0.07 Then, with these coefficients and considering the relationship, we will make our model. Which is: SCORE = 0.19 * Years at a current boss – 0.08 * salary over the previous year – 0.19 * Current Mastercard obligation 0.07 * Current vehicle obligation b)Considering that by upgrading AUC, we got the limit for the base expense/occasion as 0.25. A score beneath – 0.04 for instance will be resolved as a contrary test, which interprets as a monetarily productive individual, who could be affirmed for a Visa. |
Give an example of the score you would assign the following applicant, whether they would be approved or rejected for a credit card and why.
b)Considering that by streamlining AUC, we got the edge for the base expense/occasion as 0.25. A score underneath – 0.04 for instance will be resolved as a contrary test, which interprets as a monetarily beneficial individual, who could be endorsed for a Mastercard. |
What would the bank’s average profit per applicant be (net profits divided by 200) when using your predictive model on the Training Set?
The average profit per applicant will be 794$ on the training set. |
What is the incremental financial value per applicant of your model over no model on the Training Set?
The incremental financial value per applicant of your model over no model on the Training Set is $654.41 |
Evaluate your model on the Test Set data. How confident are you that your model does not over-fit the Training Set data? The only basis to evaluate over-fitting is to give the same metrics on the Test Set and Training Set, and compare them.
The model has an extraordinary performance in both information tests, since the relationship is very much applied, and the parametric coefficients discovered are right, this infers that the AUC is high and doesn’t change impressively, notwithstanding keeping up the assessed costs per occasion. |
Evaluate your model on the Test Set data. How confident are you that your model does not over-fit the Training Set data?
A. Choose between three broad degrees of confidence: “very” “somewhat” or “not at all.” (Note that “not at all” is still an acceptable answer if you give persuasive reasons for why you chose this answer).
B. Explain the evidence your degree of confidence is based upon. Your explanation should include the test set profits and training set profits per applicant.
How much confidence to have in the model must relate to the relationship between the profits-per-applicant on the Training Set and the Test Set
a) Very b) Because the AUC in both information tests is high andsteady, it is away from of the proficiency of themodel. Also, it keeps up a decent assessed benefitedge on the grounds that the expenses per occasion are not essentiallychanged. |
Conclusion
Hopefully, this article will be useful for you to find all the Week, final assessment and Peer Graded Assessment Answers of Global Financial Markets and Instruments of Coursera and grab some premium knowledge with less effort. If this article really helped you in any way then make sure to share it with your friends on social media and let them also know about this amazing training. You can also check out our other course Answers. So, be with us guys we will share a lot more free courses and their exam/quiz solutions also and follow our Techno-RJ Blog for more updates.
Incredible! This blog looks exactly like my old one! It’s on a totally different subject but it has pretty much the same layout and design. Outstanding choice of colors!
Amazing! This blog looks exactly like my old one! It’s on a entirely different topic but it has pretty much the same layout and design. Wonderful choice of colors!
Spot on with this write-up, I actually think this website wants much more consideration. I’ll most likely be once more to learn much more, thanks for that info.
Знакомства на Loveawake.Ru
[url=https://loveawake.ru]More info![/url]
I know this if off topic but I’m looking into starting my own weblog and was curious what all is needed to get set up? I’m assuming having a blog like yours would cost a pretty penny? I’m not very web savvy so I’m not 100 positive. Any recommendations or advice would be greatly appreciated. Kudos
Your house is valueble for me. Thanks!…
Simply as the titles, day by day activities, and career paths of enterprise analyst roles
differ, so do salaries. Jobs roles themselves tend to be much more vertical –
start to complete – and involve a broader number of abilities.
To summarise, a global manager at the moment is one who possesses what might
be termed as ‘hard’ varieties of abilities in addition to ‘softer’
forms of expertise. Once within the Question Editor, you can proper-click on on a column header and select “Change Type” with a purpose to manually select a
knowledge type corresponding to complete quantity,
decimal quantity, date, date/time, and so on. But there’s loads more we
can do with this information moreover checking column varieties.
Evaluation published by the paper confirmed that
pupils with lower socioeconomic backgrounds have been more likely to be downgraded than those in wealthier areas.
It’s essential to upskill yourself in various Business Analysis instruments,
akin to Trello, Jira, and so forth., for enterprise evaluation and collaboration. By means of
qualitative and quantitative financial analysis
strategies and instruments, economists help resolution-makers in government and corporations
understand the business influence of financial indicators.
Which of the following activities are steps in a stakeholder evaluation?
I think this is one of the so much vital information for me. And i’m satisfied reading your article. However should observation on some general issues, The web site taste is wonderful, the articles is in reality nice : D. Just right activity, cheers
I have been reading out many of your posts and i can claim clever stuff. I will surely bookmark your site.
Hello! Quick question that’s entirely off topic. Do you know how to make your site mobile friendly? My site looks weird when browsing from my apple iphone. I’m trying to find a theme or plugin that might be able to correct this problem. If you have any recommendations, please share. Many thanks!
Very interesting information!Perfect just what I was searching for!
Pretty! This was a really wonderful post. Thank you for your provided information.
I enjoy you because of your whole work on this website. My aunt takes pleasure in conducting research and it’s really simple to grasp why. Many of us hear all relating to the dynamic means you convey advantageous steps through your blog and as well as invigorate participation from people on the topic so our simple princess is undoubtedly learning a lot of things. Take pleasure in the rest of the new year. You’re the one conducting a useful job.
Very nice design and style and superb content, very little else we want : D.
I enjoy the efforts you have put in this, regards for all the great articles.
Enjoyed examining this, very good stuff, thanks. “Talk sense to a fool and he calls you foolish.” by Euripides.
Attractive part of content. I simply stumbled upon your website and in accession capital to claim that I get actually enjoyed account your weblog posts. Anyway I will be subscribing for your augment or even I achievement you get entry to persistently quickly.
You could definitely see your skills in the work you write. The world hopes for even more passionate writers like you who aren’t afraid to say how they believe. Always follow your heart.
Fantastic web site. Lots of useful information here. I am sending it to some friends ans additionally sharing in delicious. And obviously, thanks to your effort!
I am curious to find out what blog system you are working with? I’m having some small security problems with my latest site and I would like to find something more risk-free. Do you have any recommendations?
I have been checking out a few of your stories and i must say pretty clever stuff. I will make sure to bookmark your site.
Thank you for every other magnificent article. Where else could anyone get that kind of info in such a perfect means of writing? I have a presentation subsequent week, and I am on the search for such info.
Thank you, I’ve just been searching for information approximately this topic for a while and yours is the greatest I’ve came upon till now. But, what about the bottom line? Are you sure concerning the source?
tadalafil 10mg generic cialis 5mg red ed pill
Your place is valueble for me. Thanks!…
order duricef sale order proscar 1mg generic proscar tablet
cost diflucan 100mg buy acillin pills ciprofloxacin medication
order estradiol 1mg sale order estradiol 2mg pill minipress 2mg without prescription
purchase metronidazole online cheap where to buy keflex without a prescription buy keflex 125mg pill
buy mebendazole generic mebendazole where to buy order tadalis 10mg online cheap
order cleocin 300mg for sale order fildena 50mg online cheap order fildena pill
order avana 100mg pill order generic tadalafil 20mg order cambia generic
oral tamoxifen buy cefuroxime without prescription cefuroxime brand
amoxicillin pills order amoxicillin for sale clarithromycin online buy
buy careprost online cheap order careprost order desyrel 100mg sale
cost catapres order tiotropium bromide 9mcg online order tiotropium bromide 9mcg generic
sildenafil 100mg generic buy sildalis online sildalis without prescription
order minocycline 50mg pills brand pioglitazone 30mg actos 15mg generic
buy accutane pill order generic amoxil zithromax drug
buy arava generic generic leflunomide 10mg order sulfasalazine 500mg online cheap
where can i buy azithromycin omnacortil cheap order generic gabapentin 800mg
brand tadalafil 20mg viagra tablets cialis next day delivery
order stromectol online buy stromectol 3mg pills deltasone 20mg price
furosemide buy online ventolin inhalator for sale online ventolin online buy
order levitra 20mg online oral zanaflex buy hydroxychloroquine 400mg pills
Good write-up, I’m regular visitor of one’s website, maintain up the nice operate, and It is going to be a regular visitor for a lengthy time.
levitra 10mg pill order generic levitra 10mg buy generic hydroxychloroquine online
buy olmesartan 20mg generic verapamil 240mg without prescription divalproex 500mg oral
buy clobetasol online amiodarone 200mg uk buy amiodarone paypal
coreg 6.25mg over the counter purchase cenforce online aralen where to buy
acetazolamide pills diamox 250 mg ca imuran without prescription
Hello just wanted to give you a quick heads up. The text in your article seem to be running off the screen in Ie. I’m not sure if this is a format issue or something to do with web browser compatibility but I figured I’d post to let you know. The design and style look great though! Hope you get the issue solved soon. Kudos
buy digoxin pill lanoxin sale buy generic molnupiravir 200mg
purchase naproxen for sale purchase naprosyn prevacid 15mg pill