Getting and Cleaning Data Coursera Quiz Answers 2022 | All Weeks Assessment Answers [💯Correct Answer]

Hello Peers, Today we are going to share all week’s assessment and quiz answers of the Getting and Cleaning Data course launched by Coursera totally free of cost✅✅✅. This is a certification course for every interested student.

In case you didn’t find this course for free, then you can apply for financial ads to get this course for totally free.

Check out this article “How to Apply for Financial Ads?”

About The Coursera

Coursera, India’s biggest learning platform launched millions of free courses for students daily. These courses are from various recognized universities, where industry experts and professors teach in a very well manner and in a more understandable way.


Here, you will find Getting and Cleaning Data Exam Answers in Bold Color which are given below.

These answers are updated recently and are 100% correct✅ answers of all week, assessment, and final exam answers of Getting and Cleaning Data from Coursera Free Certification Course.

Use “Ctrl+F” To Find Any Questions Answer. & For Mobile User, You Just Need To Click On Three dots In Your Browser & You Will Get A “Find” Option There. Use These Option to Get Any Random Questions Answer.

About Getting and Cleaning Data Course

The course will cover obtaining data from the web, from APIs, from databases and from colleagues in various formats. It will also cover the basics of data cleaning and how to make data “tidy”. Tidy data dramatically speed downstream data analysis tasks.

Course Apply Link – Getting and Cleaning Data

Getting and Cleaning Data Quiz Answers

Getting and Cleaning Data Quiz 1 Answers

Question 1

The American Community Survey distributes downloadable data about United States communities. Download the 2006 microdata survey about housing for the state of Idaho using download.file() from here:

https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06hid.csv

and load the data into R. The code book, describing the variable names is here:

https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2FPUMSDataDict06.pdf

How many housing units in this survey were worth more than $1,000,000?

# fread url requires curl package on mac 
# install.packages("curl")

library(data.table)
housing <- data.table::fread("https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06hid.csv")

# VAL attribute says how much property is worth, .N is the number of rows
# VAL == 24 means more than $1,000,000
housing[VAL == 24, .N]

# Answer: 
# 53

Question 2

Use the data you loaded from Question 1. Consider the variable FES in the code book. Which of the “tidy data” principles does this variable violate?

Answer

Tidy data one variable per column

Question 3

Download the Excel spreadsheet on Natural Gas Aquisition Program here:

https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2FDATA.gov_NGAP.xlsx

Read rows 18-23 and columns 7-15 into R and assign the result to a variable called:

dat

What is the value of:

sum(dat$Zip*dat$Ext,na.rm=T)

(original data source: http://catalog.data.gov/dataset/natural-gas-acquisition-program)

fileUrl <- "http://d396qusza40orc.cloudfront.net/getdata%2Fdata%2FDATA.gov_NGAP.xlsx"
download.file(fileUrl, destfile = paste0(getwd(), '/getdata%2Fdata%2FDATA.gov_NGAP.xlsx'), method = "curl")

dat <- xlsx::read.xlsx(file = "getdata%2Fdata%2FDATA.gov_NGAP.xlsx", sheetIndex = 1, rowIndex = 18:23, colIndex = 7:15)
sum(dat$Zip*dat$Ext,na.rm=T)

# Answer:
# 36534720

Question 4

Read the XML data on Baltimore restaurants from here:

https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Frestaurants.xml

How many restaurants have zipcode 21231?

Use http instead of https, which caused the message Error: XML content does not seem to be XML: ‘https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Frestaurants.xml‘.

# install.packages("XML")
library("XML")
fileURL<-"https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Frestaurants.xml"
doc <- XML::xmlTreeParse(sub("s", "", fileURL), useInternal = TRUE)
rootNode <- XML::xmlRoot(doc)

zipcodes <- XML::xpathSApply(rootNode, "//zipcode", XML::xmlValue)
xmlZipcodeDT <- data.table::data.table(zipcode = zipcodes)
xmlZipcodeDT[zipcode == "21231", .N]

# Answer: 
# 127

Question 5

The American Community Survey distributes downloadable data about United States communities. Download the 2006 microdata survey about housing for the state of Idaho using download.file() from here:

https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06pid.csv

using the fread() command load the data into an R object

DT

Which of the following is the fastest way to calculate the average value of the variable pwgtp15 broken down by sex using the data.table package?

DT <- data.table::fread("https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06pid.csv")

# Answer (fastest):
system.time(DT[,mean(pwgtp15),by=SEX])

Getting and Cleaning Data Quiz 3

Question 1

The American Community Survey distributes downloadable data about United States communities. Download the 2006 microdata survey about housing for the state of Idaho using download.file() from here:

https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06hid.csv

and load the data into R. The code book, describing the variable names is here:

https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2FPUMSDataDict06.pdf

Create a logical vector that identifies the households on greater than 10 acres who sold more than $10,000 worth of agriculture products. Assign that logical vector to the variable agricultureLogical. Apply the which() function like this to identify the rows of the data frame where the logical vector is TRUE. which(agricultureLogical)

What are the first 3 values that result?

download.file('https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06hid.csv'
              , 'ACS.csv'
              , method='curl' )

# Read data into data.frame
ACS <- read.csv('ACS.csv')

agricultureLogical <- ACS$ACR == 3 & ACS$AGS == 6
head(which(agricultureLogical), 3)

# Answer: 
# 125 238 262

Question 2

Using the jpeg package read in the following picture of your instructor into R

https://d396qusza40orc.cloudfront.net/getdata%2Fjeff.jpg

Use the parameter native=TRUE. What are the 30th and 80th quantiles of the resulting data?

# install.packages('jpeg')
library(jpeg)

# Download the file
download.file('https://d396qusza40orc.cloudfront.net/getdata%2Fjeff.jpg'
              , 'jeff.jpg'
              , mode='wb' )

# Read the image
picture <- jpeg::readJPEG('jeff.jpg'
                          , native=TRUE)

# Get Sample Quantiles corressponding to given prob
quantile(picture, probs = c(0.3, 0.8) )

# Answer: 
#       30%       80% 
# -15259150 -10575416 

Question 3

Load the Gross Domestic Product data for the 190 ranked countries in this data set:

https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2FGDP.csv

Load the educational data from this data set:

https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2FEDSTATS_Country.csv

Match the data based on the country shortcode. How many of the IDs match? Sort the data frame in descending order by GDP rank. What is the 13th country in the resulting data frame?

Original data sources: http://data.worldbank.org/data-catalog/GDP-ranking-table http://data.worldbank.org/data-catalog/ed-stats

# install.packages("data.table)
library("data.table")


# Download data and read FGDP data into data.table
FGDP <- data.table::fread('https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2FGDP.csv'
                          , skip=4
                          , nrows = 190
                          , select = c(1, 2, 4, 5)
                          , col.names=c("CountryCode", "Rank", "Economy", "Total")
                          )

# Download data and read FGDP data into data.table
FEDSTATS_Country <- data.table::fread('https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2FEDSTATS_Country.csv'
                                      )
                                      
mergedDT <- merge(FGDP, FEDSTATS_Country, by = 'CountryCode')

# How many of the IDs match?
nrow(mergedDT)

# Answer: 
# 189

# Sort the data frame in descending order by GDP rank (so United States is last). 
# What is the 13th country in the resulting data frame?
mergedDT[order(-Rank)][13,.(Economy)]

# Answer: 

#                Economy
# 1: St. Kitts and Nevis

Question 4

What is the average GDP ranking for the “High income: OECD” and “High income: nonOECD” group?

# "High income: OECD" 
mergedDT[`Income Group` == "High income: OECD"
         , lapply(.SD, mean)
         , .SDcols = c("Rank")
         , by = "Income Group"]

# Answer:
#
#         Income Group     Rank
# 1: High income: OECD 32.96667

# "High income: nonOECD"
mergedDT[`Income Group` == "High income: nonOECD"
         , lapply(.SD, mean)
         , .SDcols = c("Rank")
         , by = "Income Group"]

# Answer
#            Income Group     Rank
# 1: High income: nonOECD 91.91304

Question 5

Cut the GDP ranking into 5 separate quantile groups. Make a table versus Income.Group. How many countries are Lower middle income but among the 38 nations with highest GDP?

# install.packages('dplyr')
library('dplyr')

breaks <- quantile(mergedDT[, Rank], probs = seq(0, 1, 0.2), na.rm = TRUE)
mergedDT$quantileGDP <- cut(mergedDT[, Rank], breaks = breaks)
mergedDT[`Income Group` == "Lower middle income", .N, by = c("Income Group", "quantileGDP")]

# Answer 
#           Income Group quantileGDP  N
# 1: Lower middle income (38.6,76.2] 13
# 2: Lower middle income   (114,152]  9
# 3: Lower middle income   (152,190] 16
# 4: Lower middle income  (76.2,114] 11
# 5: Lower middle income    (1,38.6]  5

Getting and Cleaning Data Quiz 4

Question 1

The American Community Survey distributes downloadable data about United States communities. Download the 2006 microdata survey about housing for the state of Idaho using download.file() from here:

https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06hid.csv

and load the data into R. The code book, describing the variable names is here:https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2FPUMSDataDict06.pdfCreate a logical vector that identifies the households on greater than 10 acres who sold more than $10,000 worth of agriculture products. Assign that logical vector to the variable agricultureLogical. Apply the which() function like this to identify the rows of the data frame where the logical vector is TRUE. which(agricultureLogical)What are the first 3 values that result?

download.file('https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06hid.csv'               , 'ACS.csv'               , method='curl' ) # Read data into data.frame ACS <- read.csv('ACS.csv') agricultureLogical <- ACS$ACR == 3 & ACS$AGS == 6 head(which(agricultureLogical), 3) # Answer:  # 125 238 262

Question 2

Using the jpeg package read in the following picture of your instructor into Rhttps://d396qusza40orc.cloudfront.net/getdata%2Fjeff.jpgUse the parameter native=TRUE. What are the 30th and 80th quantiles of the resulting data?

# install.packages('jpeg') library(jpeg) # Download the file download.file('https://d396qusza40orc.cloudfront.net/getdata%2Fjeff.jpg'               , 'jeff.jpg'               , mode='wb' ) # Read the image picture <- jpeg::readJPEG('jeff.jpg'                           , native=TRUE) # Get Sample Quantiles corressponding to given prob quantile(picture, probs = c(0.3, 0.8) ) # Answer:  #       30%       80%  # -15259150 -10575416 

Question 3

Load the Gross Domestic Product data for the 190 ranked countries in this data set:https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2FGDP.csvLoad the educational data from this data set:https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2FEDSTATS_Country.csvMatch the data based on the country shortcode. How many of the IDs match? Sort the data frame in descending order by GDP rank. What is the 13th country in the resulting data frame?Original data sources: http://data.worldbank.org/data-catalog/GDP-ranking-table http://data.worldbank.org/data-catalog/ed-stats

# install.packages("data.table) library("data.table") # Download data and read FGDP data into data.table FGDP <- data.table::fread('https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2FGDP.csv'                           , skip=4                           , nrows = 190                           , select = c(1, 2, 4, 5)                           , col.names=c("CountryCode", "Rank", "Economy", "Total")                           ) # Download data and read FGDP data into data.table FEDSTATS_Country <- data.table::fread('https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2FEDSTATS_Country.csv'                                       )                                        mergedDT <- merge(FGDP, FEDSTATS_Country, by = 'CountryCode') # How many of the IDs match? nrow(mergedDT) # Answer:  # 189 # Sort the data frame in descending order by GDP rank (so United States is last).  # What is the 13th country in the resulting data frame? mergedDT[order(-Rank)][13,.(Economy)] # Answer:  #                Economy # 1: St. Kitts and Nevis

Question 4

What is the average GDP ranking for the “High income: OECD” and “High income: nonOECD” group?

# "High income: OECD"  mergedDT[`Income Group` == "High income: OECD"          , lapply(.SD, mean)          , .SDcols = c("Rank")          , by = "Income Group"] # Answer: # #         Income Group     Rank # 1: High income: OECD 32.96667 # "High income: nonOECD" mergedDT[`Income Group` == "High income: nonOECD"          , lapply(.SD, mean)          , .SDcols = c("Rank")          , by = "Income Group"] # Answer #            Income Group     Rank # 1: High income: nonOECD 91.91304

Question 5

Cut the GDP ranking into 5 separate quantile groups. Make a table versus Income.Group. How many countries are Lower middle income but among the 38 nations with highest GDP?

# install.packages('dplyr') library('dplyr') breaks <- quantile(mergedDT[, Rank], probs = seq(0, 1, 0.2), na.rm = TRUE) mergedDT$quantileGDP <- cut(mergedDT[, Rank], breaks = breaks) mergedDT[`Income Group` == "Lower middle income", .N, by = c("Income Group", "quantileGDP")] # Answer  #           Income Group quantileGDP  N # 1: Lower middle income (38.6,76.2] 13 # 2: Lower middle income   (114,152]  9 # 3: Lower middle income   (152,190] 16 # 4: Lower middle income  (76.2,114] 11 
# 5: Lower middle income (1,38.6] 5

More About This Course

Before you can work with data you have to get some. This course will cover the basic ways that data can be obtained. The course will cover obtaining data from the web, from APIs, from databases and from colleagues in various formats. It will also cover the basics of data cleaning and how to make data “tidy”. Tidy data dramatically speed downstream data analysis tasks. The course will also cover the components of a complete data set including raw data, processing instructions, codebooks, and processed data. The course will cover the basics needed for collecting, cleaning, and sharing data.

This course is part of multiple programs

This course can be applied to multiple Specializations or Professional Certificates programs. Completing this course will count towards your learning in any of the following programs:

WHAT YOU WILL LEARN

  • Understand common data storage systems
  • Apply data cleaning basics to make data “tidy”
  • Use R for text and date manipulation
  • Obtain usable data from the web, APIs, and databases

SKILLS YOU WILL GAIN

  • Data Manipulation
  • Regular Expression (REGEX)
  • R Programming
  • Data Cleansing

Conclusion

Hopefully, this article will be useful for you to find all the Week, final assessment, and Peer Graded Assessment Answers of Getting and Cleaning Data Quiz of Coursera and grab some premium knowledge with less effort. If this article really helped you in any way then make sure to share it with your friends on social media and let them also know about this amazing training. You can also check out our other course Answers. So, be with us guys we will share a lot more free courses and their exam/quiz solutions also, and follow our Techno-RJ Blog for more updates.

Leave a Comment

Ads Blocker Image Powered by Code Help Pro

Ads Blocker Detected!!!

We have detected that you are using extensions to block ads. Please support us by disabling these ads blocker.

Powered By
Best Wordpress Adblock Detecting Plugin | CHP Adblock