Hello Peers, Today we are going to share all week’s assessment and quiz answers of the Getting and Cleaning Data course launched by Coursera totally free of cost✅✅✅. This is a certification course for every interested student.
In case you didn’t find this course for free, then you can apply for financial ads to get this course for totally free.
Check out this article – “How to Apply for Financial Ads?”
About The Coursera
Coursera, India’s biggest learning platform launched millions of free courses for students daily. These courses are from various recognized universities, where industry experts and professors teach in a very well manner and in a more understandable way.
Here, you will find Getting and Cleaning Data Exam Answers in Bold Color which are given below.
These answers are updated recently and are 100% correct✅ answers of all week, assessment, and final exam answers of Getting and Cleaning Data from Coursera Free Certification Course.
Use “Ctrl+F” To Find Any Questions Answer. & For Mobile User, You Just Need To Click On Three dots In Your Browser & You Will Get A “Find” Option There. Use These Option to Get Any Random Questions Answer.
About Getting and Cleaning Data Course
The course will cover obtaining data from the web, from APIs, from databases and from colleagues in various formats. It will also cover the basics of data cleaning and how to make data “tidy”. Tidy data dramatically speed downstream data analysis tasks.
Course Apply Link – Getting and Cleaning Data
Getting and Cleaning Data Quiz Answers
Getting and Cleaning Data Quiz 1 Answers
Question 1
The American Community Survey distributes downloadable data about United States communities. Download the 2006 microdata survey about housing for the state of Idaho using download.file() from here:
https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06hid.csv
and load the data into R. The code book, describing the variable names is here:
https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2FPUMSDataDict06.pdf
How many housing units in this survey were worth more than $1,000,000?
# fread url requires curl package on mac # install.packages("curl") library(data.table) housing <- data.table::fread("https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06hid.csv") # VAL attribute says how much property is worth, .N is the number of rows # VAL == 24 means more than $1,000,000 housing[VAL == 24, .N] # Answer: # 53
Use the data you loaded from Question 1. Consider the variable FES in the code book. Which of the “tidy data” principles does this variable violate?
Answer
Tidy data one variable per column
Download the Excel spreadsheet on Natural Gas Aquisition Program here:
https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2FDATA.gov_NGAP.xlsx
Read rows 18-23 and columns 7-15 into R and assign the result to a variable called:
dat
What is the value of:
sum(dat$Zip*dat$Ext,na.rm=T)
(original data source: http://catalog.data.gov/dataset/natural-gas-acquisition-program)
fileUrl <- "http://d396qusza40orc.cloudfront.net/getdata%2Fdata%2FDATA.gov_NGAP.xlsx" download.file(fileUrl, destfile = paste0(getwd(), '/getdata%2Fdata%2FDATA.gov_NGAP.xlsx'), method = "curl") dat <- xlsx::read.xlsx(file = "getdata%2Fdata%2FDATA.gov_NGAP.xlsx", sheetIndex = 1, rowIndex = 18:23, colIndex = 7:15) sum(dat$Zip*dat$Ext,na.rm=T) # Answer: # 36534720
Read the XML data on Baltimore restaurants from here:
https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Frestaurants.xml
How many restaurants have zipcode 21231?
Use http instead of https, which caused the message Error: XML content does not seem to be XML: ‘https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Frestaurants.xml‘.
# install.packages("XML") library("XML") fileURL<-"https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Frestaurants.xml" doc <- XML::xmlTreeParse(sub("s", "", fileURL), useInternal = TRUE) rootNode <- XML::xmlRoot(doc) zipcodes <- XML::xpathSApply(rootNode, "//zipcode", XML::xmlValue) xmlZipcodeDT <- data.table::data.table(zipcode = zipcodes) xmlZipcodeDT[zipcode == "21231", .N] # Answer: # 127
The American Community Survey distributes downloadable data about United States communities. Download the 2006 microdata survey about housing for the state of Idaho using download.file() from here:
https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06pid.csv
using the fread() command load the data into an R object
DT
Which of the following is the fastest way to calculate the average value of the variable pwgtp15 broken down by sex using the data.table package?
DT <- data.table::fread("https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06pid.csv") # Answer (fastest): system.time(DT[,mean(pwgtp15),by=SEX])
Getting and Cleaning Data Quiz 3
Question 1
The American Community Survey distributes downloadable data about United States communities. Download the 2006 microdata survey about housing for the state of Idaho using download.file() from here:
https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06hid.csv
and load the data into R. The code book, describing the variable names is here:
https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2FPUMSDataDict06.pdf
Create a logical vector that identifies the households on greater than 10 acres who sold more than $10,000 worth of agriculture products. Assign that logical vector to the variable agricultureLogical. Apply the which() function like this to identify the rows of the data frame where the logical vector is TRUE. which(agricultureLogical)
What are the first 3 values that result?
download.file('https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06hid.csv' , 'ACS.csv' , method='curl' ) # Read data into data.frame ACS <- read.csv('ACS.csv') agricultureLogical <- ACS$ACR == 3 & ACS$AGS == 6 head(which(agricultureLogical), 3) # Answer: # 125 238 262
Using the jpeg package read in the following picture of your instructor into R
https://d396qusza40orc.cloudfront.net/getdata%2Fjeff.jpg
Use the parameter native=TRUE. What are the 30th and 80th quantiles of the resulting data?
# install.packages('jpeg') library(jpeg) # Download the file download.file('https://d396qusza40orc.cloudfront.net/getdata%2Fjeff.jpg' , 'jeff.jpg' , mode='wb' ) # Read the image picture <- jpeg::readJPEG('jeff.jpg' , native=TRUE) # Get Sample Quantiles corressponding to given prob quantile(picture, probs = c(0.3, 0.8) ) # Answer: # 30% 80% # -15259150 -10575416
Load the Gross Domestic Product data for the 190 ranked countries in this data set:
https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2FGDP.csv
Load the educational data from this data set:
https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2FEDSTATS_Country.csv
Match the data based on the country shortcode. How many of the IDs match? Sort the data frame in descending order by GDP rank. What is the 13th country in the resulting data frame?
Original data sources: http://data.worldbank.org/data-catalog/GDP-ranking-table http://data.worldbank.org/data-catalog/ed-stats
# install.packages("data.table) library("data.table") # Download data and read FGDP data into data.table FGDP <- data.table::fread('https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2FGDP.csv' , skip=4 , nrows = 190 , select = c(1, 2, 4, 5) , col.names=c("CountryCode", "Rank", "Economy", "Total") ) # Download data and read FGDP data into data.table FEDSTATS_Country <- data.table::fread('https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2FEDSTATS_Country.csv' ) mergedDT <- merge(FGDP, FEDSTATS_Country, by = 'CountryCode') # How many of the IDs match? nrow(mergedDT) # Answer: # 189 # Sort the data frame in descending order by GDP rank (so United States is last). # What is the 13th country in the resulting data frame? mergedDT[order(-Rank)][13,.(Economy)] # Answer: # Economy # 1: St. Kitts and Nevis
What is the average GDP ranking for the “High income: OECD” and “High income: nonOECD” group?
# "High income: OECD" mergedDT[`Income Group` == "High income: OECD" , lapply(.SD, mean) , .SDcols = c("Rank") , by = "Income Group"] # Answer: # # Income Group Rank # 1: High income: OECD 32.96667 # "High income: nonOECD" mergedDT[`Income Group` == "High income: nonOECD" , lapply(.SD, mean) , .SDcols = c("Rank") , by = "Income Group"] # Answer # Income Group Rank # 1: High income: nonOECD 91.91304
Cut the GDP ranking into 5 separate quantile groups. Make a table versus Income.Group. How many countries are Lower middle income but among the 38 nations with highest GDP?
# install.packages('dplyr') library('dplyr') breaks <- quantile(mergedDT[, Rank], probs = seq(0, 1, 0.2), na.rm = TRUE) mergedDT$quantileGDP <- cut(mergedDT[, Rank], breaks = breaks) mergedDT[`Income Group` == "Lower middle income", .N, by = c("Income Group", "quantileGDP")] # Answer # Income Group quantileGDP N # 1: Lower middle income (38.6,76.2] 13 # 2: Lower middle income (114,152] 9 # 3: Lower middle income (152,190] 16 # 4: Lower middle income (76.2,114] 11 # 5: Lower middle income (1,38.6] 5
Getting and Cleaning Data Quiz 4
Question 1
The American Community Survey distributes downloadable data about United States communities. Download the 2006 microdata survey about housing for the state of Idaho using download.file() from here:
https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06hid.csv
and load the data into R. The code book, describing the variable names is here:https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2FPUMSDataDict06.pdfCreate a logical vector that identifies the households on greater than 10 acres who sold more than $10,000 worth of agriculture products. Assign that logical vector to the variable agricultureLogical. Apply the which() function like this to identify the rows of the data frame where the logical vector is TRUE. which(agricultureLogical)What are the first 3 values that result?
download.file('https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06hid.csv' , 'ACS.csv' , method='curl' ) # Read data into data.frame ACS <- read.csv('ACS.csv') agricultureLogical <- ACS$ACR == 3 & ACS$AGS == 6 head(which(agricultureLogical), 3) # Answer: # 125 238 262
Question 2
Using the jpeg package read in the following picture of your instructor into Rhttps://d396qusza40orc.cloudfront.net/getdata%2Fjeff.jpgUse the parameter native=TRUE. What are the 30th and 80th quantiles of the resulting data?
# install.packages('jpeg') library(jpeg) # Download the file download.file('https://d396qusza40orc.cloudfront.net/getdata%2Fjeff.jpg' , 'jeff.jpg' , mode='wb' ) # Read the image picture <- jpeg::readJPEG('jeff.jpg' , native=TRUE) # Get Sample Quantiles corressponding to given prob quantile(picture, probs = c(0.3, 0.8) ) # Answer: # 30% 80% # -15259150 -10575416
Question 3
Load the Gross Domestic Product data for the 190 ranked countries in this data set:https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2FGDP.csvLoad the educational data from this data set:https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2FEDSTATS_Country.csvMatch the data based on the country shortcode. How many of the IDs match? Sort the data frame in descending order by GDP rank. What is the 13th country in the resulting data frame?Original data sources: http://data.worldbank.org/data-catalog/GDP-ranking-table http://data.worldbank.org/data-catalog/ed-stats
# install.packages("data.table) library("data.table") # Download data and read FGDP data into data.table FGDP <- data.table::fread('https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2FGDP.csv' , skip=4 , nrows = 190 , select = c(1, 2, 4, 5) , col.names=c("CountryCode", "Rank", "Economy", "Total") ) # Download data and read FGDP data into data.table FEDSTATS_Country <- data.table::fread('https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2FEDSTATS_Country.csv' ) mergedDT <- merge(FGDP, FEDSTATS_Country, by = 'CountryCode') # How many of the IDs match? nrow(mergedDT) # Answer: # 189 # Sort the data frame in descending order by GDP rank (so United States is last). # What is the 13th country in the resulting data frame? mergedDT[order(-Rank)][13,.(Economy)] # Answer: # Economy # 1: St. Kitts and Nevis
Question 4
What is the average GDP ranking for the “High income: OECD” and “High income: nonOECD” group?
# "High income: OECD" mergedDT[`Income Group` == "High income: OECD" , lapply(.SD, mean) , .SDcols = c("Rank") , by = "Income Group"] # Answer: # # Income Group Rank # 1: High income: OECD 32.96667 # "High income: nonOECD" mergedDT[`Income Group` == "High income: nonOECD" , lapply(.SD, mean) , .SDcols = c("Rank") , by = "Income Group"] # Answer # Income Group Rank # 1: High income: nonOECD 91.91304
Question 5
Cut the GDP ranking into 5 separate quantile groups. Make a table versus Income.Group. How many countries are Lower middle income but among the 38 nations with highest GDP?
# install.packages('dplyr') library('dplyr') breaks <- quantile(mergedDT[, Rank], probs = seq(0, 1, 0.2), na.rm = TRUE) mergedDT$quantileGDP <- cut(mergedDT[, Rank], breaks = breaks) mergedDT[`Income Group` == "Lower middle income", .N, by = c("Income Group", "quantileGDP")] # Answer # Income Group quantileGDP N # 1: Lower middle income (38.6,76.2] 13 # 2: Lower middle income (114,152] 9 # 3: Lower middle income (152,190] 16 # 4: Lower middle income (76.2,114] 11
More About This Course
Before you can work with data you have to get some. This course will cover the basic ways that data can be obtained. The course will cover obtaining data from the web, from APIs, from databases and from colleagues in various formats. It will also cover the basics of data cleaning and how to make data “tidy”. Tidy data dramatically speed downstream data analysis tasks. The course will also cover the components of a complete data set including raw data, processing instructions, codebooks, and processed data. The course will cover the basics needed for collecting, cleaning, and sharing data.
This course is part of multiple programs
This course can be applied to multiple Specializations or Professional Certificates programs. Completing this course will count towards your learning in any of the following programs:
WHAT YOU WILL LEARN
- Understand common data storage systems
- Apply data cleaning basics to make data “tidy”
- Use R for text and date manipulation
- Obtain usable data from the web, APIs, and databases
SKILLS YOU WILL GAIN
- Data Manipulation
- Regular Expression (REGEX)
- R Programming
- Data Cleansing
Conclusion
Hopefully, this article will be useful for you to find all the Week, final assessment, and Peer Graded Assessment Answers of Getting and Cleaning Data Quiz of Coursera and grab some premium knowledge with less effort. If this article really helped you in any way then make sure to share it with your friends on social media and let them also know about this amazing training. You can also check out our other course Answers. So, be with us guys we will share a lot more free courses and their exam/quiz solutions also, and follow our Techno-RJ Blog for more updates.
I regard something genuinely special in this web site.