In this blog, we will explore how to perform various join operations in R using two data sets, 'flights' and 'weather'. The 'flights' data set contains information about flights that departed from New York City in 2013, while the 'weather' data set provides weather data for each NYC airport for each hour.
1. Return all rows from 'flights' and all columns from 'flights' and 'weather'. Output by the columns year and month. Return only the first 10 rows of the output and save it in the variable quest_1.
2. Return all rows from 'weather' and all columns from 'weather' and 'flights'. Output by the columns year, month, day, and hour. Return only the first 10 rows of the output and save it in the variable quest_2.
3. Return only the rows in which the flights have matching keys in the 'weather' data set. Output by the columns year, month, day, and hour. Return only the first 10 rows of the output and save it in the variable quest_3.
4. Combine two data sets, keeping rows and columns that appear in both 'flights' and 'weather'. Output by the columns year, month, and day. Return only the first 10 rows of the output and save it in the variable quest_4.
5. Return only columns from 'flights'. Output by the columns year, month, day, and hour. Return only the first 10 rows of the output. (Include the necessary libraries and read the data from the data set) and save it in the variable quest_5.
Once you are in the Web IDE:
Open prog.R file in the jupyter lab and start your coding by following the instructions in the notebook.
Once you are done with the solution , then run the following command in the terminal to check your solutions.
Click File -> New -> Terminal, run the following command as shown below
>>> Rscript prog.R
>>> bash .score.sh
After running the test cases, click the submit button and click on Submit Test to end the assessment.
(1)The results of preliminary validation don't impact final scoring. In-depth scoring are done at a later stage.
(2) Here the Rough_Work.ipynb notebook can be used for coding if necessary for rough work.
(3) In Terminal if incase the password is being asked kindly click on "ENTER" after 3 incorrect attempts the above given command will run successfully.
Let's dive into the solutions for each of these tasks.
# Load necessary libraries (if not already loaded)
library(dplyr)
# Read 'flights' and 'weather' data sets
flights <- read.csv("flights.csv")
weather <- read.csv("weather.csv")
# Output by the columns year and month. Return only the first 10 rows of the output.
quest_1 <- flights %>%
select(year, month) %>%
inner_join(weather, by = c("year", "month")) %>%
head(10)
# Output by the columns year, month, day, and hour. Return only the first 10 rows of the output.
quest_2 <- weather %>%
select(year, month, day, hour) %>%
inner_join(flights, by = c("year", "month")) %>%
head(10)
# Task 3: Return only the rows in which the flights have matching keys in the 'weather' data set.
# Output by the columns year, month, day, and hour. Return only the first 10 rows of the output.
quest_3 <- flights %>%
inner_join(weather, by = c("year", "month", "day", "hour")) %>%
select(year, month, day, hour) %>%
head(10)
# Task 4: Combine two data sets, keeping rows and columns that appear in both 'flights' and 'weather'.
# Output by the columns year, month, and day. Return only the first 10 rows of the output.
quest_4 <- flights %>%
inner_join(weather, by = c("year", "month", "day")) %>%
select(year, month, day) %>%
head(10)
# Task 5: Return only columns from 'flights'.
# Output by the columns year, month, day, and hour. Return only the first 10 rows of the output.
quest_5 <- flights %>%
select(year, month, day, hour) %>%
head(10)
Once you've completed these tasks in your R script, you can run the script to obtain the desired output and save them in the variables quest_1 to quest_5.
Don't forget to run the validation script as mentioned in the problem statement to check your solutions. This will ensure that your code meets the requirements of the tasks.
Now you are well-equipped to perform join operations in R with confidence. Happy coding!