Data Handling in R
Koulutusmuoto
Remote
Kesto
3 päivää
Hinta
2161 €
This three day course is aimed at those wishing to learn how to use R with Tidyverse packages to work with and handle Data. When combined with our Introduction to Data Science course you would be set up well to follow an R learning journey into Data Science, Machine Learning, and Artificial Intelligence.
During the programme you will be introduced to R and specific development environments and packages for working with Data, with a focus on Tidyverse packages including dplyr, tidyr, stringr, ggplot2 and more.
Along the way you will see how to clean and manipulate tabular data, apply simple statistical techniques and create engaging data visualisations.
Throughout the course you will engage with activities and discussions with one of our Data Science technical specialists and complete technical lab activities to practice the techniques you have learnt and develop ideas for further practice.
- To apply your knowledge of data practically using R for handling data in roles that involve data analysis, data engineering, data science, machine learning and AI, and Data related Ops roles.
- If you are in a Software or IT related role where you work with R, this course will support you in learning how to work with Data.
- To ensure you have the necessary pre-requisite knowledge when combined with Introduction to Data Science should you wish to progress onto Data Science and Machine Learning with R (coming soon).
No prior experience with R is necessary, though it is assumed that you will be familiar with core data concepts such as simple table structures and data types – all the pre-requisites you need are covered by our Data Fundamentals (coming soon) course.
Introduction to Programming for Data Handling
- Describe the pros and cons of using programming languages to work with data
- Identify the languages most suitable for data handling
- Explain the challenges of using programming languages versus data analysis tools
Introduction to R, RStudio, and Quarto
- Describe the key attributes of the R programming language.
- Explain the role of RStudio and Quarto for R programming.
- Use RStudio to write a basic R program.
- Write a program which uses string, integer, float and boolean data types.
Data Structures, Functions, and Pipes
- Construct dataframes and tibbles to solve data problems.
- Write reusable functions which can be used to alter data & automate repetitive tasks.
- Use a selection of R’s built-in functions and trustworthy packages along with base R and dplyr’s Pipe.
Data Sources
- Read from csv, excel, and json files.
- Connect to databases using DBI paired with a backend
Data Manipulation
- Create, manipulate, and alter dataframes and tibbles.
- Use base R and tidyverse methods for indexing, slicing, querying, filtering, grouping, pivoting, and merging tables.
Data Cleaning and Preparation
- Identify data quality metrics, missing data and apply techniques to deal with it.
- Deduplicate, transform and replace values.
- Use string methods to manipulate text data.
- Write regular expressions which munge text data.
Methods for Visualising Data
- Construct and tailor data visualisations using ggplot2.
- Meaningfully visualise aggregate data.
If you enjoy this course and want to learn more about using R within a Data or Software related role, we recommend progression onto one of the following:
- Statistics for Data Analysis in R (coming soon)
- Data Science and Machine Learning with R (coming soon)
- R Programming (coming soon)