Data Engineering with Databricks
Koulutusmuoto
Remote
Kesto
2 päivää
Hinta
1367 €
This course provides an introduction to data engineering with Databricks, covering key tools and frameworks such as Delta Lake, Databricks Workflows, Delta Live Tables, and Unity Catalog. Participants will learn how to ingest, transform, and manage data using Delta Lake, deploy workloads with Databricks Workflows, build efficient pipelines with Delta Live Tables, and apply data governance principles using Unity Catalog. The course includes hands-on labs and real-world applications to ensure learners develop practical skills for working with Databricks effectively.
This course prepares learners for the Associate Data Engineering certification exam and provides the foundational knowledge required to advance to the Advanced Data Engineering with Databricks course.
By the end of this course, learners will be able to:
- Ingest, transform, and manage data using Delta Lake.
- Deploy and monitor data workloads with Databricks Workflows.
- Build scalable data pipelines using Delta Live Tables and the Medallion Architecture.
- Apply data governance principles and manage permissions using Unity Catalog.
- Troubleshoot, optimise, and monitor data workflows in Databricks.
Participants should have:
- Beginner familiarity with basic cloud concepts (virtual machines, object storage, identity management).
- Ability to perform basic code development tasks (e.g., creating compute instances, running code in notebooks, using basic notebook operations, and importing repositories from Git).
- Intermediate familiarity with SQL, including commands such as CREATE, SELECT, INSERT, UPDATE, DELETE, GROUP BY, JOIN.
- Intermediate experience with SQL concepts such as aggregate functions, filters, sorting, indexes, tables, and views.
- Basic knowledge of Python programming, Jupyter Notebook interface, and PySpark fundamentals.
If you do not have one or more of the pre-requisites QA recommends:
Target Audience
This course is designed for:
- Data Engineers who want to enhance their knowledge of Databricks and Delta Lake.
- Data Analysts looking to expand their expertise in data pipelines and transformation.
- Cloud Engineers and Developers working with big data frameworks.
- Professionals preparing for the Databricks Associate Data Engineering certification.
Data ingestion with Delta Lake
- Delta Lake and data objects
- Setting up and loading Delta tables
- Basic data transformations
- Lab: Loading data into Delta tables
- Cleaning and preparing data
- Complex transformations
- Using SQL UDFs
- Advanced Delta Lake features
- Lab: Manipulating Delta tables
Deploy workloads with Databricks Workflows
- Introduction to Databricks Workflows
- Jobs compute
- Scheduling tasks using the Jobs UI
- Lab: Creating and managing jobs in Databricks
- Exploring job features
- Conditional tasks and repairing runs
- Modular orchestration of workflows
- Best practices for Databricks Workflows
Build data pipelines with Delta Live Tables
- Understanding the Medallion Architecture
- Introduction to Delta Live Tables
- Using the Delta Live Tables UI
- Developing SQL pipelines
- Developing Python pipelines
- Running modes in Delta Live Tables
- Monitoring pipeline results and event logs
- Optional: Landing new data
Data management and governance with Unity Catalog
- Overview of data governance in Databricks
- Demo: Populating the Metastore
- Lab: Navigating the Metastore
- Organization and access patterns in Unity Catalog
- Demo: Upgrading tables to Unity Catalog
- Security and administration features
- Overview of Databricks Marketplace
- Managing privileges in Unity Catalog
- Demo: Controlling access to data
- Fine-grained access control
- Lab: Migrating and managing data with Unity Catalog
Exams and assessments
This course does not include formal assessments but does prepare learners for the Associate Data Engineering certification exam. The cost of the exam is not included with this course. Please speak to your account manager to add it to your order.
Hands-on learning
This course features:
- Interactive labs to apply concepts in a real-world Databricks environment.
- Guided exercises demonstrating how to configure and optimise Delta Lake, Workflows, and Unity Catalog.
- Real-world case studies showcasing best practices in data engineering with Databricks.
- Troubleshooting scenarios to develop problem-solving skills.
Hinta 1367 € +alv
Pidätämme oikeudet mahdollisiin muutoksiin ohjelmassa, kouluttajissa ja toteutusmuodossa.
Katso usein kysytyt kysymykset täältä.