Data Integration Best Practices In Azure Synapse Analytics

Trainer(s): Abhishek Narain
Provider: DPS 2022 (Data Platform Summit)
Duration: 8 Hours
Subtitles: Yes
Price: USD 149.5

Abstract:

Azure Synapse contains the same Data Integration engine and experiences as Azure Data Factory, allowing you to create rich at-scale ETL pipelines without leaving Azure Synapse Analytics.

  • Ingest data from 100+ data sources
  • Code-Free ETL with Data flow activities
  • Orchestrate notebooks, Spark jobs, stored procedures, SQL scripts, and more (ELT)

This training will cover the best practices when using Synapse pipelines and is targeted at a data engineer who is new to Azure/ Azure Synapse Analytics.

Modules

  1. ADF/ Synapse Pipelines Overview
  2. Best practices
    • Metadata encryption (Microsoft, Customer-Managed Keys)
    • Source Control
    • Secure Authentication – Managed identity (MSI) and AKV integration
    • Access control/ RBAC in Synapse
    • Managed Virtual Network-enabled workspace
    • Monitoring and alerting (Observability)
  3. Copy: High-perf data integration (Extract-load)
  4. Data flows: Code-free transformation (Extract-transform-load)
    • Data quality, data masking, SCD type2
    • Data mapper and lake databases
  5. Change Data Capture-based incremental extractions from various sources
  6. Scripts, Spark notebook, and Stored Procedure: Code-based transformation (Transform)
  7. Continuous Integration and delivery
  8. Real-world case studies
  9. Challenge, Q&A