Zum Hauptinhalt springen
Dekorationsartikel gehören nicht zum Leistungsumfang.
MCA Microsoft Certified Associate Azure Data Engineer Study Guide
Exam DP-203
Taschenbuch von Benjamin Perkins
Sprache: Englisch

68,85 €*

inkl. MwSt.

Versandkostenfrei per Post / DHL

Lieferzeit 1-2 Wochen

Kategorien:
Beschreibung
Prepare for the Azure Data Engineering certification--and an exciting new career in analytics--with this must-have study aide

In the MCA Microsoft Certified Associate Azure Data Engineer Study Guide: Exam DP-203, accomplished data engineer and tech educator Benjamin Perkins delivers a hands-on, practical guide to preparing for the challenging Azure Data Engineer certification and for a new career in an exciting and growing field of tech.

In the book, you'll explore all the objectives covered on the DP-203 exam while learning the job roles and responsibilities of a newly minted Azure data engineer. From integrating, transforming, and consolidating data from various structured and unstructured data systems into a structure that is suitable for building analytics solutions, you'll get up to speed quickly and efficiently with Sybex's easy-to-use study aids and tools.

This Study Guide also offers:
* Career-ready advice for anyone hoping to ace their first data engineering job interview and excel in their first day in the field
* Indispensable tips and tricks to familiarize yourself with the DP-203 exam structure and help reduce test anxiety
* Complimentary access to Sybex's expansive online study tools, accessible across multiple devices, and offering access to hundreds of bonus practice questions, electronic flashcards, and a searchable, digital glossary of key terms

A one-of-a-kind study aid designed to help you get straight to the crucial material you need to succeed on the exam and on the job, the MCA Microsoft Certified Associate Azure Data Engineer Study Guide: Exam DP-203 belongs on the bookshelves of anyone hoping to increase their data analytics skills, advance their data engineering career with an in-demand certification, or hoping to make a career change into a popular new area of tech.
Prepare for the Azure Data Engineering certification--and an exciting new career in analytics--with this must-have study aide

In the MCA Microsoft Certified Associate Azure Data Engineer Study Guide: Exam DP-203, accomplished data engineer and tech educator Benjamin Perkins delivers a hands-on, practical guide to preparing for the challenging Azure Data Engineer certification and for a new career in an exciting and growing field of tech.

In the book, you'll explore all the objectives covered on the DP-203 exam while learning the job roles and responsibilities of a newly minted Azure data engineer. From integrating, transforming, and consolidating data from various structured and unstructured data systems into a structure that is suitable for building analytics solutions, you'll get up to speed quickly and efficiently with Sybex's easy-to-use study aids and tools.

This Study Guide also offers:
* Career-ready advice for anyone hoping to ace their first data engineering job interview and excel in their first day in the field
* Indispensable tips and tricks to familiarize yourself with the DP-203 exam structure and help reduce test anxiety
* Complimentary access to Sybex's expansive online study tools, accessible across multiple devices, and offering access to hundreds of bonus practice questions, electronic flashcards, and a searchable, digital glossary of key terms

A one-of-a-kind study aid designed to help you get straight to the crucial material you need to succeed on the exam and on the job, the MCA Microsoft Certified Associate Azure Data Engineer Study Guide: Exam DP-203 belongs on the bookshelves of anyone hoping to increase their data analytics skills, advance their data engineering career with an in-demand certification, or hoping to make a career change into a popular new area of tech.
Über den Autor

ABOUT THE AUTHOR

Benjamin Perkins is currently employed at Microsoft in Munich, Germany, as a Senior Escalation Engineer on the Azure team. He is a C# programming expert and cloud engineer who has been working professionally in the IT industry for almost three decades. His roles in IT have spanned the entire spectrum including programmer, system architect, technical support engineer, team leader, and mid-level management. While employed at Hewlett-Packard and Compaq Computer Corporation, he received numerous awards, degrees, and certifications.

Inhaltsverzeichnis

Introduction xxvii

Part I Azure Data Engineer Certification and Azure Products 1

Chapter 1 Gaining the Azure Data Engineer Associate Certification 3

The Journey to Certification 7

How to Pass Exam DP- 203 8

Understanding the Exam Expectations and Requirements 9

Use Azure Daily 17

Read Azure Articles to Stay Current 17

Have an Understanding of All Azure Products 20

Azure Product Name Recognition 21

Azure Data Analytics 23

Azure Synapse Analytics 23

Azure Databricks 26

Azure HDInsight 28

Azure Analysis Services 30

Azure Data Factory 31

Azure Event Hubs 33

Azure Stream Analytics 34

Other Products 35

Azure Storage Products 36

Azure Data Lake Storage 37

Azure Storage 40

Other Products 42

Azure Databases 43

Azure Cosmos DB 43

Azure SQL Server Products 46

Additional Azure Databases 46

Other Products 47

Azure Security 48

Azure Active Directory 48

Role- Based Access Control 51

Attribute- Based Access Control 53

Azure Key Vault 53

Other Products 55

Azure Networking 56

Virtual Networks 56

Other Products 59

Azure Compute 59

Azure Virtual Machines 59

Azure Virtual Machine Scale Sets 60

Azure App Service Web Apps 60

Azure Functions 60

Azure Batch 60

Azure Management and Governance 60

Azure Monitor 61

Azure Purview 61

Azure Policy 62

Azure Blueprints (Preview) 62

Azure Lighthouse 62

Azure Cost Management and Billing 62

Other Products 63

Summary 64

Exam Essentials 64

Review Questions 66

Chapter 2 CREATE DATABASE dbName; GO 69

The Brainjammer 70

A Historical Look at Data 71

Variety 73

Velocity 74

Volume 74

Data Locations 74

Data File Formats 75

Data Structures, Types, and Concepts 83

Data Structures 83

Data Types and Management 92

Data Concepts 95

Data Programming and Querying for Data Engineers 125

Data Programming 126

Querying Data 143

Understanding Big Data Processing 169

Big Data Stages 169

Etl, Elt, Eltl 174

Analytics Types 175

Big Data Layers 176

Summary 177

Exam Essentials 177

Review Questions 179

Part II Design and Implement Data Storage 181

Chapter 3 Data Sources and Ingestion 183

Where Does Data Come From? 185

Design a Data Storage Structure 189

Design an Azure Data Lake Solution 190

Recommended File Types for Storage 198

Recommended File Types for Analytical Queries 199

Design for Efficient Querying 200

Design for Data Pruning 203

Design a Folder Structure That Represents the Levels of Data Transformation 203

Design a Distribution Strategy 205

Design a Data Archiving Solution 206

Design a Partition Strategy 207

Design a Partition Strategy for Files 209

Design a Partition Strategy for Analytical Workloads 210

Design a Partition Strategy for Efficiency and Performance 211

Design a Partition Strategy for Azure Synapse Analytics 211

Identify When Partitioning Is Needed in Azure Data Lake Storage Gen 2 212

Design the Serving/Data Exploration Layer 213

Design Star Schemas 214

Design Slowly Changing Dimensions 215

Design a Dimensional Hierarchy 219

Design a Solution for Temporal Data 220

Design for Incremental Loading 222

Design Analytical Stores 223

Design Metastores in Azure Synapse Analytics and Azure Databricks 224

The Ingestion of Data into a Pipeline 228

Azure Synapse Analytics 228

Azure Data Factory 268

Azure Databricks 275

Event Hubs and IoT Hub 301

Azure Stream Analytics 303

Apache Kafka for HDInsight 314

Migrating and Moving Data 316

Summary 317

Exam Essentials 317

Review Questions 319

Chapter 4 The Storage of Data 321

Implement Physical Data Storage Structures 322

Implement Compression 322

Implement Partitioning 325

Implement Sharding 328

Implement Different Table Geometries with Azure Synapse Analytics Pools 329

Implement Data Redundancy 331

Implement Distributions 341

Implement Data Archiving 342

Azure Synapse Analytics Develop Hub 346

Implement Logical Data Structures 360

Build a Temporal Data Solution 361

Build a Slowly Changing Dimension 365

Build a Logical Folder Structure 368

Build External Tables 369

Implement File and Folder Structures for Efficient Querying and Data Pruning 372

Implement a Partition Strategy 375

Implement a Partition Strategy for Files 376

Implement a Partition Strategy for Analytical Workloads 377

Implement a Partition Strategy for Streaming Workloads 378

Implement a Partition Strategy for Azure Synapse Analytics 378

Design and Implement the Data Exploration Layer 379

Deliver Data in a Relational Star Schema 379

Deliver Data in Parquet Files 385

Maintain Metadata 386

Implement a Dimensional Hierarchy 386

Create and Execute Queries by Using a Compute Solution That Leverages SQL Serverless and Spark Cluster 388

Recommend Azure Synapse Analytics Database Templates 389

Implement Azure Synapse Analytics Database Templates 389

Additional Data Storage Topics 390

Storing Raw Data in Azure Databricks for Transformation 390

Storing Data Using Azure HDInsight 392

Storing Prepared, Trained, and Modeled Data 393

Summary 394

Exam Essentials 395

Review Questions 396

Part III Develop Data Processing 399

Chapter 5 Transform, Manage, and Prepare Data 401

Chapter 6 Ingest and Transform Data 402

Transform Data Using Azure Synapse Pipelines 404

Transform Data Using Azure Data Factory 410

Transform Data Using Apache Spark 414

Transform Data Using Transact- SQL 429

Transform Data Using Stream Analytics 431

Cleanse Data 433

Split Data 435

Shred JSON 439

Encode and Decode Data 445

Configure Error Handling for the Transformation 450

Normalize and Denormalize Values 451

Transform Data by Using Scala 461

Perform Exploratory Data Analysis 463

Transformation and Data Management Concepts 473

Transformation 473

Data Management 480

Azure Databricks 481

Data Modeling and Usage 485

Data Modeling with Machine Learning 486

Usage 494

Summary 500

Exam Essentials 500

Review Questions 502

Create and Manage Batch Processing and Pipelines 505

Design and Develop a Batch Processing Solution 507

Design a Batch Processing Solution 510

Develop Batch Processing Solutions 512

Create Data Pipelines 538

Handle Duplicate Data 560

Handle Missing Data 569

Handle Late- Arriving Data 571

Upsert Data 572

Configure the Batch Size 578

Configure Batch Retention 581

Design and Develop Slowly Changing Dimensions 582

Design and Implement Incremental Data Loads 583

Integrate Jupyter/IPython Notebooks into a Data Pipeline 590

Chapter 7 Revert Data to a Previous State 591

Handle Security and Compliance Requirements 592

Design and Create Tests for Data Pipelines 593

Scale Resources 593

Design and Configure Exception Handling 593

Debug Spark Jobs Using the Spark UI 594

Implement Azure Synapse Link and Query the Replicated Data 594

Use PolyBase to Load Data to a SQL Pool 595

Read from and Write to a Delta Table 595

Manage Batches and Pipelines 596

Trigger Batches 597

Schedule Data Pipelines 597

Validate Batch Loads 598

Implement Version Control for Pipeline Artifacts 604

Manage Data Pipelines 607

Manage Spark Jobs in a Pipeline 609

Handle Failed Batch Loads 610

Summary 610

Exam Essentials 611

Review Questions 612

Design and Implement a Data Stream Processing Solution 615

Develop a Stream Processing Solution 617

Design a Stream Processing Solution 618

Create a Stream Processing Solution 630

Process Time Series Data 657

Design and Create Windowed Aggregates 658

Process Data Within One Partition 661

Process Data Across Partitions 663

Upsert Data 665

Handle Schema Drift 674

Configure Checkpoints/Watermarking During Processing 680

Replay Archived Stream Data 685

Design and Create Tests for Data Pipelines 688

Monitor for Performance and Functional Regressions 689

Optimize Pipelines for Analytical or Transactional Purposes 689

Scale Resources 690

Design and Configure Exception Handling 691

Handle Interruptions 694

Ingest and Transform Data 694

Transform Data Using Azure Stream Analytics 694

Monitor Data Storage and Data Processing 695

Monitor Stream Processing 695

Summary 695

Exam Essentials 696

Review Questions 697

Part IV Secure, Monitor, and Optimize Data Storage and Data Processing 699

Chapter 8 Keeping Data Safe and Secure 701

Design Security for Data Policies and Standards 702

Design a Data Auditing Strategy 711

Design a Data Retention Policy 716

Design for Data Privacy 717

Design to Purge Data Based on Business Requirements 719

Design Data Encryption for Data at Rest and in Transit 719

...
Details
Erscheinungsjahr: 2023
Genre: Importe, Mathematik
Rubrik: Naturwissenschaften & Technik
Medium: Taschenbuch
Inhalt: 1008 S.
ISBN-13: 9781119885429
ISBN-10: 1119885426
Sprache: Englisch
Einband: Kartoniert / Broschiert
Autor: Perkins, Benjamin
Hersteller: John Wiley & Sons Inc
Verantwortliche Person für die EU: Wiley-VCH GmbH, Boschstr. 12, D-69469 Weinheim, amartine@wiley-vch.de
Maße: 234 x 186 x 53 mm
Von/Mit: Benjamin Perkins
Erscheinungsdatum: 06.09.2023
Gewicht: 1,832 kg
Artikel-ID: 121331760
Über den Autor

ABOUT THE AUTHOR

Benjamin Perkins is currently employed at Microsoft in Munich, Germany, as a Senior Escalation Engineer on the Azure team. He is a C# programming expert and cloud engineer who has been working professionally in the IT industry for almost three decades. His roles in IT have spanned the entire spectrum including programmer, system architect, technical support engineer, team leader, and mid-level management. While employed at Hewlett-Packard and Compaq Computer Corporation, he received numerous awards, degrees, and certifications.

Inhaltsverzeichnis

Introduction xxvii

Part I Azure Data Engineer Certification and Azure Products 1

Chapter 1 Gaining the Azure Data Engineer Associate Certification 3

The Journey to Certification 7

How to Pass Exam DP- 203 8

Understanding the Exam Expectations and Requirements 9

Use Azure Daily 17

Read Azure Articles to Stay Current 17

Have an Understanding of All Azure Products 20

Azure Product Name Recognition 21

Azure Data Analytics 23

Azure Synapse Analytics 23

Azure Databricks 26

Azure HDInsight 28

Azure Analysis Services 30

Azure Data Factory 31

Azure Event Hubs 33

Azure Stream Analytics 34

Other Products 35

Azure Storage Products 36

Azure Data Lake Storage 37

Azure Storage 40

Other Products 42

Azure Databases 43

Azure Cosmos DB 43

Azure SQL Server Products 46

Additional Azure Databases 46

Other Products 47

Azure Security 48

Azure Active Directory 48

Role- Based Access Control 51

Attribute- Based Access Control 53

Azure Key Vault 53

Other Products 55

Azure Networking 56

Virtual Networks 56

Other Products 59

Azure Compute 59

Azure Virtual Machines 59

Azure Virtual Machine Scale Sets 60

Azure App Service Web Apps 60

Azure Functions 60

Azure Batch 60

Azure Management and Governance 60

Azure Monitor 61

Azure Purview 61

Azure Policy 62

Azure Blueprints (Preview) 62

Azure Lighthouse 62

Azure Cost Management and Billing 62

Other Products 63

Summary 64

Exam Essentials 64

Review Questions 66

Chapter 2 CREATE DATABASE dbName; GO 69

The Brainjammer 70

A Historical Look at Data 71

Variety 73

Velocity 74

Volume 74

Data Locations 74

Data File Formats 75

Data Structures, Types, and Concepts 83

Data Structures 83

Data Types and Management 92

Data Concepts 95

Data Programming and Querying for Data Engineers 125

Data Programming 126

Querying Data 143

Understanding Big Data Processing 169

Big Data Stages 169

Etl, Elt, Eltl 174

Analytics Types 175

Big Data Layers 176

Summary 177

Exam Essentials 177

Review Questions 179

Part II Design and Implement Data Storage 181

Chapter 3 Data Sources and Ingestion 183

Where Does Data Come From? 185

Design a Data Storage Structure 189

Design an Azure Data Lake Solution 190

Recommended File Types for Storage 198

Recommended File Types for Analytical Queries 199

Design for Efficient Querying 200

Design for Data Pruning 203

Design a Folder Structure That Represents the Levels of Data Transformation 203

Design a Distribution Strategy 205

Design a Data Archiving Solution 206

Design a Partition Strategy 207

Design a Partition Strategy for Files 209

Design a Partition Strategy for Analytical Workloads 210

Design a Partition Strategy for Efficiency and Performance 211

Design a Partition Strategy for Azure Synapse Analytics 211

Identify When Partitioning Is Needed in Azure Data Lake Storage Gen 2 212

Design the Serving/Data Exploration Layer 213

Design Star Schemas 214

Design Slowly Changing Dimensions 215

Design a Dimensional Hierarchy 219

Design a Solution for Temporal Data 220

Design for Incremental Loading 222

Design Analytical Stores 223

Design Metastores in Azure Synapse Analytics and Azure Databricks 224

The Ingestion of Data into a Pipeline 228

Azure Synapse Analytics 228

Azure Data Factory 268

Azure Databricks 275

Event Hubs and IoT Hub 301

Azure Stream Analytics 303

Apache Kafka for HDInsight 314

Migrating and Moving Data 316

Summary 317

Exam Essentials 317

Review Questions 319

Chapter 4 The Storage of Data 321

Implement Physical Data Storage Structures 322

Implement Compression 322

Implement Partitioning 325

Implement Sharding 328

Implement Different Table Geometries with Azure Synapse Analytics Pools 329

Implement Data Redundancy 331

Implement Distributions 341

Implement Data Archiving 342

Azure Synapse Analytics Develop Hub 346

Implement Logical Data Structures 360

Build a Temporal Data Solution 361

Build a Slowly Changing Dimension 365

Build a Logical Folder Structure 368

Build External Tables 369

Implement File and Folder Structures for Efficient Querying and Data Pruning 372

Implement a Partition Strategy 375

Implement a Partition Strategy for Files 376

Implement a Partition Strategy for Analytical Workloads 377

Implement a Partition Strategy for Streaming Workloads 378

Implement a Partition Strategy for Azure Synapse Analytics 378

Design and Implement the Data Exploration Layer 379

Deliver Data in a Relational Star Schema 379

Deliver Data in Parquet Files 385

Maintain Metadata 386

Implement a Dimensional Hierarchy 386

Create and Execute Queries by Using a Compute Solution That Leverages SQL Serverless and Spark Cluster 388

Recommend Azure Synapse Analytics Database Templates 389

Implement Azure Synapse Analytics Database Templates 389

Additional Data Storage Topics 390

Storing Raw Data in Azure Databricks for Transformation 390

Storing Data Using Azure HDInsight 392

Storing Prepared, Trained, and Modeled Data 393

Summary 394

Exam Essentials 395

Review Questions 396

Part III Develop Data Processing 399

Chapter 5 Transform, Manage, and Prepare Data 401

Chapter 6 Ingest and Transform Data 402

Transform Data Using Azure Synapse Pipelines 404

Transform Data Using Azure Data Factory 410

Transform Data Using Apache Spark 414

Transform Data Using Transact- SQL 429

Transform Data Using Stream Analytics 431

Cleanse Data 433

Split Data 435

Shred JSON 439

Encode and Decode Data 445

Configure Error Handling for the Transformation 450

Normalize and Denormalize Values 451

Transform Data by Using Scala 461

Perform Exploratory Data Analysis 463

Transformation and Data Management Concepts 473

Transformation 473

Data Management 480

Azure Databricks 481

Data Modeling and Usage 485

Data Modeling with Machine Learning 486

Usage 494

Summary 500

Exam Essentials 500

Review Questions 502

Create and Manage Batch Processing and Pipelines 505

Design and Develop a Batch Processing Solution 507

Design a Batch Processing Solution 510

Develop Batch Processing Solutions 512

Create Data Pipelines 538

Handle Duplicate Data 560

Handle Missing Data 569

Handle Late- Arriving Data 571

Upsert Data 572

Configure the Batch Size 578

Configure Batch Retention 581

Design and Develop Slowly Changing Dimensions 582

Design and Implement Incremental Data Loads 583

Integrate Jupyter/IPython Notebooks into a Data Pipeline 590

Chapter 7 Revert Data to a Previous State 591

Handle Security and Compliance Requirements 592

Design and Create Tests for Data Pipelines 593

Scale Resources 593

Design and Configure Exception Handling 593

Debug Spark Jobs Using the Spark UI 594

Implement Azure Synapse Link and Query the Replicated Data 594

Use PolyBase to Load Data to a SQL Pool 595

Read from and Write to a Delta Table 595

Manage Batches and Pipelines 596

Trigger Batches 597

Schedule Data Pipelines 597

Validate Batch Loads 598

Implement Version Control for Pipeline Artifacts 604

Manage Data Pipelines 607

Manage Spark Jobs in a Pipeline 609

Handle Failed Batch Loads 610

Summary 610

Exam Essentials 611

Review Questions 612

Design and Implement a Data Stream Processing Solution 615

Develop a Stream Processing Solution 617

Design a Stream Processing Solution 618

Create a Stream Processing Solution 630

Process Time Series Data 657

Design and Create Windowed Aggregates 658

Process Data Within One Partition 661

Process Data Across Partitions 663

Upsert Data 665

Handle Schema Drift 674

Configure Checkpoints/Watermarking During Processing 680

Replay Archived Stream Data 685

Design and Create Tests for Data Pipelines 688

Monitor for Performance and Functional Regressions 689

Optimize Pipelines for Analytical or Transactional Purposes 689

Scale Resources 690

Design and Configure Exception Handling 691

Handle Interruptions 694

Ingest and Transform Data 694

Transform Data Using Azure Stream Analytics 694

Monitor Data Storage and Data Processing 695

Monitor Stream Processing 695

Summary 695

Exam Essentials 696

Review Questions 697

Part IV Secure, Monitor, and Optimize Data Storage and Data Processing 699

Chapter 8 Keeping Data Safe and Secure 701

Design Security for Data Policies and Standards 702

Design a Data Auditing Strategy 711

Design a Data Retention Policy 716

Design for Data Privacy 717

Design to Purge Data Based on Business Requirements 719

Design Data Encryption for Data at Rest and in Transit 719

...
Details
Erscheinungsjahr: 2023
Genre: Importe, Mathematik
Rubrik: Naturwissenschaften & Technik
Medium: Taschenbuch
Inhalt: 1008 S.
ISBN-13: 9781119885429
ISBN-10: 1119885426
Sprache: Englisch
Einband: Kartoniert / Broschiert
Autor: Perkins, Benjamin
Hersteller: John Wiley & Sons Inc
Verantwortliche Person für die EU: Wiley-VCH GmbH, Boschstr. 12, D-69469 Weinheim, amartine@wiley-vch.de
Maße: 234 x 186 x 53 mm
Von/Mit: Benjamin Perkins
Erscheinungsdatum: 06.09.2023
Gewicht: 1,832 kg
Artikel-ID: 121331760
Sicherheitshinweis

Ähnliche Produkte

Ähnliche Produkte