最新Databricks Certification Databricks-Certified-Data-Engineer-Associate考試題庫，Databricks Databricks-Certified-Data-Engineer-Associate課程的擬真試題下載

問題1
A data engineer needs to control access to data assets across multiple workspaces and enforce centralized governance policies. The organization wants fine-grained access control for tables, schemas, and catalogs. Which Databricks feature supports this requirement?

A. Unity Catalog

B. Delta Cache

C. MLflow

D. DBFS

正確答案: A

問題2
A data engineer streams customer orders into a Kafka topic (orders_topic) and is currently writing the ingestion script of a DLT pipeline. The data engineer needs to ingest the data from Kafka brokers to DLT using Databricks. What is the correct code for ingesting the data?

A.

B.

C.

D.

正確答案: B

說明：（僅 NewDumps 成員可見）

問題3
What is the structure of an Asset Bundle?

A. A compressed archive (ZIP) that solely contains workspace assets without any accompanying metadata

B. A YAML configuration file that specifies the artifacts, resources, and configurations for the project

C. A single plain text file enumerating the names of assets to be migrated to a new workspace

D. A Docker image containing runtime environments and the source code of the assets

正確答案: B

說明：（僅 NewDumps 成員可見）

問題4
A Python file is ready for production and the client wants to use the most efficient yet cost- effective type of cluster possible. The workload is quite small, only processing 10GBs of data with only simple joins and no complex aggregations or wide transformations.
Which cluster meets the requirement?

A. Job cluster with spot instances disabled

B. Job cluster with Photon enabled

C. Job cluster with spot instances enabled

D. Interactive cluster

正確答案: C

說明：（僅 NewDumps 成員可見）

問題5
A data engineer is writing a DataFrame to a Delta table and wants to physically divide the data into directories based on a specific column such as country. Which Spark DataFrame writer option should be used?

A. groupBy

B. partitionBy

C. distributeBy

D. orderBy

正確答案: B

問題6
A data engineer needs to access the view created by the sales team, using a shared cluster. The data engineer has been provided usage permissions on the catalog and schema. In order to access the view created by sales team. What are the minimum permissions the data engineer would require in addition?

A. Needs ALL PRIVILEGES on the VIEW

B. Needs ALL PRIVILEGES at the SCHEMA level

C. Needs SELECT permission on the VIEW and the underlying TABLE.

D. Needs SELECT permission only on the VIEW

正確答案: D

問題7
A data engineer needs to create a table in Databricks using data from a CSV file at location
/path/to/csv.
They run the following command:

Which of the following lines of code fills in the above blank to successfully complete the task?

A. USING DELTA

B. USING CSV

C. None of these lines of code are needed to successfully complete the task

D. FROM "path/to/csv"

E. FROM CSV

正確答案: B

問題8
A data engineer is loading a dataset into a Delta table but expects the schema to evolve over time as new columns are added. The pipeline should automatically handle new fields without failing ingestion jobs. Which Delta Lake option should the engineer enable during the write operation?

A. mergeSchema

B. overwriteSchema

C. autoCompact

D. optimizeWrite

正確答案: A

問題9
Which of the following benefits of using the Databricks Lakehouse Platform is provided by Delta Lake?

A. The ability to collaborate in real time on a single notebook

B. The ability to support batch and streaming workloads

C. The ability to manipulate the same data using a variety of languages

D. The ability to distribute complex data operations

E. The ability to set up alerts for query failures

正確答案: B

問題10
The Delta transaction log for the 'students' tables is shown using the 'DESCRIBE HISTORY students' command. A Data Engineer needs to query the table as it existed before the UPDATE operation listed in the log. Which command should the Data Engineer use to achieve this?
(Choose two.)

A. SELECT * FROM students@v4

B. SELECT * FROM students FROM HISTORY VERSION AS OF 3

C. SELECT * FROM students VERSION AS OF 5

D. SELECT * FROM students TIMESTAMP AS OF '2024-04-22T14:32:58.000+00:00'

E. SELECT * FROM students TIMESTAMP AS OF '2024-04-22T14:32:47.000+00:00'

正確答案: A,E

說明：（僅 NewDumps 成員可見）

問題11
A data engineer needs to develop integration tests for an ETL process and deploy a version- controlled, packaged workflow into production using an external job scheduler. Which tool should the data engineer use for this job?

A. Databricks Connect

B. Databricks Asset Bundles

C. Databricks Software Development Kit

D. Databricks Command Line Interface

正確答案: B

說明：（僅 NewDumps 成員可見）

問題12
A data engineer needs to read files from cloud object storage into a Spark DataFrame in Databricks. The files are stored in CSV format with headers and comma delimiters. Which Spark DataFrame reader option ensures that column names are correctly inferred from the first row?

A. header

B. inferSchema

C. delimiter

D. mode

正確答案: A

問題13
A data engineer needs to optimize the data layout and query performance for an e-commerce transactions Delta table. The table is partitioned by "purchase_date" a date column which helps with time-based queries but does not optimize searches on user statistics "customer_id", a high- cardinality column.
The table is usually queried with filters on "customer_id" within specific date ranges, but since this data is spread across multiple files in each partition, it results in full partition scans and increased runtime and costs.
How should the data engineer optimize the Data Layout for efficient reads?

A. Alter table implementing liquid clustering on "customer_id" while keeping the existing partitioning.

B. Enable delta caching on the cluster so that frequent reads are cached for performance.

C. Alter the table to partition by "customer_id".

D. Alter the table implementing liquid clustering by "customer_id" and "purchase_date".

正確答案: D

說明：（僅 NewDumps 成員可見）

問題14
Which two conditions are applicable for governance in Databricks Unity Catalog? (Choose two.)

A. You can have more than 1 metastore within a databricks account console but only 1 per region.

B. If catalog is not associated with location, it's mandatory to associate schema with managed locations

C. You can have multiple catalogs within metastore and 1 catalog can be associated with multiple metastore

D. If metastore is not associated with location, it's mandatory to associate catalog with managed locations

E. Both catalog and schema must have a managed location in Unity Catalog provided metastore is not associated with a location

正確答案: A,D

說明：（僅 NewDumps 成員可見）

先試後買

購買之前，你可以先嘗試下載一個試用版本。目前我們只提供PDF版本的試用DEMO，軟件版本只提供截圖。

專業認證

品質保證

輕松通過

Try Before Buy

最新的真實試題

相關鏈接

聯繫我們