Stay ahead with 100% Free Cloud Certified Professional Data Engineer Professional-Data-Engineer Dumps Practice Questions
Your company is currently setting up data pipelines for their campaign. For all the Google Cloud Pub/Sub streaming data, one of the important business requirements is to be able to periodically identify the inputs and their timings during their campaign. Engineers have decided to use windowing and transformation in Google Cloud Dataflow for this purpose. However, when testing this feature, they find that the Cloud Dataflow job fails for the all streaming insert. What is the most likely cause of this problem?
You have a BigQuery dataset named "customers". All tables will be tagged by using a Data Catalog tag
template named "gdpr". The template contains one mandatory field, "has sensitive data~. with a boolean value.
All employees must be able to do a simple search and find tables in the dataset that have either true or false in
the "has sensitive data" field. However, only the Human Resources (HR) group should be able to see the data
inside the tables for which "hass-ensitive-data" is true. You give the all employees group the
bigquery.metadataViewer and bigquery.connectionUser roles on the dataset. You want to minimize
configuration overhead. What should you do next?
You are developing a data pipeline that will run several data transformation programs on Compute Engine virtual machines. You do not want to use your credentials for authenticating and authorizing these programs. You want to follow Google Cloud recommended practices, how would you authenticate and authorize the data transformation programs?
You want to schedule a number of sequential load and transformation jobs Data files will be added to a Cloud
Storage bucket by an upstream process There is no fixed schedule for when the new data arrives Next, a
Dataproc job is triggered to perform some transformations and write the data to BigQuery. You then need to
run additional transformation jobs in BigQuery The transformation jobs are different for every table These
jobs might take hours to complete You need to determine the most efficient and maintainable workflow to
process hundreds of tables and provide the freshest data to your end users. What should you do?
How would you query specific partitions in a BigQuery table?
© Copyrights TheExamsLab 2025. All Rights Reserved
We use cookies to ensure your best experience. So we hope you are happy to receive all cookies on the TheExamsLab.