Computer Architecture  >  QUESTIONS & ANSWERS  >  HPC/Big Data Certification Exam 2022 with complete solution (All)

HPC/Big Data Certification Exam 2022 with complete solution

Document Content and Description Below

HPC/Big Data Certification Exam 2022 with complete solution 1. What qualifies as a Big Data Workload? >>>>>· Consists of semi-structured, or unstructured data not suitable for relational databases ... · Data volume is considered too large for other solutions (Petabyte scale) · To process the data in a reasonable timeframe, a massively parallel solution is required Most common Big Data workloads are? >>>>>· Batch processing · In memory processing · ML (typically GPU based) What are the challenges customers run into for Big Data workloads on premises? >>>>>· Tracking growth patterns and scaling infrastructure to meet capacity requirements · Time associated with procuring, deploying and maintaining infrastructure to meet demand · The cost associated with processing this data using other methods · The cost associated with Disaster Recovery when dealing with Petabytes of data · The cost and complexity associated with hardware refresh Customers running Big Data workloads on OCI can ___? >>>>>· Dynamically scale capacity against demand · Leverage Object Storage as a cost-effective Data Lake and for Disaster Recovery · Take advantage of the best price/performance in the cloud · Use OCI's managed service offerings to easily deploy and run common Big Data frameworks What is Big Data Appliance (BDA)? >>>>>Single tenant, Cloudera based hardware appliance deployed on-prem What does Big Data Appliance (BDA) include? >>>>>· Cloudera Enterprise Data Hub (EDH) v5.12 · Big Data Manager · Big Data SQL What is Oracle Big Data Service (BDS)? >>>>>· Multitenant, managed Cloudera EDH Hadoop Deployment What does Oracle Big Data Service (BDS) include? >>>>>· Cloudera EDH v5.16.1 or v6.2.0 · Big Data Manager · Big Data SQL What is the difference between Big Data Cloud Service (BDCS) and Big Data Service (BDS)? >>>>>· BDCS - Gen1, BDS - Gen2 · BDCS - Cloudera EDH v5.16.x, BDS - Cloudera EDH v5.16.1 or v6.2.0 · BDCS - deprecated, BDS - license included in consumption What is Oracle Data Flow (ODF)? >>>>>· Provides serverless framework for running Spark based workloads Where do customers put data and application code for Oracle Data Flow (ODF) applications? >>>>>· Object Storage Oracle Data Flow (ODF) provides support for what type of applications? >>>>>· Java · Python · SQL · Scala What is Oracle Data Science? >>>>>· Platform for data scientists to create projects which run notebook-based modeling on-demand What services does Oracle Data Science use? >>>>>· Compute · Block Storage What shapes are available for Oracle Data Science Notebook Sessions >>>>>· VM.Standard.E2.2, VM.Standard.E2.4, VM.Standard.E2.8 · VM.Standard2.1, VM.Standard2.2, VM.Standard2.4, VM.Standard2.6, VM.Standard2.8, VM.Standard2.16, VM.Standard2.24, What is Oracle Streaming Service? >>>>>· Kafka compatible producer/consumer service that ingests continuous streams of data Which Hadoop distributions are supported on OCI? >>>>>· Cloudera · Hortonworks · MapR Some self-managed Big Data Products are driven by ___? >>>>>· OCI QuickStart program · Marketplace 1. When deploying Hadoop on OCI, what should you consider? >>>>>· Normalize either OCPU or Memory against OCI shapes used as workers to meet workload requirements · Use HDFS replication factor 3 when using DenseIO NVMe storage to mitigate hardware failure · After normalizing OCPU or Memory, use heterogeneous storage on DenseIO workers leveraging Block Storage to augment HDFS capacity - allows you to scale HDFS capacity around workload · Segregate cluster and storage network traffic on BM hosts when using Block Volumes for HDFS by leveraging both physical VNICs - create a storage network for primary interface and deploy Hadoop on secondary interface · Use Private IP networks for Hadoop cluster hosts - enable cluster access either by using edge node(s) or VPN based access like FastConnect What is Terasort? >>>>>· Popular benchmark that measures the amount of time to sort 1TB of randomly distributed data on a given computer system · Used to measure MapReduce performance of an Apache Hadoop Cluster (all hardware layers - CPU, Memory, Storage, Network I/O) What are the phases of Terasort? >>>>>· TeraGen · TeraSort · TeraValidate What is TeraGen? >>>>>· Generate random dataset of specified size What is TeraSort? >>>>>· Map, Shuffle, Reduce the source data into smaller result set What is TeraValidate? >>>>>· Read the result set and validate it What is TeraGen heavily dependent on? >>>>>· Write intensive What is TeraSort heavily dependent on? >>>>>· Read, process, write, and I/O intensive What is TeraValidate heavily dependent on? >>>>>· Read intensive What is Terasort benchmark? >>>>>· Total time to run all 3 Terasort phases Draw a diagram of how TeraSort looks like >>>>>Look at study guide What is the most important phase of Terasort? >>>>>· TeraSort What are some steps for sizing when considering deployment on OCI? >>>>>· Always build in redundancy for DenseIO hosts to mitigate data loss - in case of Hadoop use HDFS replication factor of 3 for local NVMe storage · For low risk environments, can use a replication factor of 2 for Block Storage - more cost effective · Block Storage throughput uses the same bandwidth available to each instance VNIC What are some best practices for Big Data Migration? >>>>>· Object Storage · Data Transfer Appliance · FastConnect What is Data Transfer Appliance? >>>>>· A way for customers who have governance requirements which restrict copying sensitive data over wire or too much data (not enough time/bandwidth) in which Oracle delivers an appliance for the customer to load data, then Oracle uploads it to Object Storage [Show More]

Last updated: 3 years ago

Preview 1 out of 19 pages

Buy Now

Instant download

We Accept:

Payment methods accepted on Scholarfriends (We Accept)
Preview image of HPC/Big Data Certification Exam 2022 with complete solution document

Buy this document to get the full access instantly

Instant Download Access after purchase

Buy Now

Instant download

We Accept:

Payment methods accepted on Scholarfriends (We Accept)

Reviews( 0 )

$11.00

Buy Now

We Accept:

Payment methods accepted on Scholarfriends (We Accept)

Instant download

Can't find what you want? Try our AI powered Search

133
0

Document information


Connected school, study & course


About the document


Uploaded On

Aug 31, 2022

Number of pages

19

Written in

All

Seller


Profile illustration for Excel
Excel

Member since 3 years

246 Documents Sold

Reviews Received
15
2
2
1
6
Additional information

This document has been written for:

Uploaded

Aug 31, 2022

Downloads

 0

Views

 133

Document Keyword Tags


$11.00
What is Scholarfriends

Scholarfriends.com Online Platform by Browsegrades Inc. 651N South Broad St, Middletown DE. United States.

We are here to help

We're available through e-mail, Twitter, Facebook, and live chat.
 FAQ
 Questions? Leave a message!

Follow us on
 Twitter

Copyright © Scholarfriends · High quality services·