Computer Architecture > QUESTIONS & ANSWERS > HPC/Big Data Certification Exam 2022 with complete solution (All)

HPC/Big Data Certification Exam 2022 with complete solution

Document Content and Description Below

HPC/Big Data Certification Exam 2022 with complete solution 1. What qualifies as a Big Data Workload? >>>>>· Consists of semi-structured, or unstructured data not suitable for relational databases... · Data volume is considered too large for other solutions (Petabyte scale) · To process the data in a reasonable timeframe, a massively parallel solution is required Most common Big Data workloads are? >>>>>· Batch processing · In memory processing · ML (typically GPU based) What are the challenges customers run into for Big Data workloads on premises? >>>>>· Tracking growth patterns and scaling infrastructure to meet capacity requirements · Time associated with procuring, deploying and maintaining infrastructure to meet demand · The cost associated with processing this data using other methods · The cost associated with Disaster Recovery when dealing with Petabytes of data · The cost and complexity associated with hardware refresh Customers running Big Data workloads on OCI can ___? >>>>>· Dynamically scale capacity against demand · Leverage Object Storage as a cost-effective Data Lake and for Disaster Recovery · Take advantage of the best price/performance in the cloud · Use OCI's managed service offerings to easily deploy and run common Big Data frameworks What is Big Data Appliance (BDA)? >>>>>Single tenant, Cloudera based hardware appliance deployed on-prem What does Big Data Appliance (BDA) include? >>>>>· Cloudera Enterprise Data Hub (EDH) v5.12 · Big Data Manager · Big Data SQL What is Oracle Big Data Service (BDS)? >>>>>· Multitenant, managed Cloudera EDH Hadoop Deployment What does Oracle Big Data Service (BDS) include? >>>>>· Cloudera EDH v5.16.1 or v6.2.0 · Big Data Manager · Big Data SQL What is the difference between Big Data Cloud Service (BDCS) and Big Data Service (BDS)? >>>>>· BDCS - Gen1, BDS - Gen2 · BDCS - Cloudera EDH v5.16.x, BDS - Cloudera EDH v5.16.1 or v6.2.0 · BDCS - deprecated, BDS - license included in consumption What is Oracle Data Flow (ODF)? >>>>>· Provides serverless framework for running Spark based workloads Where do customers put data and application code for Oracle Data Flow (ODF) applications? >>>>>· Object Storage Oracle Data Flow (ODF) provides support for what type of applications? >>>>>· Java · Python · SQL · Scala What is Oracle Data Science? >>>>>· Platform for data scientists to create projects which run notebook-based modeling on-demand What services does Oracle Data Science use? >>>>>· Compute · Block Storage What shapes are available for Oracle Data Science Notebook Sessions >>>>>· VM.Standard.E2.2, VM.Standard.E2.4, VM.Standard.E2.8 · VM.Standard2.1, VM.Standard2.2, VM.Standard2.4, VM.Standard2.6, VM.Standard2.8, VM.Standard2.16, VM.Standard2.24, What is Oracle Streaming Service? >>>>>· Kafka compatible producer/consumer service that ingests continuous streams of data Which Hadoop distributions are supported on OCI? >>>>>· Cloudera · Hortonworks · MapR Some self-managed Big Data Products are driven by ___? >>>>>· OCI QuickStart program · Marketplace 1. When deploying Hadoop on OCI, what should you consider? >>>>>· Normalize either OCPU or Memory against OCI shapes used as workers to meet workload requirements · Use HDFS replication factor 3 when using DenseIO NVMe storage to mitigate hardware failure · After normalizing OCPU or Memory, use heterogeneous storage on DenseIO workers leveraging Block Storage to augment HDFS capacity - allows you to scale HDFS capacity around workload · Segregate cluster and storage network traffic on BM hosts when using Block Volumes for HDFS by leveraging both physical VNICs - create a storage network for primary interface and deploy Hadoop on secondary interface · Use Private IP networks for Hadoop cluster hosts - enable cluster access either by using edge node(s) or VPN based access like FastConnect What is Terasort? >>>>>· Popular benchmark that measures the amount of time to sort 1TB of randomly distributed data on a given computer system · Used to measure MapReduce performance of an Apache Hadoop Cluster (all hardware layers - CPU, Memory, Storage, Network I/O) What are the phases of Terasort? >>>>>· TeraGen · TeraSort · TeraValidate What is TeraGen? >>>>>· Generate random dataset of specified size What is TeraSort? >>>>>· Map, Shuffle, Reduce the source data into smaller result set What is TeraValidate? >>>>>· Read the result set and validate it What is TeraGen heavily dependent on? >>>>>· Write intensive What is TeraSort heavily dependent on? >>>>>· Read, process, write, and I/O intensive What is TeraValidate heavily dependent on? >>>>>· Read intensive What is Terasort benchmark? >>>>>· Total time to run all 3 Terasort phases Draw a diagram of how TeraSort looks like >>>>>Look at study guide What is the most important phase of Terasort? >>>>>· TeraSort What are some steps for sizing when considering deployment on OCI? >>>>>· Always build in redundancy for DenseIO hosts to mitigate data loss - in case of Hadoop use HDFS replication factor of 3 for local NVMe storage · For low risk environments, can use a replication factor of 2 for Block Storage - more cost effective · Block Storage throughput uses the same bandwidth available to each instance VNIC What are some best practices for Big Data Migration? >>>>>· Object Storage · Data Transfer Appliance · FastConnect What is Data Transfer Appliance? >>>>>· A way for customers who have governance requirements which restrict copying sensitive data over wire or too much data (not enough time/bandwidth) in which Oracle delivers an appliance for the customer to load data, then Oracle uploads it to Object Storage [Show More]

Last updated: 2 years ago

Preview 1 out of 19 pages

Buy Now

Instant download

We Accept:

We Accept
document-preview

Buy this document to get the full access instantly

Instant Download Access after purchase

Buy Now

Instant download

We Accept:

We Accept

Reviews( 0 )

$11.00

Buy Now

We Accept:

We Accept

Instant download

Can't find what you want? Try our AI powered Search

113
0

Document information


Connected school, study & course


About the document


Uploaded On

Aug 31, 2022

Number of pages

19

Written in

Seller


seller-icon
Excel

Member since 3 years

246 Documents Sold

Reviews Received
15
2
2
1
6
Additional information

This document has been written for:

Uploaded

Aug 31, 2022

Downloads

 0

Views

 113

Document Keyword Tags


$11.00
What is Scholarfriends

In Scholarfriends, a student can earn by offering help to other student. Students can help other students with materials by upploading their notes and earn money.

We are here to help

We're available through e-mail, Twitter, Facebook, and live chat.
 FAQ
 Questions? Leave a message!

Follow us on
 Twitter

Copyright © Scholarfriends · High quality services·