Upcoming Batch
25 to 29 Feb 2021 Online ‘Live’9:30am to 5:30pm Singapore Timezone
5 Sessions / 40-Hours Apply Now
|
|
1 to 5 Mar 2021 Online ‘Live’9:30am to 5:30pm Singapore Timezone
5 Sessions / 40 Hours Apply Now
|
|
26 to 30 Apr 2021 Online ‘Live’9:30am to 5:30pm Singapore Timezone
5 Sessions / 40 Hours Apply Now
|
|
This is my second time to attend a training by Casugol and both times I was able to get something from it which can be applied to my current occupation. Dwayne is very patient and attends to your question, which is a good quality of how a trainer should be. Also, even at the last minute request, he extended to give way to our company's townhall.. you ought to try them! Awesome to Learn, Thanks Sir Dwayne. Good beginner class to learn the foundation of the Data Analytics! Friendly trainer. Always stopping to ask if we have any questions and asking us questions to make the lesson interactive. Superb training experience! Very in depth seminar conducted by Mr Dwayne Ong on CDTP course. We appreciate your time and support for the past 2 days on the course. Mr Dwayne was very knowledgeable and offered very valuable information. Great tools to apply for organizations moving towards Digitalizations Bahagian Pengambilan Pelajar IPT
Very informative class. Participants able to learn latest trend in digital trend It was an amazing experience. A very interesting learning experience outside my expertise Nice experience on this training. Thank you. Machine learning is very interesting Learn enjoy with CASUGOL😁 
Course Information
- Duration: 5-Day / 40 Hours
- Certification: Participants will receive a Certificate of Competency upon successfully completing the course and passing the examination
- Who Should Attend: Data Scientist, Analyst, Data Engineers, CIO, CTO, Software Programmers, and Anyone interested in acquiring advanced knowledge and skills in Big Data, Hadoop and Python
Course Objective
To provide participants with the advanced knowledge on Big Data Analytics.
Learn how Big Data Analytics is being applied in real life through real-time demonstration on scenario based hands-on exercises.
Pre-Requisite
It is preferred that participants successfully complete and pass Advanced Big Data Professional
Examination
Participants are required to attempt an examination upon completion of course. This exam tests a candidate’s knowledge and skills related to Big Data,Hadoop and Python based on the syllabus covered
Module 1
Overview on Big Data Hadoop Ecosystem
Topics Covered
- Python Refresher
- Standard Toolkit for Hadoop and Analytics
- Understanding Relational, NoSQL, and Graph Databases
- Construction of Data Pipelines
|
- Data Modeling in Hadoop
- HDFS Schema Design
- HBase Schema Design
- Working on Metadata
|
Module 2
Advanced Hadoop Techniques
Topics Covered
- What is Data Ingestion?
- Different Ways to Perform Data Ingestion
- Data Extraction
- Data Processing in Hadoop
- Overview on MapReduce
|
- Working on Spark components
- Pig and How it is Being Used
- Overview on Hive
- Impala Speed-Oriented Design
|
Module 3
Introduction to Orchestration
Topics Covered
- Orchestration Frameworks in Hadoop
- Oozie Terminology and Workflow
- Windowing Analysis using Spark
|
- Parameterizing Workflows
- Scheduling Patterns
- Execution of Workflows
|
Module 4
Real-Time Processing with Hadoop
Topics Covered
- Stream Processing
- Integration of Apache Storm with HDFS and HBase
- Trident Overview
- Spark Streaming Overview
|
- Flume Interceptors
- Low-Latency Enrichment, Validation, Alerting, and Ingestion
- NRT Counting, Rolling Averages, and Iterative Processing
- Complex Data Pipelines
|
Module 5
Working with Big Data Framework using Python and Spark
Topics Covered
- Hadoop and Spark Refresher
- Spark SQL and Python Pandas DataFrame
- Improving Analysis Performance with Parquet and Partitions
- Working with Unstructured Data
|
- Working on Spark DataFrames
- Writing Output from Spark DataFrames
- Data Manipulation with Spark DataFrames
- Plotting Graph in Sparks
|
Module 6
Exploratory Data Analysis
Topics Covered
- Handling of Missing Values using Spark DataFrame
- Correlation Analysis with Python PySpark DataFrame
- Improving Analysis Performance with Parquet and Partitions
- Understanding Exploratory Data Analysis
|
- Identify Target Variable and Related KPIs
- Feature Importance of Target Variable
- Different Phases of an Analytics Project Life Cycle
- Gaussian Distribution of Numeric Features
|
Module 7
Advanced Big Data Analysis
Topics Covered
- Reproducible Approach to Gathering Data
- Understanding the Standards and Code Practices
- Segmentation of Workflow
- Missing Value Preprocessing with High Reproducibility
|
- Use of Functions / Loops to Optimize Coding
- Utilization of Libraries / Packages / Algorithms
- Normalization of Data
|
Module 8
Putting Everything To Together
Topics Covered
- Reading Data from a CSV File with Python PySpark Object
- Reading JSON Data with Python PySpark Object
- Using Python PySpark Objects for SQL Operations
|
- Generating Statistical Measurements
- Visualisation Using Plotly
|
Module 9
Big Data and Machine Learning using Spark
Topics Covered
- Resilient Distributed Datasets with Spark
- Introduction to Spark MLlib
- Decision Tree with Spark MLlib
- K-Means Clustering with Spark
|
- Term Frequency – Inverse Document Frequency (TF-IDF)
- DataFrame API with Spark MLlib
- Understanding A/B Testing
|
Advanced Big Data Analytics Expert (ABDE) involves rigorous usage of real-time case studies, hands-on exercises and group discussion
Explore More Courses
What Past Participants Say
The knowledge acquired from CASUGOL program has allowed me to gain a deep understanding on the technology and will come handy when there is a requirement to kickstart future projects.
Syed Othman Abd Rahman
|
|
The hands-on exercises which is part of CASUGOL course syllabus are clearly explained, demonstrated and implemented in class with the guidance of experienced CASUGOL trainers.
Andrew Lim
|
|
|
Why CASUGOL
Customization of Programs for specific industry, organisation, government agencies, statutory boards.
Flexible programmes designed to cater to the individual needs of participants, whether for professional upskilling, or for general interest.
|
|
Benefit from contribution from leading Industry Experts, Academics, and Researchers from across the world.
Opportunities for employers to develop their workforce and for individuals to enhance their career.
|
|
Dynamic learning environment that providing participants with professional networking opportunity.
Online support for participants after the training. |
|
Need more information?
Let us help if you are planning to advance your career and further your education. Request for more information.
Request for more
|