During the “Introduction to Hadoop” training course, your delegates will become familiar with major characteristics and functionalities of Apache Hadoop platform and its ecosystem of tools for Big Data processing and analysis. The course provides a hands-on practical experience in Hadoop Distributed File System (HDFS) and MapReduce frameworks. The attendees will learn to design and perform simple MapReduce programs to process the data and calculate a set of statistics. The course can also serve as a gentle introduction to the basics of Java programming language and essential Hadoop File System Unix-like shell commands.
This training course is designed for clients who consider migrating their Big Data workflows to the Hadoop ecosystem or wish to upskill and update their analytics team with essential Big Data processing knowledge.
Basic course information
Minimum recommended duration: 4-5 full days or 8-10 half-days (can be spread across multiple weeks)
Programming languages used: Java (also HDFS shell commands and basics of SQL for Hive querying)
Minimum number of attendees: 5
Course level: For beginners/novice/intermediate data engineers, data scientists and developers.
Pre-requisites: Good IT skills and practical experience in manipulating large datasets are recommended. Some knowledge of Java language and Unix commands will be beneficial, however these will be explained during the training.
IT recommendations: During the course the attendees will perform several MapReduce jobs on a Linux-based Mind Project Hadoop cluster. In order to benefit from the contents of the course it is recommended that attendees have at least one of the following web browsers: Chrome, Safari, Mozilla Firefox and/or Internet Explorer, installed on their laptops (any operating system). Also, the laptops should be equipped with a simple text editor suitable for code/script typing e.g. Notepad++ (for Windows users) or TextWrangler (for Mac users). Please be advised that we do not recommend the following applications: WordPad, Gedit or TextEdit. Other IT requirements will apply depending on the agreed setup. Please contact us should you wish to use a different setup for your course.
The programme for each in-house training course is discussed and agreed individually with the client. The proposed contents of the course may include (but is not limited to) the following concepts and topics:
Understanding the features, major characteristics, architecture and operations of Hadoop and its ecosystem including Yet Another Resource Negotiator (YARN), Hadoop Distributed File System, MapReduce programming framework and other Hadoop-related tools e.g. HBase, Hive, Cassandra, Mahout and Pig,
Monitoring and diagnostics of the performance of Apache Hadoop clusters and their resources using Apache Ambari,
Management of large datasets in Hadoop Distributed File System (HDFS) using Hadoop File System shell commands,
Introduction to Java programming language, its data structures, syntax, classes, objects and its implementation in Hadoop,
Design and execution of simple MapReduce parallel programs (written in Java) for computing various statistics, summaries, data aggregations and analyses and to control their performance in real-time,
Practical applications of learnt skills to deploy and provision Hadoop-based Big Data applications.
Customise the course
We can adapt our in-house training courses to address your specific needs and requirements e.g.:
The course can be designed to include your own data. If it is not possible e.g. due to data security issues, we can customise the course to contain exercises that address similar problems,
The course period can be spread across multiple weeks/months depending on your needs and availability – this will allow your delegates to revise and practise the learnt skills before the next session and provide them with additional time to internalise all presented material,
The course can include a custom project spread across several weeks/months with a follow-up session at the end of the period,
As all our in-house training courses are quoted individually, the final cost quotation will be based on several factors: the number of attendees, days of training (plus additional support/project guidance if needed), location of the training, complexity of IT setup and the extent of course customisation.
Arrange this course at your organisation
If you are interested in this in-house training course, please press Ask For Quote button in the top part of the page to enquire about and request a quote for this course based on your specific needs and desired outcomes of the training.
In your enquiry please include the following information:
contact details to a person who should receive the quote,
number of delegates you would like to train,
approximate number of days (or half-days) you would like to arrange the course for (including additional support/project guidance if needed),
location of the training venue,
any details on course customisation or specific topics you would like the course to address – most importantly, please indicate desired outcomes of the course if different then presented above,
any other questions you may have.