The “Applied Data Science with Python” course introduces your delegates to all most essential and practical applications of Python programming language for data wrangling, management, analysis and basic visualisations. The course will provide you with practical skills in general Python programming language for data science purposes and a number of Python’s libraries specifically designed for scientific computing and data analysis e.g. NumPy, pandas, matplotlib, IPython, SciPy etc.
The course is suitable for data and insights analysts/scientists, data engineers and data product developers who are responsible for pre-processing of data, analytics and reporting of findings.
The course covers a variety of topics related to data processing and analysis using Python language including standard Python data structures and other data objects used for scientific and statistical computing available in NumPy (multi-dimensional arrays) and pandas (Series, DataFrame) libraries, importing/exporting data from various file formats (Excel spreadsheets, csv, tab, txt etc.), basic and more advanced data transformations and essential data wrangling techniques, summaries, data aggregations, cross-tabulations, frequency and pivot tables, simple graphical representations of the data (bar plots, histograms, box plots etc.) using matplotlib and seaborn libraries, introduction to hypothesis testing with correlations, t-tests and essentials of predictive modelling using multiple linear regression methods with SciPy, statsmodels and scikit-learn packages.
Basic course information
Minimum recommended duration: 4-5 full days or 8-10 half-days (can be spread across multiple weeks)
Programming languages used: Python
Minimum number of attendees: 5
Course level: For beginners/novice, also good as a “refresher” for more advanced analysts.
Pre-requisites: No prior knowledge of Python is required from delegates attending this course, however a keen interest in data analysis is assumed. It is recommended that the attendees have practical experience in data processing or quantitative research – gathered from either professional work or university education/research. A good knowledge of statistics would be beneficial.
IT recommendations: In order to benefit from the contents of the course it is recommended that attendees have the most recent version of Anaconda distribution of Python (by Continuum Analytics) installed on their laptops (any operating system). As Anaconda’s Python is a free and fully-supported distribution you can download it directly from https://www.continuum.io/downloads. Please contact us should you have any questions related to the installation process or should you wish to use a different setup for your course.
The programme for each in-house training course is discussed and agreed individually with the client. The proposed contents of the course may include (but is not limited to) the following concepts and topics:
Use Python’s Anaconda distribution and its integrated development environment Spyder with Jupyter Notebooks to manage, develop and share a Python analytics project,
Understand and differentiate between a variety of data structures within the core Python language as well as a highly-efficient and optimised data structures from NumPy and pandas libraries,
Perform basic mathematical and more advanced control flow operations,
Import and export data from/to various data file formats e.g. Excel spreadsheets, CSV, tab-delimited, text files, and also SQL databases,
Prepare, transform and manage datasets and their variables, add/delete rows, create samples and subsets, identify specific cases based on conditional search, sort cases, add/edit value and variable labels, deal with missing data, standardise, normalise and reshape data, merge datasets and use joins,
Carry out an extensive Exploratory Data Analysis (EDA): inspect the structure of datasets and their variables, calculate cross-tabulations and descriptive statistics to summarise the data e.g. pivot tables, summary tables and data aggregations,
Introduction to EDA plotting and graphical visualisations: histograms, density plots, scatterplots, box plots, bar plots, line graphs etc.,
Perform simple hypothesis testing and inference statistics: tests of differences and correlations. Run tests for normality assumptions, t-tests, analyses of variance (ANOVA), correlations and simple regressions,
Data modelling: ANOVA and multiple linear regressions – understanding multivariate inferential tests and statistical outputs; Using regressions for predictions on test data.
Customise the course
We can adapt our in-house training courses to address your specific needs and requirements e.g.:
The course can be designed to include your own data. If it is not possible e.g. due to data security issues, we can customise the course to contain exercises that address similar problems,
The course period can be spread across multiple weeks/months depending on your needs and availability – this will allow your delegates to revise and practise the learnt skills before the next session and provide them with additional time to internalise all presented material,
The course can include a custom project spread across several weeks/months with a follow-up session at the end of the period,
As all our in-house training courses are quoted individually, the final cost quotation will be based on several factors: the number of attendees, days of training (plus additional support/project guidance if needed), location of the training, complexity of IT setup and the extent of course customisation.
Arrange this course at your organisation
If you are interested in this in-house training course, please press Ask For Quote button in the top part of the page to enquire about and request a quote for this course based on your specific needs and desired outcomes of the training.
In your enquiry please include the following information:
contact details to a person who should receive the quote,
number of delegates you would like to train,
approximate number of days (or half-days) you would like to arrange the course for (including additional support/project guidance if needed),
location of the training venue,
any details on course customisation or specific topics you would like the course to address – most importantly, please indicate desired outcomes of the course if different then presented above,
any other questions you may have.