This hands-on 2-day, instructor-led, live, online training course comprehensively covers industry-standard and less common, more specialised clustering algorithms and customer segmentation methods along with their computational implementations in Python programming language. It also serves as an in-depth introduction to most recent and cutting-edge Big Data cluster analysis tools and methods which provide greater computational efficiency and scalability compared to typical clustering methods. Through a series of coding tutorials, brief lectures and short practical exercises, the course demonstrates how to apply, optimise, visualise and evaluate clustering methods in academic, industrial and business settings e.g. in social science, biological and medical sciences, genetics, customer/product segmentation and recommendation systems.
All clustering algorithms presented during this course will be implemented in Python programming language either through the custom-made code or with functions and methods available in Python libraries e.g. NumPy, pandas, SciPy, Scikit-Learn, and Statsmodels.
Who is this course for?
This instructor-led, live, online, short course is suitable for post-doctoral researchers, Master’s or PhD students, industry data analysts or enterprise data/ML scientists and ML engineers, who are currently using the Python programming language (preferably at intermediate level) and would like to expand their skills to include theoretical understanding and practical implementations of industry-standard and modern, cutting-edge, scalable clustering and customer segmentation algorithms in Python.
This is a 2-day instructor-led online training course with a week-long follow up period. The course will run from 10:00 in the morning to ~16:30 in the afternoon (London, UK time) each day and will include a 45-minute break for lunch between morning and afternoon sessions. Following the course, you will be able to submit your solutions to practical exercises for which you will receive personal feedback from the tutor.
This training course is instructor-led – all online tutorials are presented live by our expert instructor, you can ask questions, discuss the topic and interact with other learners. You can also email the tutor after the course if you have any questions related to the material presented during the course.
The course will be recorded – you will have access to the video recordings of the course and additional resources such as datasets, Python code, academic papers related to the topic of the workshop, and supplementary exercises via Mind Project Learning Platform.
Course dates: Monday & Tuesday, 10th & 11th of October 2022, 10:00-16:30 London (UK) time
Deadline for registrations: Friday, 7th of October 2022 @ 17:00 London (UK) time
During this instructor-led live course you will:
- Implement industry standard partitional, linkage-based, density-based or spectral clustering and customer segmentation methods such as k-means and hierarchical clustering approaches, as well as less common, but more specialised algorithms such as clara, affinity propagation, mean shift, DBSCAN, optics, minimum spanning trees and spectral clustering to identify meaningful clusters in datasets from social science, biology/medical science/genetics, and business/finance fields,
- Learn about recent, cutting-edge cluster analysis techniques applicable to Big Data e.g. parallel implementations of density-based lightning connection clustering (LAPO-DBSCAN) and improved k-means (Wang, 2022) which rectify typical issues of common clustering approaches,
- Understand data processing requirements, computational efficiency and specific use cases for each presented method,
- Implement more complex and better optimised variants of typical methods by selecting their hyperparameters e.g. using different distance metrics, cluster linkage approaches and underlying mathematical algorithms,
- Evaluate and compare the clustering solutions returned by various methods and their variants through a number of metrics such Davies-Bouldin and Calinski-Harabasz scores, Dunn’s validation index, Silhouette Index, Jaccard Index, (Adjusted) Rand Index, Fowlkes-Mallows Index or (Adjusted) Mutual Information estimates,
- Visualise the obtained clusters in 2D and 3D plots, calculate profiling attributes for each cluster and interpret them,
- Discuss advanced methods of learning deep representations by reviewing Deep Learning clustering approaches of different types e.g. sequential multistep (e.g. Fast Spectral and Deep Sparse Subspace Clustering), joint (e.g. Task-Specific and Graph-Regularised Networks and Deep Clustering Networks) or closed-loop multistep (e.g. Deep Embedding Clustering – DEC) methods.
Course pre-requisites and further instructions
We recommend that all attendees have the most recent version of Anaconda Individual Edition of Python installed on their PCs (any operating system). Anaconda’s Python is a free and fully-supported distribution and you can download it directly from https://www.anaconda.com/products/individual#Downloads. Please contact us should you have any questions or issues with the installation process. You may also use any other Python IDE of your choice and/or your own Python virtual environment. A list of Python libraries to pre-install before the course will be sent to the enrolled attendees in the Welcome Pack alongside other course Joining Instructions.
We recommend that the attendees have practical experience in data processing/engineering or quantitative research with Python programming language – gathered from either professional work or university education/research. A good knowledge of statistics or experience with ML techniques would be beneficial. We suggest that the course is preceded with our “Python for Data Analysis” instructor-led six-week online training course.
Your PC needs to be connected to a stable WiFi/Internet network (either home or office-based) and have Zoom video-conferencing application installed.
You will need at least one commonly used web browser installed on your PC (e.g. Chrome, Safari, Firefox, Edge etc.) to access our Mind Project Learning Platform.
Your course instructor
Your instructor for this course will be Simon Walkowiak. Simon is a director at Mind Project Limited and a Ph.D. researcher in Artificial Intelligence at the Bartlett Centre for Advanced Spatial Analysis (University College London) and the Alan Turing Institute in London. Simon holds BSc (First Class Honours) in Psychology with Neuroscience and MSc (Distinction) in Big Data Science. He conducts and manages research projects on implementation and computational optimisation of novel AI approaches applicable to large-scale datasets to predict human behaviour and spatial cognition. Simon is the author of “Big Data Analytics with R” (2016) – a widely used textbook on high-performance computing with R language and its compatibility with the ecosystem of Big Data tools e.g. SQL/NoSQL databases, Spark, Hadoop etc. Apart from research and data management consultancy, during the past several years, Simon has taught at more than 150 in-house or open-to-public statistical training courses (in R, Python, SQL, Java and Scala languages) in the UK, Europe, Asia and USA. His major clients include organisations from finance and banking (HSBC, RBS, GE Capital, European Central Bank, Credit Suisse etc.), research and academia (GSMA, CERN, UK Data Archive, Agri-Food Biosciences Institute, Newcastle University etc.), health (NHS), and government (Home Office, Ministry of Justice, Government Actuary’s Department etc.).
Discounts and multiple bookings
We offer 2 types of enrolment options:
- Regular Fee – full-priced enrolment for learners representing commercial organisations or self-funded individuals who do not meet our eligibility criteria for discounted rates (please see below),
- Discounted Fee – applicable to undergraduate and postgraduate students as well as representatives of registered charitable organisations and non-governmental organisations (NGOs) – this category also includes employees of the National Health Service (NHS).
Students and individuals who are eligible for the Discounted Fee should submit a copy of their student or organisation ID card (with their name and card expiry date visible) when making the purchase of their place on the course for the discount eligibility verification purposes. Alternatively, the discount eligibility can be verified by submitting either i.) a copy of a letter from the university registrar or student’s department confirming your status, or ii.) a copy of a letter from your employer (on a company letter-headed paper with a charity/NGO registration number) which confirms your current position within the organisation.
Apart from the discounted fees for students or employees of charitable organisations and NGOs, we are able to offer further discounts on the overall cost of your training if you wish to attend multiple related courses or enrol several delegates on this specific course. Please note that this offer is only available through our website.
- If you book 3 or 4 tickets on any of our tutor-led open-to-public online training courses, you will receive 5% discount on the total price of your booking.
- If you book 5 or more tickets on any of our tutor-led open-to-public online training courses, you will receive 10% discount on the total price of your booking.
All discounts are calculated automatically when tickets are added to the Cart. For bookings of 6 and more delegates on one course, we recommend that you contact us directly – we may be able to arrange a separate course just for your delegates at a discounted rate.
Arrange this course at your organisation
If your delegates cannot attend this public course, or you are interested in arranging this training course exclusively for your delegates (or at your premises) or simply you need a bespoke, made-to-measure training solution, please request a quote for the in-house version of this course based on your specific needs and desired outcomes of the training.
You may email us directly at info(at)mindproject.io and include the following information in your enquiry:
contact details to a person who should receive the quote,
number of delegates you would like to train,
approximate number of online sessions (or half-days / full days for on-site in-house course) you would like to arrange the course for (including additional support/project guidance if needed),
location of the training venue if not online,
any details on course customisation or specific topics you would like the course to address – most importantly, please indicate desired outcomes of the course if different then presented above,
any other questions you may have.
If you don’t know the answers to all questions above or you are at early stages of the course planning process, we would be happy to arrange an informal chat and help you choose the most suitable and budget-efficient option.