Jupyter for Data Science Teams Training Course
Jupyter is an open-source, web-based interactive IDE and computing environment.
This instructor-led, live training (online or onsite) introduces the concept of collaborative development in data science and demonstrates how to use Jupyter to track and engage as a team in the "life cycle of a computational idea". It guides participants through creating a sample data science project built on the Jupyter ecosystem.
By the end of this training, participants will be able to:
- Install and configure Jupyter, including setting up and integrating a team repository on Git.
- Utilise Jupyter features such as extensions, interactive widgets, multiuser mode and more to enable project collaboration.
- Create, share and organise Jupyter Notebooks with team members.
- Choose from Scala, Python, or R to write and execute code against big data systems such as Apache Spark, all through the Jupyter interface.
Course Format
- Interactive lecture and discussion.
- Abundant exercises and practice opportunities.
- Hands-on implementation in a live-lab environment.
Course Customisation Options
- The Jupyter Notebook supports over 40 languages including R, Python, Scala, Julia, and others. To tailor this course to your preferred language(s), please contact us to make arrangements.
Course Outline
Introduction to Jupyter
- Overview of Jupyter and its ecosystem
- Installation and setup
- Configuring Jupyter for team collaboration
Collaborative Features
- Using Git for version control
- Extensions and interactive widgets
- Multiuser mode
Creating and Managing Notebooks
- Notebook structure and functionality
- Sharing and organising notebooks
- Best practices for collaboration
Programming with Jupyter
- Choosing and using programming languages (Python, R, Scala)
- Writing and executing code
- Integrating with big data systems (Apache Spark)
Advanced Jupyter Features
- Customising the Jupyter environment
- Automating workflows with Jupyter
- Exploring advanced use cases
Practical Sessions
- Hands-on labs
- Real-world data science projects
- Group exercises and peer reviews
Summary and Next Steps
Requirements
- Programming experience in languages such as Python, R, Scala, etc.
- A background in data science
Audience
- Data science teams
Open Training Courses require 5+ participants.
Jupyter for Data Science Teams Training Course - Booking
Jupyter for Data Science Teams Training Course - Enquiry
Jupyter for Data Science Teams - Consultancy Enquiry
Testimonials (1)
It is great to have the course custom made to the key areas that I have highlighted in the pre-course questionnaire. This really helps to address the questions that I have with the subject matter and to align with my learning goals.
Winnie Chan - Statistics Canada
Course - Jupyter for Data Science Teams
Provisional Upcoming Courses (Require 5+ participants)
Related Courses
Introduction to Data Science and AI using Python
35 HoursThis is a five-day introduction to Data Science and Artificial Intelligence (AI).
The course is delivered using Python, with practical examples and hands-on exercises.
Apache Airflow for Data Science: Automating Machine Learning Pipelines
21 HoursThis instructor-led, live training in New Zealand (online or on-site) is designed for intermediate-level participants who wish to automate and manage machine learning workflows, including model training, validation, and deployment using Apache Airflow.
By the end of this training, participants will be able to:
- Set up Apache Airflow for machine learning workflow orchestration.
- Automate data preprocessing, model training, and validation tasks.
- Integrate Airflow with machine learning frameworks and tools.
- Deploy machine learning models using automated pipelines.
- Monitor and optimise machine learning workflows in production.
Anaconda Ecosystem for Data Scientists
14 HoursThis instructor-led, live training in New Zealand (online or on-site) is aimed at data scientists who wish to leverage the Anaconda ecosystem to capture, manage, and deploy packages and data analysis workflows within a single platform.
By the end of this training, participants will be able to:
- Install and configure Anaconda components and libraries.
- Understand the core concepts, features, and benefits of Anaconda.
- Manage packages, environments, and channels using Anaconda Navigator.
- Use Conda, R, and Python packages for data science and machine learning.
- Explore practical use cases and techniques for managing multiple data environments.
AWS Cloud9 for Data Science
28 HoursThis instructor-led, live training in New Zealand (online or on-site) is aimed at intermediate-level data scientists and analysts who wish to use AWS Cloud9 to streamline their data science workflows.
By the end of this training, participants will be able to:
- Set up a data science environment in AWS Cloud9.
- Perform data analysis using Python, R, and Jupyter Notebook in Cloud9.
- Integrate AWS Cloud9 with AWS data services such as S3, RDS, and Redshift.
- Use AWS Cloud9 for developing and deploying machine learning models.
- Optimise cloud-based workflows for data analysis and processing.
Introduction to Google Colab for Data Science
14 HoursThis instructor-led, live training in New Zealand (online or on-site) is designed for beginner-level data scientists and IT professionals who wish to learn the fundamentals of data science using Google Colab.
By the end of this training, participants will be able to:
- Set up and navigate Google Colab.
- Write and execute basic Python code.
- Import and manage datasets.
- Create visualisations using Python libraries.
A Practical Introduction to Data Science
35 HoursParticipants who complete this training will gain a practical, real-world understanding of Data Science and its related technologies, methodologies and tools.
Participants will have the opportunity to put this knowledge into practice through hands-on exercises. Group interaction and instructor feedback make up an important component of the class.
The course starts with an introduction to elemental concepts of Data Science, then progresses into the tools and methodologies used in Data Science.
Audience
- Developers
- Technical analysts
- IT consultants
Format of the Course
- Part lecture, part discussion, exercises and heavy hands-on practice
Note
- To request a customised training for this course, please contact us to arrange.
Data Science for Big Data Analytics
35 HoursBig data refers to datasets so vast and complex that traditional data processing software cannot effectively handle them. Key challenges in big data include data capture, storage, analysis, search, sharing, transfer, visualisation, querying, updating, and information privacy.
Data Science essential for Marketing/Sales professionals
21 Hours
This course is designed for Marketing and Sales professionals who wish to deepen their understanding of applying data science within Marketing and Sales contexts. The course provides
comprehensive coverage of various data science techniques used for “upselling”, “cross-selling”, market segmentation, branding, and Customer Lifetime Value (CLV).
Distinguishing Marketing from Sales - How do sales and marketing differ?
In simple terms, sales can be described as a process that targets individuals or small groups, whereas marketing focuses on larger audiences or the general public. Marketing encompasses research (identifying customer needs), product development (creating innovative solutions), and promotion (through advertising) to raise product awareness among consumers. Essentially, marketing is about generating leads or prospects. Once the product is available in the market, it becomes the salesperson's role to persuade customers to make a purchase. While sales involve converting leads into purchases and orders, marketing is geared towards long-term goals, whereas sales focus on shorter-term objectives.
Introduction to Data Science
35 HoursThis instructor-led, live training (online or on-site) is designed for professionals looking to embark on a career in Data Science.
By the end of this training, participants will be able to:
- Install and configure Python and MySQL.
- Understand what Data Science is and how it can deliver value to virtually any business.
- Learn the fundamentals of coding in Python.
- Grasp supervised and unsupervised Machine Learning techniques, including how to implement them and interpret their results.
Course Format
- Interactive lectures and discussions.
- Abundant exercises and hands-on practice.
- Real-world implementation in a live-lab environment.
Course Customisation Options
- To request a customised training session for this course, please contact us to arrange.
Kaggle
14 HoursThis instructor-led, live training in New Zealand (available online or on-site) is tailored for data scientists and developers who aspire to learn and advance their careers in data science using Kaggle.
By the end of this training, participants will be able to:
- Gain a solid understanding of data science and machine learning.
- Explore the fundamentals of data analytics.
- Learn how Kaggle operates and leverage its features effectively.
Data Science with KNIME Analytics Platform
21 HoursKNIME Analytics Platform is a leading open-source option for data-driven innovation, helping you uncover the potential hidden in your data, mine fresh insights, or predict new futures. With over 1,000 modules, hundreds of ready-to-run examples, a comprehensive suite of integrated tools, and the widest range of advanced algorithms available, KNIME Analytics Platform is the perfect toolbox for any data scientist and business analyst.
This course on KNIME Analytics Platform offers an ideal opportunity for beginners, advanced users, and KNIME experts to get introduced to KNIME, learn how to use it more effectively, and create clear, comprehensive reports based on KNIME workflows.
This instructor-led, live training (available online or on-site) is designed for data professionals who want to leverage KNIME to address complex business challenges.
It is tailored for audiences without programming experience who wish to utilise cutting-edge tools to implement analytics scenarios.
By the end of this training, participants will be able to:
- Install and configure KNIME.
- Build data science scenarios.
- Train, test, and validate models.
- Implement the end-to-end value chain of data science models.
Course Format
- Interactive lectures and discussions.
- Numerous exercises and practice sessions.
- Hands-on implementation in a live lab environment.
Course Customisation Options
- To request customised training for this course or to learn more about the program, please contact us to arrange.
MATLAB Fundamentals, Data Science & Report Generation
35 HoursIn the first part of this training, we cover the fundamentals of MATLAB and its role as both a programming language and a development platform. This discussion includes an introduction to MATLAB syntax, arrays and matrices, data visualisation, script development, and object-oriented principles.
In the second part, we demonstrate how to use MATLAB for data mining, machine learning, and predictive analytics. To give participants a clear and practical understanding of MATLAB's approach and capabilities, we draw comparisons between using MATLAB and other tools such as spreadsheets, C, C++, and Visual Basic.
In the third part of the training, participants learn how to streamline their workflows by automating data processing and report generation.
Throughout the course, participants will apply the concepts they have learned through hands-on exercises in a lab environment. By the end of the training, participants will have a thorough understanding of MATLAB's capabilities and will be able to use it to solve real-world data science problems, as well as to streamline their work through automation.
Assessments will be conducted throughout the course to gauge progress.
Course Format
- The course includes both theoretical and practical exercises, including case discussions, sample code inspection, and hands-on implementation.
Note
- Practice sessions will be based on pre-arranged sample data report templates. If you have specific requirements, please contact us to arrange them.
Machine Learning for Data Science with Python
21 HoursThis instructor-led, live training in New Zealand (online or onsite) is aimed at intermediate-level data analysts, developers, or aspiring data scientists who wish to apply machine learning techniques in Python to extract insights, make predictions, and automate data-driven decisions.
By the end of this course, participants will be able to:
- Understand and differentiate key machine learning paradigms.
- Explore data preprocessing techniques and model evaluation metrics.
- Apply machine learning algorithms to solve real-world data problems.
- Use Python libraries and Jupyter notebooks for hands-on development.
- Build models for prediction, classification, recommendation, and clustering.
Accelerating Python Pandas Workflows with Modin
14 HoursThis instructor-led, live training in New Zealand (available online or on-site) is tailored for data scientists and developers who wish to utilise Modin to build and implement parallel computations with Pandas, enabling faster data analysis.
By the end of this training, participants will be able to:
- Set up the necessary environment to begin developing scalable Pandas workflows with Modin.
- Understand the features, architecture, and advantages of Modin.
- Identify the key differences between Modin, Dask, and Ray.
- Execute Pandas operations more efficiently using Modin.
- Implement the full Pandas API and its associated functions.
GPU Data Science with NVIDIA RAPIDS
14 HoursThis instructor-led, live training in New Zealand (available online or on-site) is intended for data scientists and developers who wish to use RAPIDS to build GPU-accelerated data pipelines, workflows, and visualisations, applying machine learning algorithms such as XGBoost, cuML, and others.
By the end of this training, participants will be able to:
- Set up the necessary development environment to build data models with NVIDIA RAPIDS.
- Understand the features, components, and benefits of RAPIDS.
- Leverage GPUs to accelerate end-to-end data and analytics pipelines.
- Implement GPU-accelerated data preparation and ETL processes using cuDF and Apache Arrow.
- Learn how to perform machine learning tasks with XGBoost and cuML algorithms.
- Build data visualisations and carry out graph analysis using cuXfilter and cuGraph.