Web Scraping with Python Training Course
Web scraping is a technique used to extract data from websites and save it to a local file or database.
This instructor-led, live training (available online or on-site) is designed for developers who want to use Python to automate the process of crawling multiple websites to extract data for processing and analysis.
By the end of this training, participants will be able to:
- Install and configure Python along with all relevant packages.
- Retrieve and parse data stored across numerous websites.
- Understand how websites function and how their HTML is structured.
- Build spiders to crawl the web at scale.
- Use Selenium to crawl AJAX-driven web pages.
Course Format
- Interactive lectures and discussions.
- Ample exercises and hands-on practice.
- Real-time implementation in a live-lab environment.
Course Customisation Options
- This course assumes prior programming knowledge.
- To request a customised training session for this course, please contact us to arrange.
Course Outline
Introduction
Setting up the Development Environment
Python Primer: Data Structures, Conditionals, File Handling, etc.
Python Packages for Web Scraping: Scrapy and BeautifulSoup
How a Website Works
How HTML is Structured
Making a Web Request
Scraping an HTML Page
Working with XPath and CSS
Filtering Data Using Regular Expressions
Creating a Web Crawler
Crawling AJAX and JavaScript Pages with Selenium
Web Scraping Best Practices
Troubleshooting
Summary and Conclusion
Requirements
- Programming experience, preferably in Python. If participants have programming experience in a language other than Python, the training can be extended to include more introductory Python exercises.
Audience
- Developers
Open Training Courses require 5+ participants.
Web Scraping with Python Training Course - Booking
Web Scraping with Python Training Course - Enquiry
Web Scraping with Python - Consultancy Enquiry
Testimonials (1)
Many different examples and topics has been covered, from basic investigation to login management and dynamic page management.
Daniele Tagliaferro - Creditsafe Italia Srl
Course - Web Scraping with Python
Provisional Upcoming Courses (Require 5+ participants)
Related Courses
Advanced Python: Best Practices and Design Patterns
28 HoursThis intensive, hands-on course covers advanced Python techniques, engineering best practices, and commonly used design patterns to build maintainable, testable, and high-performance Python applications. It emphasises modern tooling, typing, concurrency models, architecture patterns, and deployment-ready workflows.
This instructor-led, live training (online or onsite) is aimed at intermediate-level to advanced-level Python developers who wish to adopt professional practices and patterns for production-grade Python systems.
By the end of this training, participants will be able to:
- Apply Python typing, dataclasses, and type-checking to increase code reliability.
- Use design patterns and architecture principles to structure robust applications.
- Implement concurrency and parallelism correctly using asyncio and multiprocessing.
- Build well-tested code with pytest, property-based testing, and CI pipelines.
- Profile, optimise, and harden Python applications for production.
- Package, distribute, and deploy Python projects using modern tools and containers.
Course Format
- Interactive lectures and short demos.
- Hands-on labs and coding exercises each day.
- Capstone mini-project integrating patterns, testing, and deployment.
Course Customisation Options
- To request a customised training or focus area (data, web, or infra), please contact us to arrange.
Agentic AI Engineering with Python — Build Autonomous Agents
21 HoursThis course teaches practical engineering techniques to design, build, test, and deploy agentic (autonomous) systems using Python. It covers the agent loop, tool integrations, memory and state management, orchestration patterns, safety controls, and production considerations.
This instructor-led, live training (online or onsite) is aimed at intermediate to advanced-level ML engineers, AI developers, and software engineers who wish to build robust, production-ready autonomous agents using Python.
By the end of this training, participants will be able to:
- Design and implement the agent loop and decision-making workflows.
- Integrate external tools and APIs to extend agent capabilities.
- Implement short-term and long-term memory architectures for agents.
- Coordinate multi-step orchestrations and agent composability.
- Apply safety, access control, and observability best practices for deployed agents.
Course Format
- Interactive lectures and discussions.
- Hands-on labs building agents with Python and popular SDKs.
- Project-based exercises that produce deployable prototypes.
Course Customisation Options
- To request a customised training session for this course, please contact us to arrange.
Introduction to Data Science and AI using Python
35 HoursThis is a five-day introduction to Data Science and Artificial Intelligence (AI).
The course is delivered using Python, with practical examples and hands-on exercises.
Artificial Intelligence with Python (Intermediate Level)
35 HoursArtificial Intelligence with Python involves building intelligent systems by leveraging Python’s rich ecosystem of AI and machine learning libraries.
This instructor-led, live training (available online or on-site) is designed for intermediate-level Python programmers who want to design, implement, and deploy AI solutions using Python.
By the end of this training, participants will be able to:
- Implement AI algorithms using Python’s core AI libraries.
- Work with supervised, unsupervised, and reinforcement learning models.
- Integrate AI solutions into existing applications and workflows.
- Evaluate model performance and optimise for accuracy and efficiency.
Course Format
- Interactive lectures and discussions.
- Abundant exercises and practical sessions.
- Hands-on implementation in a live-lab environment.
Course Customisation Options
- To request a customised version of this course, please contact us to arrange.
Algorithmic Trading with Python and R
14 HoursThis instructor-led, live training in New Zealand (online or onsite) is aimed at business analysts who wish to automate trading using algorithmic strategies, Python, and R.
By the end of this training, participants will be able to:
- Use algorithms to buy and sell securities rapidly in specialised increments.
- Reduce trading-related costs through the application of algorithmic trading.
- Automatically monitor stock prices and execute trades.
Applied AI from Scratch in Python
28 HoursThis is a four-day course introducing AI and its application using the Python programming language. There is an option to undertake an additional day of an AI project upon completion of this course.
AWS Cloud9 and Python: A Practical Guide
14 HoursThis instructor-led, live training in New Zealand (online or onsite) is designed for intermediate-level Python developers who wish to enhance their Python development skills using AWS Cloud9.
By the end of this training, participants will be able to:
- Set up and configure AWS Cloud9 for Python development.
- Understand the AWS Cloud9 IDE interface and its features.
- Write, debug, and deploy Python applications within AWS Cloud9.
- Collaborate with other developers using the AWS Cloud9 platform.
- Integrate AWS Cloud9 with other AWS services for advanced deployment scenarios.
Building Chatbots in Python
21 HoursChatbots are computer programmes that automatically simulate human responses via chat interfaces. Chatbots help organisations maximise operational efficiency by providing simpler and faster options for user interactions.
In this instructor-led, live training, participants will learn how to build chatbots in Python.
By the end of this training, participants will be able to:
- Understand the fundamentals of building chatbots
- Build, test, deploy, and troubleshoot various chatbots using Python
Audience
- Developers
Format of the course
- Part lecture, part discussion, exercises and heavy hands-on practice
Note
- To request a customised training for this course, please contact us to arrange.
GPU Programming with CUDA and Python
14 HoursThis instructor-led, live training in New Zealand (online or onsite) is aimed at intermediate-level developers who wish to use CUDA to build Python applications that run in parallel on NVIDIA GPUs.
By the end of this training, participants will be able to:
- Use the Numba compiler to accelerate Python applications running on NVIDIA GPUs.
- Create, compile and launch custom CUDA kernels.
- Manage GPU memory.
- Convert a CPU-based application into a GPU-accelerated application.
Scaling Data Analysis with Python and Dask
14 HoursThis instructor-led, live training in New Zealand (delivered online or on-site) is aimed at data scientists and software engineers who wish to leverage Dask within the Python ecosystem to build, scale, and analyse large datasets.
By the end of this training, participants will be able to:
- Set up the environment to begin building big data processing solutions with Dask and Python.
- Explore the features, libraries, tools, and APIs available in Dask.
- Understand how Dask accelerates parallel computing in Python.
- Learn how to scale the Python ecosystem (NumPy, SciPy, and Pandas) using Dask.
- Optimise the Dask environment to maintain high performance when handling large datasets.
Data Analysis with Python, Pandas and Numpy
14 HoursThis instructor-led, live training in New Zealand (online or on-site) is designed for intermediate-level Python developers and data analysts seeking to enhance their skills in data analysis and manipulation using Pandas and NumPy.
By the end of this training, participants will be able to:
- Set up a development environment including Python, Pandas, and NumPy.
- Create a data analysis application using Pandas and NumPy.
- Perform advanced data wrangling, sorting, and filtering operations.
- Conduct aggregate operations and analyse time series data.
- Visualise data using Matplotlib and other visualisation libraries.
- Debug and optimise their data analysis code.
FARM (FastAPI, React, and MongoDB) Full Stack Development
14 HoursThis instructor-led, live training (available online or on-site) is designed for developers eager to harness the FARM (FastAPI, React, and MongoDB) stack to build dynamic, high-performance, and scalable web applications.
By the conclusion of this training, participants will be able to:
- Establish the essential development environment integrating FastAPI, React, and MongoDB.
- Grasp the core concepts, features, and advantages of the FARM stack.
- Learn how to construct REST APIs using FastAPI.
- Master the art of designing interactive applications with React.
- Develop, test, and deploy both front-end and back-end applications using the FARM stack.
Developing APIs with Python and FastAPI
14 HoursThis instructor-led, live training in New Zealand (online or on-site) is designed for developers who wish to leverage FastAPI with Python to build, test, and deploy RESTful APIs more efficiently and rapidly.
By the end of this training, participants will be able to:
- Set up the necessary development environment to build APIs using Python and FastAPI.
- Create APIs more quickly and easily using the FastAPI library.
- Learn how to define data models and schemas based on Pydantic and OpenAPI.
- Connect APIs to a database using SQLAlchemy.
- Implement security and authentication mechanisms in APIs using FastAPI tools.
- Build container images and deploy web APIs to a cloud server.
Fraud Detection with Python and TensorFlow
14 HoursThis instructor-led, live training in New Zealand (available online or on-site) is designed for data scientists who wish to leverage TensorFlow to analyse potential fraud data.
By the end of this training, participants will be able to:
- Develop a fraud detection model using Python and TensorFlow.
- Construct linear regressions and linear regression models to predict fraudulent activity.
- Build an end-to-end AI application for analysing fraud data.
Accelerating Python Pandas Workflows with Modin
14 HoursThis instructor-led, live training in New Zealand (available online or on-site) is tailored for data scientists and developers who wish to utilise Modin to build and implement parallel computations with Pandas, enabling faster data analysis.
By the end of this training, participants will be able to:
- Set up the necessary environment to begin developing scalable Pandas workflows with Modin.
- Understand the features, architecture, and advantages of Modin.
- Identify the key differences between Modin, Dask, and Ray.
- Execute Pandas operations more efficiently using Modin.
- Implement the full Pandas API and its associated functions.