Get in Touch

Course Outline

Introduction to Biren GPU Architecture

  • Biren overview and use cases.
  • Hardware layout: cores, memory, compute clusters.
  • Comparison with NVIDIA and AMD GPUs.

Setting Up the Biren Programming Environment

  • Installing the Biren SDK and runtime.
  • Understanding the toolchain and compiler model.
  • Basic project structure and build process.

GPU Programming with the Biren Stack

  • Thread and block models.
  • Memory management and data transfers.
  • Kernel development and launch patterns.

Porting from CUDA to Biren

  • Translation techniques for CUDA code.
  • Common API mappings and adaptations.
  • Code conversion labs and practice.

Debugging and Profiling

  • Using Biren’s debugger and profiler.
  • Identifying bottlenecks.
  • Memory access patterns and optimisation.

Optimisation Techniques

  • Thread scheduling and instruction pipelining.
  • Loop unrolling and shared memory utilisation.
  • Advanced kernel tuning for throughput.

Case Study and Application Examples

  • Training a model with Biren accelerators.
  • Porting and profiling a vision or NLP model.
  • Comparing performance versus CUDA/NVIDIA.

Summary and Next Steps

Requirements

  • An understanding of GPU architecture and parallel processing.
  • Experience with CUDA, OpenCL, or similar GPU programming environments.
  • Familiarity with deep learning frameworks such as PyTorch or TensorFlow.

Audience

  • HPC developers.
  • AI infrastructure engineers.
  • Performance optimisation specialists.
 21 Hours

Number of participants


Price per participant

Testimonials (2)

Provisional Upcoming Courses (Require 5+ participants)

Related Categories