Let's reproduce NanoGPT with JAX!(Part 1) | Towards Data Science

Part 1: Build 124M GPT2 with JAX. Part 2: Optimize the training speed in Single GPU. Part 3: Multi-GPU Training in Jax.

By · · 1 min read
Let's reproduce NanoGPT with JAX!(Part 1) | Towards Data Science

Source: Towards Data Science

Part 1: Build 124M GPT2 with JAX. Part 2: Optimize the training speed in Single GPU. Part 3: Multi-GPU Training in Jax.