HomeUpcoming webinars

Unlock New LLM Architectural Capabilities By Retraining

Key Takeaways:
  • Understand how different attention (and retention) mechanisms affect LLM behaviour, performance, and cost.
  • Learn the process of retraining large language models by repurposing pretrained weights and exploring new architectures.
  • Discover the latest research techniques—including long-context inference, hardware-efficient kernels, and scalable LLM families—and how to apply them in your projects.
Friday, November 21, 11 AM ET
View More Webinars

Register for the webinar

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.

Description

Large language models (LLMs) are evolving fast—and retraining offers a powerful path to new architectural capabilities, lower cost, and better performance. Recent research on the Brumby-14B-Base model shows how attention-free retention layers, efficient retraining from pretrained weights, and long-context processing can reshape what’s possible in generative AI. This session dives deep into these techniques and how you can apply them in your own work.

In this code-along, Jacob Buckman, the CEO at Manifest AI, will guide you through cutting-edge methods for LLM design, retraining, and deployment. You’ll explore how alternative attention mechanisms like power retention change model behaviour, how to repurpose pretrained models for new architectures, and how to engineer large-scale training pipelines. Whether you’re building foundation models or pushing boundaries in generative AI research, this session gives you the tools and insights to lead the next wave.

Presenter Bio

Jacob Buckman Headshot
Jacob BuckmanCEO at Manifest AI

Jacob runs the AI research company, Manifest AI. He is co-creator of the power attention mechanism for long context LLMs, and an expert in deep and reinforcement learning. Previously he was a resident at the Google Brain project.

View More Webinars