Our Philosophy
Pico makes the process of designing model architectures and pretraining paradigms less of an art and more of a science. By understanding how models learn, we can inform and improve the way we build and train them.
Pico is a family of decoder models, all trained identically with scale as the only difference, accompanied by rich training checkpoints containing activations, gradients, and more for interpretability research. It also provides a streamlined codebase for training and analyzing your own model suite.
Small-Scale Focus
Train and study models from 1M to 1B parameters, making experimentation with training paradigms practical and accessible.
Advanced Checkpointing
Access model activations, gradients, and other rich information throughout training for mechanistic interpretability research.
Easy Retraining
Simple, modular codebase designed for researchers to modify and retrain the entire model suite with custom training paradigms.
PyTorch Lightning
Built on PyTorch Lightning for efficient, scalable training with minimal boilerplate code.
Minimal Dependencies
Lightweight framework with only essential dependencies, making it easy to install and modify.
Research Ready
Designed with researchers in mind, providing tools and flexibility needed for academic exploration.