Discord Server Join the discord server if you are interested in LLM architecture or distributed training/inference research. Efficient GPU kernels written in both CUDA and Triton Cute Inductor CuteInductor allows easier injection of kernels contained in this repository into any PyTorch module.