rwkv_simple training and infering scripts for RWKV x060 model. inspired by https://github.com/yuunnn-w/RWKV_Pytorch and https://github.com/BlinkDL/RWKV-LM TODO parallel forward with attention mask tokenizer apply_chat_template method