This is a port of BlinkDL/RWKV-LM to ggerganov/ggml. Besides the usual FP32, it supports FP16, quantized INT4, INT5 and INT8 inference. This project is focused on CPU, but cuBLAS is also supported.
Fork 自 rohitg00/ai-engineering-from-scratch。 84% 的学生已经在使用 AI 工具。只有 18% 认为自己已准备好以 专业方式使用它们 ...