Math Module in Python Examples

LUFFY: Learning to Reason Under Off‑Policy Guidance

LUFFY is a reinforcement learning framework that bridges the gap between zero-RL and imitation learning by incorporating off-policy reasoning traces into the training process. Built upon GRPO, LUFFY ...

GitHub

SUOD: Accelerating Large-scare Unsupervised Heterogeneous Outlier Detection

Background: Outlier detection (OD) is a key data mining task for identifying abnormal objects from general samples with numerous high-stake applications including fraud detection and intrusion ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

LUFFY: Learning to Reason Under Off‑Policy Guidance

SUOD: Accelerating Large-scare Unsupervised Heterogeneous Outlier Detection

Trending now