A website called “UK visa portal” has been quietly collecting passport scans, selfies, and personal data from thousands of travellers who thought they were applying through official channels.
SDPG is the main contribution. It extends GRPO with an exact per-token forward KL between the actor (without privileged context) and itself conditioned on privileged context c: ...