A Baseline Analysis of Reward Models’ Ability To Accurately Analyze Foundation Models Under Distribution Shift - Scale Labs | Scale Labs