Ready to make a real impact in South Africa’s financial landscape? Our client, a leading and forward-thinking player in the Financial Services sector, is searching for a skilled and passionate SQL ...
Skill Eval Harness is a Python CLI for testing whether an Agent Skill changes observable output. It reads evals/shared-benchmark.json, emits answer-key-safe task rows, grades files under eval-runs/, ...