Paste or upload your examples. Modeljury reads how hard the task is, builds an evaluation tailored to it, shortlists the models capable enough (and tells you why it cut the rest), then bakes off the survivors to find the cheapest one that clears your bar.
Describe what you want in plain English (type or use voice). Then add a few labelled examples below — generate them with AI, upload a file, or type your own. Then read the task difficulty below.
The bake-off grades each model against examples in the form input | expected — one per line. Add some three ways:
This is what makes Modeljury different: it builds an evaluation tailored to your task — the test cases, what counts as correct, and how it's graded. Review it and add anything that's missing.
Constraints prune the candidates before cost matters.