BIRD-bench Text-to-Efficient-SQL: BIRD is the first text-to-SQL benchmark designed to encourage semantic parsers to produce SQL queries that are not only correct but also efficient
GitHub - bird-bench BIRD-CRITIC-1: [NeurIPS 2025 Main] SWE-SQL . . . BIRD-Critic 1 0 introduces a novel SQL benchmark designed to evaluate a key capability: Can large language models (LLMs) diagnose and solve user issues within real-world database environments? The benchmark comprises 600 tasks for development and 200 held-out out-of-distribution (OOD) tests