Skip to content

Conversation

github-actions[bot]
Copy link
Contributor

@github-actions github-actions bot commented Oct 1, 2025

This PR adds benchmark results for the z-ai/glm-4.6 model.

The following files have been updated:

  • src/benchmark/results.json - Raw benchmark results
  • src/benchmark/validation-results.json - Validation results against human baseline

This PR was automatically generated by the benchmark workflow.

Note: If you don't want to merge this PR, close it and the model will be added to the untested list to prevent re-processing.

@alrocar

Copy link

vercel bot commented Oct 1, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
llm-benchmark Ready Ready Preview Comment Oct 1, 2025 0:28am

"details": "Results don't match",
"humanRowCount": 10,
"llmRowCount": 10,
"sql": "\nSELECT \n repo_name,\n countIf(event_type, toYear(created_at) = 2016) as events_2016,\n countIf(event_type, toYear(created_at) = 2017) as events_2017,\n countIf(event_type = 'WatchEvent', toYear(created_at) = 2017) as stars_2017,\n events_2017 / events_2016 as stagnation_ratio\nFROM github_events\nWHERE toYear(created_at) IN (2016, 2017)\n AND repo_name != ''\nGROUP BY repo_name\nHAVING events_2016 > 0 \n AND stars_2017 >= 1\nORDER BY stagnation_ratio ASC\nLIMIT 10"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"sql": "\nSELECT \n repo_name,\n countIf(event_type, toYear(created_at) = 2016) as events_2016,\n countIf(event_type, toYear(created_at) = 2017) as events_2017,\n countIf(event_type = 'WatchEvent', toYear(created_at) = 2017) as stars_2017,\n events_2017 / events_2016 as stagnation_ratio\nFROM github_events\nWHERE toYear(created_at) IN (2016, 2017)\n AND repo_name != ''\nGROUP BY repo_name\nHAVING events_2016 > 0 \n AND stars_2017 >= 1\nORDER BY stagnation_ratio ASC\nLIMIT 10"
"sql": "\nSELECT \n repo_name,\n countIf(toYear(created_at) = 2016) as events_2016,\n countIf(toYear(created_at) = 2017) as events_2017,\n countIf(event_type = 'WatchEvent', toYear(created_at) = 2017) as stars_2017,\n events_2017 / events_2016 as stagnation_ratio\nFROM github_events\nWHERE toYear(created_at) IN (2016, 2017)\n AND repo_name != ''\nGROUP BY repo_name\nHAVING events_2016 > 0 \n AND stars_2017 >= 1\nORDER BY stagnation_ratio ASC\nLIMIT 10"

The SQL query contains invalid ClickHouse syntax with incorrect countIf function usage that would cause execution errors.

View Details

Analysis

Invalid ClickHouse countIf syntax in z-ai/glm-4.6 benchmark result

What fails: ClickHouse countIf function calls use incorrect two-argument syntax instead of single boolean condition at line 13878 in src/benchmark/validation-results.json

How to reproduce: Execute the SQL query from z-ai/glm-4.6 model result:

countIf(event_type, toYear(created_at) = 2016)
countIf(event_type, toYear(created_at) = 2017)

Result: ClickHouse would reject the query with syntax error - countIf only accepts one boolean condition argument

Expected: According to ClickHouse documentation, correct syntax is:

countIf(toYear(created_at) = 2016)
countIf(toYear(created_at) = 2017)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0 participants