Skip to content

Releases: microsoft/BC-Bench

v0.5.0

07 Apr 13:25
143a74e

Choose a tag to compare

Major refactor of the Python evaluation codebase for extensibility across categories.

Versions

  • GitHub Copilot CLI 1.0.2
  • Claude Code 2.1.69
  • Microsoft.Dynamics.BusinessCentral.Development.Tools 17.0.33.55542

v0.4.0

19 Mar 16:08

Choose a tag to compare

Fixed the AL MCP compile tool integration in BC-Bench #592

Improved and adjusted the options and settings for Claude Code and GitHub Copilot.

Repository setup now sparse-checks out only app folders, improving clone performance.

Fixed two PNG file conversions for a dataset entry.

Timeout is extended from 90 mins to 120 mins, due to the compile tool from al mcp

Versions

  • GitHub Copilot CLI 1.0.2
  • Claude Code 2.1.69
  • Microsoft.Dynamics.BusinessCentral.Development.Tools 17.0.33.55542

v0.3.2

14 Mar 16:08
19f541d

Choose a tag to compare

Update al mcp to latest version.

Versions

  • GitHub Copilot CLI 1.0.2
  • Claude Code 2.1.47
  • Microsoft.Dynamics.BusinessCentral.Development.Tools 17.0.33.55542

v0.3.1

11 Mar 11:59
94e4eeb

Choose a tag to compare

Update Copilot and refresh the list of supported models.

Versions

  • GitHub Copilot CLI 1.0.2
  • Claude Code 2.1.47
  • Microsoft.Dynamics.BusinessCentral.Development.Tools 17.0.30.49729-beta

v0.3.0

27 Feb 12:00

Choose a tag to compare

Updated Test-Generation category's prompt to be more explicitly that the agent is expected to create a new test case.

Versions

  • GitHub Copilot CLI 0.0.411
  • Claude Code 2.1.47
  • Microsoft.Dynamics.BusinessCentral.Development.Tools 17.0.30.49729-beta

v0.2.2

18 Feb 23:47
ad87fff

Choose a tag to compare

Bumping version for both Claude Code and GitHub Copilot to include claude-sonnet-4.6

Versions

  • GitHub Copilot CLI 0.0.411
  • Claude Code 2.1.47
  • Microsoft.Dynamics.BusinessCentral.Development.Tools 17.0.30.49729-beta

v0.2.1

18 Feb 22:13
085990e

Choose a tag to compare

Bumping GitHub Copilot CLI version to include gpt-5.3-codex, minor version bump given the small version increase

Versions

  • GitHub Copilot CLI 0.0.409
  • Claude Code 2.1.37
  • Microsoft.Dynamics.BusinessCentral.Development.Tools 17.0.30.49729-beta

v0.2.0

07 Feb 21:25

Choose a tag to compare

After a few runs with v0.1.0, manually went through the dataset and compared the gold patch and agent output. Adjusted a few things:

  • Fixed mistakes in problem statements (e.g. unreachable links for coding agent)
  • Updated problem statement for microsoftInternal__NAV-182354 to include additional changes done in gold patch (not mentioned in the original bug description)
  • Updated problem statement for microsoftInternal__NAV-180484 to clarify the expected result (two options is described in the bug)
  • Added the focus on W1 localization in the prompt
  • Added PASS_TO_PASS test for microsoftInternal__NAV-185696

Versions

  • GitHub Copilot CLI 0.0.406
  • Claude Code 2.1.37
  • Microsoft.Dynamics.BusinessCentral.Development.Tools 17.0.30.49729-beta

v0.1.0

29 Jan 08:49

Choose a tag to compare

Background

Starting to create releases (versioning) to track potential changes between runs, this way, we don't have to re-run everything after any small changes (e.g. GitHub Copilot CLI version bump).

More on versioning policy

release v0.1.0

We have successfully reached our target on the size of dataset, there are now 101 tasks for both Bug-Fixing and Test-Generation categories. All the tasks have pass our Dataset Validation and Verification

  • GitHub Copilot CLI 0.0.382
  • Claude Code 2.0.76
  • Microsoft.Dynamics.BusinessCentral.Development.Tools 17.0.30.49729-beta