Releases · LessUp/tiny-llm

Tiny-LLM v2.0.1 — Bug Fixes

Release Date: April 16, 2026

English

🔵 Fixed

Critical: Scale Dimension Calculation Error

Severity: Critical
Impact: Test utility only
File: tests/test_integration.cu

The createRandomWeight function had an incorrect scale tensor dimension calculation:

// ❌ INCORRECT (rows and cols swapped)
int num_groups = (cols + group_size - 1) / group_size;
w.scales = randomDeviceFP16(rows * num_groups, ...);

// ✅ CORRECT  
int num_groups = (rows + group_size - 1) / group_size;
w.scales = randomDeviceFP16(num_groups * cols, ...);

Why this matters: W8A16 matmul uses [rows/group_size, cols] to index scales, requiring ceil(rows/g) * cols elements.

Code Cleanup: Removed 12 lines of unused q_reg array loading code in kernels/attention.cu.

✅ Verification

$ ctest --output-on-failure
100% tests passed, 0 tests failed

简体中文

🔵 修复

严重: 尺度维度计算错误

严重程度: 严重
影响范围: 仅测试工具
文件: tests/test_integration.cu

createRandomWeight 函数中存在尺度张量维度计算错误：

// ❌ 错误 (rows 和 cols 互换)
int num_groups = (cols + group_size - 1) / group_size;
w.scales = randomDeviceFP16(rows * num_groups, ...);

// ✅ 正确  
int num_groups = (rows + group_size - 1) / group_size;
w.scales = randomDeviceFP16(num_groups * cols, ...);

重要性: W8A16 矩阵乘使用 [rows/group_size, cols] 索引尺度，需要 ceil(rows/g) * cols 个元素。

代码清理: 移除了 kernels/attention.cu 中 12 行未使用的 q_reg 数组加载代码。

✅ 验证

$ ctest --output-on-failure
100% tests passed, 0 tests failed

Installation | 安装

git clone https://github.com/LessUp/tiny-llm.git
cd tiny-llm
git checkout v2.0.1
mkdir build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Release
make -j$(nproc)

Documentation | 文档 | API Reference | API 参考

Tiny-LLM v2.0.0 — Major Refactoring

Release Date: March 9, 2026

English

⚠️ Breaking Changes

KVCache API Redesign: The previous appendKV() implementation had fragile layer-order dependencies that could lead to incorrect cache writes if layers were called in different orders.

Solution: New stateless design with explicit length advancement.

// After (v2.0+)
for (int i = 0; i < num_layers; i++) {
    layers[i]->forward(hidden_states, kv_cache, seq_id, position, stream);
}
// Explicitly advance length once after all layers
kv_cache.advanceSeqLen(seq_id, num_tokens);

🟢 Added

GitHub Actions workflow for continuous integration
Automated clang-format checking
CMake modernization with target exports (tiny_llm::tiny_llm)
Improved compiler warning flags

🟡 Changed

Minimum CMake version: 3.18
CUDA architecture auto-detection with fallback

📊 Performance

Metric	v1.0.0	v2.0.0	Change
Build time	45s	38s	-15%
Test runtime	2.1s	1.8s	-14%

简体中文

⚠️ 破坏性变更

KVCache API 重新设计: 之前的 appendKV() 实现存在脆弱的层序依赖，如果层以不同顺序调用可能导致错误的缓存写入。

解决方案: 新的无状态设计，显式推进长度。

// 之后 (v2.0+)
for (int i = 0; i < num_layers; i++) {
    layers[i]->forward(hidden_states, kv_cache, seq_id, position, stream);
}
// 所有层完成后显式推进长度
kv_cache.advanceSeqLen(seq_id, num_tokens);

🟢 新增

GitHub Actions 持续集成工作流
自动 clang-format 检查
CMake 现代化，支持 target 导出 (tiny_llm::tiny_llm)
改进的编译器警告标志

🟡 变更

最低 CMake 版本：3.18
CUDA 架构自动检测，带常见架构回退

Installation | 安装

git clone https://github.com/LessUp/tiny-llm.git
cd tiny-llm
git checkout v2.0.0
mkdir build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Release
make -j$(nproc)

Documentation | 文档 | API Reference | API 参考

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Tiny-LLM v2.0.1 — Bug Fixes

English

🔵 Fixed

✅ Verification

简体中文

🔵 修复

✅ 验证

Installation | 安装

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Tiny-LLM v2.0.0 — Major Refactoring

English

⚠️ Breaking Changes

🟢 Added

🟡 Changed

📊 Performance

简体中文

⚠️ 破坏性变更

🟢 新增

🟡 变更

Installation | 安装

Uh oh!

Releases: LessUp/tiny-llm

Tiny-LLM v2.0.1

Tiny-LLM v2.0.1 — Bug Fixes

English

🔵 Fixed

✅ Verification

简体中文

🔵 修复

✅ 验证

Installation | 安装

Uh oh!

Tiny-LLM v2.0.0

Tiny-LLM v2.0.0 — Major Refactoring

English

⚠️ Breaking Changes

🟢 Added

🟡 Changed

📊 Performance

简体中文

⚠️ 破坏性变更

🟢 新增

🟡 变更

Installation | 安装

Uh oh!