qwen-test

Author	SHA1	Message	Date
16337	682063abf1	feat: 改用 4-bit NF4 纯 GPU 推理，关闭 thinking 模式 - 模型加载改为 bitsandbytes 4-bit NF4 量化，device_map={"":0} 纯 GPU - 关闭 Qwen3.5 thinking 模式 (enable_thinking=False) - 精度从 60% 提升到 90%，推理速度 1-2 tokens/s - GPU 显存 7.13GB/8GB，输出质量正常 - 更新所有测试结果和综合报告 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-16 17:38:33 +08:00
16337	42db2b0ca9	feat: 更新 GPU 需求分析，添加实际测试结果和综合报告 - 根据 RTX 3050 8GB 实测结果更新 GPU 需求建议 - 添加 bitsandbytes 兼容性问题记录 - 生成包含实测数据的综合测试报告 REPORT.md Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-16 13:09:39 +08:00
16337	4ac406572e	fix: 修复模型加载方式，改用 FP16+CPU offload RTX 3050 8GB 无法完整加载 Qwen3.5-9B，即使量化也不行： - bitsandbytes 4-bit 不支持 CPU offload - bitsandbytes 8-bit 与 accelerate 存在版本兼容问题 - FP16 + CPU offload 可以加载但推理质量极差（输出乱码） - 推理速度仅 0.4 tokens/s 结论：RTX 3050 8GB 不适合运行 Qwen3.5-9B Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-16 13:05:20 +08:00
16337	f7174464d5	feat: 添加一键运行脚本 run_all.py Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-16 11:45:52 +08:00
16337	fd0d6b05b5	feat: 添加 GPU 需求分析和综合报告生成脚本 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-16 11:45:51 +08:00
16337	837bf407e1	feat: 添加并发压测脚本 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-16 11:45:51 +08:00
16337	1c52b15a18	feat: 添加精度评估脚本（知识/数学/逻辑/代码/翻译） Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-16 11:45:50 +08:00
16337	8f5b495ed3	feat: 添加性能基准测试脚本（速度+显存） Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-16 11:45:50 +08:00
16337	1a96de6058	feat: 添加基础推理测试脚本（4-bit 量化） Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-16 11:45:49 +08:00
16337	c2ce4f0a78	feat: 添加模型下载脚本（ModelScope） Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-16 11:31:09 +08:00
16337	f29443ffb0	feat: 添加依赖配置和环境检查脚本 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-16 11:30:31 +08:00

11 Commits