在运行微调脚本时遇到了两个 TrainingArguments 参数问题:
max_seq_length 参数错误信息:
TypeError: TrainingArguments.__init__() got an unexpected keyword argument 'max_seq_length'
原因: TrainingArguments 不接受 max_seq_length 参数。序列长度应该在数据预处理阶段设置。
解决方案:
setup_training() 方法中移除 max_seq_length 参数InstructionDataset 时使用 max_length 参数evaluation_strategy 参数错误信息:
TypeError: TrainingArguments.__init__() got an unexpected keyword argument 'evaluation_strategy'
原因: 在新版本的 Transformers 库中,参数名从 evaluation_strategy 改为 eval_strategy。
解决方案:
evaluation_strategy 改为 eval_strategyfinetunex/trainer/trainer.py修改前:
def setup_training(
self,
output_dir: str = "./outputs",
num_train_epochs: float = 3.0,
per_device_train_batch_size: int = 1,
gradient_accumulation_steps: int = 4,
learning_rate: float = 2e-4,
max_seq_length: int = 512, # ❌ 移除
warmup_ratio: float = 0.03,
weight_decay: float = 0.01,
logging_steps: int = 10,
save_steps: int = 100,
evaluation_strategy: str = "no", # ❌ 旧参数名
save_total_limit: int = 3,
fp16: bool = True,
**kwargs
):
self.training_args = TrainingArguments(
output_dir=output_dir,
num_train_epochs=num_train_epochs,
per_device_train_batch_size=per_device_train_batch_size,
gradient_accumulation_steps=gradient_accumulation_steps,
learning_rate=learning_rate,
max_seq_length=max_seq_length, # ❌ 移除
warmup_ratio=warmup_ratio,
weight_decay=weight_decay,
logging_steps=logging_steps,
save_steps=save_steps,
evaluation_strategy=evaluation_strategy, # ❌ 改为 eval_strategy
save_total_limit=save_total_limit,
fp16=fp16,
optim="paged_adamw_32bit",
lr_scheduler_type="cosine",
report_to="none",
**kwargs
)
修改后:
def setup_training(
self,
output_dir: str = "./outputs",
num_train_epochs: float = 3.0,
per_device_train_batch_size: int = 1,
gradient_accumulation_steps: int = 4,
learning_rate: float = 2e-4,
warmup_ratio: float = 0.03,
weight_decay: float = 0.01,
logging_steps: int = 10,
save_steps: int = 100,
eval_strategy: str = "no", # ✅ 新参数名
save_total_limit: int = 3,
fp16: bool = True,
**kwargs
):
self.training_args = TrainingArguments(
output_dir=output_dir,
num_train_epochs=num_train_epochs,
per_device_train_batch_size=per_device_train_batch_size,
gradient_accumulation_steps=gradient_accumulation_steps,
learning_rate=learning_rate,
warmup_ratio=warmup_ratio,
weight_decay=weight_decay,
logging_steps=logging_steps,
save_steps=save_steps,
eval_strategy=eval_strategy, # ✅ 使用新参数名
save_total_limit=save_total_limit,
fp16=fp16 if torch.cuda.is_available() else False, # ✅ 安全检查
optim="paged_adamw_32bit",
lr_scheduler_type="cosine",
report_to="none",
remove_unused_columns=False, # ✅ 添加
**kwargs
)
examples/qwen3.5_0.8b_local_finetune.py修改前:
trainer.setup_training(
output_dir=config.output_dir,
num_train_epochs=config.num_train_epochs,
per_device_train_batch_size=config.per_device_train_batch_size,
gradient_accumulation_steps=config.gradient_accumulation_steps,
learning_rate=config.learning_rate,
max_seq_length=config.max_seq_length, # ❌ 移除
warmup_ratio=0.03,
weight_decay=0.01,
logging_steps=10,
save_steps=50,
fp16=True,
)
修改后:
trainer.setup_training(
output_dir=config.output_dir,
num_train_epochs=config.num_train_epochs,
per_device_train_batch_size=config.per_device_train_batch_size,
gradient_accumulation_steps=config.gradient_accumulation_steps,
learning_rate=config.learning_rate,
warmup_ratio=0.03,
weight_decay=0.01,
logging_steps=10,
save_steps=50,
fp16=True,
)
# 修改前
fp16=fp16
# 修改后
fp16=fp16 if torch.cuda.is_available() else False
remove_unused_columnsremove_unused_columns=False # 避免数据列被意外移除
运行测试脚本:
python test_training_args.py
应该看到:
测试 TrainingArguments 参数...
✓ TrainingArguments 参数验证通过!
输出目录:./test_output
训练轮数:3
FP16: True/False
修复完成后,重新运行微调脚本:
python examples/qwen3.5_0.8b_local_finetune.py
不同版本的 Transformers 可能有不同的参数名:
| 参数 | 旧版本 (<4.30) | 新版本 (>=4.30) |
|---|---|---|
| 评估策略 | evaluation_strategy |
eval_strategy |
| 最大序列长度 | ❌ 不支持 | ❌ 不支持 |
建议: 始终查看你所使用的 Transformers 版本的官方文档。
A: Transformers 库在不断改进,有时会重命名参数以提高一致性或清晰度。
A:
A:
修复日期: 2026-03-30 修复版本: 0.1.1