https://pytroubles.com/en/posts/id2082-fix-sfttrainer-eos-token-errors-in-trl-for-qwen2-5-unsloth-import-order-explained

Fix SFTTrainer eos_token errors in TRL for Qwen2.5: Unsloth import order explained

How to resolve SFTTrainer 'token not found' eos_token issues after migrating to TRL SFTConfig with Unsloth and Qwen2.5

Fix SFTTrainer eos_token errors in TRL for Qwen2.5: Unsloth import order explained

Getting 'token not found' in TRL SFTTrainer with Qwen2.5? Import order can swap eos_token—import Unsloth before TRL. Qwen2TokenizerFast details and exact fix.

2025-11-25T03:00:10+03:00

2025-11-25T03:00:11+03:00

When moving a finetuning pipeline to the newer trl stack, an unexpected failure may appear around eos handling. After upgrading trl and switching from TrainingArguments to SFTConfig inside SFTTrainer for a Qwen2.5 setup, the configured eos_token can be silently replaced by <EOS_TOKEN>, and the trainer aborts with a token-not-found error.Problem setupThe finetuning flow relies on SFTTrainer, a Qwen2TokenizerFast passed via processing_class, and an explicit eos_token that matches the tokenizer’s configuration. The initialization looks roughly like this:fine_tuner = SFTTrainer( model=base_model, processing_class=tok, train_dataset=train_split, eval_dataset=dev_split, data_collator=DataCollatorForSeq2Seq(tokenizer=tok), callbacks=[log_cb], args=SFTConfig( per_device_train_batch_size=batch_per_device, gradient_accumulation_steps=grad_accum_steps, warmup_steps=num_warmup, num_train_epochs=epochs_cap, max_steps=steps_cap, max_seq_length=seq_block, learning_rate=lr_value, fp16=not is_bfloat16_supported(), bf16=is_bfloat16_supported(), logging_steps=log_every, optim="paged_adamw_8bit", weight_decay=0.01, lr_scheduler_type="linear", seed=seed_value, eval_strategy="epoch" if dev_split is not None else "no", save_strategy="no", output_dir="models", save_steps=50, report_to="none", packing=False, dataset_text_field="text", eos_token="<|im_end|>", ), ) Despite explicitly setting eos_token to <|im_end|> to align with Qwen2TokenizerFast, SFTTrainer’s construction can fail because the string being validated is no longer the one you provided.What actually breaksDuring SFTTrainer construction, eos_token is pulled from args and checked against the tokenizer vocabulary using processing_class.convert_tokens_to_ids. If the lookup returns None, the trainer raises. The surprising part here is that eos_token can arrive as <EOS_TOKEN> instead of <|im_end|>, so the lookup fails and the initialization stops with a “token not found” style error.The trigger for this behavior is tied to unsloth. It changes something under the hood, and the order in which the libraries are imported affects the value seen by SFTTrainer during setup.FixThe resolution is to import unsloth before trl. With that order, the trainer receives the intended eos_token and proceeds without the lookup failure.from unsloth import FastLanguageModel from trl import SFTTrainer, SFTConfig fine_tuner = SFTTrainer( model=base_model, processing_class=tok, train_dataset=train_split, eval_dataset=dev_split, data_collator=DataCollatorForSeq2Seq(tokenizer=tok), callbacks=[log_cb], args=SFTConfig( per_device_train_batch_size=batch_per_device, gradient_accumulation_steps=grad_accum_steps, warmup_steps=num_warmup, num_train_epochs=epochs_cap, max_steps=steps_cap, max_seq_length=seq_block, learning_rate=lr_value, fp16=not is_bfloat16_supported(), bf16=is_bfloat16_supported(), logging_steps=log_every, optim="paged_adamw_8bit", weight_decay=0.01, lr_scheduler_type="linear", seed=seed_value, eval_strategy="epoch" if dev_split is not None else "no", save_strategy="no", output_dir="models", save_steps=50, report_to="none", packing=False, dataset_text_field="text", eos_token="<|im_end|>", ), ) In the working setup referenced here: trl==0.18.2 and unsloth==2025.6.2.Why this mattersIn finetuning pipelines, subtle changes to special tokens cascade into data collation, loss masking, and stopping criteria. If eos handling shifts unexpectedly, you won’t just see initialization failures; you risk inconsistent training behavior. Knowing that import order can influence how frameworks integrate ensures that configurations like eos_token stay exactly as you set them.TakeawayIf SFTTrainer reports that your eos_token is missing from the vocabulary while you are confident it exists, check your import order. Import unsloth before trl, keep eos_token aligned with your Qwen2TokenizerFast configuration, and record the package versions used in a working run. This small ordering detail prevents the silent substitution of <|im_end|> with <EOS_TOKEN> and keeps your training loop predictable.

TRL, SFTTrainer, SFTConfig, eos_token, Qwen2TokenizerFast, Qwen2.5, Unsloth, import order, token not found, fine-tuning, finetuning, EOS handling, training error, LLM

2025

2025, Nov 25 03:00

How to resolve SFTTrainer 'token not found' eos_token issues after migrating to TRL SFTConfig with Unsloth and Qwen2.5

Getting 'token not found' in TRL SFTTrainer with Qwen2.5? Import order can swap eos_token—import Unsloth before TRL. Qwen2TokenizerFast details and exact fix.

Problem setup

The finetuning flow relies on SFTTrainer, a Qwen2TokenizerFast passed via processing_class, and an explicit eos_token that matches the tokenizer’s configuration. The initialization looks roughly like this:

fine_tuner = SFTTrainer(
    model=base_model,
    processing_class=tok,
    train_dataset=train_split,
    eval_dataset=dev_split,
    data_collator=DataCollatorForSeq2Seq(tokenizer=tok),
    callbacks=[log_cb],
    args=SFTConfig(
        per_device_train_batch_size=batch_per_device,
        gradient_accumulation_steps=grad_accum_steps,
        warmup_steps=num_warmup,
        num_train_epochs=epochs_cap,
        max_steps=steps_cap,
        max_seq_length=seq_block,
        learning_rate=lr_value,
        fp16=not is_bfloat16_supported(),
        bf16=is_bfloat16_supported(),
        logging_steps=log_every,
        optim="paged_adamw_8bit",
        weight_decay=0.01,
        lr_scheduler_type="linear",
        seed=seed_value,
        eval_strategy="epoch" if dev_split is not None else "no",
        save_strategy="no",
        output_dir="models",
        save_steps=50,
        report_to="none",
        packing=False,
        dataset_text_field="text",
        eos_token="<|im_end|>",
    ),
)

Despite explicitly setting eos_token to <|im_end|> to align with Qwen2TokenizerFast, SFTTrainer’s construction can fail because the string being validated is no longer the one you provided.

What actually breaks

During SFTTrainer construction, eos_token is pulled from args and checked against the tokenizer vocabulary using processing_class.convert_tokens_to_ids. If the lookup returns None, the trainer raises. The surprising part here is that eos_token can arrive as <EOS_TOKEN> instead of <|im_end|>, so the lookup fails and the initialization stops with a “token not found” style error.

The trigger for this behavior is tied to unsloth. It changes something under the hood, and the order in which the libraries are imported affects the value seen by SFTTrainer during setup.

Fix

The resolution is to import unsloth before trl. With that order, the trainer receives the intended eos_token and proceeds without the lookup failure.

from unsloth import FastLanguageModel
from trl import SFTTrainer, SFTConfig
fine_tuner = SFTTrainer(
    model=base_model,
    processing_class=tok,
    train_dataset=train_split,
    eval_dataset=dev_split,
    data_collator=DataCollatorForSeq2Seq(tokenizer=tok),
    callbacks=[log_cb],
    args=SFTConfig(
        per_device_train_batch_size=batch_per_device,
        gradient_accumulation_steps=grad_accum_steps,
        warmup_steps=num_warmup,
        num_train_epochs=epochs_cap,
        max_steps=steps_cap,
        max_seq_length=seq_block,
        learning_rate=lr_value,
        fp16=not is_bfloat16_supported(),
        bf16=is_bfloat16_supported(),
        logging_steps=log_every,
        optim="paged_adamw_8bit",
        weight_decay=0.01,
        lr_scheduler_type="linear",
        seed=seed_value,
        eval_strategy="epoch" if dev_split is not None else "no",
        save_strategy="no",
        output_dir="models",
        save_steps=50,
        report_to="none",
        packing=False,
        dataset_text_field="text",
        eos_token="<|im_end|>",
    ),
)

In the working setup referenced here: trl==0.18.2 and unsloth==2025.6.2.

Why this matters

In finetuning pipelines, subtle changes to special tokens cascade into data collation, loss masking, and stopping criteria. If eos handling shifts unexpectedly, you won’t just see initialization failures; you risk inconsistent training behavior. Knowing that import order can influence how frameworks integrate ensures that configurations like eos_token stay exactly as you set them.

Takeaway

If SFTTrainer reports that your eos_token is missing from the vocabulary while you are confident it exists, check your import order. Import unsloth before trl, keep eos_token aligned with your Qwen2TokenizerFast configuration, and record the package versions used in a working run. This small ordering detail prevents the silent substitution of <|im_end|> with <EOS_TOKEN> and keeps your training loop predictable.

deep-learning huggingface huggingface-transformers neural-network python