2025, Dec 01 13:00

How to Apply an Attention Mask in Keras Correctly: Convert Arrays to Tensors and Avoid DirectoryIterator Inputs

Learn why passing an ImageDataGenerator DirectoryIterator to a Keras layer fails, and how to apply an attention mask using tensors and tf.convert_to_tensor.

Building a custom Keras layer to apply an attention-like mask is a common need, but a subtle integration mistake can trigger a confusing exception. The crux of the issue is attempting to feed a Python iterator produced by ImageDataGenerator directly into a Layer, while layers expect tensors. Below is a concise walkthrough of what goes wrong and how to wire it correctly.

Problem setup

The goal is to multiply each image by a predefined attention map. The implementation defines a custom layer and then tries to call it on a dataset coming from a directory iterator.

import tensorflow as tf
from tensorflow.keras.layers import Layer
class FocusMask(Layer):
    def __init__(self, image_dim, attn_map):
        super().__init__()
        self.image_dim = image_dim
        self.attn_map = attn_map
    def call(self, inputs, *args, **kwargs):
        return tf.math.multiply(inputs, self.attn_map)
mask_layer = FocusMask(img_dim, map_array)
masked_batch = mask_layer(directory_iterator)
ValueError: Only input tensors may be passed as positional arguments.

Why it fails

The object returned by ImageDataGenerator.flow_from_directory is a Python iterator (DirectoryIterator). Keras layers operate on tensors. Passing a Python iterator into a layer call leads to the ValueError shown above because the layer input is not a tensor.

Working solution

The fix is to ensure the layer receives a tensor and to convert the masking map to a tensor as well. The layer can then be composed in the Functional API, where the input is a tensor produced by tf.keras.Input. The masking map is prepared with tf.convert_to_tensor.

import tensorflow as tf
from tensorflow.keras.layers import Layer
class MapOverlay(Layer):
    def __init__(self, mask_tensor):
        super().__init__()
        self.mask_tensor = tf.convert_to_tensor(mask_tensor, dtype=tf.float32)
    def call(self, inputs):
        return inputs * self.mask_tensor
# Build a model that applies the mask inside the graph
image_input = tf.keras.Input(shape=(img_size, img_size, 3))
masked_output = MapOverlay(mask_tensor)(image_input)
mask_model = tf.keras.Model(inputs=image_input, outputs=masked_output)

With this setup, the model accepts batches of image tensors. The masking happens inside the graph where both operands are tensors, so the error disappears.

Why this matters

Knowing the distinction between Python-side iterators and tensors helps avoid type mismatches at the Keras layer boundary. Layers are designed to compose tensor ops; dataset iterators should feed the model, not be passed directly to a layer call.

Final notes

Keep layer inputs as tensors, convert constant arrays such as attention maps with tf.convert_to_tensor, and integrate your custom logic through the Functional API. This keeps the data pipeline clean and the masking operation traceable within the model.