2025, Oct 23 03:00

Fix blurry NumPy array displays in Matplotlib: convert PIL crops to grayscale for crisp output and better EasyOCR

Learn why PIL images look sharp while NumPy arrays look blurry in Matplotlib, and how grayscale conversion fixes it for clean EasyOCR input and OCR results.

When you display a PIL image directly with matplotlib and then render the exact same content after converting it to a numpy array, the two visuals can look surprisingly different. In one scenario the first plot is crisp, while the array-based view becomes barely legible. If that image is destined for EasyOCR, you understandably want the sharper version to go in.

What triggered the question

I am curious what the PIL library does value scaling and normalization wise to show me crisp image and why just doing matplotlib on the extracted numpy value looks really bad.

Reproducing the issue

The behavior shows up with a straightforward crop, display, array conversion, conditional replacement, and re-display. The only difference between the two visualizations is whether matplotlib receives a PIL image or a numpy array.

from PIL import Image
import numpy as np
import matplotlib.pyplot as plt

img_handle = Image.open(tmp_img_path)
crop_roi = img_handle.crop((520, 965, 565, 1900))

plt.imshow(crop_roi, cmap='gray')
plt.show()

arr_view = np.array(crop_roi, dtype=np.uint8)
arr_view[np.where(arr_view == 48)] = 255
plt.imshow(arr_view, cmap='gray')
plt.show()

What is actually going on

The direct PIL image display appears crisp, while the array-based one does not. The practical fix in this case is to explicitly convert the cropped image to grayscale before further steps. That conversion aligns the data in a way that produces a crisp result when shown as a numpy array as well.

The fix

Applying grayscale conversion to the PIL image makes the array rendering look as clear as the direct display.

from PIL import Image, ImageOps
import numpy as np
import matplotlib.pyplot as plt

img_handle = Image.open(tmp_img_path)
crop_roi = img_handle.crop((520, 965, 565, 1900))

crop_roi = ImageOps.grayscale(crop_roi)

plt.imshow(crop_roi, cmap='gray')
plt.show()

arr_view = np.array(crop_roi, dtype=np.uint8)
arr_view[np.where(arr_view == 48)] = 255
plt.imshow(arr_view, cmap='gray')
plt.show()

Why this matters

In OCR workflows such as EasyOCR, small visual differences can translate into big accuracy differences. If the numpy view is what you ultimately feed into the model, you want that representation to be as clean and consistent as the image you see when you preview results. Ensuring the grayscale conversion before array conversion helps you avoid a mismatch between what looks right on screen and what the model actually receives.

Takeaways

If you notice that a PIL image looks crisp when shown directly but the numpy array view becomes hard to read, convert the image to grayscale with ImageOps.grayscale before displaying or array conversion. This simple step aligns the output and gives you a reliable preview of exactly what downstream code, including EasyOCR, will process.

matplotlib ocr python python-imaging-library