2025, Dec 03 15:00

How to Recover Data in Milvus on Docker After Dropping and Recreating a Collection (using milvus-backup)

Dropped and recreated a Milvus collection in Docker? Learn why it looks empty and how to recover data using batch import with milvus-backup restore scripts.

Running standalone Milvus in Docker is a common setup: you mount a host volume, Milvus writes vectors and logs into that volume, and you get persistence across container restarts. The trouble starts when a collection is dropped and then created again with the same name. The UI and SDKs show an empty collection, yet the volume still consumes gigabytes. How do you get the data back?

Reproducing the situation

In practice, “recreate collection” often means a drop followed by a create with the same identifier. There is no dedicated recreate_collection command in Milvus SDKs.

# illustrative ops that mimic a "recreate" sequence
svc.drop_collection("my_vectors")
svc.create_collection("my_vectors", same_schema)

The Docker volume keeps the persisted artifacts. A typical layout might look like this:

volumes
  milvus
    data
    |  delta_log
    |  index_files
    |  insert_log
    |  mmap
    |    mmap_chink_manager
    |  stats_log
    etcd
    rdb_data
    rdb_data_meta_kv

What actually happened and why the collection looks empty

There is no recreate_collection command in Milvus SDKs. If you dropped a collection and created a new one with the same name and schema, there is no direct method to recover the data. The new collection metadata does not point to the old on-disk artifacts anymore, even though the volume still contains files under data/ (insert_log, delta_log, index_files, and so on). That’s why the collection appears empty while the disk usage remains high.

Practical way to proceed

Given that the volume still keeps the original data, the recommended path is a manual batch_import. Use the restore scripts from the milvus-backup project as a reference: https://github.com/zilliztech/milvus-backup. The idea is to ingest the persisted content back into a new collection rather than trying to “reattach” files to metadata.

Sketch of a batch import call

The concrete steps depend on how you organize your environment and what exactly you choose from the restore scripts. Conceptually, the flow resembles the following:

# paths pointing to the persisted sources you plan to restore from
source_dirs = [
    "/path/to/your/volume/..."
]
# conceptually trigger a bulk ingestion, following the restore scripts' approach
batch_import(source_dirs)

Use the repository above to guide which inputs are valid and how to perform the restore. The repository exists to help with backup and restore flows and is the right artifact to consult for the manual batch_import.

Why this is important

Dropping and re-creating a collection severs the link between metadata and persisted files. Assuming that reusing the same name preserves data is risky; the on-disk artifacts remain, but the system does not automatically index or display them. Knowing that there is no direct recovery path helps you avoid blind attempts at reattachment and go straight to a supported restore approach.

Takeaways

If you dropped and then created a Milvus collection with the same name and schema, there is no direct method to recover the data. The Docker volume can still contain original files, but they won’t magically reappear in the new collection. To bring data back, rely on a manual batch_import and follow the restore scripts from the milvus-backup project. Plan accordingly when performing destructive operations, and treat backup/restore as part of your normal operational hygiene.