Skip to main content

Create a collection

Here we will create a collection of movies to configure its behavior and set up the data structure.

Code

Run the following code to create a collection called "Movies" in your Weaviate instance.

import weaviate
from weaviate.classes.config import Configure, Property, DataType
import os


# Instantiate your client (not shown). e.g.:
# client = weaviate.connect_to_weaviate_cloud(...) or
# client = weaviate.connect_to_local(...)

client.collections.create(
name="Movies",
properties=[
Property(name="title", data_type=DataType.TEXT),
Property(name="overview", data_type=DataType.TEXT),
Property(name="vote_average", data_type=DataType.NUMBER),
Property(name="genre_ids", data_type=DataType.INT_ARRAY),
Property(name="release_date", data_type=DataType.DATE),
Property(name="tmdb_id", data_type=DataType.INT),
],
# Define the vectorizer module
vector_config=Configure.Vectors.text2vec_cohere(model="embed-v4.0"),
# Define the generative module
generative_config=Configure.Generative.cohere(model="command-a-03-2025")
)

client.close()
If the collection already exists

This may throw an error if a collection with the same name already exists. If so, delete the collection with client.collections.delete(<COLLECTION_NAME>) before proceeding.


Deleting a collection will also delete its contents as well. Be very careful whenever you are deleting a collection.

Explain the code

Each collection definition must have a name. Then, you can define additional parameters like we've done in this example.

Properties

Properties are the object attributes that you want to store in the collection. Each property has a name and a data type.

In our movie database, we have properties like title, release_date and genre_ids, with data types like TEXT (string), DATE (date), or INT (integer). It's also possible to have arrays of data types, like TEXT_ARRAY or INT_ARRAY, like we have with genre_ids.

Auto-schema

In this example, we explicitly define the data schema. However, Weaviate can also automatically infer the schema from data as needed. This is called auto-schema.

Vector configuration

We specify one vector in the collection, using the text2vec-cohere vectorizer integration.

    vector_config=Configure.Vectors.text2vec_cohere(model="embed-v4.0"),

This means that when an object vector is not provided, Weaviate will use this integration to generate the vector.

Generative configuration

In this code example, we specify the cohere as the default AI model integration for generative tasks.

    generative_config=Configure.Generative.cohere(model="command-a-03-2025")

Python classes

The code example makes use of classes such as Property, DataType and Configure. They are defined in the weaviate.classes.config submodule and are used to define the collection.

from weaviate.classes.config import Configure, Property, DataType
import os
What's next?

You now know how to configure and create a collection in Weaviate. Next, you will learn how to add data to the collection.

Login to track your progress