Most Recent Articles
A practical guide to selecting HNSW hyperparameters	Apr 10
Boost OpenSearch vector search performance with Intel AVX-512	Apr 08
Generative AI: OpenSearch's journey as an open-source search engine	Mar 26
OpenSearch as a SIEM Solution	Mar 20
GPU-accelerated vector search in OpenSearch: A new frontier	Mar 18
Solution Provider Highlight - Enhancing anomaly detection in Amazon OpenSearc...	Mar 07
Tracking the evolution of OpenSearch performance	Mar 06
Efficient large-scale filtering with bitmap filtering in OpenSearch	Feb 25
Reduce costs with disk-based vector search	Feb 19
From chaos to clarity: Revolutionizing OpenSearch clients and documentation u...	Feb 13

Instant DeepSeek: One-click activation with OpenSearch

Fri, Jan 31, 2025 · Owais Kazi, Minal Shah, Sean Zheng, Amit Galitzky, Fanit Kolchina

In an earlier blog post, we introduced OpenSearch’s support for the DeepSeek large language model (LLM). This post focuses on simplifying DeepSeek LLM integration using the OpenSearch Flow Framework plugin. With just one API call, you can provision the entire integration—creating connectors, registering models, deploying them, and setting up agents and tools. Automated templates handle the setup, eliminating the need to call multiple APIs or manage complex orchestration.

Manual setup

In our earlier blog post, setting up the DeepSeek model—or any LLM—required four separate API calls:

Creating a connector for the DeepSeek model
Creating a model group
Registering the model using the connector ID
Creating a search pipeline for retrieval-augmented generation (RAG)

With the OpenSearch Flow Framework plugin, this process is now streamlined into a single API call. In the following example, we’ll present a simplified setup using the conversational search example from the earlier blog post.

One-click deployment

In the following example, you will configure the conversational_search_with_llm_deploy workflow template to implement RAG with DeepSeek in OpenSearch. The workflow created using this template performs the following configuration steps:

Deploys an externally hosted DeepSeek model
Registers and deploys the model
Creates a search pipeline with a RAG processor

Step 1: Create and provision the workflow

Using the conversational_search_with_llm_deploy workflow template, you can provision the workflow by specifying the required fields. Specify your API key for the DeepSeek model in the create_connector.credential.key:

POST _plugins/_flow_framework/workflow?use_case=conversational_search_with_llm_deploy&provision=true
{
    "create_connector.credential.key" : "<PLEASE ADD YOUR DEEPSEEK API KEY HERE>",
    "create_connector.endpoint": "api.deepseek.com",
    "create_connector.model": "deepseek-chat",
    "create_connector.actions.url": "https://${parameters.endpoint}/v1/chat/completions",
    "create_connector.actions.request_body": "{ \"model\": \"${parameters.model}\", \"messages\": ${parameters.messages} }",
    "register_remote_model.name": "DeepSeek Chat model",
    "register_remote_model.description": "DeepSeek Chat",
    "create_search_pipeline.pipeline_id": "rag_pipeline",
    "create_search_pipeline.retrieval_augmented_generation.tag": "deepseek_pipeline_demo",
    "create_search_pipeline.retrieval_augmented_generation.description": "Demo pipeline Using DeepSeek Connector"
}

You can change the default values in the preceding request body based on your requirements.

OpenSearch responds with a unique workflow ID, simplifying the tracking and management of the setup process:

{
    "workflow_id": "204SuZQB3ZvYMDlU9PQh"
}

Use the GET Status API to verify that all resources were created successfully:

GET _plugins/_flow_framework/workflow/204SuZQB3ZvYMDlU9PQh/_status
{
    "workflow_id": "204SuZQB3ZvYMDlU9PQh",
    "state": "COMPLETED",
    "resources_created": [
        {
            "resource_id": "3E4SuZQB3ZvYMDlU9PRz",
            "workflow_step_name": "create_connector",
            "workflow_step_id": "create_connector",
            "resource_type": "connector_id"
        },
        {
            "resource_id": "3k4SuZQB3ZvYMDlU9PTJ",
            "workflow_step_name": "register_remote_model",
            "workflow_step_id": "register_model",
            "resource_type": "model_id"
        },
        {
            "resource_id": "3k4SuZQB3ZvYMDlU9PTJ",
            "workflow_step_name": "deploy_model",
            "workflow_step_id": "register_model",
            "resource_type": "model_id"
        },
        {
            "resource_id": "rag_pipeline",
            "workflow_step_name": "create_search_pipeline",
            "workflow_step_id": "create_search_pipeline",
            "resource_type": "pipeline_id"
        }
    ]
}

(Optional) Step 2: Create a conversation memory

Note: If you skip this step and don’t create a conversation memory, a new conversation will be created automatically.

Create a conversation memory to store all messages from a conversation:

POST /_plugins/_ml/memory/
{
"name": "Conversation about NYC population"
}

The response contains a memory ID for the created memory:

{
"memory_id": "znCqcI0BfUsSoeNTntd7"
}

Step 3: Use the pipeline for RAG

Assuming that you created a k-NN index and ingested the data to use vector search. For more information about creating a k-NN index, see k-NN index. For more information about vector search, see Vector search. For more information about ingesting data, see Ingest RAG data into an index.

Send a query to OpenSearch and provide additional parameters in the ext.generative_qa_parameters object:

GET /my_rag_test_data/_search
{
  "query": {
    "match": {
      "text": "What's the population of NYC metro area in 2023"
    }
  },
  "ext": {
    "generative_qa_parameters": {
      "llm_model": "deepseek-chat",
      "llm_question": "What's the population of NYC metro area in 2023",
      "memory_id": "znCqcI0BfUsSoeNTntd7", <can skip memory_id if skipped step2>
      "context_size": 5,
      "message_size": 5,
      "timeout": 15
    }
  }
}

The response contains the model output:

{
  ...
  "ext": {
    "retrieval_augmented_generation": {
      "answer": "The population of the New York City metro area in 2023 was 18,867,000.",
      "message_id": "p3CvcI0BfUsSoeNTj9iH"
    }
  }
}

Additional use cases

The preceding example represents just one of many possible workflows. The Flow Framework plugin comes with a variety of prebuilt templates designed for different scenarios. You can explore our substitution templates for various workflows and review their corresponding default configurations.

These resources will help you discover and implement other automated workflows that best suit your needs.

Conclusion

By using the Flow Framework plugin, we’ve transformed a complex, multi-step setup process into a single, simple API call. This simplification isn’t limited to DeepSeek—you can use the same streamlined approach to deploy models from other leading LLM providers like Cohere and OpenAI. Whether you’re experimenting with different models or setting up production environments, the Flow Framework plugin makes LLM integration faster and more reliable.

« Enhancing OpenSearch anomaly detection: Reducing false positives through algorithmic improvements Zero to RAG: A quick OpenSearch vector database and DeepSeek integration guide »