Track
Snowflake’s Cortex AI brings powerful AI capabilities directly into your Snowflake environment. For example, by using Cortex’s access to industry-leading Large Language Models (LLMs), we can build tools like AI chatbots that use internal enterprise data without compromising the security of sensitive data.
These integrations have caused a shift from traditional rule-based chats and automations to more flexible AI-powered chats that are able to better understand user inputs to provide more thorough and complete answers.
In this tutorial, I’ll walk you through the high-level steps on how to create a chatbot using Snowflake Cortex. You’ll learn how to prepare data, configure Cortex Search, use Document AI for unstructured data, build a chatbot interface in Streamlit, and I’ll provide ideas on how to integrate it with external applications. By the end, you’ll have both the practical skills and conceptual understanding needed to design conversational AI on Snowflake.
If you’re new to Snowflake, I recommend following the Snowflake Foundations skill track to get up to speed. You can also get a free trial of Snowflake to put your skills into practice.
What is Snowflake Cortex?
At its core, Snowflake Cortex brings AI and LLM-powered services into the Snowflake ecosystem. This tutorial will cover integrating Cortex with your Snowflake data to create a simple Streamlit chatbot app using Cortex’s Search Service and Document AI for parsing document files to help provide additional context to existing LLM models through Cortex’s retrieval augmented generation model.
This tutorial will cover:
- Setting up your overall Snowflake environment in preparation for building a Cortex Search chatbot
- Leveraging Cortex LLM functions and Document AI to extract insights from documents.
- Building a chatbot and analytics app in Streamlit in Snowflake
- Cost structure and considerations when using Cortex
The benefit of using Cortex is its native hybrid approach to vector, keyword, and semantic reranking and its RAG-based response generation, which improves its accuracy and specificity.
Building chatbots and analytics apps with Streamlit in Snowflake leverages the flexibility of Python development while emphasizing data security and governance regulations by keeping sensitive data inside your secure Snowflake environment.
If you’d like an idea of Snowflake Cortex AI’s capabilities, look at this tutorial on using Cortex AI for basic NLP tasks such as summarization and sentiment analysis.
Prerequisites
Before we build, let’s get your Snowflake account ready. In this account, you will want the following permissions:
- Permissions to create warehouses and the
USAGEpermission - Database Permissions:
USAGE - Schema permissions:
USAGE,CREATE STREAMLIT,CREATE STAGE,CREATE CORTEX SEARCH SERVICEorOWNERSHIP - The ability to use Document AI with appropriate account-level permissions. In most cases, you’ll use Snowflake’s built-in roles (e.g.,
ACCOUNTADMIN,SYSADMIN) or a custom role with: USAGEon the database and schemaCREATE CORTEX SEARCH SERVICE- OPERATE on the search service object once it’s created
- For Cortex, you’ll typically need the
CORTEX_USERrole combined with object-level grants (USAGE,OPERATE) to interact with services.
Ideally, you have some familiarity with SQL, Python, and basic AI concepts. You will want to understand SQL in order to navigate the Snowflake environment and the data. You will need Python familiarity for building the Streamlit app. Having a basic understanding of AI concepts will help us understand the components that go into our Cortex Search Service.
Tools wise, we will be focusing on Snowflake’s Snowsight which is their web browser. If you’d like to build this locally, you will need your own IDE/text editor.
While we will try to provide some data samples, ideally you have a test corpus and dataset you want to work with. If you do not have any, I recommend the Federal Open Market Committee meeting minutes for the tutorial (Sourced from Snowflake): FOMC minutes sample.
How to Build a Chatbot in Snowflake Cortex
Okay, let's get building! Some of these instructions might be a little more generic so we can fit a wide variety of options.
Step 1: Setup Snowflake Database, Schema, and Warehouse
First, we’ll create the necessary environment for our chatbot project.
-- Create a dedicated warehouse
CREATE OR REPLACE WAREHOUSE cortex_wh
WITH WAREHOUSE_SIZE = 'XSMALL'
AUTO_SUSPEND = 60
AUTO_RESUME = TRUE;
-- Create a database and stage
CREATE OR REPLACE DATABASE cortex_chatbot_db;
CREATE OR REPLACE STAGE cortex_chatbot_db.public.chatbot_stage;
-- Switch context
USE DATABASE cortex_chatbot_db;
USE WAREHOUSE cortex_wh;
This ensures all resources are isolated and cost-managed.
Step 2: Load and Prepare Data
Your chatbot needs data to respond to queries. This can be:
- Structured data: customer info, transaction logs, product catalogs.
- Unstructured data: PDFs, FAQs, policies, knowledge bases.
If you need some more info on data ingestion and how we get data into Snowflake, look over this guide on data ingestion.
Example: staging and loading a CSV file.
-- Upload data (done via Snowsight UI or SnowSQL)
-- Example: product_faq.csv
-- Load into a table
CREATE OR REPLACE TABLE cortex_chatbot_db.public.chatbot_stage.product_faq (
question STRING,
answer STRING
);
COPY INTO product_faq
FROM @chatbot_stage/product_faq.csv
FILE_FORMAT = (TYPE = CSV FIELD_OPTIONALLY_ENCLOSED_BY='"');
For unstructured data (e.g., PDFs), you can process them using Cortex Document AI. We will talk about how to get data out of unstructured data sources using Cortex Document AI in the next step.
We want to take both our structured and unstructured data and turn them into chunks and make sure to attach some metadata as needed to make it more efficient for the RAG.
Step 3: Document AI Training and Document Parsing
Let’s go over building a model using Document AI to build a tuned model for parsing documents. Note, this is optional and I encourage you to try the default Snowflake models for parsing, as these are production-ready and handle OCR, layout extraction, and table detection. You don’t “train” custom models directly in Cortex Document AI today, but you can configure extraction modes (e.g., entity extraction, table extraction) to suit your data.
Start simple with the default parser, and only layer on more advanced configuration if your documents have complex or irregular formatting.
In Snowsight, open the navigation menu and go to AI & ML. Select our cortex_wh and then click “Build”. From that we select “Create”.
Once you have created the build, we upload our documents to the build. Once it finishes processing, we can work on extracting data either using entity extraction or table extraction. Entity extraction lets you ask questions of your document to get data out, table extract parses data based on defined fields in your document.
Once the model is built, we want to evaluate its accuracy and then train the model as needed. Training the model fine-tunes it to your particular document’s structure. Now we can use this model or our default model for parsing documents. For now, we’ll stick to Snowflake’s default AI parser.
CREATE OR REPLACE TABLE cortex_chatbot_db.public.chatbot_stage.doc_data AS --create the table
SELECT
RELATIVE_PATH,
TO_VARCHAR (
SNOWFLAKE.CORTEX.PARSE_DOCUMENT ( --using default cortex parser
'@cortex_chatbot.public.chatbot_stage',
RELATIVE_PATH,
{'mode': 'LAYOUT'} ):content
) AS EXTRACTED_LAYOUT
FROM
DIRECTORY('@cortex_chatbot.public.chatbot_stage') --pull the files from the stage
WHERE
RELATIVE_PATH LIKE '%.pdf'; --specifically just PDFs
Now our data will be in a table and ready to be transformed.
Step 4: Data Transformation and Feature Extraction
To prepare data for chatbot interactions, documents need to be chunked and enriched with metadata. Chunking makes Cortex Search more accurate.
CREATE OR REPLACE TABLE cortex_chatbot_db.public.chatbot_stage.doc_chunks AS
SELECT
relative_path,
BUILD_SCOPED_FILE_URL(@cortex_chatbot.public.chatbot_stage, relative_path) AS file_url,
(
relative_path || ':\n'
|| coalesce('Header 1: ' || c.value['headers']['header_1'] || '\n', '')
|| coalesce('Header 2: ' || c.value['headers']['header_2'] || '\n', '')
|| c.value['chunk']
) AS chunk,
'English' AS language
FROM
cortex_chatbot_db.public.chatbot_stage.doc_data
LATERAL FLATTEN(SNOWFLAKE.CORTEX.SPLIT_TEXT_MARKDOWN_HEADER(
EXTRACTED_LAYOUT,
OBJECT_CONSTRUCT('#', 'header_1', '##', 'header_2'),
2000, -- chunks of 2000 characters
300 -- 300 character overlap
)) c;
Now, this SQL is doing a lot, so let me summarize. It’s taking each file and flattening out the text into chunks of 2000 characters.
Chunking into smaller portions helps with searching efficiency. It then adds some header information (like titles) to the data within the chunk so that the model has context for where this data comes from.
This makes your data searchable and ready for retrieval-augmented generation (RAG).
Step 5: Create and Configure Cortex Search or Agent
Let’s build our Cortex Search Agent, which is the main service that will search through chunks and pull the context for a response.
CREATE OR REPLACE CORTEX SEARCH SERVICE cortex_chatbot_db.public.chatbot_stage.cortex_serv
ON chunk
ATTRIBUTES category
WAREHOUSE = cortex_wh
TARGET_LAG = '1 hour'
AS (
SELECT
chunk,
relative_path,
file_url,
category
FROM cortex_chatbot_db.public.chatbot_stage.doc_chunks
);
This will build an initial service which also defines the field attributes which selects columns you want to use for search result filtering. Here, I’ve chosen “category” so that we could theoretically filter on different product categories. For this, I’ve chosen the columns chunk which should contain our chunked text.
Cortex Search defaults to a hybrid vector + keyword search, improving chatbot accuracy. With this, we can add further context using a semantic model. If you expect some unique phrasing or technical abbreviations, you can configure a semantic model YAML file using the Cortex Analyst semantic model generator. This will allow you to provide context and synonyms to the model so it can acknowledge something like “CUST” might also mean “customer”.
This enables RAG workflows, where the chatbot retrieves relevant context before generating responses.
Step 6: Build the Chatbot Application with Streamlit
Now let’s build a super simple chatbot front-end inside Snowflake using Streamlit. You will want to use Snowsight to find the “Streamlit in Snowflake” app to create your chatbot application. This uses Snowpark to connect to your Snowflake.
If you want some information on building chatbots in general, I recommend first going through this course on building chatbots in Python.
First, we allow the user to connect to Snowflake using Session.builder. Then, given a particular input, we provide that as a query to SNOWFLAKE.CORTEX.SEARCH
import streamlit as st
import snowflake.connector
from snowflake.snowpark import Session
# Connect to Snowflake
session = Session.builder.configs({
"account": "<account_id>",
"user": "<username>",
"password": "<password>",
"role": "SYSADMIN",
"warehouse": "CORTEX_WH",
"database": "CORTEX_CHATBOT",
"schema": "PUBLIC"
}).create()
st.title("Snowflake Cortex Chatbot")
user_input = st.text_input("Ask me a question:")
if user_input:
# Query Cortex Search
result = session.sql(f"""
SELECT PARSE_JSON(
SNOWFLAKE.CORTEX.SEARCH(
chatbot_serv',
'{
"query": "{user_input}",
"columns":[
"col1",
"col2"
],
"filter": {"@eq": {"col1": "filter value"} }
}'
)
)[‘results’] AS results
""").collect()
st.write("**Chatbot:**", result[0]['RESPONSE'])
This basic chatbot retrieves relevant information from the Cortex Search index and returns answers in real-time. There is also the ability to create a multi-turn (conversational) chatbot by adding in the ability to store the session chat history in memory. This can be done natively in Streamlit using the existing search service.
Step 7: Implementation via REST API Endpoints
To integrate the chatbot into external apps (e.g., websites, Slack bots), use the Snowflake REST API. Generally, the URl will look something like this: https://<account_url>/api/v2/databases/<db_name>/schemas/<schema_name>/cortex-search-services/<service_name>:query.
These APIs do require authentication for security purposes, which means this is limited to users who have access to your Snowflake environment. Taking this API-first approach does allow for embedding chatbot capabilities more flexibly and to a wider audience.
A use case for this would be to build an internal tool that can be hosted on a web server, which allows users to ask questions about documents like quarterly financial reports or meeting notes. These integrations would require some development of tools that can make API requests.
Step 8: Testing and Validation
Before deploying widely, test the chatbot for:
- Accuracy – does it retrieve correct answers?
- Speed – is the latency acceptable?
- Usability – is the interface intuitive?
You can log queries and compare outputs against known ground truth for benchmarking. It is common to have a set of questions that have a prescribed answer and use other models to make comparisons for accuracy.
Step 9: Clean Up (Optional)
To manage costs and maintain governance, drop unused resources.
DROP WAREHOUSE IF EXISTS cortex_wh;
DROP DATABASE IF EXISTS cortex_chatbot;
DROP STAGE IF EXISTS chatbot_stage;
Additional Concepts & Best Practices
Let’s add a little context to our chatbot’s function and some best practices.
Retrieval Augmented Generation (RAG) in Snowflake Cortex
RAG is a technique used by many large language models to enhance its original data with external knowledge.
Snowflake Cortex adds your enterprise data to the context of your chatbot, it is able to provide answers using the additional context. By using a mixture of vector and keyword search, it’s able to parse a prompt with context data and uses semantic reranking to pick the most relevant documents to provide a contextualized response.
This also means it simplifies the implementation of RAG through a single Cortex Search engine instead of having the user manually create vectors, similarity functions, and embeddings. The power of RAG allows us to build things like chatbots geared for technical documentation.
Prompt Engineering for Snowflake Chatbots
Prompt engineering is important for making sure any chatbot provides appropriate responses to user questions. Having a good understanding of prompt engineering is critical for ensuring your chat service returns good responses.
Something that helps provide better responses is providing users with a usage guide that helps them craft better questions.
You can also add prompt templates to user queries prior to submitting them to the Cortex service. Prompt templates help provide consistent context to the model so that it uses the right information for providing an answer.
An example template that might be useful for enterprise data might be: "You are a helpful assistant for the company’s historical data. Answer based only on company knowledge base." This will let the model focus on using enterprise data for answers.
You will also occasionally get ambiguous questions or user input. Instead of forcing the model to answer, you can add to your template something like: "If the user asks a vague question, provide the answer: I don’t know the answer to that.” This may prompt them to ask a more specific question.
Flexibility and Choice in Implementation
Snowflake Cortex supports pre-built LLMs (fast, production-ready) and custom models (fine-tuned for domains). These can be defined when creating the service. Also, adding in your own custom semantic models adds some industry/enterprise-specific context to language that may otherwise be obscure to the model without having to feed a large corpus of text. This flexibility allows future model integration.
Security, Governance, and Compliance
Cortex follows the same role-based access control, audit logging, and encryption offered by Snowflake. This ensures compliance with HIPAA, GDPR, and other regulations. The main responsibility on your end would be to ensure the right people have the right roles.
Cost Structure
Snowflake Cortex uses a consumption-based pricing model. Like other Snowflake products, you are charged for compute and storage, but there are a few extra layers specific to Cortex Search. You pay for:
- Virtual Warehouse Compute: This is the main compute usage, which processes the queries against Snowflake data and consumes credits
EMBED_TEXTToken usage: During data processing for documents, Snowflake Cortex handles token embedding, and you are charged when documents are added or changed- Serving Compute: The Cortex Search Service itself uses a serving compute in order to establish a low-latency and high-throughput service to the user. This is separate from the virtual warehouse computer and charges any time the service is available not just when it is used
- Storage for semantic indexes and staged data.
- Cloud Service Compute: These are a Snowflake-level cost that allows Snowflake to identify changes in base objects. This is only billed if this cost is greater than 10% of the daily warehouse cost.
Try the following techniques for managing costs:
- Snowflake recommends using MEDIUM or smaller warehouses
- Monitor token usage and set usage quotas.
- Archive unused indexes or allow for a longer data freshness lag if the newest documents aren’t necessary
- Suspend your service when it is not needed. It may be useful to set the service on a timer that allows it to turn on/off when not being used.
Conclusion
Building a chatbot with Snowflake Cortex unlocks enterprise specific conversational AI directly on the Data Cloud. From ingestion to retrieval to deployment, you can work entirely within Snowflake’s governed environment. These chatbots can help streamline things like onboarding, helping answer in-depth analytical questions, and creating technical documentation.
Cortex enables RAG-powered chatbots that combine LLMs with your own data. Building a chatbot in Streamlit in Snowflake makes it easy to deploy user-friendly interfaces inside Snowflake. Security, governance, and cost management are built into the workflow. With Cortex, organizations can evolve from traditional BI dashboards to intelligent assistants that engage with data conversationally.
For more info on Snowflake and its AI capabilities, check out the following resources:
Snowflake Cortex Chatbot FAQs
What are the main benefits of using Snowflake Cortex for business intelligence?
Cortex simplifies analytics by allowing non-technical users to ask natural language questions and get SQL-driven answers. It eliminates manual query writing, reduces BI bottlenecks, and enables faster insights. Additionally, all results inherit Snowflake’s security, lineage, and access controls.
Can Snowflake Cortex Analyst handle complex multi-table queries?
Yes. Cortex Analyst supports multi-table joins and can reason across schemas, provided relationships are clear and metadata is well defined. Proper documentation, semantic models, and data governance practices improve the accuracy of complex queries.
How does Snowflake ensure the security and governance of data used by Cortex Analyst?
Cortex Analyst runs inside the Snowflake environment, ensuring that sensitive data never leaves the platform. It respects role-based access control (RBAC), object tagging, masking policies, and audit logging, so AI-powered insights follow the same compliance rules as traditional queries.
What are the limitations of Snowflake Cortex Analyst in handling follow-up questions?
Cortex Analyst is strong for single-turn queries but has limited memory for multi-turn conversations compared to full chatbot frameworks. For advanced dialogue or contextual follow-ups, pairing Cortex Analyst with Cortex Search or a custom chatbot built on Streamlit offers better results.
Can businesses customize or fine-tune the models behind Snowflake Cortex?
While Cortex primarily uses Snowflake-managed LLMs, it allows configuration through semantic models, prompt templates, and retrieval pipelines. For deeper customization, Cortex can integrate with external models using Snowpark Python or APIs, depending on enterprise needs.
I am a data scientist with experience in spatial analysis, machine learning, and data pipelines. I have worked with GCP, Hadoop, Hive, Snowflake, Airflow, and other data science/engineering processes.


