Snowflake Copilot: Everything We Know About This AI-Powered Assistant
Share this article
Snowflake Copilot is an AI-powered SQL assistant that helps you write, optimize, and understand SQL queries to analyze your Snowflake data assets. You can also ask open-ended questions in plain English to understand the contents of databases and schema in Snowflake.
See How Atlan Simplifies Data Governance – Start Product Tour
This article explores Snowflake Copilot’s benefits and top use cases and addresses some of the most commonly asked questions.
Table of contents #
- What is Snowflake Copilot?
- What are the top use cases for Snowflake Copilot?
- How much does Snowflake Copilot cost?
- How does Snowflake Copilot work?
- Snowflake Copilot best practices: 3 tips to improve your outcomes
- Bottomline: Standardize your metadata for more accurate, reliable insights with Snowflake Copilot
- Snowflake Copilot: Related reads
What is Snowflake Copilot? #
Snowflake Copilot is an AI-powered assistant that comprehends the context of your Snowflake data assets. As a result, you can:
- Ask open-ended questions about your data and metadata
- Send follow-up inquiries to understand the information stored in your Snowflake databases and schema
- Refine and improve your own SQL queries
“Leveraging breakthroughs from Snowflake’s world-class AI research team, [Copilot] combines the strengths of Mistral’s latest state-of-the-art model, Mistral Large, with Snowflake’s proprietary SQL-generation model.” - Snowflake Blog
As of July 15, 2024, Snowflake Copilot is generally available in AWS us-east-1 (Virginia), AWS us-west-2 (Oregon), AWS eu-central-1 (Frankfurt), and Azure East US 2 (Virginia).
Snowflake Copilot is powered by Snowflake Cortex, which is Snowflake’s intelligent, fully managed AI service. Its underlying infrastructure and resources enable Copilot’s AI capabilities.
Additionally, just like Snowflake Cortex, Copilot fully supports RBAC and provides results only based on the datasets you can access.
What are the top use cases for Snowflake Copilot? #
Snowflake Copilot democratizes data exploration and accelerates data analysis for Snowflake users. Some of the top use cases for Snowflake Copilot include:
- Engage in data exploration of your Snowflake assets in plain English – ask open-ended questions about how your data is structured, how to explore new datasets, etc.
- Enable Text-to-SQL — construct and run SQL statements that analyze your Snowflake data assets
- Build complex queries by having conversations with Snowflake Copilot – ask follow-up questions to refine the suggested SQL and delve deeper into the analysis
- Improve querying efficiency by asking Snowflake Copilot to analyze your queries and suggest improvements
- Get AI-powered explanations (and fixes) of SQL queries
- Learn more about Snowflake by asking any questions you have about SQL (what is a Join?) and Snowflake documentation — concepts (what is Snowflake Cortex?), features (how does the TRANSLATE function work?), and capabilities (how do I ingest data into Snowflake?)
- Provide custom instructions, including preferences (guidelines on tone and response structure, SQL schema) or specific business knowledge (data assets to consider), for Snowflake Copilot to incorporate when generating its responses.
How much does Snowflake Copilot cost? #
As of now, Snowflake Copilot is free. According to Snowflake, the “details on pricing and billing are planned but you will be notified before any charges are applied for this feature.”
How does Snowflake Copilot work? #
According to Snowflake, you can interact with Copilot in SQL Worksheets and Snowflake Notebooks in Snowsight.
Snowflake Copilot doesn’t require any additional setup. So, with the Copilot panel, you can ask a question, and Snowflake Copilot will provide an answer.
Each chat session will be associated with a particular worksheet or notebook. So, suggested SQL queries can be executed directly in your worksheet or notebook. Additionally, you should have a database and schema in use during your session for Copilot to generate responses.
A step-by-step guide to getting started with Snowflake Copilot #
To start using Snowflake Copilot, follow these steps:
- Create a new worksheet or open an existing worksheet.
- Select Ask Copilot in the lower-right corner of the worksheet to open the Snowflake Copilot panel.
- Select a database and a schema for the current worksheet.
- In the message box, type in your question and then select the send icon or press Enter to submit it. Snowflake Copilot provides a response in the panel.
- If the response from Snowflake Copilot includes SQL statements:
- Select Run to run the query
- Select Add to edit the query before running it
- To add custom instructions, go to the Copilot menu at the top of the Snowflake Copilot panel and select Custom instructions from the drop-down menu.
- Enter your instructions in plain text English and select Save when finished.
Snowflake Copilot limitations: 7 aspects to bear in mind #
Since Snowflake Copilot is still being developed and refined, there are some limitations to using it:
- Limited language support: Snowflake Copilot supports only English and SQL.
- No direct data access: Snowflake Copilot does not access the data within your tables. For example, if you ask it to filter rows where column A equals “X,” you must specify the value “X” in your request.
- No cross-database or schema queries: Queries spanning multiple databases or schemas are not supported. To work around this, you can create views that join data from different schemas or databases.
- Delayed responses: Depending on the complexity and length of the response, Snowflake Copilot may take a few seconds to complete its replies.
- Potentially invalid SQL suggestions: Occasionally, Copilot may suggest queries with incorrect SQL syntax or reference non-existent tables or columns. Use the thumbs-up or thumbs-down feedback options to report issues and help improve the feature.
- Delay in recognizing new structures: Newly created databases, schemas, or tables may take 3–4 hours to be recognized by Snowflake Copilot.
- Limited scope for tables and columns: When generating responses, Snowflake Copilot identifies the most relevant tables and columns, considering only the top 10 tables and the top 10 columns from each of those tables based on relevancy rankings.
Snowflake Copilot best practices: 3 tips to improve your outcomes #
To make the most out of Snowflake Copilot and achieve the best results in your data analysis, consider these best practices:
- Create curated views to get better results. Make sure that the views you create have the following details:
- Descriptive and easy-to-understand names for the columns — If a column contains the date for a specific sale, name the column
sale_date
- Columns with the appropriate data type — If a column contains the date for a specific sale, make sure it has the DATE type
- Commonly used metrics/expressions as new columns — If profit is defined as
revenue - cost
, create a column(revenue - cost) AS profit
in your view - Common and complex joins — If two tables,
products
andsales
, are often joined, make sure that your view joins these tables
- Descriptive and easy-to-understand names for the columns — If a column contains the date for a specific sale, name the column
- Be as specific as possible when asking questions or requesting queries.
- To refine your search and filter on specific values within columns, actively guide Snowflake Copilot. For instance, you can request a query that retrieves all the unique values present in a given column.
Bottomline: Standardize your metadata for more accurate, reliable insights with Snowflake Copilot #
Snowflake Copilot is a promising tool that has the potential to democratize data analysis within the Snowflake environment. Copilot’s ability to automate and streamline SQL query generation and refinement is already a significant asset for data teams.
It’s vital to note that standardizing your data and metadata — ensuring that it’s consistently structured, labeled, and contextualized — is crucial for delivering more accurate, reliable insights with Snowflake Copilot.
While creating customized views can help, a more scalable and sustainable solution is to establish a unified glossary. This centralized repository would act as a unified control plane, offering enriched information on your assets with:
- Standardized names and descriptions
- Added context (READMEs, summaries, ownership details, version history)
- A map of relationships between data assets across your data estate — for all data sources, tools, dashboards, etc.
Such a setup would provide contextual documentation for all data assets, ensure consistent understanding, and empower Snowflake Copilot to deliver accurate, reliable insights.
Snowflake Copilot: Related reads #
- Snowflake Cortex for AI & ML Analytics: Here’s Everything We Know So Far
- Snowflake Horizon for Data Governance: A Comprehensive Guide
- How to Set Up Data Governance for Snowflake: A Step-by-Step Guide
- Snowflake Data Cloud Summit 2024: Get Ready and Fit for AI
- Snowflake Data Lineage: A Step-by-Step How to Guide
- How to Set Up a Data Catalog for Snowflake: A Step-by-Step Guide
- Snowflake Data Catalog: What, Why & How to Evaluate
- Snowflake Data Mesh: Step-by-Step Setup Guide
- Glossary for Snowflake: Shared Understanding Across Teams
- Personalized Data Discovery for Snowflake Data Assets
- Snowflake Data Dictionary
Share this article