Skip to main content
HomeTutorialsArtificial Intelligence (AI)

Codestral API Tutorial: Getting Started With Mistral’s API

To connect to the Codestral API, obtain your API key from Mistral AI and send authorized HTTP requests to the appropriate endpoint (either or
Jun 2024  · 9 min read

Codestral is a state-of-the-art generative model optimized for code generation tasks, including fill-in-the-middle (FIM) and code completion. Trained across over 80 programming languages, Codestral excels in both common and less commonly used languages, making it a versatile tool for developers.

By leveraging the Codestral API, we can automate and integrate code generation into our workflows. In this tutorial, we’ll provide a step-by-step guide to using Codestral API.

If you want to get an overview of Codestral, check out my article on What is Mistral’s Codestral.

Codestral API Endpoints

Codestral offers two main API endpoints to cater to different user needs:

  • Ideal for individual users and small-scale integrations. This endpoint is currently free until August 1, 2024, after which it will become a monthly subscription service.
  • Designed for business and high-demand use cases, this endpoint offers higher rate limits and robust support for large-scale applications.

Mistral recommends the endpoint if you intend to use Codestral as part of an IDE plugin or if you plan to build tools that expose endpoints directly to users, allowing them to bring their own API keys.

Mistral recommends the endpoint for all other use cases due to its higher rate limits and suitability for business applications.

In this tutorial, we’ll use the endpoint to showcase Codestral's capabilities.

Getting Started With the Codestral API

Acquiring an API Key

To use the Codestral API, you need to sign up for a Mistral AI account and obtain an API key. Follow these steps::

Step 1: Sign up

Visit Mistral AI's sign-up page and create an account.

Step 2: Get an API key

For, navigate to the API Keys tab and click Create new key to generate your API key.

For Head to the Codestral tab (the one with a “New” badge below the API Keys tab) and follow the instructions on the page to obtain your API key. Note that there’s currently a waiting list, and you’ll need to provide a phone number to sign up. Once approved, you’ll be able to access your API key along with two specific endpoints on the same page, as shown below:

This article uses the endpoints to showcase Codestral, so all the examples provided will be tailored to that.


To authenticate our API requests, we'll create two Python functions using the requests library. These functions will handle the communication with Codestral's two API endpoints, incorporating the obtained API key into the request headers for authentication. Here are the two functions:

  1. call_chat_endpoint — to call the Chat Endpoint
  2. call_fim_endpoint — to call the Completion Endpoint

Here's how to do this in Python:

import requests
import json


def call_chat_endpoint(data, api_key=api_key):
    url = "<>"
    headers = {
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json",
        "Accept": "application/json"

    response =, headers=headers, data=json.dumps(data))

    if response.status_code == 200:
        return response.json()
        return f"Error: {response.status_code}, {response.text}"

def call_fim_endpoint(data, api_key=api_key):
    url = "<>"
    headers = {
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json",
        "Accept": "application/json"

    response =, headers=headers, data=json.dumps(data))

    if response.status_code == 200:
        return response.json()
        return f"Error: {response.status_code}, {response.text}"

Understanding Codestral's API Endpoints

Now that we've obtained our API key and set up the authentication functions, let's explore how to use Codestral's two API endpoints for code generation. We'll begin with the fill-in-the-middle (FIM) endpoint.

Fill-in-the-middle endpoint

The fill-in-the-middle endpoint is designed to generate code that fits between a given starting point (prompt) and an optional ending point (suffix). This is particularly useful for tasks requiring a specific piece of code to be generated within a larger code structure.

API Endpoint URL:


  • prompt: The starting point of the code.
  • suffix (optional): The ending point of the code.
  • stop (optional): A sequence of tokens to stop generation.


prompt = "def fibonacci(n: int):"
suffix = "n = int(input('Enter a number: '))\\nprint(fibonacci(n))"
data = {
    "model": "codestral-latest",
    "prompt": prompt,
    "suffix": suffix,
    "temperature": 0

response = call_fim_endpoint(data)

Codestral API output.

Instruct endpoint

The instruct endpoint provides instructions to the model for generating code. It’s useful for more open-ended code-generation tasks.

API Endpoint URL:


  • prompt: The instruction for the code generation task.
  • temperature (optional): Controls the randomness of the generated code.
  • max_tokens (optional): Limits the length of the generated code.


prompt = "Please write me a function that adds up two numbers"
data = {
    "model": "codestral-latest",
    "messages": [
            "role": "user",
            "content": prompt
    "temperature": 0

response = call_chat_instruct_endpoint(data)

Advanced API Usage

Rate limits

Each endpoint has different rate limits:

To ensure smooth operation and avoid exceeding usage restrictions, managing rate limits is crucial. One effective strategy is to implement retry logic, which pauses and resubmits a request after a specified delay—we can use Python's time library to introduce this delay.

Alternatively, you can keep track of your API usage to ensure you stay within the allowed limits. You can log the number of requests made and compare them against the rate limits.

Error handling

Common error codes include:

  • 401 Unauthorized: Invalid API key or missing authentication.
  • 429 Too Many Requests: Rate limit exceeded.
  • 500 Internal Server Error: Server issues on Mistral AI's side.

For a 401 Unauthorized error, check the API key and ensure it's correctly included in the request headers.

For a 429 Too Many Requests error, try to implement retry logic to wait and retry the request after a delay.

Retry logic is also useful to handle 500 Internal Server Error to try the request again later.

Customizing output

To control the output format and style, you can adjust parameters like prompt (making it more specific or general) and temperature (controlling the model’s creativity).

Experiment with these parameters to fine-tune the balance between creativity and predictability in the generated code.

Integrating Codestral API

Codestral doesn't just function as a standalone API—we can integrate it into our existing development workflows. Let's explore two primary ways to do this: through IDE and text editor integrations or by creating custom scripts.

IDEs and text editors

Codestral integrates seamlessly with popular IDEs and text editors through plugins or extensions. For example, you can set up Codestral with for VS Code or JetBrains:

  1. Install the Continue extension for VS Code or JetBrains.
  2. Select Mistral API as a provider and choose Codestral as the model.
  3. Click Get API Key to get Codestral API key.
  4. Click Add model, which will automatically populate the config.json.

Custom scripts

You can build custom scripts or applications that leverage the Codestral API for code generation tasks. Here’s a simple example of a script that generates test functions:

prompt="""Sure, here is a simple function in Python that adds up two numbers:

def add_two_numbers(num1, num2):
    return num1 + num2

You can use this function like this:
result = add_two_numbers(5, 3)
print(result)  # Outputs: 8

This function takes two arguments, num1 and num2, and returns their sum.\ndef test_add_two_numbers():""" 


data = { "model": "codestral-latest", "prompt": prompt, "suffix": suffix, "temperature": 0 }

response = call_fim_endpoint(data)


Best Practices for Using the Codestral API

To get the most out of Codestral, let's explore some best practices for using the API effectively and responsibly.

Clear and concise prompts

When using the Codestral API, clarity is key. Ensure your prompts or instructions are well-defined and unambiguous to obtain the desired code output. You can guide the model towards generating code that aligns with your expectations by providing specific and contextual prompts.

Iterative refinement

Optimizing your interactions with the Codestral API involves an iterative process of refinement. Experiment with different parameters, prompts, and inputs to achieve the desired outcomes. Analyze the generated code outputs and use them to inform adjustments to your prompts. By continuously refining your approach, you can improve the quality and relevance of the generated code over time.

Responsible use

Responsible use of the Codestral API is important. Ensure that your use of the API aligns with ethical and legal standards. Avoid generating code that could be malicious or harmful. Adhere to best practices for secure and ethical programming, and respect the privacy and rights of others. Using the API responsibly contributes to a positive and sustainable development environment.


This tutorial provided a practical guide to using the Codestral API for code generation. We explored the different endpoints, best practices, and integration options.

I encourage you to experiment with the API and discover how it can enhance your own development workflow.

To learn more about Mistral, check out this Mistral 7B tutorial and this guide to working with the Mistral large model.

Photo of Ryan Ong
Ryan Ong

Ryan is a lead data scientist specialising in building AI applications using LLMs. He is a PhD candidate in Natural Language Processing and Knowledge Graphs at Imperial College London, where he also completed his Master’s degree in Computer Science. Outside of data science, he writes a weekly Substack newsletter, The Limitless Playbook, where he shares one actionable idea from the world's top thinkers and occasionally writes about core AI concepts.


Learn more about AI and APIs!


Working with the OpenAI API

3 hr
Start your journey developing AI-powered applications with the OpenAI API. Learn about the functionality that underpins popular AI applications like ChatGPT.
See DetailsRight Arrow
Start Course
See MoreRight Arrow


What is Mistral's Codestral? Key Features, Use Cases, and Limitations

Codestral is Mistral AI's first open-weight generative AI model designed for code generation tasks, automating code completion, generation, and testing across multiple languages.
Ryan Ong's photo

Ryan Ong

8 min


A Comprehensive Guide to Working with the Mistral Large Model

A detailed tutorial on the functionalities, comparisons, and practical applications of the Mistral Large Model.
Josep Ferrer's photo

Josep Ferrer

12 min


Mistral 7B Tutorial: A Step-by-Step Guide to Using and Fine-Tuning Mistral 7B

The tutorial covers accessing, quantizing, fine-tuning, merging, and saving this powerful 7.3 billion parameter open-source language model.
Abid Ali Awan's photo

Abid Ali Awan

12 min


GPT-4o API Tutorial: Getting Started with OpenAI's API

To connect through the GPT-4o API, obtain your API key from OpenAI, install the OpenAI Python library, and use it to send requests and receive responses from the GPT-4o models.
Ryan Ong's photo

Ryan Ong

8 min


Getting Started with the Claude 2 and the Claude 2 API

The Python SDK provides convenient access to Anthropic's powerful conversational AI assistant Claude 2, enabling developers to easily integrate its advanced natural language capabilities into a wide range of applications.
Abid Ali Awan's photo

Abid Ali Awan

12 min


Getting Started With Mixtral 8X22B

Explore how Mistral AI's Mixtral 8X22B model revolutionizes large language models with its efficient SMoE architecture, offering superior performance and scalability.
Bex Tuychiev's photo

Bex Tuychiev

12 min

See MoreSee More