Chat Model

Nebula Chat model provides multi-turn or single-turn interactions between user and assistant.

The Nebula Chat model allows the user to continue to chat with the model. It takes a list of messages in input and returns generated response messages by the model in output. The chat model is ideal for a multi-turn chat with the model, however even when there's no need for continued chat with the model it can be used for single-turn tasks as well.

You can include your conversation transcript with instructions in one or more messages.

Here's an example of an API call to the Chat Model:

See the last message in the payload is modified to include the transcript.

export NEBULA_API_KEY="YOUR_API_KEY"
curl --location "https://api-nebula.symbl.ai/v1/model/chat" \
--header "ApiKey: $NEBULA_API_KEY" \
--header "Content-Type: application/json" \
--data '{
    "max_new_tokens": 1024,
    "system_prompt": "You are a sales coaching assistant. You help user to get better at selling. You are respectful, professional and you always respond politely.",
    "messages": [
        {
            "role": "human",
            "text": "Hello, I am Mark."
        },
        {
            "role": "assistant",
            "text": "Hello, Mark. I'\''m here to assist you with any sales-related questions or topics you'\''d like to discuss. How can I help you today?"
        },
        {
            "role": "human",
            "text": "Give me some tips on how do I improve in handling objections. Here is my transcript:\nMark: Hello, I am Mark from DataFlyte. How are you doing?\nCustomer: Hi, I am good. ....."
        }
    ]
}'
import requests
import json

NEBULA_API_KEY="YOUR_API_KEY"  # Replace with your API key

url = "https://api-nebula.symbl.ai/v1/model/chat"

payload = json.dumps({
  "max_new_tokens": 1024,
  "system_prompt": "You are a sales coaching assistant. You help user to get better at selling. You are respectful, professional and you always respond politely.",
  "messages": [
    {
      "role": "human",
      "text": "Hello, I am Mark."
    },
    {
      "role": "assistant",
      "text": "Hello, Mark. I'm here to assist you with any sales-related questions or topics you'd like to discuss. How can I help you today?"
    },
    {
      "role": "human",
      "text": "Give me some tips on how do I improve in handling objections. Here is my transcript:\nMark: Hello, I am Mark from DataFlyte. How are you doing?\nCustomer: Hi, I am good. ....."
    }
  ]
})
headers = {
  'ApiKey': NEBULA_API_KEY,
  'Content-Type': 'application/json'
}

response = requests.request("POST", url, headers=headers, data=payload)

print(response.json())

You can take a look at the API reference to learn more about the Chat API.

The input contains messages field which is an array of message objects, where each object has a role - either human or assistant, and text of the message. There should be at least one message in the messages array, or it can be many back-and-forth turn messages. Human messages are requests, instructions, or comments for the assistant to reply to. Assistant messages represent previous assistant responses, however, if you wish to write your own examples for assistant messages to achieve the desired behavior you can do so.

It's important to include the history of chat messages when user's messages refer to previous messages. Since the model doesn't have the memory of past requests or messages, all relevant information must be provided in the messages as chat history in each request. If it cannot fit in the model's token limit, it should be shortened.

The system_prompt can be provided optionally in the input request. It helps set the behavior of the assistant. In the above example, the system prompt sets the context for the assistant to behave like a coaching assistant for sales and indicates that it needs to help the user to get better at sales while being respectful, professional, and polite. If the system prompt is not provided, the assistant behavior will be generic and grounded to be helpful and polite.

If you want to receive output as it is generated progressively, you can use the/v1/model/chat/streaming endpoint.

Response Format

{
    "model": "nebula-chat-large",
    "messages": [
        {
            "role": "human",
            "text": "Hello, I am Mark."
        },
        {
            "role": "assistant",
            "text": "Hello, Mark. I'm here to assist you with any sales-related questions or topics you'd like to discuss. How can I help you today?"
        }
    ],
    "stats": {
        "input_tokens": 61,
        "output_tokens": 34,
        "total_tokens": 95
    }
}

You can access the assistant's response using:

response['messages'][-1]['text']

You can further tune other parameters to control the generation behavior of the model.