Note: this page describes how to build applications using websockets. If you prefer to use the webhooks API, please visit this page.
TLDR;
- Use
npx create-jambonz-ws-app
to scaffold a webhook application - See @jambonz/node-client-ws for Node.js API
The websocket API is functionally equivalent to the Webhook API; it is simply an alternative way for an application to interact with and drive jambonz call and message processing. We recommend using the websocket API for highly asynchronous applications.
When you create a jambonz application in the jambonz portal and you want to use the websocket API, simply provide a ws(s) URL for the calling webhook instead of an http(s) URL. The call status webhook can be the same ws(s) URL, in which case your application will get the call status notifications over the same websocket connections.
You can also have call status notifications sent to a completely separate http(s) webhook URL if you prefer.
The impact of specifying a ws(s) URL as the application calling webhook is that this causes jambonz to establish a websocket connection to that URL when an incoming call (or outbound call) is routed to the jambonz application, and then communicate with your application over that websocket connection.
In the documentation below, we refer to the websocket server as the "application".
The websocket connection will be established by jambonz to the specified websocket URL, The websocket subprotocol used shall be “ws.jambonz.org”. If jambonz fails to connect to the provided URL, there will be no retry and the call shall be rejected.
Once connected, jambonz will send an initial JSON text message to the your application with the same parameters as are provided in the webhook call. The full message set is described below, but for now we can simply say that:
- Only text frames are ever sent over the websocket connections; i.e. no binary frames.
- All text frames contain JSON-formatted data.
- The information content sent from jambonz to the your application is exactly the same content as that supplied via http webhooks.
The websocket should generally be closed only from the jambonz side, which happens when the call is ended. If the your application closes the socket, jambonz will attempt to reconnect, up to a configurable number of reconnection attempts. Upon reconnecting, jambonz will send an initial reconnect message containing only the callSid of the session. It is up to the your application to maintain the state of the application between reconnections for the same call.
As mentioned above, all messages will be JSON payloads sent as text frames. The following top-level properties will be commonly included:
- type: all messages must have a type property.
- Messages from jambonz to the your application will have the following types: [
session:new
,session:reconnect
,verb:hook
,call:status
,error
]. - Messages from the your application to jambonz will have the following types: [
ack
,command
].
- Messages from jambonz to the your application will have the following types: [
- msgid: every message sent from jambonz will include a unique message identifier. Messages from the your application application that are responses to jambonz messages (
ack
) must include the msgId that they are acknowledging.
Note that not all messages sent by jambonz need to be acknowledged. The message types which must be acknowledged are the session:new
, and verb:hook
messages.
In the sections that follow, we will describe each of the message types in detail. The tables below provides summary information for:
- client messages (sent from jambonz to your application) and,
- server messages (sent from your application to jambonz).
The following messages are sent by jambonz to your application
type | usage |
---|---|
session:new | sent when a new call arrives (or an outbound call generated via the REST API has been answered). This is analogous to the initial webhook sent by jambonz to gather an initial set of instructions for the call. |
session:redirect | sent when live call control has been used to retrieve a new application for either the parent or child call leg. |
session:reconnect | sent when the websocket connection was closed unexpectedly by the application and jambonz has successfully reconnected. |
call:status | sent any time the call status changes. |
verb:hook | sent when an action hook or event hook configured for a verb has been triggered (e.g. a “gather” verb has collected an utterance from the user). |
verb:status | sent when a verb has just started or completed executing. See “command” below; this message is only sent if the application includes “id” properties on the verbs provided. |
llm:event | sent when an LLM generates any kind of event; e.g. transcript, etc |
llm:tool-call | sent when an LLM agent makes a tool or function call that the application needs to invoke |
tts:tokens-result | sent in response to a tts:tokens message to indicate whether the tokens have been processed. The payload may indicate that the tokens were not processed due to a throttling limit, in which case the application is expected to queue the tokens and retry later (after a tts:tokens message is received indicating the stream has been resumed) |
tts:streaming-event | sent to notify an application that a tts stream has been paused or resumed due to throttling limits |
dial:confirm | sent when a dialed call has a confirmHook; the application should respond with a payload of verbs to play in the confirm call session |
jambonz:error | if jambonz encounters some sort of fatal error (i.e. something that would necessitate ending the call unexpectedly) jambonz will send an error event to the far end application describing the problem. |
The following messages can be sent from your application back to jambonz
type | usage |
---|---|
ack | the jambonz application must respond to any session:new or verb:hook message with an ack message indicating that the provided content in the message has been processed. The ack message may optionally contain a payload of new instructions for jambonz. |
command | the application will send this message when it wants to asynchronously provide a new set of instructions to jambonz. The application may include an id property in each of the verbs included in the command; if so, jambonz will send verb:status notifications back to the application when the verb is executed. The id property is a string value that is assigned by the application and is meaningful only to the application (i.e. to jambonz it is simply an opaque piece of tracking data). |
llm:tool-output | the application should send this when an LLM has invoked a tool and results are available |
llm:update | the apps sends when it wants to asynchronously provide new instructions or session state to the LLM |
tts:tokens | sent when the application wants to stream text, e.g. from an LLM being managed on the application side. This may be called multiple times as the LLM itself streams tokens |
tts:flush | sent periodically during TTS streaming when the application wants to cause audio to be generated |
tts:clear | sent during TTS streaming when the application wants to discard any queued audio and tokens, e.g. to handle an interruption |