Place call in conference on answer (replicating Twilio functionality)

Hi,

We’re trying to recreate our existing Twilio functionality in Jambonz and I don’t know if I’m hitting a fundamental difference/limitation or I’m doing something stupid. The behaviour we are looking for is to dial out and upgrade the call to a conference as soon as the call is answered.

In Twilio, when we make an outbound call, we dial the callee using <Dial action="/dialAction"><Number url="/receiverAnswers">0123456789</Number></Dial>. On the receiverAnswers hook, we call the Twilio’s API UpdateAsync on the callee’s CallSid to redirect it into a conference. Twilio fires the callback for the origin, and then we send the origin to join the same conference via the action response. Both parties end up in a conference without either SIP session being torn down.

However, when I try to recreate similar behaviour in Jambonz it doesn’t work. I’ve tried two different ways - both ways starting with a dial verb and an actionHook and confirmHook:

  • Using UpdateCall in the REST API in the confirmHook to send the callee into a conference - this doesn’t seem to do anything (I assume it’s because it’s waiting on the response from the confirmHook and this takes precedent?).

  • Returning a Conference verb on the confirmHook which moves the callee to the conference. This doesn’t trigger the actionHook so the origin stays in dial state, and if we force it in to a conference with UpdateCall on the API (from a conference status callback) it ends the call for the callee.

Does that all sound right or am I doing something wrong? Is the best option to get what we want here to start with a Conference verb instead of Dial, or is there another approach we could take? Thank you in advance!

Hi Matt,

My initial thought for doing this would be to avoid the dial verb all together,
For your initial incoming call I would put this into the conference but not set startOnEnter for them, and set the hold music to be a ringback tone.
Then use the REST APi to create a new outbound call to the B leg and when they answer return them the same conference verb but with startOnEnter set to true.

Are you using webhooks or websockets?

Hi Sam,

Thanks for your response! I will take that approach and check it works as expected!

We are using webhooks because that is most similar to our existing Twilio functionality. Would websockets change the approach here?

Webosckets wouldn’t change the overall scenario I described but they do make it a bit easier to update the call verbs asyncronously.
You could even hold the incomming call in a ringing state until the outgoing leg is in the conference and then answer it.
Overall websockets tend to give more flexibility when dealing with multiple legs of a call

Good to know thank you!

So using websockets we could leave the call in a ringing state (whilst we create a separate outbound call via the API and direct both to a conference on answer), but we can’t do this with webhooks? We’ve gone with the webhook approach because our existing service/infrastructure is built around answering Twilio webhooks but we could revisit that choice.

I’ve been asking about upgrading to a conference at answer time - but just to check it’s impossible to upgrade a dial to a conference at any point during the call?

To give a bit more context about what we’re trying to achieve - we’d like users to be able to make outbound calls which include call monitoring/whispering, warm transfers (including a 3-way conference step - I know you support warm transfers without conferences as long as one party is always on hold) which require conferences. But we’d also like the advantages of starting the call using Dial (destination country ringback, accurate call time, expected handling of busy/failed calls) that the user is expecting when placing a phone call and are things we’d have to implement ourselves/fake.

It sounds like our choices are:

  1. Start with a conference and fake ringback, calculate our own ring time instead of using the SIP one for display in our Softphone, etc.
  2. Switch to using websockets and leave the call in the ringing state - does this mean just not replying with any verbs on the websocket connection until the outbound B leg we create via the REST API is answered/rejected? What’s the ringback in this case - is there a way to pass through early media from the B leg?

Hi Matt,

I’ve been doing a bit of testing this morning and it is possible to get what you are aiming for with Webhooks.

Here’s the flow:

The initial incoming call should return a Dial verb with the B leg address in it as the destinaton

When that B leg call is answered you’ll get a status event on the apps status hook with the direction as outbound and the call_status as in-progress, this tells you the B party has answered and they are currently connected in a 1-1 call.

Using this event as your trigger you can now update the call_sid of the initial leg (or in that b party even it will be called parent_call_sid) using the update call API update a call
The key thing here to send is the call_hook AND the child_call_hook.

These should both be to a URL that will then return a conference verb with the conf ID in it, I’d suggest using something like the A leg call_sid as the conf ID.

The 1-1 dial call with then convert into a conference, you’ll want to set a few param on the conference verb like startOnEnter, & EndOnExit so if one leg hangs up the other is terminated, and also maybe DTMF passthrough so that if the A party sends DTMF its sent to the B party across the conference.

Thank you - the child_call_hook is what I was missing.

However I am now seeing some weird behaviour with the conference started like this. The conference seems to be created with both parties, and if I set endConferenceOnExit=false on one party the other party stays connected when they leave which seems to be conference-like behaviour. BUT I don’t receive anything on the conferenceStatus webhook and the /Accounts/:AccountSid/Conferences API endpoint doesn’t list the conference ID.

These both work as expected if I return a conference verb to the initial incoming call webhook.

Here are examples of the body of the responses returned from the call_hook and child_call_hook which I’ve logged:

[
  {
    "verb": "conference",
    "name": "DC.BdHp6f8uB191is1bSNdS",
    "beep": false,
    "memberTag": "CK.5v6s7xZwra9Ic6L3jYhH",
    "startConferenceOnEnter": true,
    "endConferenceOnExit": true,
    "statusEvents": [
      "start",
      "end",
      "join",
      "leave"
    ],
    "statusHook": "https://matt-router-local.ngrok.dev/api/jambonz/conference-status/DC.BdHp6f8uB191is1bSNdS"
  }
]
[
  {
    "verb": "conference",
    "name": "DC.BdHp6f8uB191is1bSNdS",
    "beep": false,
    "startConferenceOnEnter": true,
    "endConferenceOnExit": true,
    "statusEvents": [
      "start",
      "end",
      "join",
      "leave"
    ],
    "statusHook": "https://matt-router-local.ngrok.dev/api/jambonz/conference-status/DC.BdHp6f8uB191is1bSNdS"
  }
]

I’ve SSH’d into the box and looked at the pm2 logs and can see SipError: Sip non-success response: 603 System Tampering Detected!.

Hi Matt,

Thats strange, I see the conf status hooks in my test app and your verb looks to be correct.

The 603 error is a bit odd too, that usually relates to the license but if calls are going through normally then it shouldn’t be an issue. Just double check the license still shows as valid in the web ui.

I just tested it again and that error appeared again at the point it should have moved to the conference. The license is showing as valid on the licensing site and in the Jambonz settings.

This is on our development cluster but we have a production cluster spun up that’s not in use yet - tomorrow I can try the same experiment on that with a new trial license and let you know how I get on!

Hi Sam,

I’ve tried it on the other cluster and got the same results: no conference webhooks, conference ID not appearing in API response, and same 603 error from FreeSwitch.

I’ve managed to pick up the FreeSwitch logs. There’s a WARNING about a missing X-Jambonz-Session-Token which appears at the point the call is answered - not sure if this is a big clue or a red herring?

Here is everything from /usr/local/freeswitch/log/freeswitch.log on the Feature Server at the point the call is answered:

6768ede9-ece0-4a83-ad0d-c9281906d69d 2026-04-30 16:28:43.649919 98.80% [NOTICE] mod_dptools.c:1374 Hangup sofia/drachtio_mrf/nobody@172.20.10.12:5060 [CS_EXCHANGE_MEDIA] [NORMAL_CLEARING]
6768ede9-ece0-4a83-ad0d-c9281906d69d 2026-04-30 16:28:43.669857 98.80% [NOTICE] switch_core_session.c:1762 Session 11 (sofia/drachtio_mrf/nobody@172.20.10.12:5060) Ended
6768ede9-ece0-4a83-ad0d-c9281906d69d 2026-04-30 16:28:43.669857 98.80% [NOTICE] switch_core_session.c:1766 Close Channel sofia/drachtio_mrf/nobody@172.20.10.12:5060 [CS_DESTROY]
688b144d-fa83-4a32-a0cc-69c274b1bb6d 2026-04-30 16:28:43.729929 98.80% [NOTICE] mod_dptools.c:1374 Hangup sofia/drachtio_mrf/nobody@172.20.10.12:5060 [CS_PARK] [NORMAL_CLEARING]
688b144d-fa83-4a32-a0cc-69c274b1bb6d [call_sid=e3b8a451-5033-446e-99fe-e8a032e0a15e] 2026-04-30 16:28:43.729929 98.80% [NOTICE] switch_core_session.c:1762 Session 10 (sofia/drachtio_mrf/nobody@172.20.10.12:5060) Ended
688b144d-fa83-4a32-a0cc-69c274b1bb6d 2026-04-30 16:28:43.729929 98.80% [NOTICE] switch_core_session.c:1766 Close Channel sofia/drachtio_mrf/nobody@172.20.10.12:5060 [CS_DESTROY]
2026-04-30 16:28:44.069856 98.80% [WARNING] sofia.c:2657 Rejecting INVITE with SDP but missing X-Jambonz-Session-Token header

Note that this is a deployment in Azure from the Azure Terraform here (and the two clusters are mini and large): terraform/azure at main · jambonz-selfhosting/terraform · GitHub

Thanks!