Question about Flex Consumption Function App Service Bus Trigger #10706

dsdilpreet · 2024-12-19T03:08:54Z

Is your question related to a specific version? If so, please specify:

What language does your question apply to? (e.g. C#, JavaScript, Java, All)

C#

Question

Hi there, I asked this question over here, but this might be more appropriate place to ask, if not my apologies. The question is about service bus messages stay in queue for longer on flex consumption compared to consumption. The explanation is:

I have multiple functions which subscribe to service bus topics using the ServiceBusTrigger. All function apps are running .NET 8 isolated model. The service bus is using standard tier. Each function app is being pinged from application insights every 10 minutes.

I have recently migrated from windows consumption plan to flex consumption plan.

On the consumption plan, when the app scaled down to 0, service bus requests would also drop whereas on flex consumption, it doesn't drop when function apps scale down to 0. It only drops when I turn them off.

I understand that service bus functions are now scaled independently of other trigger types on flex consumption plan. What I am noticing now on flex is that there is some delay even before a message is picked up from the subscription by the trigger in the function, I have seen almost up to 2 minutes. I have never observed delays of this kind on consumption plan.

Is this expected? Is there any configuration or setting I can change in the function app, so it checks for messages more frequently? Even when I have always on instance enabled, the delay still seems to be there although it does seem a bit reduced but then again I haven't done too much with this.

Appreciate any insight into how this works internally.

Thank you!

My host.json file

{
    "version": "2.0",
    "logging": {
        "applicationInsights": {
            "samplingSettings": {
                "isEnabled": true,
                "excludedTypes": "Request"
            },
            "enableLiveMetricsFilters": true
        }
    }
}

and appsettings.json

{
  "Logging": {
    "LogLevel": {
      "Default": "Information",
      "Function.GetHealthFunction": "Error",
      "Azure.Messaging.ServiceBus": "Warning",
      "Azure.Core": "Warning"
    }
  }
}

These files have not been changed when migrating to flex consumption.

For context, I have randomly picked a event from app insights before the changeover to flex.

The text was updated successfully, but these errors were encountered:

dsdilpreet · 2024-12-31T02:26:21Z

I have created a little sample to help reproduce this issue.
https://github.com/dsdilpreet/flex-consumption-service-bus-sample

The sample contains a bicep script to deploy relevant infra and two function apps, one running consumption and the other running flex consumption. As you can see from the results below flex consumption is significantly slower. It took 8 and 11 seconds for flex and just milliseconds for consumption to trigger after message was enqueued to the same topic. I made sure in both cases, exactly one instance was running for each function app, so there was no cold start involved.

Test 1
Consumption

Flex Consumption

Test 2
Consumption

Flex Consumption

dsdilpreet · 2025-01-12T22:57:51Z

Hi @nzthiago! Is this your area of expertise? Would really appreciate your input, this is holding up our flex consumption deployment, unfortunately.

nzthiago · 2025-01-13T22:13:10Z

Hi @dsdilpreet - thank you for pinging me, and for sharing a repro. I also added a Linux Consumption app to the test, and for the initial message I can see the same results as you (with Linux Consumption being similar to Flex Consumption). There is likely a "scale from zero" optimization that was done to Windows Consumption that we need to bring to Flex Consumption here.

Can you share what you experience with subsequent messages? I.e., if you wait, say, 30 minutes, and send another message, and then a few more quickly, does the behavior and latency difference change for you? I believe Flex Consumption should be faster for those.

dsdilpreet · 2025-01-13T23:04:18Z

Hi @nzthiago - thanks for getting back.

I don't think its just the cold start at play here. I have done the test you said again on my end (same setup as repo still). I sent the first message (should be cold start because I haven't sent anything to the topic for days) and then I sent 3 more messages within seconds of sending the first one.

Message	Windows Consumption	Flex Consumption	Notes
1 (cold start)	~12s	~12s	roughly the same
2 (warm)	pretty much instant	~9s
3 (warm)	pretty much instant	~10s
4 (warm)	pretty much instant	~10s

It seems like flex consumption doesn't poll the service bus as frequently compared to windows consumption. Do you have any insights about this?

Thanks for your help so far.

nzthiago · 2025-01-14T17:34:50Z

@dsdilpreet thank you for the extra tests, appreciate it. We now understand why you have these results. It is both related to how fast Flex Consumption scales in and how quickly it checks for new messages in the queue. Anything beyond 30 seconds between tests could have the Flex Consumption app scaled back to zero, which was the case for your tests. Once that queue or topic gets busy then Flex Consumption will scale and perform faster than Consumption.

We will discuss internally how to improve this, either with faster checks for changes or take longer to scale in, or both. In the meantime, if you need very fast response for that very initial message, this can be mitigated by enabling one Always Ready instance for that function.

You would be able to see if the instance gets reused or if it's a new instance by looking at the cloud_RoleInstance field in an App Insights query against the traces table. Here's a sample query:

traces
| parse-where message with "Trigger Details: MessageId: " MessageId ", SequenceNumber: " SequenceNumber ", DeliveryCount: " DeliveryCount ", EnqueuedTimeUtc: " EnqueuedTimeUtc ", " *
| extend LatencyToTriggerMs = datetime_diff("millisecond", timestamp, todatetime(EnqueuedTimeUtc)) 
| project timestamp, EnqueuedTimeUtc, LatencyToTriggerMs, cloud_RoleName, cloud_RoleInstance, MessageId, SequenceNumber, DeliveryCount
| order by EnqueuedTimeUtc asc

With that one Always Ready instance the Flex Consumption app triggers in milliseconds the first and subsequent times:

dsdilpreet · 2025-01-15T03:51:12Z

@nzthiago, you are right. When I send a message very quickly after an instance has started, the flex consumption plan does process it pretty much instantly. But the function seems to scale in irrespective of traffic after about 30 seconds i.e. even if I keep sending messages it will still scale in and an odd message will start a new instance.

We also tried an always on instance and it seems to remedy the problem but we do have a lot of subscriptions, so having an always on instance will have a significant cost implication for our solution.

It will be great if you could configure the polling / scaling as you said in your previous comment. Is there anyway we can track the progress of this as we would like to use flex consumption going forward when this is fixed?

Thank you for replicating this on your end and all your help so far!

nzthiago · 2025-01-27T21:13:09Z

@dsdilpreet - we now have in our backlog to introduce a "last instance per function group / individual function remains for 10 minutes" feature, to mitigate the behavior you identified of the app scaling in too fast. This will likely take a few months to be implemented and roll out, so the workaround shared above is recommended for now, even though it might not be the best for your implementation. I will update our documentation once it does roll out, thank you for highlighting this! @pragnagopa @alrod FYI.

dsdilpreet · 2025-01-28T03:47:47Z

Thank you! @nzthiago

Rough timeline helps as well.

Would this also address the service bus polling frequency issue, you know how sometimes a message can stay in the bus for a while before an instance begins to init?

microsoft-github-policy-service bot added the Needs: Triage (Functions) label Dec 19, 2024

satvu added the area: flex-consumption Items related to Flex Consumption support label Dec 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about Flex Consumption Function App Service Bus Trigger #10706

Question about Flex Consumption Function App Service Bus Trigger #10706

dsdilpreet commented Dec 19, 2024

dsdilpreet commented Dec 31, 2024

dsdilpreet commented Jan 12, 2025

nzthiago commented Jan 13, 2025

dsdilpreet commented Jan 13, 2025

nzthiago commented Jan 14, 2025 •

edited

Loading

dsdilpreet commented Jan 15, 2025

nzthiago commented Jan 27, 2025

dsdilpreet commented Jan 28, 2025

Question about Flex Consumption Function App Service Bus Trigger #10706

Question about Flex Consumption Function App Service Bus Trigger #10706

Comments

dsdilpreet commented Dec 19, 2024

Is your question related to a specific version? If so, please specify:

What language does your question apply to? (e.g. C#, JavaScript, Java, All)

Question

dsdilpreet commented Dec 31, 2024

dsdilpreet commented Jan 12, 2025

nzthiago commented Jan 13, 2025

dsdilpreet commented Jan 13, 2025

nzthiago commented Jan 14, 2025 • edited Loading

dsdilpreet commented Jan 15, 2025

nzthiago commented Jan 27, 2025

dsdilpreet commented Jan 28, 2025

nzthiago commented Jan 14, 2025 •

edited

Loading