Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add azure blob scaler #514

Merged
merged 2 commits into from
Jan 16, 2020
Merged

Add azure blob scaler #514

merged 2 commits into from
Jan 16, 2020

Conversation

IsharaPannila
Copy link
Contributor

@IsharaPannila IsharaPannila commented Dec 12, 2019

Relates to #154
Docs via kedacore/keda-docs#58

@tomkerkhove
Copy link
Member

Docs ✔️

@ahmelsayed
Copy link
Contributor

@IsharaPannila this change assumes that the worker will clear the blob container specified when done with blobs, right?

@IsharaPannila
Copy link
Contributor Author

@IsharaPannila this change assumes that the worker will clear the blob container specified when done with blobs, right?

Nope. if a sub path specified, worker will clear the given sub path(folder) when done with all the blobs inside that sub path. but blob container (and any other sub folders if exists) will remains not affected.

https://github.com/IsharaPannila/sample-blob-consumer/blob/master/blob_consumer.cs
check the above sample code,
once all the blobs processed the optional BLOB_SUB_PATH will clear but container-name will remains.

@tomkerkhove
Copy link
Member

Can you have another look at this please @ahmelsayed? @IsharaPannila mentioned in standup yesterday that this PR should be finished.

@ahmelsayed
Copy link
Contributor

Sorry for the delay in getting back. @IsharaPannila what I meant is this line in your sample (https://github.com/IsharaPannila/sample-dotnet-blob-azure-function/blob/master/blob_consumer.cs#L28) What would happen if you don't delete the blob once you're done? If I'm reading the PR correctly, containerURL.ListBlobsHierarchySegment() returns the list of blobs in the container, right? So if you don't delete the blob once you're done processing it, IsActive will always return true, right?

@IsharaPannila
Copy link
Contributor Author

@ahmelsayed yes, you are right. In our use cases we actually move the file to another sub folder for archiving. unless the worker delete/move the file IsActive will return true.

@jeffhollan
Copy link
Member

Talked about this is standup and chatted some with @ahmelsayed as well. To clarify the main point of discussion is that this scaler will scale based on any items in a blob container. In order to scale back down, the user code would need to delete or move the item out of the container. This is different than how blob trigger works for solutions like Event Grid, Azure Functions, and Azure Logic Apps where code is only triggered on new blobs.

I'm ok with this behavior in the interim, but it does present a fair consideration of what happens if we end up creating a scaler that scales based on the way other Azure Services trigger on blob? It would potentially be a breaking change or add some strange if/then code paths on behavior.

I'm wondering if we can set the name of this scaler as something unique to how it behaves. So in the future may be azure-blob and this could be azure-blob-all or something? Not sure. But is worth considering - how would we rationalize that behavior with this scaler, and should we name this scaler something different, or at least have a good understand of how that decision would be clear to users if both options were available

@IsharaPannila
Copy link
Contributor Author

@jeffhollan , I got some idea based on the comment of @ahmelsayed that I can use the marker and achieve the expected behaviour. so the change seems to me like IsActive will begin with empty marker and always get subsequent containerURL.ListBlobsHierarchySegment with the previous marker. so if there are new items added we can scale based on that. @ahmelsayed , will it give the expected outcome?

@ahmelsayed
Copy link
Contributor

@IsharaPannila I think that's the basic idea the runtime does. Though the implementation can be tricky unfortunately since you'll need to store that cursor in something external (azure functions uses another blob in the default storage account, similar to how eventhubs works if you're familiar with that). It also takes into account edited/updated blobs by tracking time stamps and few other stuff.

I'm okay with Jeff's suggestion of leaving the behavior of this scaler as is, and call out in the docs explicitly that this is different from how azure functions in particular does it. and maybe in the future we can have another scaler that implements the azure functions particular contact. and in the meantime you can use this scaler even with azure functions if you do as you did in the sample.

@IsharaPannila
Copy link
Contributor Author

@ahmelsayed sounds good. yes , its better to leave this behaviour as is and meantime I will start working on the implementation of the particular azure functions contract as a seperate scalar and will reach out to you when progressing it to get your feedback. I will update the doc for this scalar to something similar to Azure Blob Storage† and mention the limitations(†) .

@ahmelsayed
Copy link
Contributor

Thanks @IsharaPannila, and thank you for adding an e2e test too :)

@ahmelsayed ahmelsayed merged commit 1aeb063 into kedacore:master Jan 16, 2020
@jeffhollan
Copy link
Member

Thanks all

@tomkerkhove
Copy link
Member

Created a follow-up issue for the discussion above #547

I'm wondering if we can set the name of this scaler as something unique to how it behaves. So in the future may be azure-blob and this could be azure-blob-all or something? Not sure. But is worth considering - how would we rationalize that behavior with this scaler, and should we name this scaler something different, or at least have a good understand of how that decision would be clear to users if both options were available

I'm happy to see that we've sticked with the original name as we can still swap the behavior internally. Over time, we can change it and maybe rev the blob scaler seperately from the runtime.

preflightsiren pushed a commit to preflightsiren/keda that referenced this pull request Nov 7, 2021
Signed-off-by: jorturfer <jorge_turrado@hotmail.es>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants