Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Programmatically create permanent session #744

Closed
jmilkiewicz opened this issue Nov 14, 2023 · 10 comments
Closed

Programmatically create permanent session #744

jmilkiewicz opened this issue Nov 14, 2023 · 10 comments

Comments

@jmilkiewicz
Copy link
Contributor

Hey

I am just wondering if you consider creating permanent session programmatically - via REST API.
The problem i am facing is that in my app i consider using a mix of permanent and "regular sessions" . Both of them require a huge number of submit-params to be provided - regular will differ from perm only by pyFiles passed on session creation. I would like to avoid storing/keeping common submit-params in 2 places: in my app and in lighter application*.yml file.
One of the solution would be to allow user to create perm session on the fly (via REST API), what do you think ?

Of course it would require solving some issues like:

  • a perm session with name 'foo' in conf file would conflict with REST-API perm session with the same name.
  • shall deleting of perm session is allowed ?
  • shall updating of perm sessions is allowed ?
@pdambrauskas
Copy link
Collaborator

Souds like a reasonable idea.

There are some thoughts on the issues you have listed:

a perm session with name 'foo' in conf file would conflict with REST-API perm session with the same name.

permanent sessions and all sessions in general are identified by their ids, same name can be used for multiple sessions. I think we allow to specify application id, when creating a new session/batch app, we should reject creation requests, when ids match. Curently it would probably be rejected by database conatraint violation.

shall deleting of perm session is allowed ?

hm, probably we should introduce a soft delete (deleted field on our database) to support it for both - preconfigured permanent sessions & permanent sessions created through the Rest API, otherwise preconfigured sessions would be recreated after restart, since we wouldnt be able to tell if they were deleted or not.

shall updating of perm sessions is allowed ?

I think in that case the user should delete a session and re-create a new one. Since Spark sessions cannot be modified at runtume, it would be nice to not suggest that it is possible throug our api endpoints :)

@jmilkiewicz
Copy link
Contributor Author

I started implementing this feature yesterday and faced a few challenges...
Given that the perm sessions can be created both in yaml and via REST API we will somehow have 2 sources of truth.

Even in the current implementation (yaml only) i see a following problem:

  • user defines perm session in yaml with id foo and conf parameter aaa -> bbb,
  • lighter is started and perm session defined in yaml is created,
  • user stops lighter ( perm session still goes on) and modifies conf parameter aaa -> ccc in yaml,
  • user starts lighter

According to what i see in a code this modification would be ignored but would give an impression to user that he/she made changes to perm session configuration

In case of allowing users to define perm sessions via REST api and yaml I can see even more hairy scenarios, like:

  • perm session created via REST , lighter stops, perm session with the same id created in yaml but with different params, lighter starts -> shall we throw exception on startup ?
  • perm session created via yaml, lighter starts, lighter stops, perm session removed from yaml, lighter starts -> shall we somehow remove this perm session ?
  • perm session created via yaml, lighter starts, perm session removed via REST, lighter stops, yaml update for perm session with different configuration params, lighter starts -> shall we ignore this modification ?

I am not sure if i am covering all scenarios above but valid implementation (and what's even more important sensible explanation in doc what is expected behavior) can be challenging... The simplest option i can think of now is to simply remove possibility to define perm sessions in yaml and rely on REST API. If REST API would only allow:

  • creating perm session,
  • deleting perm session

all problems mentioned above (plus implementation of soft delete) would be gone

@jmilkiewicz
Copy link
Contributor Author

A draft version in which we can have both, ie YAML and REST API for permanent sessions #748

@pdambrauskas
Copy link
Collaborator

Even in the current implementation (yaml only) i see a following problem:...

Yeah, Permanent session functionality was created with one specific use-case in mind, we didn't really thought, that others would find a use-case for it :). To recreate a session with new configuration we just deleted an existing one, and the Lighter took care of creating a new one with new submit params. That was enough in our case.

In case of allowing users to define perm sessions via REST api and yaml I can see even more hairy scenarios, like:...

maybe we should introduce additional configuration field for permanent sessions defined on yaml, something like creationMode: recreate | createOnMissing, so that the user defines, what should be done on lighter startup - weather the session should be recreated, or created only in cases, when session with the same id does not exist, I've seen this strategy on some other project.

Otherwise we'd have to somehow hash the submit params, and compare them with running sessions params, to know if the configs have changed, I'm not sure if thats worth the effort.

The simplest option i can think of now is to simply remove possibility to define perm sessions in yaml and rely on REST API.

@Minutis team is using permanent session configuration through the yaml file, I'm not sure if his team is willing to rework their solution :)

@jmilkiewicz
Copy link
Contributor Author

jmilkiewicz commented Nov 15, 2023

Yeah, Permanent session functionality was created with one specific use-case in mind, we didn't really thought, that others would find a use-case for it :). To recreate a session with new configuration we just deleted an existing one, and the Lighter took care of creating a new one with new submit params. That was enough in our case.

Manually deleting session, modifying perm session config and then restarting lighter sounds like a lot of work...

maybe we should introduce additional configuration field for permanent sessions defined on yaml, something like creationMode: recreate | createOnMissing, so that the user defines, what should be done on lighter startup - weather the session should be recreated, or created only in cases, when session with the same id does not exist, I've seen this strategy on some other project.

Given how it is implemented now why not give preferences to REST API in case of conflict ?

Otherwise we'd have to somehow hash the submit params, and compare them with running sessions params, to know if the configs have changed, I'm not sure if thats worth the effort.

The simplest option i can think of now is to simply remove possibility to define perm sessions in yaml and rely on REST API.

@Minutis team is using permanent session configuration through the yaml file, I'm not sure if his team is willing to rework their solution :)

Fine. That's why i introduced an option to define it via REST api and in case of conflict what you have in DB wins

@Minutis
Copy link
Member

Minutis commented Nov 16, 2023

Hello,

I think managing permanent session via API is a great idea. Although we should be able to think of a solution that would still allow us to create them not only via API but also in a declarative fashion using yaml as well. These two approaches serves two different use cases. One of them is when you know you need a specific limited permanent session pool and configuration hardly changes and another if you do not know the exact number of permanent sessions or when configurations is dynamic during runtime. I understand that @jmilkiewicz has use cases where he needs to dynamically manage permanent sessions and it sounds like a great new feature on the current permanent session implementation. But it would be best to avoid cluttering or adding limitations to other features while introducing new functionality.

Of course that current permanent session implementation is far from perfect needing to manually delete the job in case any configuration changes. But the same problem applies to API managed permanent sessions since you still need to kill existing session and create a new one to apply any parameters changes.

That being said, this is how I imagine we could get the best of both worlds - declarative and API managed permanent sessions:

  • API should not be able to mutate the permanent sessions that are defined in a declarative fashion.
  • Declarative permanent sessions should be fixed so that in case the parameters change (after the Lighter restart) we should automatically kill and re-create the session with new params to avoid current situation that @jmilkiewicz described.

If we agree on these two points we avoid all the hassle that is possible between API and declarative permanent sessions. Also both of the approaches do not limit each other.

What you guys think? @pdambrauskas @jmilkiewicz

@jmilkiewicz
Copy link
Contributor Author

jmilkiewicz commented Nov 16, 2023

I understand that @jmilkiewicz has use cases where he needs to dynamically manage permanent sessions and it sounds like a great new feature on the current permanent session implementation.

My use-case is slightly totally different. I wanna have option for

  • session pooling,
  • ad-hoc session creation.
    As both of them will require a few dozens of configuration params i do not wanna to keep them in 2 places:
  • in my app
  • in a lighter yaml

@Minutis I already started with a simple implementation #748 in which BD wins.

That being said, this is how I imagine we could get the best of both worlds - declarative and API managed permanent sessions:

  • API should not be able to mutate the permanent sessions that are defined in a declarative fashion.

It shall be easy. ALL yaml defined perm sessions which ARE not in DB will be inserted to DB, so on REST call to create permanent session with the same ID which is already in DB we can reject the request - probably CONFLICT

  • Declarative permanent sessions should be fixed so that in case the parameters change (after the Lighter restart) we should automatically kill and re-create the session with new params to avoid current situation that @jmilkiewicz described.

It does not solve the whole issue: what if you remove permanent session from YAML which had been created (and started before), how can you track it ?
My solution (albeit not being perfect) assumes that one can delete (soft delete) permanent session via REST and DB is source of truth. Have a look at https://github.com/exacaster/lighter/blob/183ceee506b59652875665b39a755a0ecb5f288a/server/src/test/groovy/com/exacaster/lighter/application/sessions/PermanentSessionHandlerTest.groovy for details

@Minutis
Copy link
Member

Minutis commented Nov 16, 2023

It does not solve the whole issue: what if you remove permanent session from YAML which had been created (and started before), how can you track it ?

This is a good point. Haven't thought about it.

Although a bigger issue that I see with declarative permanent sessions being saved in the same fashion as dynamic permanent sessions is that you should not be able to mutate or manage them in any other way but in the yaml itself. I do not see any good reason to have a permanent session defined in yaml and have it's params out of sync with what is actually running.

Not the ideal solution but in that case maybe we could have a trait in the database where we would see if permanent session was started via API or via yaml? In that case we would cover the point regarding the deletion of permanent session in case it was no longer defined in yaml after Lighter restart? What do you think?

Again I would like to separate management logic for declarative and API managed permanent session to a point because although it's the same type of object but lifetime and management should be treated differently.

@jmilkiewicz
Copy link
Contributor Author

Not the ideal solution but in that case maybe we could have a trait in the database where we would see if permanent session was started via API or via yaml? In that case we would cover the point regarding the deletion of permanent session in case it was no longer defined in yaml after Lighter restart? What do you think?

I see why not, on the other case for me it would be v2 of this feature. For now i will go for much simpler solution - DB wins. It is still a big progress comparing what is now.
@Minutis have you had a change to look at the cases expressed in https://github.com/exacaster/lighter/blob/183ceee506b59652875665b39a755a0ecb5f288a/server/src/test/groovy/com/exacaster/lighter/application/sessions/PermanentSessionHandlerTest.groovy ?

@pdambrauskas
Copy link
Collaborator

Addressed by #748

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants