[🐛 Bug]: org.openqa.selenium.NoSuchSessionException: Unable to find session with ID #14322

rishabhjain-qait · 2024-07-30T07:07:33Z

What happened?

Getting org.openqa.selenium.NoSuchSessionException: Unable to find session with ID: issue intermittently,

I have sel grid version 4.21.0-20240517 up and running, with below properties for browser pods in place,
TZ: "Asia/Kolkata"
SE_NODE_MAX_SESSIONS: "1"
SE_NODE_SESSION_TIMEOUT: "10800"
SE_NODE_OVERRIDE_MAX_SESSIONS: "true"
SE_SCREEN_HEIGHT: "1080"
SE_SCREEN_WIDTH: "1920"
SE_OPTS: "--log-level FINEST"

I am running one browser node per k8s pod,
I do have autoscaling for the browser pods in place,

autoscaling works absolutely fine, both upscaling and downscaling,
this issue that i am facing is not very frequent,
but i get this issue sometimes, i am not sure why it is coming,

And i am unable to reproduce this issue on my own, this is intermittent sometimes it comes, sometime it does not,
also not related to test, it is not coming with same test everytime, it can be seen with different test whenever observed

I have integrated Jaeger support with my sel grid, just to look at the traces in order to catch these kind of issues,
but when i am looking at traces for this issue, i don't see any localSessionMap.remove command sent as it's not visible in jaeger,

all i see is at some point it just threw SessionNotAvailable Exception all of a sudden,
it was working fine, it was able to click on the element, and then the next it shows is Unable to Find Session Id,
Adding screenshots of what i see in Jaeger

Please help in checking once what could be the reason here for this issue,
is there a particular setting that needs to be changed so as to avoid these kind of issues,
please help in checking this once, Thanks in advance.

How can we reproduce the issue?

Adding the logs of what i see in my test output, 

and also adding the stack trace of what i am seeing in jaeger as an exception

Relevant log output

Test Exception
 
Unable to find session with ID: 303f6c17713ba2fe4988d4ecd00194f5 Build info: version: '4.21.0', revision: '79ed462ef4' System info: os.name: 'Linux', os.arch: 'amd64', os.version: '6.1.58+', java.version: '17.0.11' Driver info: driver.version: unknown Build info: version: '4.21.0', revision: '79ed462ef4' System info: os.name: 'Linux', os.arch: 'amd64', os.version: '5.14.0-362.24.2.el9_3.x86_64', java.version: '11.0.12' Driver info: org.openqa.selenium.remote.RemoteWebDriver Command: [303f6c17713ba2fe4988d4ecd00194f5, get {url=https://space-prod0-automation.sprinklr.com/logout}] Capabilities {acceptInsecureCerts: true, browserName: chrome, browserVersion: 125.0.6422.60, chrome: {chromedriverVersion: 125.0.6422.60 (3ac3319bff9f..., userDataDir: /tmp/.org.chromium.Chromium...}, fedcm:accounts: true, goog:chromeOptions: {debuggerAddress: localhost:34867}, goog:loggingPrefs: {browser: ALL}, networkConnectionEnabled: false, pageLoadStrategy: none, platformName: linux, proxy: Proxy(), se:bidiEnabled: false, se:cdp: wss://qa6-selenium-grid-soc..., se:cdpVersion: 125.0.6422.60, se:name: Governance_UI_Macro_Tests/164, se:vnc: wss://qa6-selenium-grid-soc..., se:vncEnabled: true, se:vncLocalAddress: ws://10.102.33.70:7900, setWindowRect: true, strictFileInteractability: false, timeouts: {implicit: 0, pageLoad: 300000, script: 30000}, unhandledPromptBehavior: accept, webauthn:extension:credBlob: true, webauthn:extension:largeBlob: true, webauthn:extension:minPinLength: true, webauthn:extension:prf: true, webauthn:virtualAuthenticators: true} Session ID: 303f6c17713ba2fe4988d4ecd00194f5

java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)

java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)

java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)

org.openqa.selenium.remote.ErrorCodec.decode(ErrorCodec.java:167)

org.openqa.selenium.remote.codec.w3c.W3CHttpResponseCodec.decode(W3CHttpResponseCodec.java:138)

org.openqa.selenium.remote.codec.w3c.W3CHttpResponseCodec.decode(W3CHttpResponseCodec.java:50)

org.openqa.selenium.remote.HttpCommandExecutor.execute(HttpCommandExecutor.java:190)

org.openqa.selenium.remote.TracedCommandExecutor.execute(TracedCommandExecutor.java:51)

org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:518)

org.openqa.selenium.remote.RemoteWebDriver.get(RemoteWebDriver.java:300)





Jaeger Exception

event	
exception
exception.message	
Unable to execute request for an existing session: Unable to find session with ID: 303f6c17713ba2fe4988d4ecd00194f5
Build info: version: '4.21.0', revision: '79ed462ef4'
System info: os.name: 'Linux', os.arch: 'amd64', os.version: '6.1.58+', java.version: '17.0.11'
Driver info: driver.version: unknown
exception.stacktrace	
org.openqa.selenium.NoSuchSessionException: Unable to find session with ID: 303f6c17713ba2fe4988d4ecd00194f5
Build info: version: '4.21.0', revision: '79ed462ef4'
System info: os.name: 'Linux', os.arch: 'amd64', os.version: '6.1.58+', java.version: '17.0.11'
Driver info: driver.version: unknown
	at org.openqa.selenium.grid.sessionmap.local.LocalSessionMap.get(LocalSessionMap.java:132)
	at org.openqa.selenium.grid.sessionmap.SessionMap.getUri(SessionMap.java:84)
	at org.openqa.selenium.grid.router.HandleSession.lambda$loadSessionId$4(HandleSession.java:223)
	at io.opentelemetry.context.Context.lambda$wrap$2(Context.java:224)
	at org.openqa.selenium.grid.router.HandleSession.execute(HandleSession.java:180)
	at org.openqa.selenium.remote.http.Route$PredicatedRoute.handle(Route.java:397)
	at org.openqa.selenium.remote.http.Route.execute(Route.java:69)
	at org.openqa.selenium.remote.http.Route$CombinedRoute.handle(Route.java:360)
	at org.openqa.selenium.remote.http.Route.execute(Route.java:69)
	at org.openqa.selenium.grid.router.Router.execute(Router.java:87)
	at org.openqa.selenium.grid.web.EnsureSpecCompliantResponseHeaders.lambda$apply$0(EnsureSpecCompliantResponseHeaders.java:34)
	at org.openqa.selenium.remote.http.Filter$1.execute(Filter.java:63)
	at org.openqa.selenium.remote.http.Route$CombinedRoute.handle(Route.java:360)
	at org.openqa.selenium.remote.http.Route.execute(Route.java:69)
	at org.openqa.selenium.remote.AddWebDriverSpecHeaders.lambda$apply$0(AddWebDriverSpecHeaders.java:35)
	at org.openqa.selenium.remote.ErrorFilter.lambda$apply$0(ErrorFilter.java:44)
	at org.openqa.selenium.remote.http.Filter$1.execute(Filter.java:63)
	at org.openqa.selenium.remote.ErrorFilter.lambda$apply$0(ErrorFilter.java:44)
	at org.openqa.selenium.remote.http.Filter$1.execute(Filter.java:63)
	at org.openqa.selenium.netty.server.SeleniumHandler.lambda$channelRead0$0(SeleniumHandler.java:44)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.base/java.lang.Thread.run(Unknown Source)

Operating System

macOs

Selenium version

4.21.0-20240517

What are the browser(s) and version(s) where you see this issue?

Chrome

What are the browser driver(s) and version(s) where you see this issue?

ChromeDriver

Are you using Selenium Grid?

4.21.0-20240517

github-actions · 2024-07-30T07:07:53Z

@rishabhjain-qait, thank you for creating this issue. We will troubleshoot it as soon as we can.

Info for maintainers

Triage this issue by using labels.

If information is missing, add a helpful comment and then I-issue-template label.

If the issue is a question, add the I-question label.

If the issue is valid but there is no time to troubleshoot it, consider adding the help wanted label.

If the issue requires changes or fixes from an external project (e.g., ChromeDriver, GeckoDriver, MSEdgeDriver, W3C), add the applicable G-* label, and it will provide the correct link and auto-close the issue.

After troubleshooting the issue, please add the R-awaiting answer label.

Thank you!

rishabhjain-qait · 2024-07-30T07:09:18Z

cc: @rookieInTraining

diemol · 2024-07-30T07:10:23Z

@VietND96 do you know?

VietND96 · 2024-07-30T07:17:53Z

autoscaling works absolutely fine, both upscaling and downscaling,

May I know if it is ScaledObject or ScaledJob?
If it is ScaledObject, pod preStop is executed to graceful shutdown the Node? If yes, settings of terminationGracePeriodSeconds in how long, is it enough for pod keep Terminating to wait for the session to be completed?

VietND96 · 2024-07-30T07:20:37Z

A similar error that also discussed in here SeleniumHQ/docker-selenium#2129 (comment)

rishabhjain-qait · 2024-07-30T07:31:08Z

hey @VietND96
thanks for looking at the issue,

I am not using KEDA for the autoscaling part,
i have written a small spring boot application which is doing this work for me,

I am Draining the node in order to scale down if any of the nodes of sel grid is having 0 sessions running
Drain Node https://www.selenium.dev/documentation/grid/advanced_features/endpoints/
Node drain command is for graceful node shutdown. Draining a Node stops the Node after all the ongoing sessions are complete. However, it does not accept any new session requests.

cURL --request POST 'http://localhost:4444/se/grid/distributor/node//drain' --header 'X-REGISTRATION-SECRET;'

VietND96 · 2024-07-30T07:31:17Z

Also, can you try to upgrade docker image to tag 4.23.0-20240727 (helm chart 0.33.0), which contains the fix #14282 - race condition, a session can be assigned to Node in status DRAINING

VietND96 · 2024-07-30T07:37:52Z

I am Draining the node in order to scale down if any of the nodes of sel grid is having 0 sessions running

Do you guard the case that at a point of time, having 0 sessions running, drain nodes is triggered but suddenly new requests come? or draining nodes and new requests come together?

VietND96 · 2024-07-30T07:43:11Z

Also, assume you rely on GraphQL endpoint for getting sessions running. For example, there is a glitch that response return error or something. In this case, how the script makes decision? Is it assume as 0 and trigger the scale down, or retry further before making decision?

rishabhjain-qait · 2024-07-30T07:51:31Z

I am Draining the node in order to scale down if any of the nodes of sel grid is having 0 sessions running

Do you guard the case that at a point of time, having 0 sessions running, drain nodes is triggered but suddenly new requests come? or draining nodes and new requests come together?

https://www.selenium.dev/documentation/grid/advanced_features/endpoints/
As mentioned here, once the node is set to drained, no new request would come up to that particular node,
ideally once the session is finished, a new node would spawn up and that would be able to take new requests if present in session queue as per the autoscaling logic written,

ideally the node that is set to drained should not take up any new requests and should be killed as soon as the current session is completed,

Also, assume you rely on GraphQL endpoint for getting sessions running. For example, there is a glitch that response return error or something. In this case, how the script makes decision? Is it assume as 0 and trigger the scale down, or retry further before making decision?

Also if the graphql endpoint returns error which i haven't observed till now,
the script would not assume it as 0 and scale down, instead it will break from the logic, and then it would just try to hit the same graphql endpoint in another 10 sec to get the status and then makes the decision accordingly if needs to scale up/down

VietND96 · 2024-07-30T08:17:44Z

As mentioned here, once the node is set to drained, no new request would come up to that particular node,

I think the scaler not able to guard this, since Hub makes decision to assign session. So try the the new fix I mentioned to see able to avoid DRAINING node picking up new session.

ideally once the session is finished, a new node would spawn up and that would be able to take new requests if present in session queue as per the autoscaling logic written,

Again, question to the scaler. Once the session is finished, how scaler do the scale down? Does scaler consider exactly which pod will be scaled down, or it just randomly selected?

rishabhjain-qait · 2024-07-30T09:21:39Z

hey @VietND96

Yes scaler is considering exactly which pod to be scaled down, it does not select randomly,

the pod which needs to be scaled down, i am only updating that pod's deletion cost with below,
String payload = "{ "metadata": { "annotations": { "controller.kubernetes.io/pod-deletion-cost": "-1" } } }";

and then scaling down so as to ensure correct pod scaled down and not any other

joerg1985 · 2024-08-02T12:20:09Z

@rishabhjain-qait Is this happening shortly after the session is started?
A small delay in processing the NodeRestartedEvent might cause this trouble.

edsherwin · 2024-08-27T10:17:42Z

@rishabhjain-qait have you resolve your issue with KEDA? if yes, can you please share also. Thanks

diemol · 2024-11-05T11:39:37Z

I will close this as the issue has not had any more activity.

github-actions · 2024-12-05T22:49:53Z

This issue has been automatically locked since there has not been any recent activity since it was closed. Please open a new issue for related bugs.

rishabhjain-qait added I-defect needs-triaging labels Jul 30, 2024

joerg1985 mentioned this issue Aug 2, 2024

[grid] ensure the local_sessionmap.remove event is raised #14337

Merged

8 tasks

abhi-iac mentioned this issue Sep 2, 2024

[🐛 Bug]: Grid is going down everyday and I need to manually restart the hub #14467

Closed

joerg1985 mentioned this issue Oct 16, 2024

[🐛 Bug]: org.openqa.selenium.WebDriverException: Unable to route (POST) /session/0405a24bf1d06c6ae8402b8d12027934/... #13769

Closed

diemol closed this as not planned Won't fix, can't repro, duplicate, stale Nov 5, 2024

github-actions bot locked and limited conversation to collaborators Dec 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[🐛 Bug]: org.openqa.selenium.NoSuchSessionException: Unable to find session with ID #14322

[🐛 Bug]: org.openqa.selenium.NoSuchSessionException: Unable to find session with ID #14322

rishabhjain-qait commented Jul 30, 2024

github-actions bot commented Jul 30, 2024

rishabhjain-qait commented Jul 30, 2024

diemol commented Jul 30, 2024

VietND96 commented Jul 30, 2024

VietND96 commented Jul 30, 2024 •

edited

Loading

rishabhjain-qait commented Jul 30, 2024

VietND96 commented Jul 30, 2024

VietND96 commented Jul 30, 2024

VietND96 commented Jul 30, 2024

rishabhjain-qait commented Jul 30, 2024 •

edited

Loading

VietND96 commented Jul 30, 2024

rishabhjain-qait commented Jul 30, 2024

joerg1985 commented Aug 2, 2024

edsherwin commented Aug 27, 2024

diemol commented Nov 5, 2024

github-actions bot commented Dec 5, 2024

[🐛 Bug]: org.openqa.selenium.NoSuchSessionException: Unable to find session with ID #14322

[🐛 Bug]: org.openqa.selenium.NoSuchSessionException: Unable to find session with ID #14322

Comments

rishabhjain-qait commented Jul 30, 2024

What happened?

How can we reproduce the issue?

Relevant log output

Operating System

Selenium version

What are the browser(s) and version(s) where you see this issue?

What are the browser driver(s) and version(s) where you see this issue?

Are you using Selenium Grid?

github-actions bot commented Jul 30, 2024

rishabhjain-qait commented Jul 30, 2024

diemol commented Jul 30, 2024

VietND96 commented Jul 30, 2024

VietND96 commented Jul 30, 2024 • edited Loading

rishabhjain-qait commented Jul 30, 2024

VietND96 commented Jul 30, 2024

VietND96 commented Jul 30, 2024

VietND96 commented Jul 30, 2024

rishabhjain-qait commented Jul 30, 2024 • edited Loading

VietND96 commented Jul 30, 2024

rishabhjain-qait commented Jul 30, 2024

joerg1985 commented Aug 2, 2024

edsherwin commented Aug 27, 2024

diemol commented Nov 5, 2024

github-actions bot commented Dec 5, 2024

VietND96 commented Jul 30, 2024 •

edited

Loading

rishabhjain-qait commented Jul 30, 2024 •

edited

Loading