Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

client::page_service: fails to parse Error tag #6298

Closed
Tracked by #5771
problame opened this issue Jan 8, 2024 · 0 comments · Fixed by #6302
Closed
Tracked by #5771

client::page_service: fails to parse Error tag #6298

problame opened this issue Jan 8, 2024 · 0 comments · Fixed by #6302
Assignees
Labels
c/storage/pageserver Component: storage: pageserver

Comments

@problame
Copy link
Contributor

problame commented Jan 8, 2024

2024-01-08T17:31:51.614277Z  INFO number of timelines:
200
2024-01-08T17:31:53.906346Z  INFO RPS: 707
2024-01-08T17:31:54.907109Z  INFO RPS: 11055
2024-01-08T17:31:55.908896Z  INFO RPS: 17344
2024-01-08T17:31:56.909262Z  INFO RPS: 23668
thread 'tokio-runtime-worker' panicked at pageserver/pagebench/src/cmd/getpage_latest_lsn.rs:342:14:
called `Result::unwrap()` on an `Err` value: getpage for a1815f4d2223f3c0cc0a6ef65aafdb47/0bed65b1d81dc8456b3b24bade5939bc

Caused by:
    remaining bytes in msg with tag=103: 36

Stack backtrace:
   0: pageserver_api::models::PagestreamBeMessage::deserialize
             at ./libs/pageserver_api/src/models.rs:832:13
   1: pageserver_client::page_service::PagestreamClient::getpage::{{closure}}
             at ./pageserver/client/src/page_service.rs:130:19
      pagebench::cmd::getpage_latest_lsn::client::{{closure}}::{{closure}}
             at ./pageserver/pagebench/src/cmd/getpage_latest_lsn.rs:340:14
      <tracing::instrument::Instrumented<T> as core::future::future::Future>::poll
             at /home/admin/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tracing-0.1.37/src/instrument.rs:272:9
   2: pagebench::cmd::getpage_latest_lsn::client::{{closure}}
             at ./pageserver/pagebench/src/cmd/getpage_latest_lsn.rs:317:1

Corresponding PS log (kind of error shouldn't really matter, I think)

2024-01-08T17:31:56.988380Z  INFO page_service_conn_main{peer_addr=127.0.0.1:46556}:process_query{tenant_id=a1815f4d2223f3c0cc0a6ef65aafdb47 timeline_id=0bed65b1d81dc8456b3b24bade5939bc}:handle_pagerequests:handle_get_page_at_lsn_request{rel=1663/5/2608 blkno=3 req_lsn=0/1504210}: walredo failed, path:
- layer traversal: result Continue, cont_lsn 0/1504199, layer: /mnt/test_output/test_getpage_throughput/repo/pageserver_1/tenants/a1815f4d2223f3c0cc0a6ef65aafdb47/timelines/0bed65b1d81dc8456b3b24bade5939bc/000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000001504199-0000000001504211
- layer traversal: result Continue, cont_lsn 0/1504121, layer: /mnt/test_output/test_getpage_throughput/repo/pageserver_1/tenants/a1815f4d2223f3c0cc0a6ef65aafdb47/timelines/0bed65b1d81dc8456b3b24bade5939bc/000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000001504121-0000000001504199
- layer traversal: result Continue, cont_lsn 0/15040A9, layer: /mnt/test_output/test_getpage_throughput/repo/pageserver_1/tenants/a1815f4d2223f3c0cc0a6ef65aafdb47/timelines/0bed65b1d81dc8456b3b24bade5939bc/000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__00000000015040A9-0000000001504121
- layer traversal: result Continue, cont_lsn 0/1504031, layer: /mnt/test_output/test_getpage_throughput/repo/pageserver_1/tenants/a1815f4d2223f3c0cc0a6ef65aafdb47/timelines/0bed65b1d81dc8456b3b24bade5939bc/000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000001504031-00000000015040A9
- layer traversal: result Continue, cont_lsn 0/1503FA1, layer: /mnt/test_output/test_getpage_throughput/repo/pageserver_1/tenants/a1815f4d2223f3c0cc0a6ef65aafdb47/timelines/0bed65b1d81dc8456b3b24bade5939bc/000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000001503FA1-0000000001504031
- layer traversal: result Continue, cont_lsn 0/1503F29, layer: /mnt/test_output/test_getpage_throughput/repo/pageserver_1/tenants/a1815f4d2223f3c0cc0a6ef65aafdb47/timelines/0bed65b1d81dc8456b3b24bade5939bc/000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000001503F29-0000000001503FA1
- layer traversal: result Continue, cont_lsn 0/1503EB1, layer: /mnt/test_output/test_getpage_throughput/repo/pageserver_1/tenants/a1815f4d2223f3c0cc0a6ef65aafdb47/timelines/0bed65b1d81dc8456b3b24bade5939bc/000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000001503EB1-0000000001503F29
- layer traversal: result Complete, cont_lsn 0/14DC061, layer: /mnt/test_output/test_getpage_throughput/repo/pageserver_1/tenants/a1815f4d2223f3c0cc0a6ef65aafdb47/timelines/0bed65b1d81dc8456b3b24bade5939bc/000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__00000000014DC061-0000000001503EB1

2024-01-08T17:31:56.988488Z ERROR page_service_conn_main{peer_addr=127.0.0.1:46556}:process_query{tenant_id=a1815f4d2223f3c0cc0a6ef65aafdb47 timeline_id=0bed65b1d81dc8456b3b24bade5939bc}:handle_pagerequests:handle_get_page_at_lsn_request{rel=1663/5/2608 blkno=3 req_lsn=0/1504210}: error reading relation or page version: Failed to reconstruct a page image:: launch walredo process: spawn process: Too many open files (os error 24)
@problame problame added the c/storage/pageserver Component: storage: pageserver label Jan 8, 2024
@problame problame self-assigned this Jan 8, 2024
problame added a commit that referenced this issue Jan 8, 2024
Before this PR, we wouldn't advance the underlying `Bytes`'s cursor.

fixes #6298
problame added a commit that referenced this issue Jan 8, 2024
Before this PR, we wouldn't advance the underlying `Bytes`'s cursor.

fixes #6298
problame added a commit that referenced this issue Jan 9, 2024
)

Before this PR, we wouldn't advance the underlying `Bytes`'s cursor.

fixes #6298
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c/storage/pageserver Component: storage: pageserver
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant