Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Settings for read optimized graph? #5403

Open
porscheme opened this issue Mar 15, 2023 · 4 comments
Open

Settings for read optimized graph? #5403

porscheme opened this issue Mar 15, 2023 · 4 comments
Labels
type/question Type: question about the product

Comments

@porscheme
Copy link

General Question

Our graph is read only...

  • We never do online UPSERTs
  • We only update through SST files on weekly cadence

I wanted to now if there is any read optimized Nebula or RocksDB settings.
Currently graph walks are annoyingly very slow.
Below is our query...

MATCH (p:Student)
WHERE id(p) IN [ ... ]

OPTIONAL MATCH (p)<-[:HAS_COURSE]-(a:CourseCodes)
OPTIONAL MATCH (p)<-[:STUDENT_HAS_SOCIAL]-(s:Social)
OPTIONAL MATCH (p)<-[:HAS_PROCEDURE]-(r:ProcedureCodes)

WITH
  p,
  COLLECT(DISTINCT a.CourseCodes.CodeId) as courses,
  COLLECT(DISTINCT s.Social.AttributeId) as social,
  COLLECT(DISTINCT r.ProcedureCodes.CodeId) as procedures

RETURN 
  p.Student.StudentId as Student, 
  p.Student.Gender as sex, 
  p.Student.Race as race, 
  p.Student.Ethnicity as ethnicity, 
  p.Student.MaritalStatus as maritalstatus,
  diagnosis,
  procedures,
  social
@porscheme porscheme changed the title Settings read optimized graph? Settings for read optimized graph? Mar 15, 2023
@porscheme
Copy link
Author

Anyone?

@Sophie-Xie
Copy link
Contributor

@yixinglu Pls take a look, thanks.

@Sophie-Xie Sophie-Xie added the type/question Type: question about the product label Mar 27, 2023
@yixinglu
Copy link
Contributor

sorry to reply late.

there some optimize options for performance tuning, u could try to update following flags separately:

  1. nebula-graphd.conf
--optimize_appendvertices=true
--max_job_size=10
  1. nebula-storaged.conf
--query_concurrently=true

In addition, you can profile above query in your environment to check where the bottleneck is at runtime.

by the way, what's the version of nebula you used? if possible, you could upgrade the latest version since we have improve the match performance in latest version.

@porscheme
Copy link
Author

porscheme commented Mar 28, 2023

Our cluster configuration:
version: v3.4.0
metad: 3 (16 cores, 128 GB, 2 TB SSD)
graphd: 3 (16 cores, 128 GB, 2 TB SSD)
storaged: 5 (16 cores, 128 GB, 2 TB SSD)
Replica Factor: 3
VID: String 40 character size

We did see performance improvement after setting below but not enough. Is there anything else we can do?

  • nebula-graphd.conf
--optimize_appendvertices=true
--max_job_size=10
  • nebula-storaged.conf
# This is turned ON by default in nebula v3.4.0, just do it just incase
query_concurrently=true

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/question Type: question about the product
Projects
None yet
Development

No branches or pull requests

3 participants