-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance testing triplestores for Islandora community recommendations #30
Comments
Nice @ruebot I would like to add also some requirements for the possible candidates if possible:
Quick list from google
Some existing work on benchmarks |
Wanna throw BlazeGraph into the mix: http://www.blazegraph.com/ . It's what wikipedia is using. |
Nice addition @daniel-dgi. BlazeGraph Looks really good. ++ for testing that one first. |
Shall we identify benchmarks, and datasets from this RdfStoreBenchmarking list? Maybe we can coordinate with the Fedora community? Get some input there as well? looks at @awoods |
It would be good to identify usage characteristics and expectations of the community in order to ensure that we are looking at the right metrics. As a side note, I believe @no-reply at DPLA is also planning on such an analysis. Maybe we can extend the coordination. |
Hi, do we have some stats on how many triples do we will get for every FF object? |
No, but that should be easy to determine. My guess is 20. |
Ok, that's less than what we got now in Fedora 3. A simple object with RELS-EXT + full DC document gives me about 30. |
You will want to check what the F4 triples look like from your specific data, of course. I was just throwing out a guess. 30 may be closer to the truth. |
Thanks @awoods! , i just wan't to try to infer what will be the reality for the largest (and ever growing) islandora implementations we have on the community. @ruebot , do you think we could make a quick and dirty poll about this on the google group? Like "how many objects are you handling right now, and how fast are you growing every year"?. I have read in the group of repos with over 250000 objects. That's 7.500.000 triples. To have this as basis to identify usage "characteristics and expectations" as @awoods correctly stated. |
Looks like LUBM: http://swat.cse.lehigh.edu/projects/lubm/ is a standard test sets and tools used on benchmarking triple stores. At least Oracle thinks so! |
fyi ... Open Link Virtuoso (i believe) is also used by the OSF for Drupal project |
Nice Donald! OSF for drupal looks like a nice addition, reading quickly through the documentation i see there is a lot of things we could do without having to write custom code, even importing whole ontologies. Also 3.2 version does not require Virtuoso anymore, you can use any Triple store, even better. Thanks a lot, this could make the bridge and bring Linked data to Drupal. |
This could be done as Fedora community Performance Scaling & Testing; relevant agenda item from this meeting. |
Because sometimes we have a conversation on Twitter a year or so later: ...and a document now thanks to @cmh2166 |
For Ruby users, I've done some initial work on a benchmark suite for ruby-rdf at: https://github.com/ruby-rdf/rdf-benchmark My hope is that this will become a general purpose benchmark for RDF.rb, using the Berlin Benchmark data generator. It's early days, still, but the work might have more general usefulness. |
Blazegraph GPU on AWS EC2 G2 Family :-) |
Stardog is not open source, although in my experience @kendall at @Complexible is approachable and very willing to have discussions about favorable licensing terms. I had that experience in the context of work I did for @ddavis at @Smithsonian, so YMMV. |
The text was updated successfully, but these errors were encountered: