Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

One app to rule them all #82

Merged
merged 19 commits into from
Dec 9, 2021
Merged

One app to rule them all #82

merged 19 commits into from
Dec 9, 2021

Conversation

whikloj
Copy link
Member

@whikloj whikloj commented Oct 19, 2021

GitHub Issue: Islandora/documentation#731

What does this Pull Request do?

Moves Alpaca from an OSGI bundle to a single runnable jar with all the various services inside.

What's new?

  • Removes OSGI bindings

  • Pulls SparqlUpdateProcessors in from fcrepo-camel code (it was causing problems with Jena libraries otherwise)

  • Create single runnable jar that takes configuration from a properties file.

  • Does this change require documentation to be updated? Docs are updated

  • Does this change add any new dependencies? No

  • Does this change require any other modifications to be made to the repository (i.e. Regeneration activity, etc.)? No

  • Could this change impact execution of existing code? Deployment will need to be altered.

How should this be tested?

Using a playbook or ISLE instance, build the jar and then copy it into the system (somewhere it can access Fedora, Blazegraph, ActiveMQ, the microservices).

Shutdown Karaf and copy the settings from all the various configuration files to a single properties file.

Start the jar with that properties file and it should perform all the various actions.

ie.

java -jar islandora-alpaca-app-2.0.0-all.jar -c config.properties

Additional Notes:

There are some stacktraces thrown that I left in place but are not actually a problem (for instance when you choose a media file, Drupal triggers us to send an event but it doesn't have all the necessary URLs yet. It can be ignored but could also be a sign of a problem.

Could leave the less verbose in the INFO level and move the stacktrace to the DEBUG logging level, but I don't want anyone to miss an actual problem. 🤷

Interested parties

@Islandora/8-x-committers (especially those you don't hate Java or at least will do Java, @dannylamb, @elizoller, @adam-vessey, etc)

@codecov
Copy link

codecov bot commented Oct 19, 2021

Codecov Report

Merging #82 (2b5e23d) into 2.x (1bb1918) will decrease coverage by 11.72%.
The diff coverage is n/a.

Impacted file tree graph

@@              Coverage Diff              @@
##                2.x      #82       +/-   ##
=============================================
- Coverage     89.80%   78.07%   -11.73%     
- Complexity       79      143       +64     
=============================================
  Files            11       30       +19     
  Lines           304      561      +257     
  Branches          1       27       +26     
=============================================
+ Hits            273      438      +165     
- Misses           30      101       +71     
- Partials          1       22       +21     
Impacted Files Coverage Δ
...lpaca/indexing/triplestore/TriplestoreIndexer.java 96.22% <0.00%> (-0.33%) ⬇️
...aca/http/client/StaticTokenRequestInterceptor.java 73.33% <0.00%> (ø)
...paca/connector/derivative/DerivativeConnector.java 100.00% <0.00%> (ø)
...java/ca/islandora/alpaca/support/event/AS2Url.java
.../islandora/alpaca/support/event/AS2Attachment.java
...a/ca/islandora/alpaca/support/event/AS2Object.java
...ora/alpaca/support/event/AS2AttachmentContent.java
...va/ca/islandora/alpaca/support/event/AS2Event.java
...va/ca/islandora/alpaca/support/event/AS2Actor.java
.../connector/derivative/RequestConfigConfigurer.java
... and 27 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1bb1918...2b5e23d. Read the comment docs.

@elizoller
Copy link
Member

can you be more specific about

Start the jar with that properties file
?
like literally java -jar ... or do i need to run it some other way?

@whikloj
Copy link
Member Author

whikloj commented Oct 20, 2021

Sorry @elizoller I didn't realize I hadn't put that in here.

I did update the README which (hopefully) has all the needed information. https://github.com/Islandora/Alpaca/blob/d92362fd418cd5ad2d306ef7901a536bd555907d/README.md

README.md Show resolved Hide resolved
@whikloj
Copy link
Member Author

whikloj commented Oct 22, 2021

Sorry folks I decided to pull in the changes from #66 as well. This should be ready for a full test.

@elizoller
Copy link
Member

I feel like I can't really speak to the code part of this directly but I can provide feedback on what I did to test this PR.
I pulled in this PR in alpaca (in a playbook) and used the gradle command in the readme to build it. Then I started the jar with the commands provided in the readme with the provided examples.properties file. Then I went into the playbook and created some items and media. I verified the following functionality works: indexing items and media in fedora, deleting items removes them from fedora, updating an item creates a version in fedora, homarus/ffmpeg derivs generated, houdini/imagemagick derivatives generated, hypercube/tesseract/pdf full text extracted, crayfits/fits derivatives generated.

@whikloj
Copy link
Member Author

whikloj commented Oct 27, 2021

No one can speak to Java code.

Like all code it is best glimpsed in one's peripheral vision. It is the fairy lights in the trees, the oasis on the horizon, it is real but subjective.

and like that...he's gone

@seth-shaw-unlv
Copy link
Contributor

Other than my question about connectionClose (see above), I don't see any obvious issues. I think we can probably trust your smoke tests on this, @elizoller.

@seth-shaw-unlv
Copy link
Contributor

Okay, as far as I'm concerned, if @elizoller is happy with his tests and @adam-vessey has his questions adequately addressed, 👍.

@adam-vessey adam-vessey changed the base branch from 1.x to 2.x October 27, 2021 17:26
@whikloj
Copy link
Member Author

whikloj commented Nov 3, 2021

Sorry for the delay on this, I have reverted the URL changes and then I had to work out a way of testing as the old way did not seem to work.

Copy link
Contributor

@adam-vessey adam-vessey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noticed PMD flagging things... should we be introducing/adding in some Github actions that check for these?

@DonRichards
Copy link
Member

@whikloj Sorry, I'm about to show how little I understand Alpaca and this pull request but is it possible to have gradle do the "build the jar and then copy it into the system (somewhere it can access Fedora, Blazegraph, ActiveMQ, the microservices)."?

@whikloj
Copy link
Member Author

whikloj commented Nov 30, 2021

Ok there are still 2 PMD rule issues we need to make a decision about, but otherwise they are cleaned up.

@DonRichards once Alpaca is finalized we can deploy a build jar to Maven and Github then people deploying it can just download the pre-built jar instead of having to compile it themselves.

@adam-vessey adam-vessey self-assigned this Dec 1, 2021
whikloj and others added 2 commits December 3, 2021 10:38
* Change up instantiation/initialization slightly.

* PostConstruct is allowed to apply to private methods...

... seems like a false-positive in PMD? Got example exception from:
https://stackoverflow.com/a/48679770

* Changing to camel's 3.7.6 LTS.
@adam-vessey
Copy link
Contributor

So, did realize something with the concurrent consumer stuff: It wouldn't really allow multiprocessing without the additional asyncConsumer parameter... throwing together another little PR to introduce it: whikloj#2

@whikloj
Copy link
Member Author

whikloj commented Dec 6, 2021

So I think it depends on what you mean by multiprocessing. For instance in our Islandora Legacy I use camel to kick off tesseract, it will run concurrentConsumers equal to the number of cores - 1, but each message is processed completely before the a new one is started. I'm not against the PR for including asyncConsumer, I'm just not completely clear how it decides when to grab a new item to start processing. Perhaps this is where the difference between concurrentConsumers and maxConcurrentConsumers` comes in?

@adam-vessey
Copy link
Contributor

Possibly... my interpretation of concurrentConsumers vs maxConcurrentConsumers is somewhat analogous to the -Xms and -Xmx values to the JVM... initial value when starting, and the max to which things can grow if called for.

Without asyncConsumer, it seems like only a single thing for each queue can be processed at a time (per instance of consumers listening, anyway... as in, running multiple instances of the jar could allow for multiple, in a round-about way), but that said... and it's entirely possible that I'm missing something, but given how it's described, I'm not sure of the value of being able to specify concurrentConsumers without also specifying asyncConsumer=true offers much... maybe a alleviating some amount of overhead?

@whikloj
Copy link
Member Author

whikloj commented Dec 7, 2021

I think we are in somewhat agreement, the only difference (I think) is that I now believe that without asyncConsumers the concurrentConsumers would never rise from its initial value and maxConcurrentConsumers might be irrelevant.

I do know that you can set concurrentConsumers to a number greater than 1 and have multiple messages processed at the same time. Async is (I think) that once the message has been pulled from the JMS queue and handed off (maybe just to the route or maybe to a secondary route) then a new message is grabbed from the JMS queue. In normal processing it would wait until the first message completed the entire route.

I'll give your PR a build today and I have no issue adding it as a configurable option.

@adam-vessey
Copy link
Contributor

Hmm... yeah, giving things a further poke, and seeing multiple convert instances at a time, from Houdini; however... I wonder... is it because of the streaming? Not really familiar with Camel's lifecycle WRT JMS messages... wondering if it's more, when the to() gets a response, that it would acknowledge the message (removing from the queue, allowing another thing to be handled), even though it's not quite done of the message?

@whikloj
Copy link
Member Author

whikloj commented Dec 7, 2021

Yeah I think that is why they don't guarantee the order if you use asyncConsumers. If you have a route like

from("input")
.to("processing")
.to("output");

and queue with

A, B, C

with a single consumer but asyncConsumers=true then depending on how long each thing takes in processing the output endpoint could see A, C, B, C, A, B or C, B, A...

* Fix up warning about the unclosed app context.

* Slap together async-consumer stuff.

* Add separator for asyncConsumer parameter.

* Update example as suggested.
@adam-vessey
Copy link
Contributor

Since... I've now got code in this, I'm not really supposed to merge it, so... yeah... but I would if I could. :P

@adam-vessey adam-vessey removed their assignment Dec 8, 2021
@whikloj
Copy link
Member Author

whikloj commented Dec 8, 2021

Yeah we'll have to drag some other @Islandora/8-x-committers in to this

@elizoller
Copy link
Member

@whikloj / @adam-vessey - do you want me to give it another round of smoke tests?

@jordandukart
Copy link
Member

Great job @whikloj and @adam-vessey for working through this!

@elizoller
Copy link
Member

sounds like @alxp already got to it here: Islandora-Devops/islandora-playbook#208 (comment)

@whikloj
Copy link
Member Author

whikloj commented Dec 8, 2021

@elizoller if you're able to have a look, please do.

Copy link

@alxp alxp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one question about the "solr" config comment.

connection.timeout=-1
socket.timeout=-1

# Solr indexer options
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be "Fedora indexer options"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DOH

public static void main(final String[] args) {
final AlpacaDriver driver = new AlpacaDriver();
final CommandLine cmd = new CommandLine(driver);
//cmd.setExecutionExceptionHandler(new AppExceptionHandler(driver));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented-out code should usually be removed.

@seth-shaw-unlv
Copy link
Contributor

@alxp , do you want to merge it? Or are you blocked due to codecov?

@alxp alxp merged commit 7bf983d into Islandora:2.x Dec 9, 2021
@whikloj whikloj deleted the one-app branch December 9, 2021 22:20
@whikloj
Copy link
Member Author

whikloj commented Dec 10, 2021

Thank you all for seeing this through. I'm going to try and release it, see how that goes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

7 participants