Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a GitHub actions build #25

Merged
merged 6 commits into from
Oct 9, 2024
Merged

Add a GitHub actions build #25

merged 6 commits into from
Oct 9, 2024

Conversation

timothyjward
Copy link
Contributor

No description provided.

Signed-off-by: Tim Ward <timothyjward@apache.org>
* Add the OSSRH repository
* Update to use Map<String, String> in launch properties

Signed-off-by: Tim Ward <timothyjward@apache.org>
Signed-off-by: Tim Ward <timothyjward@apache.org>
@timothyjward
Copy link
Contributor Author

This PR aims to publish the current snapshot so that it can be used in demos and elsewhere.

The overall set of changes seem to work, but there's a timing bug somewhere in the tests. @ideas-into-software can you look over the changes I've made, check that you're happy, and I'll try to stabilise things so that I can work on the demo code. Thanks!

@ideas-into-software
Copy link
Contributor

Hi @timothyjward

It looks like this is a limitation of number of cores on GitHub hosted runners available. This is a private repository on free plan, it appears, and such runners have only 2 virtual CPUs / only 2 threads ( https://github.blog/news-insights/product-news/github-hosted-runners-double-the-power-for-open-source/ ).

Programmatically launching and shutting down OSGi framework multiple times across test cases in suites run apparently does not keep up to number of threads available.

I did not see any of such issues in my local environment, where I have 6 CPUs / 12 threads and 64 GB of memory - otherwise, I would have fixed them or at least asked for your help.

Even temporarily removing bundle state assertions in com.kentyou.featurelauncher.impl.FeatureLauncherImplTest test cases ( i.e. assertEquals("ACTIVE", BundleStateUtil.getBundleStateString(bundles[1].getState()));, etc. ), where errors you observed were seen in several GitHub runs, did not make GitHub workflow pass, as it hit another issue when running integration tests ( com.kentyou.featurelauncher.impl.runtime.integration.FeatureRuntimeIntegrationTest ) - see logs of most recent runs for details.

I am sorry I can't help more on this - only increasing resources available ( number of vCPUs ) would make this pass, it seems.

@timothyjward timothyjward force-pushed the feature/actions branch 2 times, most recently from 440cb05 to 2ff93a3 Compare October 9, 2024 11:10
This commit uses JUnit's @tempdir annotation to provide a framework storage area per-test. It also removes the shutdown handler and simplifies framework cleanup a little.

Signed-off-by: Tim Ward <timothyjward@apache.org>
Signed-off-by: Tim Ward <timothyjward@apache.org>
@timothyjward timothyjward force-pushed the feature/actions branch 2 times, most recently from 53c87cb to 3c25b2b Compare October 9, 2024 14:10
@timothyjward
Copy link
Contributor Author

timothyjward commented Oct 9, 2024

I am sorry I can't help more on this - only increasing resources available ( number of vCPUs ) would make this pass, it seems.

With a little cleanup and some logging I was able to identify the root cause, namely the interaction of the Gogo shell with a headless build process. By configuring Gogo to be noshutdown we prevent it from stopping the framework during the middle of startup.

@ideas-into-software - I would like to merge this unless it will be excessively disruptive to you?

@ideas-into-software
Copy link
Contributor

Hi Tim,

This is a very constrained environment, and I do not know the internals as well as you do, but the exceptions thrown and apparent differences pointed at insufficient resources to handle this - as per my earlier message; apparently it is environment difference which causes this, as no such issues appear in less constrained environment. It's good to know gogo can be configured like this - I did not know this.

I see by commit history and changes in this PR:

  • two commits I made yesterday are missing for some reason, even though some stuff from those is there - what happened to those commits ? see screenshot attached;

  • in com.kentyou.featurelauncher.impl.FeatureLauncherImpl.LaunchBuilderImpl you removed createShutdownHook, so cleanup is only called on failure and there is no cleanup when framework starts with no problems and e.g. gogo shell is launched then user terminates with ^C, etc.; in numerous projects where there is programmatic interaction with OSGi framework I saw shutdown hooks used, specifically for this reason - was this causing issues as well ? see screenshot attached;

  • also in com.kentyou.featurelauncher.impl.FeatureLauncherImpl.LaunchBuilderImpl, I see call to #stopConfigurationAdminTracker() in finally block where #waitForConfigurationAdminTrackerIfNeeded is called - why is it not running until framework is stopped ? this results in configuration admin tracker to be immediately stopped right after it is started - is it deliberate ? see screenshot attached;

  • also, com.kentyou.featurelauncher.impl.FeatureLauncherImpl.LaunchBuilderImpl code is not formatted after these changes you applied and it contains unused imports;

  • lastly, in pom.xml I see you added bundles which are already part of features' definitions, and did not specify test scope for those - why are those necessary ? see screenshot attached;

You can obviously merge as you wish, but I wanted to clarify these items - does not have to happen first.

I am focusing on the CLI, so I will merge your changes before sharing new PR which contains it.


Screenshot from 2024-10-09 18-12-20
Screenshot from 2024-10-09 18-17-15
Screenshot from 2024-10-09 18-36-16
Screenshot from 2024-10-09 18-36-49
Screenshot from 2024-10-09 18-47-17
Screenshot from 2024-10-09 18-47-31
Screenshot from 2024-10-09 18-47-40
Screenshot from 2024-10-09 18-47-51

@timothyjward
Copy link
Contributor Author

Hi Tim,

This is a very constrained environment, and I do not know the internals as well as you do, but the exceptions thrown and apparent differences pointed at insufficient resources to handle this - as per my earlier message; apparently it is environment difference which causes this, as no such issues appear in less constrained environment. It's good to know gogo can be configured like this - I did not know this.

I was able to reliably reproduce by running mvn -B clean verify </dev/null which sets up a build with no stdin. This failed every time until I added the gogo configuration

I see by commit history and changes in this PR:

  • two commits I made yesterday are missing for some reason, even though some stuff from those is there - what happened to those commits ? see screenshot attached;

I cleared these commits as my understanding was that they contained temporary checks and you wanted them to be removed. I kept the id in the developers section of the pom.

  • in com.kentyou.featurelauncher.impl.FeatureLauncherImpl.LaunchBuilderImpl you removed createShutdownHook, so cleanup is only called on failure and there is no cleanup when framework starts with no problems and e.g. gogo shell is launched then user terminates with ^C, etc.; in numerous projects where there is programmatic interaction with OSGi framework I saw shutdown hooks used, specifically for this reason - was this causing issues as well ? see screenshot attached;

There were a few headaches here, the most significant being that there was no removal of the shutdown hook. Each test ended up with an additional hook registered and I kept seeing "odd" things at the end of all the tests where it tried to clean up a framework that had already been stopped (it was also a memory leak in the tests). This could be reinstated to simply stop the framework, as long as we remove the hook if the framework stops "normally". I don't see this as a big priority though.

  • also in com.kentyou.featurelauncher.impl.FeatureLauncherImpl.LaunchBuilderImpl, I see call to #stopConfigurationAdminTracker() in finally block where #waitForConfigurationAdminTrackerIfNeeded is called - why is it not running until framework is stopped ? this results in configuration admin tracker to be immediately stopped right after it is started - is it deliberate ? see screenshot attached;

Once the configurations are installed there shouldn't be any need to keep tracking should there?

  • also, com.kentyou.featurelauncher.impl.FeatureLauncherImpl.LaunchBuilderImpl code is not formatted after these changes you applied and it contains unused imports;

I've cleared the remaining import. I can't see any missing formatting

  • lastly, in pom.xml I see you added bundles which are already part of features' definitions, and did not specify test scope for those - why are those necessary ? see screenshot attached;

Good catch. I needed these at compile scope for debug breakpoints. That should be backed out.

Some of the Feature Launcher tests deploy the Gogo Shell. If the Gogo shell detects that stdin has closed then it shuts down the framework automatically. When running in CI (and elsewhere) we use batch mode with no terminal which disconnects the test run and gives it an empty stdin. This causes the tests to fail as gogo shuts things down before the test finishes. Instead we must configure gogo to be non-interactive during the tests.

Signed-off-by: Tim Ward <timothyjward@apache.org>
@timothyjward timothyjward merged commit 4d13cd8 into main Oct 9, 2024
1 check passed
@timothyjward timothyjward deleted the feature/actions branch October 9, 2024 17:30
@ideas-into-software
Copy link
Contributor

Hi Tim,

You merged this in the meantime, and I was focusing on delivering next PR ( #26 ) on time, so I only have time to reply now.

a) regarding stopping of FeatureLauncherConfigurationManager right after it is started - you replied (...) Once the configurations are installed there shouldn't be any need to keep tracking should there? (...); perhaps my understanding of 160 specification is wrong, but two sections ( 160.4.3.5 and 160.4.3.4 ) define this, specifically in 160.4.3.5 it mentions:

(...) The Feature Launcher implementation must delay returning control to the caller until all configurations have been created, subject to the timeout defined by CONFIGURATION_TIMEOUT. The default timeout is 5000 milliseconds, and it determines the maximum length of time that the Feature Launcher implementation should wait to begin creating the configurations. A value of -1 indicates that the Feature Launcher implementation must not wait, and must continue immediately, even if the configurations have not yet been created. If it is not possible to begin before the timeout expires then a LaunchException must be thrown. (...)

My understanding of CONFIGURATION_TIMEOUT is that it defines how long waiting for ConfigurationAdmin service should last - not after how long it should be stopped. What if service arrives and is in the middle of creating configurations / not finished yet ? Shouldn't there be some check of whether FeatureLauncherConfigurationManager is done with creating configurations before stopping it ?

b) regarding missing commits - there was actually some refactoring done as well, as I was right after delivering PR with FeatureRuntime implementation ( #24 ) and realized some things I did in FeatureLauncher could be done in a better way ( e.g. splitting into separate methods stopping and uninstalling, as well as checking if bundle is a fragment, etc. .. all of which I added in those commits );

c) regarding formatting and obsolete import - aside from that missing import you removed, there was also formatting issue; see screenshots attached - perhaps it's a matter of width set in Eclipse, mine is pre-set as standard, but there are clear differences, which is why I mentioned it;

d) regarding environment differences - thank you for the tip; I thought this was related to async issues I saw locally when several test cases are run / i.e. why I could not rely on waitForStop which I mentioned previously;


Screenshot from 2024-10-14 03-40-52
Screenshot from 2024-10-14 03-41-08
Screenshot from 2024-10-14 03-41-18

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants