-
Notifications
You must be signed in to change notification settings - Fork 314
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow "non-repository" input to the analyzer #8803
Comments
What we have missed to discuss so far, but I believe is important to clarify is: |
Good point about a This relates a bit to @mnonnenmacher's questions about how follow-up ORT tools, that might run on a different machine / node, should get access to the provenance's / project's source code. |
Would you like me to incorporate those changes to |
I'm not sure. Your PR is already quite involving, so maybe we should not make it yet more complex, but roll out the changes in stages, if possible. |
Yes, I see the problem: either we have breaking changes after every small pull request, or we have a large pull request, which is impossible to review. Maybe we could create a feature branch for this topic and merge any PRs onto there first and only once it is finished you merge it into main? This way there would only be breaking changes in one release and you also had small PRs to review. FYI: I have taken a look at the changes required for |
@sschuberth Is there any news from the core developer meeting? |
No, because we had to skip it for this week due to appointment conflicts, sorry. |
Today, we had a meeting with @fviernau and discussed potential issues regarding the current refactoring approach. 1. Ambiguous
|
* Original VCS-related information as defined in the [Project]'s metadata. |
but the
vcs
from Repository
originates from the analyzer root in the working tree.ort/model/src/main/kotlin/Repository.kt
Line 32 in f3123a7
* Original VCS-related information from the working tree containing the analyzer root. |
Frank pointed out this ambiguity in the kdoc strings, which may have lead to inconsistent implementation regarding the source of the VcsInfo
objects.
Further, any LocalFileProvenance
would lack metadata and would not fit the given description. Frank suggests, the semantics of the VcsInfo
s source should be clarified, before further refactoring measures are taken.
2. KnownProvenance
sub-classes
Frank suggested to devide the KnownProvenance
class into two distinct sub-classes RemoteProvenance
and LocalProvenance
. This way a feature would be able to support RepositoryProvenance
and ArtifactProvenance
without the requirement to support all KnownProvenance
s including the new LocalProvance
s.
I believe that approach to be sensible. For one it separates the Provenance
s logically. It also allows granular control of which provenances are supported when, allowing distinction if necessary.
A scheme for such an inheritance could look like this:
- KnownProvenance
- RemoteProvenance
- VcsProvenance
- ArtifactProvenance
- LocalProvenance
FileProvenanceDirectoryProvenance
- RemoteProvenance
3. "Dummy" VersionControlSystem
Plugin
We also briefly talked about the alternative solution, @sschuberth already suggested before. Although it was not Frank's area of expertise, he assumes such an implementation would be less complex, as it only requires implementation of a single class with little to no interfaces to the rest of the code base.
I'm not sure I follow here about what the "ambiguity" is. Let me start by saying that I believe the properties to be documented correctly regarding their intent. That is, IMO the So for example, the On top of that, the That said, I believe the implementation of the above is partly wrong in some package managers. For example, we have The current
The proposed structure makes sense to me. |
Would that be something that needs to be fixed before our provenance implementation?
Sounds like the two are only indirectly related. |
IMO, no.
IMO, yes. |
This is until [1] is resolved. [1]: #8803 Signed-off-by: Sebastian Schuberth <sebastian@doubleopen.org>
This is until [1] is resolved. [1]: #8803 Signed-off-by: Sebastian Schuberth <sebastian@doubleopen.org>
Summarizing today's meeting between @sschuberth (ORT), @mnonnenmacher (ORT), @jens-erdmann (HELLA), and myself (HELLA). TL:DR: We are still looking to contribute on the 1. Ambiguous
|
In preparation to split the `KnownProvenance` hierarchy further, for now introduce a `RemoteProvenance` that bundles the `RepositoryProvenance` and `ArtifactProvance`. Later an additional `LocalProvance` will be introduced. See [1] for context. [1]: oss-review-toolkit#8803 (comment) Signed-off-by: Jens Keim <jens.keim@forvia.com>
I just officially published my Mock VCS Plugin for anyone to use. Any ORT users requiring the ability to scan non-version controlled, may use it as a temporary workaround, while we working on the clean provenance implementation. @heliocastro from #2896 I gather you might have use for the plugin. |
As ORT takes transparency and reproducibility of results serious, currently only local directories that are under version control can serve as the input to the analyzer. This is because for such "working trees" it is machine-readable where they originate from, and also the state of the source code is encoded into the VCS revision. If a remote VCS repository is to be analyzed, it either needs to be manually cloned with the respective VCS tool or the ORT Downloader first.
However, there are use-cases where the source code to analyze never was checked into version control, and committing it to a "fake" repository just to be able to analyze it defeats the purpose of capturing genuine provenance information.
Thus the proposal is to relax / extend the valid input types for the analyzer to the following:
Open questions:
Related issues / PRs:
The text was updated successfully, but these errors were encountered: