-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent duplicate and invalid titles #6326
base: main
Are you sure you want to change the base?
Conversation
e8c63dd
to
f23289c
Compare
@@ -367,7 +367,7 @@ public enum ParameterCore implements ParameterInterface { | |||
/** | |||
* Validation of process title via regular expression. | |||
*/ | |||
VALIDATE_PROCESS_TITLE_REGEX(new Parameter<>("validateProzessTitelRegex", "[\\w-]*")), | |||
VALIDATE_PROCESS_TITLE_REGEX(new Parameter<>("validateProzessTitelRegex", "^[a-zA-Z0-9_-]+$")), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the new default value not even updated in the kitodo_config.properties
files?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering: This is a hard limitation of the Kitodo software, some titles lead to problems later. Should we allow to change that in the config.properties at all? Or should we hardcode that in the application? Additionally one can think of allowing additional regexes which might be checked on top of the minimal requirements. See #6055
cc @solth
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At SLUB we allow even the .
inside the process title as for historical reasons this was needed to create the processes. Maybe this can be changed now but this should be discussed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more note: the w
meta character in regular expressions stands for a-z, A-Z, 0-9, including _ (underscore)
. So the expression can be simplified to ^[\\w-]+$
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about chosing a new parameter name "validateProcessTitle" which avoids the current "denglisch" (English-German mix)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more note: the
w
meta character in regular expressions stands fora-z, A-Z, 0-9, including _ (underscore)
. So the expression can be simplified to^[\\w-]+$
?
Thanks i changed that. It turned out that [\\w-]+
in combination with Java's matches
als works (the problem was that the necessary validation is just not called during process creation), but i think the changed regex is more expressive.
What about chosing a new parameter name "validateProcessTitle" which avoids the current "denglisch" (English-German mix)?
I also have sympathy for that, but that would require institutions having to adapt their config. I think the Pull Request, if accepted, should probably be backported which would probably be easier, when we do not change the configuration.
Kitodo/src/main/java/org/kitodo/production/forms/createprocess/CreateProcessForm.java
Outdated
Show resolved
Hide resolved
f23289c
to
6037595
Compare
} | ||
|
||
@Test | ||
public void shouldNotAllowDuplicateTitles() throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably have to take into account the configuration here. On some systems this is allowed...
Edit: maybe not necessary.
a035cde
to
849dadc
Compare
I think, for now, it would be less disruptive if institutions are allowed to override the regex in the configuration. This approach might introduce the possibility of configurations that Kitodo cannot handle (e.g., titles containing whitespace). However, one could consider introducing a baseline regex in the future—one that enforces basic constraints, such as disallowing whitespace (defined in the application and not changeable). The configuration value could then be redefined and renamed to build on top of this baseline regex, allowing institutions to apply additional rules. For example, institutions could enforce identifiers that always start with a specific prefix, such as "digi_". |
1e7c3a8
to
667867d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm tested this change and let a few code comments inside.
The changes are working if you have access to all projects or at least to the same projects where a process is already existing. It did not work if the already process is an other project where the user has no access.
As in both referenced issues there is no mention if process duplication check should work over all projects for a client independent of the user as project access or not I'm can not judge if this is solution is in current state correct or not. In my opinion the duplication check should work over all projects of a client and not only over the projects where the user has access.
} | ||
|
||
// Helper to clean up database and file system | ||
private void cleanUpProcess(Process process) throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a note: this method is only executed if all method executions before calling this method was successful as this method is called at the end of test execution. This could lead to a dirty repository and tests which are executed after this could executed in a wrong way. Maybe an approach with @AfterEach
should be considered so the cleanup is done independent of a successful or failed test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, i tried to adapt the tests.
@@ -404,7 +404,7 @@ numberOfMetaBackups=8 | |||
useMetadatenvalidierung=true | |||
|
|||
# Validierung der Vorgangstitel ueber regulaeren Ausdruck | |||
validateProzessTitelRegex=[\\w-]* | |||
validateProzessTitelRegex=^[\\w-]+$ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the other kitodo_config.properties
files with this entry be adjusted too?
For an other issue I switched to release 3.7.1 and tried by accident to create a new monographic process from a catalogue but I got the message, that this process is already existing after I clicked on the save button in the create process form. So at least some kind of title duplication check is done even without this change. |
I agree, the check is useless if it does not check every title of every project (in the same client); i adapted the logic and the tests to ensure that duplicate titles are not allowed on the same client. (No matter if the user has access to the projects or not) |
2c3ce88
to
b843df8
Compare
Integer processId = newProcess.getId(); | ||
processService.remove(processId); | ||
fileService.delete(URI.create(processId.toString())); | ||
createdProcess = underTest.getMainProcess(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The determination of the created process could be moved before the first assertion is executed so you have anytime the right process information before cleaning up. If adjusted here it should be adjusted on the other places too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, good idea. I did that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general the changes are right. My two notes (adjustment of the other existing kitodo_config.properties
files and the determination of the created process inside the tests) can be done and even not they do nothing wrong.
// Second process creation with duplicate title | ||
CreateProcessForm underTestTwo = setupCreateProcessForm("Monograph"); | ||
underTestTwo.getMainProcess().setTitle("title"); | ||
underTestTwo.getMainProcess().setProject(ServiceManager.getProjectService().getById(secondProjectId)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my previous implementation, I used project 2 for both process creations in the shouldNotAllowProcessTitlesInProjectsTheUserDoesNotBelongTo
test. I changed that because the second process creation should happen in project 1, where user 2 has access. This ensures that the failure occurs due to a duplicate title, not because the user lacks access to the project. While it worked before, this change clarifies the test's intent.
I think this is a very good idea. This would prevent technically problematic titles while still keeping some flexibility for institutions that want or need slight variations from the default in their process titles! |
If we go down this road, we should maybe make this configurable in the project(?) settings and remove the value from the Kitodo config, to allow for more flexibility. |
Read I this correct that this value is moved from a global / general place to a project specific place? How should this value be transferred if you have a lot of projects which need the same value? Did we really need different values for the projects? |
Good point. Maybe a 'layererd' architecture
|
If we use a layered architecture, I would use
|
Yes, I just had the same idea. Not sure, if we need all these layers, though - perhaps just a global parameter in Tbh, I also think it is not asked too much to update one string manually in multiple projects, even if an institution has more than just a handful of projects - this is just a simple copy & paste task, after all. |
I think we have to keep in mind that in special circumstances a user might see processes of projects that do not belong to one of the clients he is assigned to - specifically when he has the special permission to link new processes to processes of unassigned projects, see #5869 - and if and how a duplicate check would have to take this into account. |
What i implemented right now is that the search for duplicate titles is done across all projects (of the current client/tenant), even in those the user does NOT have access to. In some setups having duplicate titles could have really bad consequences, so it should be prevented. Some user might not even see the process titles in the list, which he or she is not allowed to set, but that's the way it is. |
Fixes
#6303
#6055