Extend the network client #1269

MehmedGIT · 2024-08-06T14:18:45Z

A brief beginning of addressing the points in #1265:

Fix network-integration-test failing due to a bad env variable
Ocrd network client:
- can submit workflow jobs via the CLI
- can check processing and workflow job statuses
- can check the list of deployed processing workers/processor servers
- can check the ocrd tool json of a deployed processor
- can check the log file of a processing job
Two new env variables to control ocrd network client behavior: OCRD_NETWORK_CLIENT_POLLING_SLEEP and OCRD_NETWORK_CLIENT_POLLING_TIME
Integration tests for the ocrd network client
Reuse client logic in the processing server integration tests to avoid code duplication

MehmedGIT · 2024-08-06T18:52:01Z

Connections to the client callback URL would be more complicated than in the example @joschrew shared. For the CI/CD pipeline, we need an appropriate host defining and port mapping just like in the other services in the docker-compose file.

kba

So far so good, appreciate also the test.

We should really find a different mechanism than a global variable. I have proposed a way to at least scope such a variable to the server instance. Why do we need that mechanism anyway? IIUC callback_server.handle_request will handle exactly one request with no timeout, which is exactly what we want, is it not?

But I'm also wondering whether having an alternative based on polling for blocking processing might still be worth it. Because I imagine that a lot of firewalls won't allow the callback to a temporary server run on the user's machine. Polling is less efficient obviously but it does not require juggling IP addresses, ports and servers.

Makefile

src/ocrd_network/cli/client.py

Makefile

src/ocrd_network/client.py

src/ocrd_network/client_utils.py

src/ocrd_utils/config.py

Co-authored-by: Konstantin Baierer <kba@users.noreply.github.com>

joschrew · 2024-08-13T15:34:57Z

I started to review, or rather to test the client but I couldn't finish so far. I will continue tomorrow. Here is what is not working for me up until now:

OCRD_NETWORK_SERVER_ADDR_PROCESSING in not working as a default for --address. I think simply default is missing as argument for the click-decorator
ocrd network client processing processor ocrd-cis-ocropy-binarize --address "http://localhost:8000" -I DEFAULT -O BIN-2 -m /data/vd18test/mets.xml gives an error: TypeError: send_processing_job_request() got an unexpected keyword argument 'block'. The parameter block_till_job_end could be renamed to block to resolve this.
this is not related to this pr (but I came across this while testing) but in ocrd_network/server_utils.py in line 215 and 218 it must be HTTP_400_BAD_REQUEST instead of HTTP_404_BAD_REQUEST (which does not exist).
running a processing job (send_processing_job_request) with the client does not work for me currently. The problem seems to be parameters in the inputs, at least it works if I remove this parameter from job_input. While debugging I think I realized a list instead of a dict is provided. Here I had to stop.

When refactoring to a shorter flag I obviously missed that there are two places to adapt.

MehmedGIT · 2024-08-13T20:13:24Z

OCRD_NETWORK_SERVER_ADDR_PROCESSING in not working as a default for --address. I think simply default is missing as argument for the click-decorator

I'm not a big fan of copy-pasting the same default everywhere. Instead, I resolved it with 4de1e83. The --address default could probably be also a constant - the same default we use for the default processing server address.

The parameter block_till_job_end could be renamed to block to resolve this.

Resolved 8e7ba26.

this is not related to this pr (but I came across this while testing) but in ocrd_network/server_utils.py in line 215 and 218 it must be HTTP_400_BAD_REQUEST instead of HTTP_404_BAD_REQUEST (which does not exist).

You have caught something bigger than just that simple error. Resolved with 69808b6 and d1af85b. The HTTP exception was being caught with the general Exception catch which was duplicating the error output, and thus obfuscating it.

running a processing job (send_processing_job_request) with the client does not work for me currently. The problem seems to be parameters in the inputs, at least it works if I remove this parameter from job_input. While debugging I think I realized a list instead of a dict is provided. Here I had to stop.

That was not very pleasant to debug. The ocrd.decorators.parameter_option is returning ['{}'] and not {} by default. Which led to the parameters={'{': '}'} instead of parameters={}.

21:57:40.222 WARNING ocrd_network.processing_server - Job input: processor_name='ocrd-cis-ocropy-binarize' path_to_mets='/home/mm/repos/ocrd_network_tests/ws29/data/mets.xml' workspace_id=None description='OCR-D Network client request' input_file_grps=['DEFAULT'] output_file_grps=['BIN-TEST'] page_id=None parameters={'{': '}'} result_queue_name=None callback_url=None agent_type=<AgentType.PROCESSING_WORKER: 'worker'> job_id='d1e075e3-6cb0-40c7-b854-d2d77322cfbf' depends_on=None
21:57:40.222 ERROR ocrd_network.processing_server - Failed to validate processing job input against the tool json of processor: ocrd-cis-ocropy-binarize
["[] Additional properties are not allowed ('{' was unexpected)"]

Dirty fix: 50f73c5. Not sure how to better handle that.

~~Also, the -P and -p flags are currently broken with the client. Probably the parsing is incorrectly done.~~ fixed with #1270

joschrew

After the recent changes the client is working for me.

src/ocrd_network/client.py

kba · 2024-08-14T10:54:00Z

src/ocrd_network/server_utils.py

@@ -210,12 +212,12 @@ def validate_job_input(logger: Logger, processor_name: str, ocrd_tool: dict, job
        raise_http_exception(logger, status.HTTP_404_NOT_FOUND, message)
    try:
        report = ParameterValidator(ocrd_tool).validate(dict(job_input.parameters))
-        if not report.is_valid:
-            message = f"Failed to validate processing job input against the tool json of processor: {processor_name}\n"
-            raise_http_exception(logger, status.HTTP_400_BAD_REQUEST, message + report.errors)
    except Exception as error:


When would this exception be raised? Validators should not raise errors but return a report. If you don't have a specific use case, it might be better to just not do try/except so that potential errors are actually raised and fixed.

I wanted to be extra secure against unexpected errors. The workers or the server (network agents as a service) should ideally never crash due to some error. Especially with an HTTP 500 error on the client side and leaving the network agent in an unpredictable state. The exceptions from problematic requests are still logged.

Understood, but errors in

report = ParameterValidator(ocrd_tool).validate(dict(job_input.parameters))

would mean that something is broken in our implementation or in jsonschema. So this would likely not be a fluke that happens once, we would need to stop processing and fix the bug at the source. The only variable factor is the job_input.parameters dict and if user-provided input breaks the validator - which it must absolutely never do - we should fix that in the validator.

The only variable factor is the job_input.parameters dict and if user-provided input breaks the validator - which it must absolutely never do - we should fix that in the validator.

A failing validator would also probably fail most of the processing/workflow jobs. Input from the user breaking down the entire infrastructure is conceptually wrong to me. Even if all the processing jobs fail, the Processing Server should still respond to other requests such as checking logs of old jobs. We would rather need a mechanism to prevent further submission of processing jobs in case X amount of jobs fail in a row - potentially preventing further requests only to the processing endpoint till it is fixed.

There is also currently no graceful shutdown for the processing server. I.e., once the server dies, anything inside the internal processing cache of the server (not the RabbitMQ) will be lost.

kba · 2024-08-14T11:10:34Z

running a processing job (send_processing_job_request) with the client does not work for me currently. The problem seems to be parameters in the inputs, at least it works if I remove this parameter from job_input. While debugging I think I realized a list instead of a dict is provided. Here I had to stop.

That was not very pleasant to debug. The ocrd.decorators.parameter_option is returning ['{}'] and not {} by default. Which led to the parameters={'{': '}'} instead of parameters={}.
21:57:40.222 WARNING ocrd_network.processing_server - Job input: processor_name='ocrd-cis-ocropy-binarize' path_to_mets='/home/mm/repos/ocrd_network_tests/ws29/data/mets.xml' workspace_id=None description='OCR-D Network client request' input_file_grps=['DEFAULT'] output_file_grps=['BIN-TEST'] page_id=None parameters={'{': '}'} result_queue_name=None callback_url=None agent_type=<AgentType.PROCESSING_WORKER: 'worker'> job_id='d1e075e3-6cb0-40c7-b854-d2d77322cfbf' depends_on=None
21:57:40.222 ERROR ocrd_network.processing_server - Failed to validate processing job input against the tool json of processor: ocrd-cis-ocropy-binarize
["[] Additional properties are not allowed ('{' was unexpected)"]

The actual parsing of the -p and -P arguments was missing. Fixed in #1270.

ocrd network client: parse parameters and overrides

src/ocrd_network/cli/client.py

kba · 2024-08-22T14:54:27Z

@MehmedGIT I've created the changelog for this PR, anything essential missing?

# Conflicts: # CHANGELOG.md

MehmedGIT · 2024-08-23T11:19:06Z

IMO, good to be merged.

MehmedGIT added 6 commits August 6, 2024 16:15

add integration test for client

2927a28

fix the test dir path in docker

8e7cd3e

update network client

bd16dd7

integration test for client

b2c0675

Fix flag typo

db6e566

try docker host ip

bec81ba

kba reviewed Aug 7, 2024

View reviewed changes

MehmedGIT added 6 commits August 9, 2024 12:29

remove the client server

4815896

refactor status checks

cb3460f

fix test

920c1a9

fix: client processing request

2a843a8

add: client workflow run

3a238a7

add timeout and wait to configs

50794f9

kba approved these changes Aug 12, 2024

View reviewed changes

Makefile Show resolved Hide resolved

src/ocrd_network/client.py Show resolved Hide resolved

src/ocrd_network/client_utils.py Outdated Show resolved Hide resolved

src/ocrd_utils/config.py Outdated Show resolved Hide resolved

MehmedGIT and others added 6 commits August 12, 2024 13:01

Update src/ocrd_network/client_utils.py

cc06fc3

Co-authored-by: Konstantin Baierer <kba@users.noreply.github.com>

refine status check methods

4115937

add help for new env

0136db0

add cli job status check

734bbf0

add: help section to the cli

f86bc23

fix: required job id

4194f9f

MehmedGIT marked this pull request as ready for review August 13, 2024 13:33

MehmedGIT requested a review from bertsky August 13, 2024 13:49

add docstring to cli commands

97b3eea

MehmedGIT requested a review from joschrew August 13, 2024 13:53

MehmedGIT added 4 commits August 13, 2024 17:57

Fix: rename to block

8e7ba26

When refactoring to a shorter flag I obviously missed that there are two places to adapt.

Fix: server_utils.py > 404 to 400

69808b6

fix: set ps address if None in constructor

4de1e83

fix: check report validation outside try block

d1af85b

fix: the annoying string dict

50f73c5

MehmedGIT added 4 commits August 13, 2024 22:28

add: parameter_override

8f2861c

add sort to network agents

06a371c

add: discovery cli, processors and processor

4d85970

add: check processing job log file

bb3007d

joschrew approved these changes Aug 14, 2024

View reviewed changes

kba reviewed Aug 14, 2024

View reviewed changes

src/ocrd_network/client.py Outdated Show resolved Hide resolved

kba reviewed Aug 14, 2024

View reviewed changes

MehmedGIT and others added 2 commits August 14, 2024 13:04

fix: exception handling

ff4243f

ocrd network client: parse parameters and overrides

5f746c1

kba and others added 2 commits August 14, 2024 16:00

fix parameter parsing again

8fc8bff

Merge pull request #1270 from OCR-D/fix-parsing

d73cfaa

ocrd network client: parse parameters and overrides

joschrew mentioned this pull request Aug 21, 2024

Run ocrd network sample OCR-D/ocrd_all#449

Draft

kba reviewed Aug 22, 2024

View reviewed changes

src/ocrd_network/cli/client.py Outdated Show resolved Hide resolved

📝 changelog

15cea57

kba and others added 2 commits August 22, 2024 16:55

Merge branch 'master' into extend-network-client

18d743a

# Conflicts: # CHANGELOG.md

refactor client cli: process -> run

6608539

kba merged commit 6608539 into master Aug 23, 2024
22 checks passed

kba deleted the extend-network-client branch August 23, 2024 11:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extend the network client #1269

Extend the network client #1269

MehmedGIT commented Aug 6, 2024 •

edited

Loading

MehmedGIT commented Aug 6, 2024

kba left a comment

joschrew commented Aug 13, 2024

MehmedGIT commented Aug 13, 2024 •

edited

Loading

joschrew left a comment

kba Aug 14, 2024

MehmedGIT Aug 14, 2024 •

edited

Loading

kba Aug 22, 2024

MehmedGIT Aug 23, 2024 •

edited

Loading

kba commented Aug 14, 2024

kba commented Aug 22, 2024

MehmedGIT commented Aug 23, 2024

Extend the network client #1269

Extend the network client #1269

Conversation

MehmedGIT commented Aug 6, 2024 • edited Loading

MehmedGIT commented Aug 6, 2024

kba left a comment

Choose a reason for hiding this comment

joschrew commented Aug 13, 2024

MehmedGIT commented Aug 13, 2024 • edited Loading

joschrew left a comment

Choose a reason for hiding this comment

kba Aug 14, 2024

Choose a reason for hiding this comment

MehmedGIT Aug 14, 2024 • edited Loading

Choose a reason for hiding this comment

kba Aug 22, 2024

Choose a reason for hiding this comment

MehmedGIT Aug 23, 2024 • edited Loading

Choose a reason for hiding this comment

kba commented Aug 14, 2024

kba commented Aug 22, 2024

MehmedGIT commented Aug 23, 2024

MehmedGIT commented Aug 6, 2024 •

edited

Loading

MehmedGIT commented Aug 13, 2024 •

edited

Loading

MehmedGIT Aug 14, 2024 •

edited

Loading

MehmedGIT Aug 23, 2024 •

edited

Loading