Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run multi-dataset tests #104

Closed
gordonwatts opened this issue May 9, 2024 · 11 comments · Fixed by #121
Closed

Run multi-dataset tests #104

gordonwatts opened this issue May 9, 2024 · 11 comments · Fixed by #121
Assignees
Labels
perf test Log of running a performance test performance Issue that is trying to improve or understand the performance of the workflows servicex Related to SX tests
Milestone

Comments

@gordonwatts
Copy link
Member

Run tests on SX with multiple datasets. See how fast we can go!

@gordonwatts gordonwatts added performance Issue that is trying to improve or understand the performance of the workflows servicex Related to SX tests labels May 9, 2024
@gordonwatts gordonwatts self-assigned this May 9, 2024
@gordonwatts
Copy link
Member Author

Here is a run with 3000 workers. The CPU was 40%, so they were starved for data.
image

And this was the switch activity:
image

@gordonwatts
Copy link
Member Author

After "doing somethig" - which means adding a new switch into the mix, we got up a bit faster... but not by the 50 or 60 Gbps that we were hoping for.

image

@gordonwatts gordonwatts added the perf test Log of running a performance test label May 12, 2024
@gordonwatts
Copy link
Member Author

We can't even submit the multi-dataset due to our timeout issue:

(venv) [bash][gwatts]:idap-200gbps-atlas > python servicex/servicex_materialize_branches.py -v --distributed-client scheduler --dask-scheduler 'tcp://dask-gwatts-2e1782e2-0.af-jupyter:8786' --dask-profile --dataset multi_data --query xaod_small --num-files 0
0000.0476 - INFO - root - Using release 22.2.107 for type information.
0000.0816 - WARNING - func_adl.type_based_replacement - Unknown type for name len
0000.8357 - INFO - root - Running over 4 datasets, 142.636 TB and 19,074,862,754 events.
0000.8362 - INFO - root - Building ServiceX query
0000.8366 - INFO - root - Querying dataset data15_13TeV:data15_13TeV.periodAllYear.physics_Main.PhysCont.DAOD_PHYSLITE.grp15_v01_p6026
0000.8367 - INFO - root - Querying dataset data16_13TeV:data16_13TeV.periodAllYear.physics_Main.PhysCont.DAOD_PHYSLITE.grp16_v01_p6026
0000.8367 - INFO - root - Querying dataset data17_13TeV:data17_13TeV.periodAllYear.physics_Main.PhysCont.DAOD_PHYSLITE.grp17_v01_p6026
0000.8367 - INFO - root - Querying dataset data18_13TeV:data18_13TeV.periodAllYear.physics_Main.PhysCont.DAOD_PHYSLITE.grp18_v01_p6026
0000.8368 - INFO - root - Running on the full dataset(s).
0000.8368 - INFO - root - Starting ServiceX query
Transform     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/?  
Download/URLs ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/?  0002.3338 - INFO - servicex.query - ServiceX Transform speed_test_data16_13TeV:data16_13TeV.periodAllYear.phys
Transform     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/?      
Download/URLs ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/0 --:--0003.2684 - INFO - servicex.query - ServiceX Transform speed_test_data15_13TeV:data15_13TeV.periodAllYear.
Transform     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/?      
Download/URLs ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/0 --:--0028.4436 - INFO - servicex.query - ServiceX Transform speed_test_data17_13TeV:data17_13TeV.periodAllYear.
Transform     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/?      
Download/URLs ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/0 00:31
Traceback (most recent call last):
  File "/home/gwatts/code/iris-hep/idap-200gbps-atlas/servicex/servicex_materialize_branches.py", line 508, in <module>
    main(
  File "/home/gwatts/code/iris-hep/idap-200gbps-atlas/servicex/servicex_materialize_branches.py", line 181, in main
    dataset_files = query_servicex(
  File "/home/gwatts/code/iris-hep/idap-200gbps-atlas/servicex/servicex_materialize_branches.py", line 148, in query_servicex
    results = sx.deliver(spec)
  File "/venv/lib/python3.9/site-packages/servicex/servicex_client.py", line 107, in deliver
    results = group.as_signed_urls()
  File "/venv/lib/python3.9/site-packages/make_it_sync/func_wrapper.py", line 63, in wrapped_call
    return _sync_version_of_function(fn, *args, **kwargs)
  File "/venv/lib/python3.9/site-packages/make_it_sync/func_wrapper.py", line 14, in _sync_version_of_function
    return loop.run_until_complete(r)
  File "/usr/AnalysisBaseExternals/25.2.2/InstallArea/x86_64-el9-gcc13-opt/lib/python3.9/asyncio/base_events.py", line 647, in run_until_complete
    return future.result()
  File "/venv/lib/python3.9/site-packages/servicex/dataset_group.py", line 76, in as_signed_urls_async
    return await asyncio.gather(*self.tasks)
  File "/venv/lib/python3.9/site-packages/servicex/query.py", line 521, in as_signed_urls_async
    return await self.submit_and_download(
  File "/venv/lib/python3.9/site-packages/servicex/query.py", line 260, in submit_and_download
    self.request_id = await self.servicex.submit_transform(sx_request)
  File "/venv/lib/python3.9/site-packages/servicex/servicex_adapter.py", line 120, in submit_transform
    raise RuntimeError("ServiceX WebAPI Error during transformation "
RuntimeError: ServiceX WebAPI Error during transformation submission: 504 - <Response [504 Gateway Time-out]>

@gordonwatts
Copy link
Member Author

Changed timeout to None - see how that does.

@gordonwatts
Copy link
Member Author

Crash during the 4th dataset submission:

(venv) [bash][gwatts]:idap-200gbps-atlas > python servicex/servicex_materialize_branches.py -v --distributed-client scheduler --dask-scheduler 'tcp://dask-gwatts-2e1782e2-0.af-jupyter:8786' --dask-profile --dataset multi_data --query xaod_small --num-files 0
0000.0420 - INFO - root - Using release 22.2.107 for type information.
0000.0780 - WARNING - func_adl.type_based_replacement - Unknown type for name len
0000.8158 - INFO - root - Running over 4 datasets, 142.636 TB and 19,074,862,754 events.
0000.8161 - INFO - root - Building ServiceX query
0000.8164 - INFO - root - Querying dataset data15_13TeV:data15_13TeV.periodAllYear.physics_Main.PhysCont.DAOD_PHYSLITE.grp15_v01_p6026
0000.8165 - INFO - root - Querying dataset data16_13TeV:data16_13TeV.periodAllYear.physics_Main.PhysCont.DAOD_PHYSLITE.grp16_v01_p6026
0000.8165 - INFO - root - Querying dataset data17_13TeV:data17_13TeV.periodAllYear.physics_Main.PhysCont.DAOD_PHYSLITE.grp17_v01_p6026
0000.8166 - INFO - root - Querying dataset data18_13TeV:data18_13TeV.periodAllYear.physics_Main.PhysCont.DAOD_PHYSLITE.grp18_v01_p6026
0000.8166 - INFO - root - Running on the full dataset(s).
0000.8166 - INFO - root - Starting ServiceX query
0000.8286 - INFO - servicex.servicex_client - Returning code generators from cache
Transform     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/?  
Download/URLs ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/?  0006.6497 - INFO - servicex.query - ServiceX Transform speed_test_data15_13TeV:data15_13TeV.periodAllYear.phys
Transform     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/?              
Download/URLs ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━     0/10049 --:--0027.8295 - INFO - servicex.query - ServiceX Transform speed_test_data17_13TeV:data17_13TeV.period
Transform     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/?              
Download/URLs ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━     1/10049 --:--0032.1429 - INFO - servicex.query - ServiceX Transform speed_test_data18_13TeV:data18_13TeV.period
Transform     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/?              
Download/URLs ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━     0/55534 --:--
Traceback (most recent call last):
  File "/home/gwatts/code/iris-hep/idap-200gbps-atlas/servicex/servicex_materialize_branches.py", line 508, in <module>
    main(
  File "/home/gwatts/code/iris-hep/idap-200gbps-atlas/servicex/servicex_materialize_branches.py", line 181, in main
    dataset_files = query_servicex(
  File "/home/gwatts/code/iris-hep/idap-200gbps-atlas/servicex/servicex_materialize_branches.py", line 148, in query_servicex
    results = sx.deliver(spec)
  File "/venv/lib/python3.9/site-packages/servicex/servicex_client.py", line 107, in deliver
    results = group.as_signed_urls()
  File "/venv/lib/python3.9/site-packages/make_it_sync/func_wrapper.py", line 63, in wrapped_call
    return _sync_version_of_function(fn, *args, **kwargs)
  File "/venv/lib/python3.9/site-packages/make_it_sync/func_wrapper.py", line 14, in _sync_version_of_function
    return loop.run_until_complete(r)
  File "/usr/AnalysisBaseExternals/25.2.2/InstallArea/x86_64-el9-gcc13-opt/lib/python3.9/asyncio/base_events.py", line 647, in run_until_complete
    return future.result()
  File "/venv/lib/python3.9/site-packages/servicex/dataset_group.py", line 76, in as_signed_urls_async
    return await asyncio.gather(*self.tasks)
  File "/venv/lib/python3.9/site-packages/servicex/query.py", line 521, in as_signed_urls_async
    return await self.submit_and_download(
  File "/venv/lib/python3.9/site-packages/servicex/query.py", line 260, in submit_and_download
    self.request_id = await self.servicex.submit_transform(sx_request)
  File "/venv/lib/python3.9/site-packages/servicex/servicex_adapter.py", line 120, in submit_transform
    raise RuntimeError("ServiceX WebAPI Error during transformation "
RuntimeError: ServiceX WebAPI Error during transformation submission: 504 - <Response [504 Gateway Time-out]>

@gordonwatts
Copy link
Member Author

However that run got to 200 Gbps
image

@gordonwatts
Copy link
Member Author

To work around the timeout issues, added retry code to the frontend:

    @retry(wait=wait_fixed(60), stop=stop_after_delay(60*60), reraise=True)
    async def submit_transform(self, transform_request: TransformRequest):
        async with httpx.AsyncClient() as client:
            headers = await self._get_authorization(client)
            r = await client.post(url=f"{self.url}/servicex/transformation",
                                  headers=headers,
                                  json=transform_request.dict(by_alias=True,
                                                              exclude_none=True),
                                  timeout=None)
            if r.status_code == 401:
                raise AuthorizationError(
                    f"Not authorized to access serviceX at {self.url}")
            elif r.status_code == 400:
                raise ValueError(f"Invalid transform request: {r.json()['message']}")
            elif r.status_code > 400:
                try:
                    error_message = r.json().get('message', str(r))
                except Exception:
                    error_message = str(r)
                raise RuntimeError("ServiceX WebAPI Error during transformation "
                                   f"submission: {r.status_code} - {error_message}")
        return r.json()['request_id']

Likely need to be more specific (only the timeout error).

@gordonwatts
Copy link
Member Author

Here is the first time 4 got submitted - sadly someone was running continuously in the background:

image

2e1782e2-0.af-jupyter:8786' --dask-profile --dataset multi_data --query xaod_small --num-files 0
0000.0350 - INFO - root - Using release 22.2.107 for type information.
0000.0705 - WARNING - func_adl.type_based_replacement - Unknown type for name len
0000.8200 - INFO - root - Running over 4 datasets, 142.636 TB and 19,074,862,754 events.
0000.8203 - INFO - root - Building ServiceX query
0000.8207 - INFO - root - Querying dataset data15_13TeV:data15_13TeV.periodAllYear.physics_Main.PhysCont.DAOD_PHYSLITE.grp15_v01_p6026
0000.8208 - INFO - root - Querying dataset data16_13TeV:data16_13TeV.periodAllYear.physics_Main.PhysCont.DAOD_PHYSLITE.grp16_v01_p6026
0000.8208 - INFO - root - Querying dataset data17_13TeV:data17_13TeV.periodAllYear.physics_Main.PhysCont.DAOD_PHYSLITE.grp17_v01_p6026
0000.8209 - INFO - root - Querying dataset data18_13TeV:data18_13TeV.periodAllYear.physics_Main.PhysCont.DAOD_PHYSLITE.grp18_v01_p6026
0000.8210 - INFO - root - Running on the full dataset(s).
0000.8210 - INFO - root - Starting ServiceX query
0000.8325 - INFO - servicex.servicex_client - Returning code generators from cache
Transform     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/?  
Download/URLs ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/?  0007.6675 - INFO - servicex.query - ServiceX Transform speed_test_data15_13TeV:data15_13TeV.periodAllY
Transform     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/?              
Download/URLs ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━     0/10049 --:--0023.8078 - INFO - servicex.query - ServiceX Transform speed_test_data16_13TeV:data16_13Te
Transform     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/?              
Download/URLs ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━     0/10049 --:--0028.1298 - INFO - servicex.query - ServiceX Transform speed_test_data17_13TeV:data17_13Te
Transform     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/?              
Download/URLs ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━     0/45571 --:--0039.4060 - INFO - servicex.query - ServiceX Transform speed_test_data18_13TeV:data18_13Te
Transform     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/?              
Download/URLs ━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━━━━━ 25804/55534 04:041106.4289 - WARNING - servicex.query - Transforms completed with failures 1 files failed o
Transform     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/?              
Transform     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/?              
Download/URLs ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╸ 55531/55534 --:--1994.0269 - INFO - servicex.query - Transforms completed successfully
Transform     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/?              
Download/URLs ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 64815/64803 47:55
2884.0278 - INFO - root - Event rate for ServiceX: 00:48:03 time, 6615.85 kHz, Data rate: 395.77 Gbits/s
2884.0279 - INFO - root - Dataset speed_test_data15_13TeV:data15_13TeV.periodAllYear.physics_Main.PhysCont.DAOD_PHYSLITE.grp15_v01_p6026 has 2887 files
2884.0279 - INFO - root - Dataset speed_test_data16_13TeV:data16_13TeV.periodAllYear.physics_Main.PhysCont.DAOD_PHYSLITE.grp16_v01_p6026 has 4078 files
2884.0280 - INFO - root - Dataset speed_test_data17_13TeV:data17_13TeV.periodAllYear.physics_Main.PhysCont.DAOD_PHYSLITE.grp17_v01_p6026 has 4550 files
2884.0280 - INFO - root - Dataset speed_test_data18_13TeV:data18_13TeV.periodAllYear.physics_Main.PhysCont.DAOD_PHYSLITE.grp18_v01_p6026 has 4682 files

There is no way that number is right!

@gordonwatts
Copy link
Member Author

On a quiet cluster - full run went!!!

image

(venv) [bash][gwatts]:idap-200gbps-atlas > python servicex/servicex_materialize_branches.py -v --distributed-client scheduler --dask-scheduler 'tcp://dask-gwatts-2e1782e2-0.af-jupyter:8786' --dask-profile --dataset multi_data --query xaod_small --num-files 0
0000.0374 - INFO - root - Using release 22.2.107 for type information.
0000.0732 - WARNING - func_adl.type_based_replacement - Unknown type for name len
0000.8248 - INFO - root - Running over 4 datasets, 142.636 TB and 19,074,862,754 events.
0000.8252 - INFO - root - Building ServiceX query
0000.8256 - INFO - root - Querying dataset data15_13TeV:data15_13TeV.periodAllYear.physics_Main.PhysCont.DAOD_PHYSLITE.grp15_v01_p6026
0000.8256 - INFO - root - Querying dataset data16_13TeV:data16_13TeV.periodAllYear.physics_Main.PhysCont.DAOD_PHYSLITE.grp16_v01_p6026
0000.8257 - INFO - root - Querying dataset data17_13TeV:data17_13TeV.periodAllYear.physics_Main.PhysCont.DAOD_PHYSLITE.grp17_v01_p6026
0000.8257 - INFO - root - Querying dataset data18_13TeV:data18_13TeV.periodAllYear.physics_Main.PhysCont.DAOD_PHYSLITE.grp18_v01_p6026
0000.8257 - INFO - root - Running on the full dataset(s).
0000.8258 - INFO - root - Starting ServiceX query
Transform     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/?  
Download/URLs ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/?  0006.8627 - INFO - servicex.query - ServiceX Transform speed_test_data15_13TeV:data15_13TeV.p
Transform     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/?              
Download/URLs ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━     0/10049 --:--0025.7301 - INFO - servicex.query - ServiceX Transform speed_test_data16_13TeV:da
Transform     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/?              
Download/URLs ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━     0/10049 --:--0030.2254 - INFO - servicex.query - ServiceX Transform speed_test_data17_13TeV:da
Transform     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/?              
Download/URLs ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━     0/10049 --:--0034.7581 - INFO - servicex.query - ServiceX Transform speed_test_data18_13TeV:da
Transform     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/?              
Transform     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/?              
Transform     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/?              
Transform     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/?              
Transform     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/?              
Download/URLs ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 64803/64803 30:43
1949.7755 - INFO - root - Event rate for ServiceX: 00:32:28 time, 9787.25 kHz, Data rate: 585.49 Gbits/s
1949.7756 - INFO - root - Dataset speed_test_data15_13TeV:data15_13TeV.periodAllYear.physics_Main.PhysCont.DAOD_PHYSLITE.grp15_v01_p6026 has 2893 files
1949.7756 - INFO - root - Dataset speed_test_data16_13TeV:data16_13TeV.periodAllYear.physics_Main.PhysCont.DAOD_PHYSLITE.grp16_v01_p6026 has 3981 files
1949.7757 - INFO - root - Dataset speed_test_data17_13TeV:data17_13TeV.periodAllYear.physics_Main.PhysCont.DAOD_PHYSLITE.grp17_v01_p6026 has 4204 files
1949.7757 - INFO - root - Dataset speed_test_data18_13TeV:data18_13TeV.periodAllYear.physics_Main.PhysCont.DAOD_PHYSLITE.grp18_v01_p6026 has 4591 files
Traceback (most recent call last):
  File "/home/gwatts/code/iris-hep/idap-200gbps-atlas/servicex/servicex_materialize_branches.py", line 508, in <module>
    main(
  File "/home/gwatts/code/iris-hep/idap-200gbps-atlas/servicex/servicex_materialize_branches.py", line 196, in main
    report, n_events = dask.compute(*calculate_n_events(dataset_files, steps_per_file))
  File "/venv/lib/python3.9/site-packages/dask/base.py", line 661, in compute
    results = schedule(dsk, keys, **kwargs)
  File "/venv/lib/python3.9/site-packages/distributed/client.py", line 2232, in _gather
    raise exception.with_traceback(traceback)
distributed.scheduler.KilledWorker: Attempted to run task ('<dask-awkward.lib.core.ArgsKwargsPackedFunction ob-c9e5eefd8340b50f30ff99510d27817b', 1078) on 4 different workers, but all those workers died while running it. The last worker that attempt to run the task was tcp://172.16.160.82:37043. Inspecting worker logs is often a good next step to diagnose what went wrong. For more information see https://distributed.dask.org/en/stable/killed.html.

@gordonwatts
Copy link
Member Author

Ran on DASK without errors. Had to limit it to 100 workers in order to make it work.

Duration: 16m 29s
Tasks Information
number of tasks: 100000
compute time: 4hr 57m
disk-read time: 22.87 ms
disk-write time: 13.36 ms
transfer time: 15m 27s

image

Clearly the 301 second long green bars are timeouts. Would be good to reduce the number of timeouts!

@gordonwatts
Copy link
Member Author

With 500 workers:

(venv) [bash][gwatts]:idap-200gbps-atlas > python servicex/servicex_materialize_branches.py -v --distributed-client scheduler --dask-scheduler 'tcp://dask-gwatts-2e1782e2-0.af-jupyter:8786' --dask-profile --dataset multi_data --query xaod_small --num-files 0
0000.0598 - INFO - root - Registering retry HTTPFileSystem and HTTPFile with fsspec on DASK cluster
0000.7733 - INFO - root - Using release 22.2.107 for type information.
0000.8078 - WARNING - func_adl.type_based_replacement - Unknown type for name len
0001.5643 - INFO - root - Running over 4 datasets, 142.636 TB and 19,074,862,754 events.
0001.5646 - INFO - root - Building ServiceX query
0001.5650 - INFO - root - Querying dataset data15_13TeV:data15_13TeV.periodAllYear.physics_Main.PhysCont.DAOD_PHYSLITE.grp15_v01_p6026
0001.5651 - INFO - root - Querying dataset data16_13TeV:data16_13TeV.periodAllYear.physics_Main.PhysCont.DAOD_PHYSLITE.grp16_v01_p6026
0001.5651 - INFO - root - Querying dataset data17_13TeV:data17_13TeV.periodAllYear.physics_Main.PhysCont.DAOD_PHYSLITE.grp17_v01_p6026
0001.5651 - INFO - root - Querying dataset data18_13TeV:data18_13TeV.periodAllYear.physics_Main.PhysCont.DAOD_PHYSLITE.grp18_v01_p6026
0001.5652 - INFO - root - Running on the full dataset(s).
0001.5652 - INFO - root - Starting ServiceX query
0001.6020 - INFO - servicex.servicex_client - Returning code generators from cache
0001.6253 - INFO - servicex.query - Returning results from cache
0001.6438 - INFO - servicex.query - Returning results from cache
0001.6622 - INFO - servicex.query - Returning results from cache
0001.6810 - INFO - servicex.query - Returning results from cache

0001.6828 - INFO - root - Event rate for ServiceX not calculated since cached result was used
0001.6829 - INFO - root - Dataset speed_test_data15_13TeV:data15_13TeV.periodAllYear.physics_Main.PhysCont.DAOD_PHYSLITE.grp15_v01_p6026 has 2893 files
0001.6829 - INFO - root - Dataset speed_test_data16_13TeV:data16_13TeV.periodAllYear.physics_Main.PhysCont.DAOD_PHYSLITE.grp16_v01_p6026 has 3981 files
0001.6829 - INFO - root - Dataset speed_test_data17_13TeV:data17_13TeV.periodAllYear.physics_Main.PhysCont.DAOD_PHYSLITE.grp17_v01_p6026 has 4204 files
0001.6830 - INFO - root - Dataset speed_test_data18_13TeV:data18_13TeV.periodAllYear.physics_Main.PhysCont.DAOD_PHYSLITE.grp18_v01_p6026 has 4591 files
0001.6830 - INFO - root - Using `uproot.dask` to open files (splitting files 1 ways).
0409.5802 - INFO - root - Number of skimmed events: 87,463,684 (skim percent: 0.4585%)
0410.5958 - INFO - root - Starting build of DASK graphs
0415.0800 - INFO - root - Computing the total count
0855.2350 - INFO - root - Event rate for DASK Calculation: 00:07:20 time, 43336.73 kHz, Data rate: 2592.47 Gbits/s
0855.2352 - INFO - root - DASK event rate over actual events: 198.71 kHz
0855.2353 - INFO - root - speed_test_data15_13TeV:data15_13TeV.periodAllYear.physics_Main.PhysCont.DAOD_PHYSLITE.grp15_v01_p6026: result = 16,310,072
0855.2353 - INFO - root - speed_test_data16_13TeV:data16_13TeV.periodAllYear.physics_Main.PhysCont.DAOD_PHYSLITE.grp16_v01_p6026: result = 22,730,800
0855.2354 - INFO - root - speed_test_data17_13TeV:data17_13TeV.periodAllYear.physics_Main.PhysCont.DAOD_PHYSLITE.grp17_v01_p6026: result = 25,561,943
0855.2354 - INFO - root - speed_test_data18_13TeV:data18_13TeV.periodAllYear.physics_Main.PhysCont.DAOD_PHYSLITE.grp18_v01_p6026: result = 22,860,869
Duration: 432.72 s
Tasks Information
number of tasks: 51501
compute time: 5hr 11m
disk-read time: 5.53 ms
transfer time: 591.42 s

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
perf test Log of running a performance test performance Issue that is trying to improve or understand the performance of the workflows servicex Related to SX tests
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant