Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OutOfMemoryError en validant un feed GTFS Realtime sur transport.data.gouv.fr/validation #2212

Closed
isabelle-dr opened this issue Mar 11, 2022 · 10 comments
Assignees
Labels
ops Gestion des serveurs et de la production

Comments

@isabelle-dr
Copy link

Description du bug
J'ai eu une OutOfMemoryError en testant l'interface sur transport.data.gouv.fr/validation.

Comment reproduire le bug

  1. Sélectionner GTFS-RT
  2. Mettre les liens du dataset de MBTA trouvés sur transitfeeds.com

Attendu
Voir les résultats de validation dans l'interface.

Observé
Le message d'erreur suivant:
Screen Shot 2022-03-11 at 9 13 05 AM
L'erreur à copier/coller si nécessaire 👇

"[main] INFO edu.usf.cutr.gtfsrtvalidator.lib.batch.BatchProcessor - Starting batch processor...\n[main] INFO edu.usf.cutr.gtfsrtvalidator.lib.batch.BatchProcessor - Reading GTFS data from /tmp/validation_165096_gtfs_rt/file.zip...\n[main] INFO edu.usf.cutr.gtfsrtvalidator.lib.batch.BatchProcessor - file.zip read in 16.145 seconds\n[main] INFO edu.usf.cutr.gtfsrtvalidator.lib.validation.GtfsMetadata - Building GtfsMetadata for /tmp/validation_165096_gtfs_rt/file.zip...\n[main] INFO edu.usf.cutr.gtfsrtvalidator.lib.validation.GtfsMetadata - Processing trips and building trip shapes for /tmp/validation_165096_gtfs_rt/file.zip...\nException in thread \"main\" java.lang.OutOfMemoryError: Java heap space\n\tat org.locationtech.spatial4j.shape.jts.JtsShapeFactory$CoordinatesAccumulator.pointXYZ(JtsShapeFactory.java:316)\n\tat org.locationtech.spatial4j.shape.jts.JtsShapeFactory$CoordinatesAccumulator.pointXY(JtsShapeFactory.java:310)\n\tat org.locationtech.spatial4j.shape.jts.JtsShapeFactory$JtsLineStringBuilder.pointXY(JtsShapeFactory.java:228)\n\tat edu.usf.cutr.gtfsrtvalidator.lib.validation.GtfsMetadata.<init>(GtfsMetadata.java:184)\n\tat edu.usf.cutr.gtfsrtvalidator.lib.batch.BatchProcessor.processFeeds(BatchProcessor.java:145)\n\tat edu.usf.cutr.gtfsrtvalidator.lib.Main.main(Main.java:62)\n"

Environnement
Mac M1
macOS Monterey version 12.2

Informations additionnelles
En utilisant le validateur avec le terminal, ça fonctionne avec les liens MBTA.
J'ai aussi essayé l'interface avec les liens de Bibus de Brest Metropole trouvés sur le site de transport.data.gouv.fr et ça fonctonne parfaitement.
Screen Shot 2022-03-11 at 9 10 28 AM

@thbar
Copy link
Contributor

thbar commented Mar 11, 2022

Merci pour le rapport de bug ! On va explorer ça ; notre environnement n'est pas immense côté RAM, ça peut poser problème.

@AntoineAugusti AntoineAugusti added the ops Gestion des serveurs et de la production label Mar 14, 2022
@AntoineAugusti
Copy link
Member

Voir CUTR-at-USF/gtfs-realtime-validator#382, à voir quels paramètres on choisit en fonction de notre machine et de combien on veut dédier.

Actuellement le worker n'a pas crashé quand ces erreurs surviennent

@AntoineAugusti
Copy link
Member

AntoineAugusti commented Mar 22, 2022

Sur le worker en prod j'ai fait après avoir lu StackOverflow

$ java -XX:+PrintFlagsFinal -version | grep -iE 'HeapSize|PermSize|ThreadStackSize'
     intx CompilerThreadStackSize                  = 1024                                   {pd product} {default}
   size_t ErgoHeapSizeLimit                        = 0                                         {product} {default}
   size_t HeapSizePerGCThread                      = 43620760                                  {product} {default}
   size_t InitialHeapSize                          = 65011712                                  {product} {ergonomic}
   size_t LargePageHeapSizeThreshold               = 134217728                                 {product} {default}
   size_t MaxHeapSize                              = 1029701632                                {product} {ergonomic}
    uintx NonNMethodCodeHeapSize                   = 5830732                                {pd product} {ergonomic}
    uintx NonProfiledCodeHeapSize                  = 122913754                              {pd product} {ergonomic}
    uintx ProfiledCodeHeapSize                     = 122913754                              {pd product} {ergonomic}
   size_t ShenandoahSoftMaxHeapSize                = 0                                      {manageable} {default}
     intx ThreadStackSize                          = 1024                                   {pd product} {default}
     intx VMThreadStackSize                        = 1024                                   {pd product} {default}

soit MaxHeapSize d'environ ~1 GB et on a seulement 4 GB de RAM sur notre worker actuel. On augmente les flags pour passer un peu plus ?

On remarque quelques erreurs liées à la heap en production pour la validation GTFS-RT pour des réseaux USA / Suisse

select date, details, on_the_fly_validation_metadata, on_the_fly_validation_metadata->>'state', on_the_fly_validation_metadata->>'error_reason'
from validations
where validations.on_the_fly_validation_metadata->>'type' = 'gtfs-rt'

Les URLs des GTFS en question

URL GTFS count
https://cdn.mbta.com/MBTA_GTFS.zip 3
https://cdn.mbtace.com/MBTA_GTFS.zip 1
https://github.com/MobilityData/gtfs-validator/files/8232184/gtfs.zip 1
https://gtfs.mfdz.de/DELFI.BB.gtfs.zip 1
https://opentransportdata.swiss/de/dataset/timetable-2022-gtfs2020/permalink 1
https://svc.metrotransit.org/mtgtfs/gtfs.zip 1
https://transitfeeds.com/p/mbta/64/latest/download 1
https://www.portauthority.org/developerresources/GTFS.zip 1

@AntoineAugusti AntoineAugusti self-assigned this Mar 22, 2022
@thbar
Copy link
Contributor

thbar commented Mar 22, 2022

Merci d'avoir fouillé !

On augmente les flags pour passer un peu plus ?

Oui ça va être nécessaire ; ce qui peut être utile, c'est de faire un test en local avec le fichier qui a planté, et d'ajouter 100 MB à chaque tour jusqu'à ce que ça passe, pour ne pas augmenter complètement au hasard, vu que la RAM est précieuse sur la prod.

Je note qu'on a encore les tailles de queues suivantes par ailleurs, et que ça peut multiplier (ou pas) le max théorique:

queues: [default: 2, heavy: 1, on_demand_validation: 1],

@AntoineAugusti
Copy link
Member

@thbar Pas encore eu le temps de faire ce que tu suggères mais c'est une bonne idée ! Je vais le faire prochainement et regarder les statistiques.

@thbar
Copy link
Contributor

thbar commented Mar 30, 2022

@AntoineAugusti pas de souci! De mon côté je vais voir pour expérimenter côté suivi mémoire, voir si on peut avoir quelque chose de plus précis que les métriques sur CC.

@barbeau
Copy link

barbeau commented Apr 18, 2022

(Not following all of the above because I don't speak French, but hopefully the below is still useful 😄)

Typically you can fix java.lang.OutOfMemoryError by increasing the heap size using the following command-line parameters when running the batch validator:

java ... -Xmx512m -XX:MaxMetaspaceSize=512m

See https://stackoverflow.com/a/38336005/937715 for more details.

If you don't want to allocate more memory to the batch validator, note that you can typically avoid OOM errors on larger feeds by ignoring the rules that process shapes.txt file by adding the command line parameter -ignoreShapes yes:
https://github.com/MobilityData/gtfs-realtime-validator/blob/master/gtfs-realtime-validator-lib/README.md#command-line-config-parameters

The above docs say:

  • -ignoreShapes (Optional) - If this argument is supplied (e.g., -ignoreShapes yes), the validator will ignore the shapes.txt file for the GTFS feed. If you are getting OutOfMemoryErrors when processing very large feeds, you should try setting this to true. Note that setting this to true will prevent the validator from checking rules like E029 that require spatial data. See this issue for details.

We specifically added the above command-line to allow the validator to process huge country-sized datasets like the Netherlands, which still generated OOM errors even with large -Xmx values due to huge shapes.txt files. When using -ignoreShapes yes we were able to validate all real-world GTFS/GTFS RT files we could find at the time on a USF lab computer.

@AntoineAugusti
Copy link
Member

Looked at this error today, happened 6 times out of 109 GTFS-RT on-demand validations so doesn't happen much. Looked at GTFS URLs and they are all outside France, making this issue low priority for us for now.

select on_the_fly_validation_metadata->>'gtfs_url', count(1)
from validations 
where on_the_fly_validation_metadata->>'type' = 'gtfs-rt' and on_the_fly_validation_metadata->>'error_reason' like '%heap space%'
group by 1
GTFS URL count
https://cdn.mbta.com/MBTA_GTFS.zip 2
https://gtfs.mfdz.de/DELFI.BB.gtfs.zip 1
https://opentransportdata.swiss/de/dataset/timetable-2022-gtfs2020/permalink 1
https://svc.metrotransit.org/mtgtfs/gtfs.zip 1
https://www.portauthority.org/developerresources/GTFS.zip 1

@AntoineAugusti
Copy link
Member

Latest data

select oban_args->>'state', count(1)
from multi_validation mv 
where mv.validator = 'gtfs-realtime-validator' and mv.validated_data_name is not null
group by 1

gives 325 rows with state = completed.

Closing for now, we'll revisit if needed

@AntoineAugusti AntoineAugusti closed this as not planned Won't fix, can't repro, duplicate, stale Nov 17, 2022
@AntoineAugusti
Copy link
Member

My previous SQL was wrong.

select oban_args->>'state', oban_args->>'error_reason' like '%OutOfMemoryError%', count(1)
from multi_validation mv 
where oban_args->>'type' = 'gtfs-rt'
group by 1, 2

Gives

state OutOfMemoryError error count
completed 359
error FALSE 203
error TRUE 19

so it happened 19 times.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ops Gestion des serveurs et de la production
Projects
None yet
Development

No branches or pull requests

5 participants