Text2Process

General Information: Group organization

In this project, the SCRUM approch model was used. The sprint length was set at two weeks. At the end of each sprint, a short sprint review was held to discuss what went well in the last sprint, what went badly and which tasks were still open and how to proceed further. Afterwards, in a short Sprint Planning, it was discussed which Epics should be dealt with in the following Sprint and which resources (developers) should be used for this Epic.

In the period of the project the following roles were assigned:

Scrum Master: Jonas Trautmann

Group: Harun Bajric, Noah Colby, Jannik Steck, Jonas Trautmann

Period: 10.05.2021 - 12.07.2021

The last week of the project was used to prepare the documentation

Epic: Work on parallel splits

(Sprint 1,2 / Members: Harun Bajric, Noah Colby, Jannik Steck, Jonas Trautmann)

BRANCH: ControlFlowStructures

Original problem:

Parallel workflows were not being recognized or being interpreted incorrectly.

Main issues:

Old Stanford parser
Parallel workflows rely heavily on context -->structure of code does not support this
Some marker words for parallel workflows were not being recognized or were treated differently than others or were not specified

Informations about Stanford Parser version:

Current build is from 2010 --> complex sentences with parallel workflows or loops are not handled correctly
Update will be very time consuming --> the API changed in a major way --> T2P will possibly have to be reconstructed from the ground up again

Findings:

syntax tree returned by the Stanford parser incorrect --> update version
updating Stanford parser version will require a lot of time --> newest version much newer than current build
current Stanford parser returns the correct tree

Idea:

Workaround with python-service

Python Service workaround:

To allow woped to use the new Stanford parser version without having to update the codebase in any major way, a python script was created, which creates a separate server to compute the tree using the new Stanford parser version. For each sentence, the service is called via a POST-command and a Tree (Stanford CoreNLP API) is returned. After that, the program resumes its work as usual.
Input for Service: String
Output from Service: String
THE CoreNLP version can be updated via the dockerfile

Work on loops:

Large issues on the work of parallel workflows --> loops were deprioritized to create capacity for work on the parallel workflows
Generalized approach in Epic: Conception

Two transitions in one path:

Two transitions are shown in one path
Example Sentence: The process is registered. The history is checked then the funds are generated while the authorization is tested. The process is completed
Generalized approach and more Information in Epic: Conception

Epic: Labels

(Sprint 2 / Members: Harun Bajric, Jonas Trautmann)

When using WoPeD, it has been noticed that punctuation marks are occasionally included in the process models created and that too much information is also provided in places. The aim is to correct this situation.

In order to fix this condition, it was initially tried to identify the cause when too much information was written into the label. Here it could be determined that there is a connection with the control flow structures. If the parallelism is misinterpreted, labels contain words that are used to represent parallelism. With the improvement of the interpretation of parallelism, this problem has not arisen any further.

In order to identify why labels protrude into other labels, the location responsible for the size of the labels was identified. This is done in the NameModul.java class of the main project. Here it was recognizable that the size of the labels was determined via autosize. This ensures that labels with a lot of text protrude into others. The labels were then set to a fixed size and centered. To avoid that the texts are cut off unattractively, after a certain number of chars, the rest is cut off with a ... abbreviation. The whole text of the label can still be viewed by double-clicking on the respective label.

The last problem, that punctuation marks are taken over arbitrarily, could not be reproduced and was not considered further after consultation with the supervisor.

Epic: Different kinds of imports

(Sprint 3 / Members: Noah Colby, Jannik Steck)

BRANCH: FileExtensions

Idea

Extend the file formats of the input text
In the first step only pdf
Use of TikaParser --> Parser detects automatically the format --> you only need one method
TikaParser: Link to TikaParser
Use of Aspose to translate docx and pptx to PDF because Tika is not compatible with either
Aspose: Link to Aspose

Problem

The parser couldn't find docx and pptx

Solution

Workaround for the formats docx and pptx --> Woped converts the files (docx, pptx) into a pdf file
After the file has been read correctly, the converted pdf file is deleted
You can now import the following file types for Mac and Windows: rtf, txt, doc, docx, ppt, pptx, pdf

Epic: Work on BPMN

(Sprint 3 / Members: Jonas Trautmann)

Initial situation: T2P provides the ability to create process model in angular.js based frontend. At the beginning of this project this was only possible for the creation of petri nets. Target state was to provide another common notation (BPMN).

To accomplish this, radio buttons have been added to the frontend to choose between PNML and BPMN. Depending on this choice, the process model is displayed in the canvas either as pnml or bpmn and the respective other model is hidden using ngif. This is done in the class index.html. The different representation is achieved by extending the Petrinet.js class with another component called BPMN. This component interpreted the created process model in the symbols typical for BPMN notation. To add more symbols they have to be extended in the bower_components (class vis.js). For example (https://www.bitdegree.org/learn/best-code-editor/html-canvas-tag-example-2) can be used to create these forms. In addition, another button has been added that allows to download the respective processes as a TXT file in the form of an XML file. Via the radioService.js is determined which XML file should be downloaded. For the creation of the bpmn xml an additional endpoint (/generateBPMN) was created in the t2pcontroller which delivers the corresponding XML as a response to a post request.

Additional work needed: Some classes should be split logically. For example, the BPMN component currently in petrinet.js should be moved to another class bpmn.js. In addition, a more up-to-date version of angular should be used in the near future. The LTS of the currently used version expires at the end of the year 2021 which leads to the fact that security relevant updates are no longer updated.

Epic: Conception

(Sprint 4 / Members: Jannik Steck)

There are conceptions for loops and parallel splits
Information: The conceptions are in german
The two Conceptions are in the last comment: conception

Epic: Secondary task

(Sprint 4 / Members: Jannik Steck, Jonas Trautmann)

Wrong error message by header "URL-Fehler" --> In the class T2PUI.java in the main woped project the wrong error message was mapped to a 500 error. This was replaced with the correct message T2PUI.500Error.Text. Furthermore, the text for the T2PUI.GeneralError.Text in Messages.properties and Messages_en.properties was rewritten to be more meaningful.
Code-Refactoring

Further required work

Implement concepts (Epic: Conception)
Update of the Stanford Parser version
Bpmn
Continue improve Parallelism

Provide feedback

Saved searches

Use saved searches to filter your results more quickly