Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to read in template information from a separate external file #1152

Closed
pk-mitre opened this issue Sep 28, 2023 · 17 comments · Fixed by #1154
Closed

Add option to read in template information from a separate external file #1152

pk-mitre opened this issue Sep 28, 2023 · 17 comments · Fixed by #1154

Comments

@pk-mitre
Copy link
Contributor

Currently ROBOT template files as described here have to include the template strings for each column as the second line in the input template file.
An option to add the ability to read the template string definitions from a separate file would allow for a more flexible pipeline where the template strings can be separately created and input via a file.
A usage example would be
robot template --template animals_ext_template.tsv \
--ext-template animals_template.tsv \
--output results/animals_ext_template.owl

pk-mitre added a commit to pk-mitre/robot that referenced this issue Sep 28, 2023
@pk-mitre pk-mitre mentioned this issue Sep 28, 2023
5 tasks
@matentzn
Copy link
Contributor

I would love that.. The name you suggest is not expressive enough yet, but we can work on that. There is also the option (which is more along the lines of the notion of template) to go with --template animals_template.tsv --template-data animals_ext_template.tsv, so, factorting the data out into a separate parameter rather than the template.

@pk-mitre I see you are already working on a PR! Awesome!

@pk-mitre
Copy link
Contributor Author

pk-mitre commented Sep 29, 2023 via email

@jamesaoverton
Copy link
Member

I agree that this is a good idea, and a first glance at your branch looks good. Please open a Pull Request and we'll review it.

@pk-mitre
Copy link
Contributor Author

pk-mitre commented Sep 29, 2023 via email

@jamesaoverton
Copy link
Member

You created a fork of this repository, and you created a PR on your fork pk-mitre#1, but that's not what I mean.

It would be helpful if you can open a PR on this repository, so that it shows up in our list of PRs here https://github.com/ontodev/robot/pulls. Here is the GitHub documentation on that https://docs.github.com/en/pull-requests. If you have trouble, let us know and we can help.

Much appreciated!

@matentzn
Copy link
Contributor

Before we move to the PR, lets first agree on the whether the data or the template part should be factored out into a separate variable.

👍: --template template_header.tsv --template-data template_data.tsv
🎉 :--template template_data.tsv --template-header template_header.tsv

@pk-mitre
Copy link
Contributor Author

pk-mitre commented Sep 29, 2023 via email

@pk-mitre
Copy link
Contributor Author

pk-mitre commented Sep 29, 2023 via email

@pk-mitre pk-mitre mentioned this issue Sep 29, 2023
5 tasks
@jamesaoverton
Copy link
Member

👍: --template template_header.tsv --template-data template_data.tsv

How will this first option be backwards compatible?

🎉 :--template template_data.tsv --template-header template_header.tsv

This second option seems safer to me.

@matentzn
Copy link
Contributor

matentzn commented Oct 2, 2023

How will this first option be backwards compatible?

I think they both can be made backwards compatible, and both have their own issues in that.

If we go with the first option, where the rows that contain the template strings are in the --template option, we don't need to write any template parsing coat for the --template-data option (because we just append its contents under the table parsed by --template). If we go with option two, in order to be backwards compatible, we will have to write template handling code for the --template and --template-data: --template to be backwards compatible, and --template-data to be forwards compatible.

I think with option 2, the only thing that needs to change in the code is to add a single line that appends the data provided by --template-data to the base template --template. It even continues to be allowed to have actual template data in --template - we don't change its definition at all!

@jamesaoverton
Copy link
Member

With the current version of ROBOT, you can always manipulate your input tables before ROBOT sees them to add template strings or whatever. Call this the "default approach". I tend to use standard head and tail commands and shell pipes for this. If ROBOT sees a problem with the file, it will report the correct row numbers for the input that it saw, and the user can figure it out.

If we want ROBOT to support fancier template inputs, that feature needs to be better than the default approach. When there is a problem with the template(s), ROBOT needs to tell the user which file, row, and column to look at. The current PR will mess up those error reports, which makes it (in my opinion) worse than the default approach. I'm worried that doing better than the default will require deeper changes to the code than we want to make.

@dlutz2
Copy link
Contributor

dlutz2 commented Oct 12, 2023

If we can live with the restriction that a single Template command line run uses either the current template-in-2nd-row or the external-template-file approach for all incoming data (doesn't mix approaches), I believe the only change necessary to resolve the off-by-one in the error reporting would be to set Template.rowNum =2 (when using the current approach as currently coded) or rowNum=1 (when using external template approach). This starting offset would be set in TemplateOperations.template() when creating the template and using OptionsHelper to determine which templating approach was being used.
Am I missing something?

@dlutz2
Copy link
Contributor

dlutz2 commented Oct 24, 2023

Where did we leave this? If we can make it work such as the row/columns in the error reporting are correct, without major changes to the existing code, is the capability of interest? Or should we just close the PR?

@jamesaoverton
Copy link
Member

Sorry for the delay. @dlutz2 your proposal makes sense to me. From my perspective, the important things are backwards compatibility and clear communication to the user.

@pk-mitre
Copy link
Contributor Author

pk-mitre commented Nov 2, 2023

I've added some updates to take into account the rowNum offset for an external template file. A test data set is referenced in template.md (commented out ) to illustrate this.

@pk-mitre
Copy link
Contributor Author

Does this update for the keeping the line numbers for error reporting consistent meet the requirement ?

@matentzn
Copy link
Contributor

What is the state of this issue? would be great to have this feature :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants