Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature request] customized suffixes or prefixes in join command #263

Closed
Jesson-mark opened this issue Dec 17, 2023 · 4 comments
Closed

Comments

@Jesson-mark
Copy link

Hi, Wei,
thanks for your great work on csvtk! It is a really great software that I use it everyday!

I am wondering if there is a new option which can provide customized suffixes or prefixes when using join to merge two files. I know --prefix-filename and --prefix-trim-ext can add each filename as a prefix to each colname, but in some instances my file may contain a long string, which may not be proper if there are added to new columns (these columns are too long to read or distinguish). In such case, I only want to add two distinct strings(labels) to the combined (merged) columns.

That will be great if an option which can supply customized suffixes or prefixes is added.

Thanks again for your great work!

@shenwei356
Copy link
Owner

Please show some simple examples.

@Jesson-mark
Copy link
Author

Jesson-mark commented Dec 18, 2023

Sorry for my ambiguous explanation above. Below are some simple examples:

Suppose there are two files named phones.csv and region.csv (the same examples as join command)

$ cat phones.csv 
username,phone
gri,11111
rob,12345
ken,22222
shenwei,999999

$ cat region.csv 
name,region
ken,nowhere
gri,somewhere
shenwei,another
Thompson,there

When joinning them, adding --prefix-filename and --prefix-trim-ext options, the results are:

$ csvtk join -f 1 phones.csv region.csv --prefix-filename --prefix-trim-ext
username,phones-phone,region-region
gri,11111,somewhere
ken,22222,nowhere
shenwei,999999,another

If there is a new option, eg: --suffix "label1,label2", where label1 is added to columns of file1 and label2 is added to columns of file2, it will becomes:

$ csvtk join -f 1 phones.csv region.csv --suffix "A,B"
username,phone-A,region-B
gri,11111,somewhere
ken,22222,nowhere
shenwei,999999,another

Now A and B are added to the new columns, which is more readable than previous outputs because it is highly customizable without modifying the filename.

The option (--suffix) is just like the suffix parameter in left_join function of dtplyr package, but csvtk is more convenient than dtplyr since usng the latter requires writing a little scripts.

shenwei356 added a commit that referenced this issue Dec 18, 2023
@shenwei356
Copy link
Owner

Added:

$ csvtk join -f 1 phones.csv region.csv --suffix "A,B"  | csvtk pretty 
username   phone-A   region-B 
--------   -------   ---------
gri        11111     somewhere
ken        22222     nowhere  
shenwei    999999    another 

@Jesson-mark
Copy link
Author

Thank you for your prompt reply and modifications to csvtk! I have great respect for that!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants