-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request]: Support converting Nominal-Text -> Nominal/Ordinal #1633
Comments
Sorry if I add just noise, as I'm out of my comfort zone: But maybe there is a better way than a conversion "based on the order at the moment of conversion"? It seems that this may lead to confusing effects if a user changes/adds/removes strings? |
I'd do whatever R does when it converts character to factor: set.seed(123)
c <- sample(letters, 5)
f <- factor(c)
# alphabetical
print(data.frame(
character = c,
factor = f,
integer = as.integer(f)
), row.names = FALSE)
#> character factor integer
#> o o 4
#> s s 5
#> n n 3
#> c c 1
#> j j 2
c <- c("汉", "字", letters[seq(3, 1, -1)])
f <- factor(c)
# no idea what determines the order for the chinese characters
print(data.frame(
character = c,
factor = f,
integer = as.integer(f)
), row.names = FALSE)
#> character factor integer
#> 汉 汉 5
#> 字 字 4
#> c c 3
#> b b 2
#> a a 1
Perhaps that's what R does? I think it just sorts the unique values and uses that to assign integer values. |
@vandenman well I meant a conversion based on the actual string "value", not its ordered position. |
Well, actually this is how we use them inside analyses already, if you change the order of the labels in the variableswindow then that will change the order in the resulting To make that a bit more clear, when we feed the nominal-text column to R now it is in fact converted into a So just using that when converting to nominal and ordinal should be alright. And it allows for users deciding the order of their scales and things like that which I assume they want. (And we wouldn't get if we just order it based on the strings) |
that's ok,The order of Chinese characters is usually not considered because in quantitative data analysis practice, Chinese characters are generally used as labels but not treated as values. If ordering is to be considered, I would suggest ordering by value. |
I think this makes sense.
Also makes sense. There is one edge case though that I would check for. In R, this situation can occur: f <- factor(as.character(1:11))
f # order from sorting 1:11 as strings
#> [1] 1 2 3 4 5 6 7 8 9 10 11
#> Levels: 1 10 11 2 3 4 5 6 7 8 9
fSorted <- factor(f, levels = sort(as.numeric(levels(f))))
fSorted # order from sorting 1:11 as numbers
#> [1] 1 2 3 4 5 6 7 8 9 10 11
#> Levels: 1 2 3 4 5 6 7 8 9 10 11 where the default levels (first print) have order Also, I'd imagine this is just the default conversion from nominal text to Nominal/ ordinal. Afterward, people should be able to change the order and labels in any way they want.
Sure, but the issue is that we need a consistent way to assign values to text. That text may consist of Chinese characters, Hebrew symbols, or who knows what kind of characters. However, initially, there is no value we can use to order by. |
@JorisGoosen |
Indeed! |
This has come up often and we added some feedback for users on why a column can't be converted to another type at jasp-stats/jasp-desktop@769383d for our internal issue https://github.com/jasp-stats/INTERNAL-jasp/issues/977
#1258 is perhaps related as well as #1581
But now also @EJWagenmakers was asking me about it and I think it shouldn't be so very hard to just do the following:
Support Nominal-Text -> Nominal/Ordinal
Where we drop the original strings as in, for instance, a csv file and assign an integral value based on the order at the moment of conversion. This will "lose" some information but this is not so bad.
Converting to scalar would then still fail, because then even the labels would be lost and I suppose that is not what one wants?
On the other hand, a messagebox asking the user whether they are ok with losing the data could also be done I suppose.
And also this: https://github.com/jasp-stats/INTERNAL-jasp/issues/1397
The text was updated successfully, but these errors were encountered: