Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

expose a function in Lua to generate identifiers/sort keys from UTF8 text #9591

Closed
massifrg opened this issue Mar 19, 2024 · 6 comments
Closed

Comments

@massifrg
Copy link

I need a function similar to the one that computes auto identifiers.

I need it for Lua filters/writers.

It should:

  • convert text to lowercase (this is already available in Lua thanks to pandoc.text

  • remove accents from letters: "àéü" => "aeu"

  • convert multiple non-alphanumeric chars to one space or "-", or "_": "foo (11/12), bar" => "foo-11-12-bar"

In particular, I need it to automatically generate sort keys for indices.

@tarleb
Copy link
Collaborator

tarleb commented Mar 20, 2024

I think that's a bit too specific and hence out of scope, especially the "remove accents" part.

See this StackOverflow Q&A for a hack to generate auto-identifiers from text.

@tarleb tarleb closed this as not planned Won't fix, can't repro, duplicate, stale Mar 20, 2024
@tarleb
Copy link
Collaborator

tarleb commented Mar 20, 2024

Addendum: you might like sluaggo, it was even written specifically with pandoc Lua filters in mind.

@massifrg
Copy link
Author

Thanks @tarleb for the hints.

I also found issue #6415 that's related to what I'm working on.

There @jgm cites his unicode-collation library, which is used in pandoc for citeproc.

Though not strictly related to generating sort keys, a pandoc.text.collate(lang, text1, text2) in Lua may be useful in conjunction with pandoc.list:sort.

@jgm
Copy link
Owner

jgm commented Mar 20, 2024

I'll note that we do have textToIdentifier in Text.Pandoc.Shared. In principle it could be wrapped and made available in the Lua API. Though I share @tarleb's sense that this might be a bit special-purpose for us to include in the API.

@tarleb
Copy link
Collaborator

tarleb commented Mar 30, 2024

@massifrg Having access to collate in Lua would certainly be helpful. Would you raise an issue for the collate feature request over at hslua? I'll look into it when I can, and it would find the way into pandoc from there.

@massifrg
Copy link
Author

Would you raise an issue for the collate feature request over at hslua?

Thanks @tarleb.
I did it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants