-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Borrow from biblatex #64
Comments
For sorting and dates in SortingSorting is, as you probably know, fully configurable via "sorting templates" and there are several predefined for common patterns (see the easily readable definitions in
The DatesThis is all done with a custom ETDF parsing module in Feel free to ask any other questions - it's a few years since I implemented most of this but I'm still actively supporting it all and so am fairly on top of it still. |
Thanks for the thorough reply @plk! Before I explain a bit more, the project is written in Rust, and JSON schemas are generated from that model. So the examples I'm using for illustration are YAML, since that' s a valid format in this context. On dates, I do have an EDTF parser I'm using, so the input end is covered. On the output end, I am currently using configuration options drawn from the javascript dates:
month: long In general, these "options" are defined globally in a style, and can be overridden in the local context of a citation or bibliography. And finally, in the templates, template "components" have a "form" property so one can do this: - date: issued
form: month I think that part is sound. Do you agree? I hadn't, however, thought much about extended dates and times. From reviewing your manual it looks like it would be pretty easy, as I think you're suggesting, for me to add a few options on I guess some of that may need to be localized as well? On sorting, I'll look more closely at the sorting templates. |
One other, specific, question: why the EDIT: oh, I see you answered that. I guess normally it would be empty, but you sometimes need it? A related question is when you need In my in-progress code here, I'm defining some behavior that can be called like this: author.key() So a few different data types (dates, contributors, titles) will share that same trait. In that case, it will return a string just for sorting, like "doe-jane:smith-john". |
The output
There have been requests to rewrite |
Honestly, I would generate sorting keys from names via a template and set a sensible default - you'll need to use templates eventually when non-Western language users start to request it. On the other hand, it's not so hard to retrofit, I found. |
Currently, those are only defined in options, but those can be set either globally or locally. My assumption there is one wouldn't need a long month and a short one in the same bibliography? But it should be easy to extend if for some reason my assumption is wrong.
Right. And in a batch-oriented context like tex, probably not worth the hassle? With this project, I started out with typescript, but switched to Rust because, while much more difficult in some ways, makes other things I need much simpler; namely schema generation and serializing and deserializing that data. Also, I just think we need a CSL-ish processor that can work well in different contexts, including the web, and desktop GUIs. While the compiler can be really annoying, things usually just work when I make it happy!
With CSL 1.0, we kind of took the approach to make some things fairly complicated and flexible upfront, not really knowing what we needed. So not only sorting is configured via templates (what we call there "macros"), but so are author substitutions, contributor role labeling, date formatting, etc. With this project, I'm trying to simplify wherever possible, moving much of that configuration to these options. But multi-lingual is definitely a goal here; am just trying to get there progressively. It may be the method for the keys and the sorting takes parameters to handle some of that, which can in turn be set in style options. |
Right, those options are basically global in
Not really although people do complain about
I looked at what would be needed in Rust for
I found that every time I tried to implement a simple option, in the end I had to extend it to be a fully configurable interface. Still basically an "option" but a complex one that's defined using TeX macros whose sole job is to output a complex option in XML in the
It's been an issue for |
Is Perl the kind of language where you can off-load pieces of performance-intensive processing to Rust code? I know it's often used for that. For example, in the neovim world, plugins are written in Lua, but some projects will rewrite pieces in Rust. Of course, if those key pieces would need to rely on crates that don't really exist ...
I saw the new Hayagriva project from the typst folks uses this crate. https://crates.io/crates/biblatex Do you know which ICU crate you were looking at? I guess there are two; the one recommend to me on a Rust forum was this one, which is pure Rust. But I did find it difficult, which is why I needed help from the forum to figure out localized date formatting (which I now need to implement). https://users.rust-lang.org/t/localized-date-time-formatting/94868
But you probably couldn't have figured out the latter without first doing the former? I'm currently thinking on sorting to make room for other configuration options. So this: sort:
- contributor: author
order: ascending
- date: issued
order: ascending ... becomes something like: sort:
bar: x # new options
foo:
- contributor: author
order: ascending
- date: issued
order: ascending E.g. effectively define an area to put config parameters as I need them. |
I've not looked into Rust integration but there are ways to integrate C code. I've had a look at this sort of thing before and I think that a complete re-write is likely the best policy for performance. However, performance isn't really much of an issue, it's not slow. it's just that people who use
I may have looked at this, I'll have another look, just out of interest.
Can't remember offhand. ICU in general is more complex (and complete) than most Unicode libs ...
Good point - that's true to some extent but in retrospect, where there were hard-coded assumptions in the structure of some option (like the parts of names and number of characters etc. to take from a name to construct a name key), I think it's best to make a user-facing template and use the template to pull the data parts as you'll inevitably have to extend it.
It depends a bit on how many new options there will be. I'd say, assume "quite a few". Not all of the sort-relevant options have to be in the sorting template - we have the template itself (effectively what you have here) and then other complex options which determine other aspect of sorting (such as the sort exclusions, name key generation etc.). If you have a look at a sample For example, here you can see an examples from the regression test files of a https://github.com/plk/biber/blob/dev/t/tdata/basic-misc.bcf You'll see that the sorting templates don't contain the name key generation template - that's a separate option. |
Oh WOW! |
OK, so looking at this example: <bcf:sortingtemplate name="nty">
<bcf:sort order="1">
<bcf:sortitem order="1">presort</bcf:sortitem>
</bcf:sort>
<bcf:sort order="2" final="1">
<bcf:sortitem order="1">sortkey</bcf:sortitem>
</bcf:sort>
<bcf:sort order="3">
<bcf:sortitem order="1">sortname</bcf:sortitem>
<bcf:sortitem order="2">author</bcf:sortitem>
<bcf:sortitem order="3">editor</bcf:sortitem>
<bcf:sortitem order="4">translator</bcf:sortitem>
<bcf:sortitem order="5">sorttitle</bcf:sortitem>
<bcf:sortitem order="6">title</bcf:sortitem>
</bcf:sort>
<bcf:sort order="4">
<bcf:sortitem order="1">sorttitle</bcf:sortitem>
<bcf:sortitem order="2">title</bcf:sortitem>
</bcf:sort>
<bcf:sort order="5">
<bcf:sortitem order="1">sortyear</bcf:sortitem>
<bcf:sortitem order="2">year</bcf:sortitem>
</bcf:sort>
<bcf:sort order="6">
<bcf:sortitem order="1">volume</bcf:sortitem>
<bcf:sortitem literal="1" order="2">0</bcf:sortitem>
</bcf:sort>
</bcf:sortingtemplate> Let me see if I understand this
<bcf:sort>
<bcf:sortitem>sorttitle</bcf:sortitem>
<bcf:sortitem>title</bcf:sortitem>
</bcf:sort> So here, maybe something like this as a second cut: #[derive(Default, Debug, Clone, PartialEq, Eq, PartialOrd, Ord, Hash)]
pub struct Sort {
pub config: SortConfig,
pub template: Vec<SortTemplate>,
}
#[derive(Default, Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash)]
pub struct SortTemplate {
pub key: SortKey,
pub order: SortOrder,
}
#[derive(Default, Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash)]
pub enum SortOrder {
#[default]
Ascending,
Descending,
}
#[derive(Default, Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash)]
pub enum SortKey {
#[default]
Author, // by default, substitution rules apply
Editor,
IssuedYear,
Type,
}
#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash)]
pub struct SortConfig {
/// Shorten name lists for sorting the same as for display.
pub shorten_names: bool,
/// Use same substitutions for sorting as for rendering.
pub render_substitutions: bool,
// etc
}
impl Default for SortConfig {
fn default() -> Self {
Self {
shorten_names: false,
render_substitutions: true,
}
}
} So in YAML: sort:
template:
- author
- issued-year Where default for order, config and substitution are already set. |
To make room for additional configuration options for sorting, move the template to a named template field, and add a couple of example parameters. Refs: #64 Signed-off-by: Bruce D'Arcus <bdarcus@gmail.com>
To make room for additional configuration options for sorting, move the template to a named template field, and add a couple of example parameters. Refs: #64 Signed-off-by: Bruce D'Arcus <bdarcus@gmail.com>
The general semantics is that the
This provides a fixed place in the sorting for when you want to give the sorting key for this part when there is no suitable field (for example, if there is no
Artefact of the library I use - it reads into a random-ordered hash so I have this to make sure of the order. Also was just in case of issues in the
Actually,
This looks nice, yes. |
To make room for additional configuration options for sorting, move the template to a named template field, and add a couple of example parameters. Refs: #64 Signed-off-by: Bruce D'Arcus <bdarcus@gmail.com>
To make room for additional configuration options for sorting, move the template to a named template field, and add a couple of example parameters. Refs: #64 Signed-off-by: Bruce D'Arcus <bdarcus@gmail.com>
OK, I merged the initial results of this very useful discussion; both the adjustments to the sort model, and added a couple of parameters for dates here (I'll need to figure out how to get the localized date-formatting + EDTF code working before figuring out what more I need; it's a much bigger hassle than in JS): Hopefully I can keep the sort model simple-ish :-) |
I missed earlier that this crate is actually written by the typst devs, so is also fairly newly-available. |
See also #61
Beyond CSL, the other excellent package first released around the same time, and similarly ambitious, is biblatex.
It has struck me its design has some similarities to what I'm doing here.
Consider their long list of completely flat parameters, aka options (and see table I've attached below for how they map to scopes):
They've also been ahead of us on EDTF, and looks like already figured it out.
biblatex-options-table.pdf
The text was updated successfully, but these errors were encountered: