Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Translation Api #515

Merged
merged 40 commits into from
Apr 26, 2024
Merged

feat: Translation Api #515

merged 40 commits into from
Apr 26, 2024

Conversation

tomasciccola
Copy link
Contributor

this should close #466

@tomasciccola
Copy link
Contributor Author

I can't remember whats the rationale for omitting the actual translated value (called message) when hashing the doc. For now I'm using the whole doc for hashing, but there was probably a good reason for it

* start writing unit tests
* error handling for missing translation
* fix error in sql query
@gmaclennan
Copy link
Member

I can't remember whats the rationale for omitting the actual translated value (called message) when hashing the doc. For now I'm using the whole doc for hashing, but there was probably a good reason for it

Yes the reason is so that when you edit the translated value, the docId does not change, and so that you can know the docId without knowing the translated value (when reading a translated value, when you have the other fields)

src/translation-api.js Outdated Show resolved Hide resolved
src/translation-api.js Outdated Show resolved Hide resolved
src/mapeo-project.js Outdated Show resolved Hide resolved
* fix logic error when `put`
* handle unexisting key in the indexing (by creating a new Set)
* make `fieldRef` and `regionCode` optional when `get`
@tomasciccola
Copy link
Contributor Author

I'm wondering how are we actually using the cache (when calling translationApi.index(doc). If, lets say, I translate a field to spanish (by calling translationApi.put(translation), the index would look like

Map(1) { 'es' => Set(1) { 'field' } }

Which signals that basically we have fields translated to spanish. I would think that we would use this Map on subsequent calls to translationApi.get(), but I don't actually know how we would use that?? (I would imagine this would remove some filters from the sql query, but I don't actually know...)

@tomasciccola
Copy link
Contributor Author

Now that I think about it, shouldn't the #translatedLanguageCodeToSchemaNames, should be smth like

Map<languageCode,Map<schemaName,Map<fieldRef,message>>>

I know that it looks convoluted, but that way we're actually storing the translated values in memory that we can retrieve

@gmaclennan
Copy link
Member

Now that I think about it, shouldn't the #translatedLanguageCodeToSchemaNames, should be smth like

Map<languageCode,Map<schemaName,Map<fieldRef,message>>>

I know that it looks convoluted, but that way we're actually storing the translated values in memory that we can retrieve

In translationApi.get():

get(opts) {
  const docTypeIsTranslatedToLanguage = this.#translatedLanguageCodeToSchemaNames.get(opts.languageCode)?.has(opts.schemaNameRef)
  if (! docTypeIsTranslatedToLanguage) return []

The vast majority of records in the database will not have any translation. This provides a shortcut to avoid an unnecessary database query, with a fairly small memory footprint (the size of this map is limited to the number of languages the user has translated for, by the number of schema types they have translated, so fairly small).

@tomasciccola
Copy link
Contributor Author

owww ok yeah, that makes sense; its not to cache records, is to avoid queries for things we don't have translation for. I kinda knew that after our sync, but I guess I just forgot

@tomasciccola
Copy link
Contributor Author

tomasciccola commented Mar 12, 2024

Shouldn't the index method be called every time we do a put ? So, in that sense, it should basically be a private method of the api - and internally run every time we call put - ?
Is there a case where we want to put a translation but not index it??

* expose cache map through symbol and getter
* improve `get` signature (by making `fieldRef` and `regionCode`
  optional)
* add more unit tests
@tomasciccola
Copy link
Contributor Author

I took the liberty of setting index as a private method and running it when calling put. There's smth that I may be missing, so that change could be reverted.
I'll probably create e2e tests for this, so that we can test creating translations for real records in a project (probably using the configs that we use for testing), but in terms of unit tests I dunno if there's stuff missing (there may be...)

@gmaclennan
Copy link
Member

You're missing that there is more than one user/writer in a project 🙂

@tomasciccola
Copy link
Contributor Author

You're missing that there is more than one user/writer in a project 🙂

HEh! That's what I was missing. I'll revert that and update the tests

…ranslationApi

Also, have a fallback when no matching `regionCode` (can be reverted is
it not desired)
Comment on lines 48 to 50
if (doc) {
await this.#dataType.update(doc.versionId, value)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (doc) {
await this.#dataType.update(doc.versionId, value)
}
await this.#dataType.update(doc.versionId, value)

src/translation-api.js Show resolved Hide resolved
Comment on lines 49 to 53
// @ts-ignore how can this be improved (we can maybe set "useUnknownInCatchVariables": false)
if (e.message !== 'Not found')
throw new Error(`Error on translation ${e}`)
// TODO: throw if the error is different from 'Not found'
return await this.#dataType[kCreateWithDocId](docId, value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lines have several issues, which (I think) I fixed in #590.

src/translation-api.js Show resolved Hide resolved
Copy link
Contributor

@EvanHahn EvanHahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for addressing all of my comments!

@tomasciccola tomasciccola merged commit 9fa05aa into main Apr 26, 2024
4 checks passed
@tomasciccola tomasciccola deleted the feat/translationApi branch April 26, 2024 13:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Translation API
3 participants