Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to translate japanese #15

Closed
BtencateSphereon opened this issue Dec 6, 2021 · 13 comments
Closed

Unable to translate japanese #15

BtencateSphereon opened this issue Dec 6, 2021 · 13 comments
Labels
bug Something isn't working

Comments

@BtencateSphereon
Copy link

BtencateSphereon commented Dec 6, 2021

Hey,

i am noticing a difference in translation using the scheduler and directly using the translator

test string: こんにちは (hello in japanese)

scheduler:
`
const scheduler = new Scheduler(translator);

scheduler
.translate('こんにちは', 'ja', 'en')
.then((translate) => console.log(translate));
`
result: 縺 薙 s 縺 縺

translator:
`const translator = new GoogleTranslator();

translator
.translate('こんにちは', 'ja', 'en')
.then((translate) => console.log(translate))`
result: hello

is this something i am doing wrong or can fix?

Thanks in advance

@vitonsky
Copy link
Contributor

vitonsky commented Dec 6, 2021

Hi. Is it reproduce with translateBatch method?
If yes, then problem not in scheduler. I will check it in near time

@vitonsky vitonsky added the bug Something isn't working label Dec 6, 2021
@BtencateSphereon
Copy link
Author

Hey, thanks for the quick reply,

Yes the error is the same with translateBatch

translator
.translateBatch(['こんにちは'], 'ja', 'en')
.then((translate) => console.log(translate));

result: [ '縺 薙 s 縺 縺.' ]

@vitonsky
Copy link
Contributor

vitonsky commented Dec 6, 2021

As i see, a google API return this response: <pre><a i=\"0\">縺 薙 s 縺 縺.</a></pre>
Also, this phrase is not translate on translate.google.com for me (and for you? test it please)
But phrase "縺 薙" is translated successfully as "Nagi"

If google service can't translate this, then we can't fix it unfortunately.
I will try to make a yandex translator work in node environment, to you have alternative service, but it not so easy, so i need time.

If you need, you can extend the translateBatch method of GoogleTranslator class and when you have one item - call translate method and otherwise call super method. Something like:

class MyGoogleTranslator extends GoogleTranslator {
  public translateBatch = async(texts: string[], from: string, to: string) => {
    return texts.length === 1 ? [await this.translate(texts[0], from, to)] : super.translateBatch(texts, from, to);
  }
}

Or always call translate method with your own implementation of multiplexing (you can use helper Multiplexor)

@BtencateSphereon
Copy link
Author

Hmmm it is translating for me on https://translate.google.com/

image

@vitonsky
Copy link
Contributor

vitonsky commented Dec 6, 2021

Hmmm, you know, i tried translate "こんにちは" and it's successfully translate for me.
In past time i tried to translate "縺 薙 s 縺 縺.".

What environment you are use? Is it node? Maybe google are return wrong result for non-browser environments?
I will check it in nearer 1-2 days

@BtencateSphereon
Copy link
Author

I am using node. The translator itself is translating it properly, like described in the first post

@vitonsky
Copy link
Contributor

vitonsky commented Dec 7, 2021

translate and translateBatch are use different google API. I will research it, but it looks that google not wanna translate this for non-browser environments

@BtencateSphereon
Copy link
Author

Ok, the info really helped, did not know 2 different api's were used.

I did some digging and i found the following (not sure how helpful it really is)

ssut/py-googletrans#268 (talking about some google api's)

python example:

import requests

word = 'こんにちは'
url = "https://translate.googleapis.com/translate_a/t?client=dict-chrome-ex&sl=ja&tl=en&q=" + word
headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.104 Safari/537.36'
}

try:
    request_result = requests.get(url, headers=headers).json()

    print('[In English]: ' + request_result['alternative_translations'][0]['alternative'][0]['word_postproc'])
    print('[Language Dectected]: ' + request_result['src'])
except:
    pass

Result:
[In English]: Hello
[Language Dectected]: ja

@vitonsky
Copy link
Contributor

vitonsky commented Dec 7, 2021

@BtencateSphereon it's useful, thank you.

I tried this code on https://translate.google.com/

var word = 'こんにちは';
var url = `https://translate.googleapis.com/translate_a/t?client=dict-chrome-ex&sl=ja&tl=en&q=${word}&q=${word}&q=${word}`;
await fetch(url).then(r=>r.json())

And i got result with translations. As i see, this API request is support batching, so probably we should use it.
I will work on it in near time (about 2 hour later) and will replace this api for batch method or will create second implementation of google translator if it will work fine and will support large texts

@vitonsky
Copy link
Contributor

vitonsky commented Dec 7, 2021

@BtencateSphereon i tried to use this API and it is not work for me at this time.

You can checkout it with install npm install @translate-tools/core@0.2.1-0, and then run this code:

const { GoogleTranslatorFree } = require('@translate-tools/core/translators/GoogleTranslator');

const translator = new GoogleTranslatorFree();

translator.translateBatch(['こんにちは'], 'ja', 'en').then((translate) => console.log(translate))

Maybe it will work with esm version? Try to run it as esm module:

import { GoogleTranslatorFree } from '@translate-tools/core/esm/translators/GoogleTranslator';

const translator = new GoogleTranslatorFree();

translator.translateBatch(['こんにちは'], 'ja', 'en').then((translate) => console.log(translate))

When i run tests, it works fine, but it is not work from package

@vitonsky
Copy link
Contributor

vitonsky commented Dec 8, 2021

Ok, i found solution. We should use header User-Agent when use node:

const { GoogleTranslator} = require('@translate-tools/core/translators/GoogleTranslator');

const translator = new GoogleTranslator({
	headers: {
		'User-Agent':
			'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.81 Safari/537.36',
	},
});

translator.translateBatch(['こんにちは', 'こんにちは'], 'ja', 'en').then(console.log)

@vitonsky
Copy link
Contributor

vitonsky commented Dec 8, 2021

I will update readme

@BtencateSphereon
Copy link
Author

Very nice, thanks for the quick fix. I tested 0.2.4. and it works \o/.

Will be closing this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants