Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different output in some languages when there is semicolon in header #87

Open
marusak opened this issue Jan 17, 2020 · 1 comment
Open

Comments

@marusak
Copy link

marusak commented Jan 17, 2020

There seems to be some inconsistency, about using semicolon at the end of plural-forms in .po headers. It's presence can break the format, that po2json produces. Let me explain with examples:

Let's have this file: (tmp.po)

  msgid ""                                                                        
  msgstr ""                                                                       
  "Project-Id-Version: PACKAGE VERSION\n"                                         
  "Language: ko\n"                                                                
  "MIME-Version: 1.0\n"                                                           
  "Content-Type: text/plain; charset=UTF-8\n"                                     
  "Content-Transfer-Encoding: 8bit\n"                                             
  "Plural-Forms: nplurals=1; plural=0\n"                                          
  "X-Generator: Weblate 3.10.1\n"                                                 
                                                                                  
  msgid "Combined usage of $0 CPU core"                                           
  msgid_plural "Combined usage of $0 CPU cores"                                   
  msgstr[0] "$0 CPU 코어의 총 사용량"

when I run ./node_modules/po2json/bin/po2json -p tmp.po tmp then tmp looks like this:

{                                                                               
   "": {                                                                        
      "project-id-version": "PACKAGE VERSION",                                  
      "language": "ko",                                                         
      "mime-version": "1.0",                                                    
      "content-type": "text/plain; charset=UTF-8",                              
      "content-transfer-encoding": "8bit",                                      
      "plural-forms": "nplurals=1; plural=0",                                   
      "x-generator": "Weblate 3.10.1"                                           
   },                                                                           
   "Combined usage of $0 CPU core": [                                           
      "Combined usage of $0 CPU cores",                                         
      "$0 CPU 코어의 총 사용량"                                                 
   ]                                                                            
}  

Which is correct. But let's now add semicolon to the "Plural-Forms: nplurals=1; plural=0\n" line.
Now the tmp.po file looks like this:

  msgid ""                                                                        
  msgstr ""                                                                       
  "Project-Id-Version: PACKAGE VERSION\n"                                         
  "Language: ko\n"                                                                
  "MIME-Version: 1.0\n"                                                           
  "Content-Type: text/plain; charset=UTF-8\n"                                     
  "Content-Transfer-Encoding: 8bit\n"                                             
  "Plural-Forms: nplurals=1; plural=0;\n"                                         
  "X-Generator: Weblate 3.10.1\n"                                                 
                                                                                                                                     
  msgid "Combined usage of $0 CPU core"                                           
  msgid_plural "Combined usage of $0 CPU cores"                                   
  msgstr[0] "$0 CPU 코어의 총 사용량" 

and when I run the same command, the tmp output is:

{                                                                               
   "": {                                                                        
      "project-id-version": "PACKAGE VERSION",                                  
      "language": "ko",                                                         
      "mime-version": "1.0",                                                    
      "content-type": "text/plain; charset=UTF-8",                              
      "content-transfer-encoding": "8bit",                                      
      "plural-forms": "nplurals=1; plural=0;",                                  
      "x-generator": "Weblate 3.10.1"                                           
   },                                                                           
   "Combined usage of $0 CPU core": [                                           
      "Combined usage of $0 CPU cores",                                         
      [                                                                         
         "$0 CPU 코어의 총 사용량"                                              
      ]                                                                         
   ]                                                                            
} 

So the translation for the string is not array of strings, but array of one string and one array.

Interestingly enough, if I have different file, like this:

  msgid ""                                                                        
  msgstr ""                                                                       
  "Project-Id-Version: PACKAGE VERSION\n"                                         
  "Language: cs\n"                                                                
  "MIME-Version: 1.0\n"                                                           
  "Content-Type: text/plain; charset=UTF-8\n"                                     
  "Content-Transfer-Encoding: 8bit\n"                                             
  "Plural-Forms: nplurals=3; plural=(n==1) ? 0 : (n>=2 && n<=4) ? 1 : 2\n"        
  "X-Generator: Weblate 3.10.1\n"                                                 
                                                                                  
  msgid "Combined usage of $0 CPU core"                                           
  msgid_plural "Combined usage of $0 CPU cores"                                   
  msgstr[0] "Kombinované využití $0 jádra procesoru"                              
  msgstr[1] "Kombinované využití $0 jader procesoru"                              
  msgstr[2] "Kombinované využití $0 jader procesoru"

The output is the same, no matter if there is semicolon or not on the Plural-Forms line.
From docs it seems there always should be semicolon (1, 2). This is likely problem in some library that po2json uses, but was not sure where it really comes from, so reporting here.
(side note: We had mix of some files having this semicolon and some don't for years and it seemed to work just fine. We were using Zanata to generate these files for us, now we migrated to Weblate and it adds this semicolon to some more languages (still not to all). So maybe this is known bug/documented somewhere)

marusak added a commit to cockpit-project/cockpit-weblate that referenced this issue Jan 17, 2020
marusak referenced this issue in cockpit-project/cockpit-weblate Feb 11, 2020
marusak added a commit to marusak/cockpit-machines that referenced this issue Mar 24, 2021
See mikeedwards/po2json#87
Done with bots that I edited to include every changed file
marusak added a commit to cockpit-project/cockpit-machines that referenced this issue Mar 24, 2021
See mikeedwards/po2json#87
Done with bots that I edited to include every changed file
@hthetiot
Copy link

Most likely https://github.com/smhg/gettext-parser related.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants