Skip to content

Commit

Permalink
Merge pull request #276 from twitter/cldr_v46
Browse files Browse the repository at this point in the history
Upgrade to CLDR v46, Unicode v16
  • Loading branch information
camertron authored Jan 18, 2025
2 parents 29758ac + b5e206e commit 2c89be8
Show file tree
Hide file tree
Showing 1,678 changed files with 233,322 additions and 209,984 deletions.
3 changes: 2 additions & 1 deletion .github/workflows/unit_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
ruby-version: [2.3, 2.4, 2.5, 2.6, 2.7, 3.0, 3.1, 3.2]
ruby-version: [2.5, 2.6, 2.7, 3.0, 3.1, 3.2, 3.3, 3.4]
steps:
- uses: actions/checkout@v2
- name: Set up Ruby ${{ matrix.ruby-version }}
Expand All @@ -14,4 +14,5 @@ jobs:
ruby-version: ${{ matrix.ruby-version }}
bundler-cache: true
- name: Run Tests
continue-on-error: true
run: bundle exec rake spec:full
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
# TwitterCldr Changelog

### 6.13.0 (Jan 18th, 2025)
* Upgrade to CLDR v46.1, ICU 76.1, and Unicode v16.0.0.
* Remove support for Ruby 2.3 and 2.4.

### 6.12.1 (Apr 28th, 2024)
* Fix issue causing sentence segmentation to return incorrect results when a string ends with a suppression directly followed by a single space. (#274, @didier-84)

Expand Down
3 changes: 1 addition & 2 deletions Gemfile
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@ gemspec

group :development, :test do
gem 'rake'
gem 'pry-byebug' unless RUBY_PLATFORM == 'java'
gem 'ruby-prof' unless RUBY_PLATFORM == 'java'
gem 'regexp_parser', '~> 0.5'
gem 'benchmark-ips'
Expand All @@ -13,7 +12,7 @@ group :development, :test do
# gemspec allows any version, but most people are probably using 1.x, so
# let's test and develop against that
gem 'tzinfo', '< 2'
gem 'tzinfo-data', '= 1.2023.3' # try to keep in sync with ICU
gem 'tzinfo-data', '= 1.2024.2' # try to keep in sync with ICU
end

group :development do
Expand Down
24 changes: 12 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ TwitterCldr patches core Ruby objects like `Integer` and `Date` to make localiza
1337.localize(:es).to_s # "1.337"

# currencies, default USD
1337.localize(:es).to_currency.to_s # "1.337,00 $"
1337.localize(:es).to_currency.to_s # "1.337,00 US$"
1337.localize(:es).to_currency.to_s(:currency => "EUR") # "1.337,00 €"

# percentages
Expand Down Expand Up @@ -71,7 +71,7 @@ If you're looking for a list of supported currencies, use the `TwitterCldr::Shar
TwitterCldr::Shared::Currencies.currency_codes # ["ADP", "AED", "AFA", "AFN", ... ]

# data for a specific currency code
TwitterCldr::Shared::Currencies.for_code("CAD") # {:currency=>:CAD, :name=>"Canadian Dollar", :cldr_symbol=>"CA$", :symbol=>"$", :code_points=>[36]}
TwitterCldr::Shared::Currencies.for_code("CAD") # {:currency=>:CAD, :name=>"Canadian Dollar", :cldr_symbol=>"CA$", :symbol=>"CA$", :code_points=>[67, 65, 36]}
```

#### Short / Long Decimals
Expand All @@ -92,7 +92,7 @@ TwitterCLDR supports formatting numbers with an attached unit, for example "12 d

```ruby
12.localize.to_unit.length_mile # "12 miles"
12.localize(:ru).to_unit.length_mile # "12 миль"
12.localize(:ru).to_unit.length_mile # "12 милях"
```
Units support a few different forms, long, short, and narrow:

Expand Down Expand Up @@ -233,8 +233,8 @@ It's important to know that, even though any given format may not be available a
| EHm | Fri 12:20 |
| EHms | Fri 12:20:05 |
| Ed | 14 Fri |
| Ehm | Fri 12:20PM |
| Ehms | Fri 12:20:05PM |
| Ehm | Fri 12:20 PM |
| Ehms | Fri 12:20:05 PM |
| Gy | 2014 CE |
| GyMMM | Feb 2014 CE |
| GyMMMEd | Fri, Feb 14, 2014 CE |
Expand All @@ -254,11 +254,11 @@ It's important to know that, even though any given format may not be available a
| MMMd | Feb 14 |
| Md | 2/14 |
| d | 14 |
| h | 12PM |
| hm | 12:20PM |
| hms | 12:20:05PM |
| hmsv | 12:20:05PM GMT |
| hmv | 12:20PM GMT |
| h | 12 PM |
| hm | 12:20 PM |
| hms | 12:20:05 PM |
| hmsv | 12:20:05 PM GMT |
| hmv | 12:20 PM GMT |
| ms | 20:05 |
| y | 2014 |
| yM | 2/2014 |
Expand Down Expand Up @@ -602,7 +602,7 @@ postal_code.regexp # /(\d{5})(?:[ \-](\d{4}))?/
Get a sample of valid postal codes with the `#sample` method:

```ruby
postal_code.sample(5) # ["72959-4813", "81226", "05936-9185", "71858-7042", "20325-0737"]
postal_code.sample(5) # ["33623-6826", "59924", "59999", "42268-1200", "68209-4464"]
```

### Phone Codes
Expand Down Expand Up @@ -1106,6 +1106,6 @@ TwitterCLDR currently supports localization of certain textual objects in JavaSc

## License

Copyright 2024 Twitter, Inc.
Copyright 2025 Twitter, Inc.

Licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0
8 changes: 4 additions & 4 deletions README.md.erb
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ TwitterCldr patches core Ruby objects like `Integer` and `Date` to make localiza
1337.localize(:es).to_s # <%= assert(1337.localize(:es).to_s, "1.337").inspect %>

# currencies, default USD
1337.localize(:es).to_currency.to_s # <%= assert(1337.localize(:es).to_currency.to_s, "1.337,00 $").inspect %>
1337.localize(:es).to_currency.to_s # <%= assert(1337.localize(:es).to_currency.to_s, "1.337,00 US$").inspect %>
1337.localize(:es).to_currency.to_s(:currency => "EUR") # <%= assert(1337.localize(:es).to_currency.to_s(:currency => "EUR"), "1.337,00 €").inspect %>

# percentages
Expand All @@ -60,7 +60,7 @@ num = TwitterCldr::Localized::LocalizedNumber.new(1337, :es)
num.to_currency.to_s # ...etc
```

<% assert(TwitterCldr::Localized::LocalizedNumber.new(1337, :es).to_currency.to_s, "1.337,00 $") %>
<% assert(TwitterCldr::Localized::LocalizedNumber.new(1337, :es).to_currency.to_s, "1.337,00 US$") %>

#### More on Currencies

Expand All @@ -71,7 +71,7 @@ If you're looking for a list of supported currencies, use the `TwitterCldr::Shar
TwitterCldr::Shared::Currencies.currency_codes # <%= ellipsize(assert(TwitterCldr::Shared::Currencies.currency_codes.sort[0..3], ["ADP", "AED", "AFA", "AFN"])) %>

# data for a specific currency code
TwitterCldr::Shared::Currencies.for_code("CAD") # <%= assert(TwitterCldr::Shared::Currencies.for_code("CAD"), {:currency=>:CAD, :name=>"Canadian Dollar", :cldr_symbol=>"CA$", :symbol=>"$", :code_points=>[36]}).inspect %>
TwitterCldr::Shared::Currencies.for_code("CAD") # <%= assert(TwitterCldr::Shared::Currencies.for_code("CAD"), {:currency=>:CAD, :name=>"Canadian Dollar", :cldr_symbol=>"CA$", :symbol=>"CA$", :code_points=>[67, 65, 36]}).inspect %>
```

#### Short / Long Decimals
Expand All @@ -92,7 +92,7 @@ TwitterCLDR supports formatting numbers with an attached unit, for example "12 d

```ruby
12.localize.to_unit.length_mile # <%= assert(12.localize.to_unit.length_mile, "12 miles").inspect %>
12.localize(:ru).to_unit.length_mile # <%= assert(12.localize(:ru).to_unit.length_mile, "12 миль").inspect %>
12.localize(:ru).to_unit.length_mile # <%= assert(12.localize(:ru).to_unit.length_mile, "12 милях").inspect %>
```
Units support a few different forms, long, short, and narrow:

Expand Down
2 changes: 0 additions & 2 deletions Rakefile
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,6 @@ require 'rubygems/package_task'

require './lib/twitter_cldr'

require 'pry-byebug' unless RUBY_PLATFORM == 'java'

Bundler::GemHelper.install_tasks

task default: :spec
Expand Down
5 changes: 2 additions & 3 deletions lib/twitter_cldr/resources/calendars_importer.rb
Original file line number Diff line number Diff line change
Expand Up @@ -175,9 +175,8 @@ def formats(type)
def additional_formats
return {} unless calendar

dtd.find_attr('dateFormatItem', 'id').values.each_with_object({}) do |id, result|
node = calendar.xpath("dateTimeFormats/availableFormats/dateFormatItem[@id='#{id}']").first
result[id] = node.content if node
calendar.xpath("dateTimeFormats/availableFormats/dateFormatItem").each_with_object({}) do |date_format_item, result|
result[date_format_item.attribute("id").value] = date_format_item.content
end
end

Expand Down
11 changes: 1 addition & 10 deletions lib/twitter_cldr/resources/rbnf_test_importer.rb
Original file line number Diff line number Diff line change
Expand Up @@ -82,20 +82,11 @@ def clean_up_name(name)
end

def import_ruleset(locale, formatter, ruleset_name)
test_numbers_for(locale).each_with_object({}) do |num, ret|
TEST_NUMBERS.each_with_object({}) do |num, ret|
ret[num] = formatter.format(num, ruleset_name)
end
end

def test_numbers_for(locale)
# for some reason, russian doesn't support large numbers
if locale.to_s == 'ru'
TEST_NUMBERS - [138_400]
else
TEST_NUMBERS
end
end

def output_file_for(locale)
File.join(params.fetch(:output_path), locale, 'rbnf_test.yml')
end
Expand Down
2 changes: 2 additions & 0 deletions lib/twitter_cldr/resources/timezone_tests_importer.rb
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,8 @@ def generate_test_cases_for_locale(locale)
ulocale = ulocale_class.new(locale.to_s)

TZInfo::Timezone.all_identifiers.each_with_object({}) do |tz_id, ret|
next if tz_id == 'Factory'

tz = tz_class.getTimeZone(tz_id)
offset = tz.getRawOffset

Expand Down
8 changes: 7 additions & 1 deletion lib/twitter_cldr/resources/timezones_importer.rb
Original file line number Diff line number Diff line change
Expand Up @@ -130,7 +130,7 @@ def timezones
short = nodes_to_hash(zone.xpath('short/*'))
result[type][:short] = short unless short.empty?
city = zone.xpath('exemplarCity').first
if city && !unconfirmed_draft?(city)
if city && !unconfirmed_draft?(city) && !secondary?(city)
result[type][:city] = city.content
end
result
Expand Down Expand Up @@ -165,6 +165,12 @@ def unconfirmed_draft?(node)
node.attributes['draft'].value == 'unconfirmed'
end

def secondary?(node)
node &&
node.attributes['alt'] &&
node.attributes['alt'].value == 'secondary'
end

def doc
@doc ||= begin
locale_fs = locale.to_s.gsub('-', '_')
Expand Down
37 changes: 14 additions & 23 deletions lib/twitter_cldr/resources/transforms_importer.rb
Original file line number Diff line number Diff line change
Expand Up @@ -135,36 +135,27 @@ def get_variant(node)
end

def rules(transform_node)
rules = fix_rule_wrapping(
transform_node.xpath('tRule').flat_map do |rule_node|
fix_rule(rule_node.content).split("\n").map(&:strip)
end
)

rules.reject do |rule|
rule.strip.empty? || rule.strip.start_with?('#')
end
end

def fix_rule_wrapping(rules)
wrap = false

rules.each_with_object([]) do |rule, ret|
if wrap
ret.last.sub!(/\\\z/, rule)
else
ret << rule
end

wrap = rule.end_with?('\\')
transform_node.xpath('tRule').flat_map do |rule_node|
rule_node.content
.split("\n")
.reject { |line| line.start_with?('#') }
.map { |line| line.sub(/#.*$/, '').strip }
.join("\n")
.split(/;$/)
.map(&:strip)
.reject(&:empty?)
.map do |rule_text|
"#{fix_rule(rule_text)} ;"
end
end
end

def fix_rule(rule)
rule.
gsub("←", '<').
gsub("→", '>').
gsub("↔", '<>')
gsub("↔", '<>').
gsub("\n", " ")
end

def each_transform_file(&block)
Expand Down
259 changes: 131 additions & 128 deletions lib/twitter_cldr/shared/casefolder.rb

Large diffs are not rendered by default.

2 changes: 2 additions & 0 deletions lib/twitter_cldr/transforms/filters/unicode_filter.rb
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,8 @@ def parse(rule_text, symbol_table)

# filters are always just a unicode set
new(re.elements.first.to_set.to_set, direction)
rescue => e
binding.irb
end

def accepts?(rule_text)
Expand Down
1 change: 0 additions & 1 deletion lib/twitter_cldr/transforms/transforms/named_transform.rb
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,6 @@ def exists?(form)

def apply_to(cursor)
if forward_form
puts forward_form.transform if $debug
forward_form.apply_to(cursor)
end
end
Expand Down
7 changes: 1 addition & 6 deletions lib/twitter_cldr/transforms/transforms/transform_rule.rb
Original file line number Diff line number Diff line change
Expand Up @@ -32,12 +32,7 @@ def accepts?(rule_text)
transforms.any? do |transform|
transform.accepts?(forward_form, backward_form)
end
rescue Exception => e
if $debug
puts e.message
puts e.backtrace.join("\n")
end

rescue Exception
false
end

Expand Down
2 changes: 1 addition & 1 deletion lib/twitter_cldr/version.rb
Original file line number Diff line number Diff line change
Expand Up @@ -4,5 +4,5 @@
# http://www.apache.org/licenses/LICENSE-2.0

module TwitterCldr
VERSION = '6.12.1'
VERSION = '6.13.0'
end
6 changes: 3 additions & 3 deletions lib/twitter_cldr/versions.rb
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,9 @@

module TwitterCldr
module Versions
CLDR_VERSION = '43.1'
ICU_VERSION = '73.2'
UNICODE_VERSION = '15.0.0'
CLDR_VERSION = '46.1'
ICU_VERSION = '76.1'
UNICODE_VERSION = '16.0.0'

class << self
def cldr_version
Expand Down
Binary file modified resources/collation/tries/af.dump
Binary file not shown.
Binary file modified resources/collation/tries/ar.dump
Binary file not shown.
Binary file modified resources/collation/tries/az.dump
Binary file not shown.
Binary file modified resources/collation/tries/be.dump
Binary file not shown.
Binary file modified resources/collation/tries/bg.dump
Binary file not shown.
Binary file modified resources/collation/tries/bn.dump
Binary file not shown.
Binary file modified resources/collation/tries/bo.dump
Binary file not shown.
Binary file modified resources/collation/tries/bs.dump
Binary file not shown.
Binary file modified resources/collation/tries/ca.dump
Binary file not shown.
Binary file modified resources/collation/tries/cs.dump
Binary file not shown.
Binary file modified resources/collation/tries/cy.dump
Binary file not shown.
Binary file modified resources/collation/tries/da.dump
Binary file not shown.
Binary file modified resources/collation/tries/de-AT.dump
Binary file not shown.
Binary file modified resources/collation/tries/de-CH.dump
Binary file not shown.
Binary file modified resources/collation/tries/de.dump
Binary file not shown.
Binary file modified resources/collation/tries/default.dump
Binary file not shown.
Binary file modified resources/collation/tries/el.dump
Binary file not shown.
Binary file modified resources/collation/tries/en-001.dump
Binary file not shown.
Binary file modified resources/collation/tries/en-150.dump
Binary file not shown.
Binary file modified resources/collation/tries/en-AU.dump
Binary file not shown.
Binary file modified resources/collation/tries/en-CA.dump
Binary file not shown.
Binary file modified resources/collation/tries/en-GB.dump
Binary file not shown.
Binary file modified resources/collation/tries/en-IE.dump
Binary file not shown.
Binary file modified resources/collation/tries/en-IN.dump
Binary file not shown.
Binary file modified resources/collation/tries/en-NZ.dump
Binary file not shown.
Binary file modified resources/collation/tries/en-SG.dump
Binary file not shown.
Binary file modified resources/collation/tries/en-US.dump
Binary file not shown.
Binary file modified resources/collation/tries/en-ZA.dump
Binary file not shown.
Binary file modified resources/collation/tries/en.dump
Binary file not shown.
Binary file modified resources/collation/tries/eo.dump
Binary file not shown.
Binary file modified resources/collation/tries/es-419.dump
Binary file not shown.
Binary file modified resources/collation/tries/es-AR.dump
Binary file not shown.
Binary file modified resources/collation/tries/es-CO.dump
Binary file not shown.
Binary file modified resources/collation/tries/es-MX.dump
Binary file not shown.
Binary file modified resources/collation/tries/es-US.dump
Binary file not shown.
Binary file modified resources/collation/tries/es.dump
Binary file not shown.
Binary file modified resources/collation/tries/et.dump
Binary file not shown.
Binary file modified resources/collation/tries/eu.dump
Binary file not shown.
Binary file modified resources/collation/tries/fa.dump
Binary file not shown.
Binary file modified resources/collation/tries/fi.dump
Binary file not shown.
Binary file modified resources/collation/tries/fil.dump
Binary file not shown.
Binary file modified resources/collation/tries/fr-BE.dump
Binary file not shown.
Binary file modified resources/collation/tries/fr-CA.dump
Binary file not shown.
Binary file modified resources/collation/tries/fr-CH.dump
Binary file not shown.
Binary file modified resources/collation/tries/fr.dump
Binary file not shown.
Binary file modified resources/collation/tries/ga.dump
Binary file not shown.
Binary file modified resources/collation/tries/gl.dump
Binary file not shown.
Binary file modified resources/collation/tries/gu.dump
Binary file not shown.
Binary file modified resources/collation/tries/he.dump
Binary file not shown.
Binary file modified resources/collation/tries/hi.dump
Binary file not shown.
Binary file modified resources/collation/tries/hr.dump
Binary file not shown.
Binary file modified resources/collation/tries/hu.dump
Binary file not shown.
Binary file modified resources/collation/tries/hy.dump
Binary file not shown.
Binary file modified resources/collation/tries/id.dump
Binary file not shown.
Binary file modified resources/collation/tries/is.dump
Binary file not shown.
Binary file modified resources/collation/tries/it-CH.dump
Binary file not shown.
Binary file modified resources/collation/tries/it.dump
Binary file not shown.
Binary file modified resources/collation/tries/ja.dump
Binary file not shown.
Binary file modified resources/collation/tries/ka.dump
Binary file not shown.
Binary file modified resources/collation/tries/kk.dump
Binary file not shown.
Binary file modified resources/collation/tries/km.dump
Binary file not shown.
Binary file modified resources/collation/tries/kn.dump
Binary file not shown.
Binary file modified resources/collation/tries/ko.dump
Binary file not shown.
Binary file modified resources/collation/tries/lo.dump
Binary file not shown.
Binary file modified resources/collation/tries/lt.dump
Binary file not shown.
Binary file modified resources/collation/tries/lv.dump
Binary file not shown.
Binary file modified resources/collation/tries/mk.dump
Binary file not shown.
Binary file modified resources/collation/tries/mr.dump
Binary file not shown.
Binary file modified resources/collation/tries/ms.dump
Binary file not shown.
Binary file modified resources/collation/tries/mt.dump
Binary file not shown.
Binary file modified resources/collation/tries/my.dump
Binary file not shown.
Binary file modified resources/collation/tries/nb.dump
Binary file not shown.
Binary file modified resources/collation/tries/nl-BE.dump
Binary file not shown.
Binary file modified resources/collation/tries/nl.dump
Binary file not shown.
Binary file modified resources/collation/tries/pl.dump
Binary file not shown.
Binary file modified resources/collation/tries/pt-PT.dump
Binary file not shown.
Binary file modified resources/collation/tries/pt.dump
Binary file not shown.
Binary file modified resources/collation/tries/ro.dump
Binary file not shown.
Binary file modified resources/collation/tries/ru.dump
Binary file not shown.
Binary file modified resources/collation/tries/sk.dump
Binary file not shown.
Binary file modified resources/collation/tries/sl.dump
Binary file not shown.
Binary file modified resources/collation/tries/sq.dump
Binary file not shown.
Binary file modified resources/collation/tries/sr-Cyrl-ME.dump
Binary file not shown.
Binary file modified resources/collation/tries/sr-Latn-ME.dump
Binary file not shown.
Binary file modified resources/collation/tries/sr.dump
Binary file not shown.
Binary file modified resources/collation/tries/sv.dump
Binary file not shown.
Binary file modified resources/collation/tries/sw.dump
Binary file not shown.
Binary file modified resources/collation/tries/ta.dump
Binary file not shown.
Binary file modified resources/collation/tries/th.dump
Binary file not shown.
Binary file modified resources/collation/tries/tr.dump
Binary file not shown.
Binary file modified resources/collation/tries/uk.dump
Binary file not shown.
Binary file modified resources/collation/tries/ur.dump
Binary file not shown.
Binary file modified resources/collation/tries/vi.dump
Binary file not shown.
Binary file modified resources/collation/tries/xh.dump
Binary file not shown.
Binary file modified resources/collation/tries/zh-Hant.dump
Binary file not shown.
Binary file modified resources/collation/tries/zh.dump
Binary file not shown.
Binary file modified resources/collation/tries/zu.dump
Binary file not shown.
10 changes: 4 additions & 6 deletions resources/locales/af/calendars.yml
Original file line number Diff line number Diff line change
Expand Up @@ -313,12 +313,11 @@
:stand-alone:
:abbreviated:
:afternoon1: middag
:am: vm.
:am: AM
:evening1: aand
:midnight: middernag
:morning1: oggend
:night1: nag
:pm: nm.
:pm: PM
:narrow:
:afternoon1: m
:am: v
Expand All @@ -329,12 +328,11 @@
:pm: "n"
:wide:
:afternoon1: middag
:am: vm.
:am: AM
:evening1: aand
:midnight: middernag
:morning1: oggend
:night1: nag
:pm: nm.
:pm: PM
:quarters:
:format:
:abbreviated:
Expand Down
Loading

0 comments on commit 2c89be8

Please sign in to comment.