-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for bill of the House of Representatives of Japan #184
Add support for bill of the House of Representatives of Japan #184
Conversation
The wareki gem will be convenient but it's overkill for us. Because it changes
Can we try this? Do we need to support only 昭和, 平成 and 令和? |
We can defer this to a follow-up task. |
Thanks for your review!
I agree. I would like to implement independently method only support 昭和, 平成, 令和. |
The fields capable of accepting dates only contained Heisei and Reiwa. This fact was confirmed with the following script on 24e251d. ```ruby require "datasets" house_of_representative = Datasets::HouseOfRepresentative.new japanese_date_strings = house_of_representative.collect { |record| [ record.house_of_representatives_of_accepted_bill_on_preliminary_consideration, record.house_of_representatives_of_preliminary_refer_on, record.house_of_representatives_of_accepted_bill_on, record.house_of_representatives_of_refer_on, record.house_of_representatives_of_finished_consideration_on, record.house_of_representatives_of_finished_deliberation_on, record.house_of_councillors_of_accepted_bill_on_preliminary_consideration, record.house_of_councillors_of_preliminary_refer_on, record.house_of_councillors_of_accepted_bill_on, record.house_of_councillors_of_refer_on, record.house_of_councillors_of_finished_consideration_on, record.house_of_councillors_of_finished_deliberation_on, record.promulgated_on, ].reject { |s| s.nil? || s == "" } }.flatten japanese_date_strings.collect { |s| s[0..1] } .uniq # => ["平成", "令和"] ``` The private method `parse_japanese_date` was asserted correctly with the following script. ```ruby require "date" require "test-unit" JAPANESE_DATE_MATCHER = /\A(平成|令和)\s*(\d{1,2}|元)年\s*\d{1,2}月\s*\d{1,2}日\z/.freeze def parse_japanese_date(field) return field unless field.to_s.match?(JAPANESE_DATE_MATCHER) Date.parse(normalize_jisx0301_format(field)) end def normalize_jisx0301_format(japanese_date_string) japanese_date_string.gsub(/平成/, "H") .gsub(/令和/, "R") .gsub(/元/, "01") .gsub(/\s/, "0") .gsub(/[年月]/, ".") .gsub(/[日]/, "") end class ParseJapaneseDateTest < Test::Unit::TestCase data("month and day with leading a space in Heisei", ["H10.01.01", "平成10年 1月 1日"]) data("month with leading a space in Heisei", ["H10.01.10", "平成10年 1月10日"]) data(" day with leading a space in Heisei", ["H10.10.01", "平成10年10月 1日"]) data(" without leading a space in Heisei", ["H10.10.10", "平成10年10月10日"]) data("year, month and day with leading a space in Reiwa", ["R02.01.01", "令和 2年 1月 1日"]) data("year, month with leading a space in Reiwa", ["R02.01.10", "令和 2年 1月10日"]) data("year, day with leading a space in Reiwa", ["R02.10.01", "令和 2年10月 1日"]) data("year, without leading a space in Reiwa", ["R02.10.10", "令和 2年10月10日"]) data("boundary within Heisei", ["H31.04.30", "平成31年 4月30日"]) data("boundary within Reiwa", ["R01.05.01", "令和元年 5月 1日"]) def test_parse_japanse_date(data) expected_jisx0301, japanese_date_string = data assert_equal(expected_jisx0301, parse_japanese_date(japanese_date_string).jisx0301) end end Test::Unit::AutoRunner.run # Loaded suite irb # Started # Finished in 0.002313 seconds. # ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- # 10 tests, 10 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications # 100% passed # ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- # 4323.39 tests/s, 4323.39 assertions/s # => true ```
a37b13e
to
1e15280
Compare
I've rebased on master and update expected, so the CI has successfully. I implement independently method only support 平成, 令和 (not 昭和, 平成, 令和). Because the fields capable of accepting dates only contained Heisei and Reiwa. This fact was confirmed with the following script on 24e251d. require "datasets"
house_of_representative = Datasets::HouseOfRepresentative.new
japanese_date_strings = house_of_representative.collect { |record|
[
record.house_of_representatives_of_accepted_bill_on_preliminary_consideration,
record.house_of_representatives_of_preliminary_refer_on,
record.house_of_representatives_of_accepted_bill_on,
record.house_of_representatives_of_refer_on,
record.house_of_representatives_of_finished_consideration_on,
record.house_of_representatives_of_finished_deliberation_on,
record.house_of_councillors_of_accepted_bill_on_preliminary_consideration,
record.house_of_councillors_of_preliminary_refer_on,
record.house_of_councillors_of_accepted_bill_on,
record.house_of_councillors_of_refer_on,
record.house_of_councillors_of_finished_consideration_on,
record.house_of_councillors_of_finished_deliberation_on,
record.promulgated_on,
].reject { |s| s.nil? || s == "" }
}.flatten
japanese_date_strings.collect { |s| s[0..1] }
.uniq
# => ["平成", "令和"] The private method require "date"
require "test-unit"
JAPANESE_DATE_MATCHER = /\A(平成|令和)\s*(\d{1,2}|元)年\s*\d{1,2}月\s*\d{1,2}日\z/.freeze
def parse_japanese_date(field)
return field unless field.to_s.match?(JAPANESE_DATE_MATCHER)
Date.parse(normalize_jisx0301_format(field))
end
def normalize_jisx0301_format(japanese_date_string)
japanese_date_string.gsub(/平成/, "H")
.gsub(/令和/, "R")
.gsub(/元/, "01")
.gsub(/\s/, "0")
.gsub(/[年月]/, ".")
.gsub(/[日]/, "")
end
class ParseJapaneseDateTest < Test::Unit::TestCase
data("month and day with leading a space in Heisei", ["H10.01.01", "平成10年 1月 1日"])
data("month with leading a space in Heisei", ["H10.01.10", "平成10年 1月10日"])
data(" day with leading a space in Heisei", ["H10.10.01", "平成10年10月 1日"])
data(" without leading a space in Heisei", ["H10.10.10", "平成10年10月10日"])
data("year, month and day with leading a space in Reiwa", ["R02.01.01", "令和 2年 1月 1日"])
data("year, month with leading a space in Reiwa", ["R02.01.10", "令和 2年 1月10日"])
data("year, day with leading a space in Reiwa", ["R02.10.01", "令和 2年10月 1日"])
data("year, without leading a space in Reiwa", ["R02.10.10", "令和 2年10月10日"])
data("boundary within Heisei", ["H31.04.30", "平成31年 4月30日"])
data("boundary within Reiwa", ["R01.05.01", "令和元年 5月 1日"])
def test_parse_japanse_date(data)
expected_jisx0301, japanese_date_string = data
assert_equal(expected_jisx0301, parse_japanese_date(japanese_date_string).jisx0301)
end
end
Test::Unit::AutoRunner.run
# Loaded suite irb
# Started
# Finished in 0.002313 seconds.
# --------------------------------------------------------------------------------------------
# 10 tests, 10 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications
# 100% passed
# --------------------------------------------------------------------------------------------
# 4323.39 tests/s, 4323.39 assertions/s
# => true Could you review it again? |
How about the Japanese date parse method as a |
I see. It is testable! I'll try to extract |
Thanks for your review and suggest some clear code. I dealt with all of them you pointed so far. |
Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
Following a test is fail because return value was changed on b5e68f4. - Bill of the House of Representatives of Japan
Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
Creating a new `JapaneseDateParser` object for each parsing may be slow. Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
Thank you so much. I'm excited to encounter some clear code I had not even thought of. I dealt with all of them you pointed out so far. Could you review it again? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we can merge this with some minor improvements. :-)
`end_with?` is simpler than regular expression. Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
Thanks for your review! I've improved all. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
Fixes #142.
Notes for reviewers
Wareki can't parse to
Date
object (e.g.,"平成10年 3月 4日"
). I think the following options.String
(Not parsed)