Skip to content
This repository has been archived by the owner on Dec 11, 2020. It is now read-only.

Correct Chinese names #628

Merged
merged 3 commits into from
Jul 12, 2015
Merged

Correct Chinese names #628

merged 3 commits into from
Jul 12, 2015

Conversation

phoenixgao
Copy link
Contributor

The first part of a Chinese name is the family name, shared by males and females.

Update lastnames with top 300 family names with population from 0.2 - 90 million.
Separate firstnames by gender.

The first part of a Chinese name is the family name
'钱', '阎', '陆', '陶', '韦', '韩', '顾', '马', '高', '魏', '黄', '黎',
'龚',
protected static $lastName = array(
// These are the top 300 lastnames.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add a link to the source of this data in the comment, like in other locales.

@fzaninotto
Copy link
Owner

I have no way to tell that your lists are better than the current ones. Please provide a reference link for each list to prevent other developers like you from contesting this data in the future.

@phoenixgao
Copy link
Contributor Author

The previous logic is wrong, the first part of name is not related to gender at all.

My 300 family names comes from this list:
http://www.360doc.com/content/13/0624/20/2577690_295253771.shtml
A lot of same list like this news:
http://www.fjfdls.com/show.asp?id=76

The data comes from the 6th national census of population.

Here are another top 400 list which is slightly different I believe it's data from different time, you can see the population of each family name:
http://baike.baidu.com/picture/33020/33020/16790980/4b90f603738da977bb123226b451f8198618e3fa?fr=lemma&ct=cover#aid=16790980&pic=4b90f603738da977bb123226b451f8198618e3fa

For the given name, I just split your original list by gender and add some more popular ones to make each list has 100 names.

I don't want to update random names because there is no perfect list for fake data, but the previous list is wrong, mixing female names and male names together, and split family name by gender which is not right either.

@fzaninotto
Copy link
Owner

Please, add these links to the source code.

@phoenixgao
Copy link
Contributor Author

Found a more believable link and added it in the comment, there are several versions of the surname list, some are published in books, but sorry I could not find the original data from government websites.

But any of them is good enough for faking names.

Please help me to fix English grammar errors in the comments, thanks!

@phoenixgao
Copy link
Contributor Author

Just took a look at zh_TW names, the logic has no problem, it get right order of firstname and lastname, and also right sorting firstnames by gender, but the list of lastnames is thousands years old, (seriously, around 960 D.C.), a lot of the names in that list is not exist anymore. I don't know if you will be happy with those ancient names?

'钱', '阎', '陆', '陶', '韦', '韩', '顾', '马', '高', '魏', '黄', '黎',
'龚',
protected static $lastName = array(
// According to http://baike.baidu.com/view/6109935.htm,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use the phpDoc format (starting with /*) and move it one line up, before the array initialization.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NB, /* isn't phpdoc, it's a "comment". "phpdoc" is /**. They actually tokenise to T_COMMENT and T_DOC_COMMENT.

@fzaninotto
Copy link
Owner

I agree, the zh_TW list isn't ideal either.

@fzaninotto fzaninotto closed this Jul 6, 2015
@fzaninotto fzaninotto reopened this Jul 6, 2015
@fzaninotto
Copy link
Owner

Sorry, closed by mistake

fzaninotto added a commit that referenced this pull request Jul 12, 2015
@fzaninotto fzaninotto merged commit fb1a614 into fzaninotto:master Jul 12, 2015
@fzaninotto
Copy link
Owner

Thanks for your PR!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants