Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Yarn support #69

Open
wants to merge 24 commits into
base: master
Choose a base branch
from
Open

Yarn support #69

wants to merge 24 commits into from

Conversation

batizty
Copy link

@batizty batizty commented Jun 7, 2017

Hi rjagemen,

Could you please help me to review the request?

All codes are tested on online in my cluster environment.

Any question is welcome and appreciate your previous work.

Thanks

@batizty batizty closed this Jun 9, 2017
@batizty batizty reopened this Jun 9, 2017
@batizty
Copy link
Author

batizty commented Jun 9, 2017

@rjagerman Could You Please help me to review the change. Thanks

@rjagerman
Copy link
Owner

Hi @batizty,

Thanks! This looks really nice! I haven't had the time yet to review it due to several projects and deadlines at work. I hope to review it some time next week.

@batizty
Copy link
Author

batizty commented Jun 14, 2017

Hi @rjagerman

Understand.

And feature for yarn support is used in weibo.com(Maybe you have heard about this web site, maybe not, and it is top 5 website in China, similar twitter with more users in China). And it works well.

And I also developed some other features on Glint, which includes additional operations like Save and Load which could used to store and read quickly models in HDFS, and I believe it is useful for most of Glint Users who are working on Big Vector and Matrix Machine Learning.

If could, I wanna to be an contributor for Glint because it is very simple and stable for large scale Machine learning.

Thank you for your work on Glint.

@rjagerman
Copy link
Owner

Still haven't found the time to do it, too many deadlines unfortunately :-( I'll let you know when I get around to it.

@batizty
Copy link
Author

batizty commented Jun 28, 2017

Got it.

later I will send out another patch for Glint, which could be used to store all parameters into HDFS by nodes independently.
And I have tested before, if you want to pull all weight vector/matrix which sizes is over 100m, it took about more than 30min. And I add an operation 'Save' to store the weights in parameter nodes, fortunately it took me less than 1min.
I believe it is useful for others who will work on huge models.

Thanks.

@baukloze
Copy link

Hi, @batizty
I want to use Glint to store weights for machine learning algorithms, but it's too difficult to save weights to local file or hdfs file. fortunately, i found that you had met this problem and solved it, could you please send out your branch? Thanks.

@batizty
Copy link
Author

batizty commented Dec 26, 2017

Hi, @baukloze
Sorry, I forgot this issue.

And could you please wait one or two days, I will send out my modification ASAP.
Hope you like it.

By the way, @rjagerman my workmates and i have implemented basic ML algorithms based on Glint, but it is not stable enough now. When our data size reached to 1000B, and the matrix/vector width reached 500B, a lot of traffic load will cause some of AKKA nodes became Quarantined State. Any Suggestion or method to fix this problem?

@baukloze
Copy link

@batizty ok, thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants