Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker autoscaling #15

Closed
wants to merge 249 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
249 commits
Select commit Hold shift + click to select a range
562d1bc
Fix up docker usage
richardliaw Jul 2, 2019
d7fab62
Merge branch 'master' into docker_autoscaling
richardliaw Jul 11, 2019
5f6fdca
Merge branch 'fix_autoscaling_tune' into docker_autoscaling
richardliaw Jul 11, 2019
81f8089
lint
richardliaw Jul 11, 2019
b5b23a5
lit
richardliaw Jul 11, 2019
aa43445
revert
richardliaw Jul 11, 2019
39a7b38
fix
richardliaw Jul 11, 2019
dd07cb5
lint
richardliaw Jul 11, 2019
3b42d5c
Track newly created actor's parent actor (#5098)
vipulharsh Jul 11, 2019
f46c555
Only get actor ID if actor task (#5180)
stephanie-wang Jul 12, 2019
0ec3a16
Fix Java MultithreadingTest (#5182)
raulchen Jul 12, 2019
1530389
[tune] Fast Node Recovery (#5053)
richardliaw Jul 12, 2019
65792ee
update mounting
richardliaw Jul 12, 2019
877b21f
lint
richardliaw Jul 12, 2019
b6509f4
Update wheels to 0.8.0dev2 (#5186)
richardliaw Jul 13, 2019
71b57cd
Merge branch 'master' into docker_autoscaling
richardliaw Jul 13, 2019
f5a87b8
Fix: ServerCallFactory's destructor not marked as virtual (#5185)
raulchen Jul 13, 2019
21b3cb5
config files
richardliaw Jul 13, 2019
322b516
Update arrow to include user defined status for plasma (#5156)
pcmoritz Jul 13, 2019
0c88733
nits
richardliaw Jul 13, 2019
69463b5
nit
richardliaw Jul 13, 2019
5b13a7e
Keep parameter space noise consistent with action space noise (Fix 51…
joneswong Jul 14, 2019
7342117
Fix a multithreading bug in grpc `ClientCall` (#5196)
raulchen Jul 15, 2019
ea6aa64
Reconstruct failed actors without sending tasks. (#5161)
raulchen Jul 15, 2019
fd71ffd
Improve release process 0.7.2 (#5187)
simon-mo Jul 15, 2019
e5be5fd
Remove dependencies from TaskExecutionSpecification (#5166)
edoakes Jul 16, 2019
8065243
[Java worker] Refactor object store and worker context on top of core…
kfstorm Jul 16, 2019
047f4cc
[rllib] Fix rollout.py with tuple action space (#5201)
ericl Jul 16, 2019
4fa2a60
[rllib] Remove nested import (#5204)
ericl Jul 16, 2019
3e0ad11
Add heartbeat test + Fix monitor.py (#5191)
richardliaw Jul 17, 2019
214f09d
[rllib] Make RLLib handle zero-length observation arrays (#5208)
qxcv Jul 17, 2019
ae03c42
Fixed inconsistent action placeholder (#5213)
joneswong Jul 17, 2019
81d297f
Remove redundant scaler of l2 reg (#5172)
joneswong Jul 17, 2019
63f49f9
Improve memory check (#5216)
MQQ Jul 18, 2019
0af07bd
Enable seeding actors for reproducible experiments (#5197)
joneswong Jul 18, 2019
b5b8c1d
[GCS] introduce new gcs client and refactor actor table (#5058)
micafan Jul 19, 2019
aa42328
[direct call] add local plasma provider (#5184)
zhijunfu Jul 19, 2019
28e5c55
[rllib] Move some inline defs to avoid deserialization errors (#5228)
ericl Jul 19, 2019
da7676c
Removed the implicit sync barrier at the end of each training iterati…
joneswong Jul 19, 2019
d58b986
[rllib] MultiCategorical shouldn't return array for kl or entropy (#5…
ericl Jul 19, 2019
b0c0de4
[tune] Fixup exception messages (#5238)
richardliaw Jul 21, 2019
f9043cc
[rllib] Remove experimental eager support
ericl Jul 21, 2019
53fb876
Improved KeyboardInterrupt Exception Handling (#5237)
richardliaw Jul 22, 2019
7fc15db
[autoscaler] Clean up error messages on setup failure (#5210)
richardliaw Jul 22, 2019
80b976e
Ray namespace added for k8s (#4111)
vakker Jul 22, 2019
fc58905
[sgd] Deprecate old distributed SGD implementation (#5160)
pschafhalter Jul 22, 2019
a3d4f9f
Fix the issue when passing multiple options in one string (#5241)
jovany-wang Jul 23, 2019
15959b0
Leave `ray.wait` calls open until the task or actor exits (#5234)
stephanie-wang Jul 23, 2019
9c651f4
Add regression test for actor load balancing (#5224)
stephanie-wang Jul 23, 2019
97c4328
[rllib] Fix trainer state restore (#5257)
ericl Jul 24, 2019
5b76238
Fix two types of eviction hangs (#5225)
ericl Jul 24, 2019
690b374
[rllib] Add Keras LSTM example with ModelV2 (#5258)
ericl Jul 24, 2019
60f5963
[rllib] Port DDPG to the build_tf_policy pattern (#5242)
ericl Jul 24, 2019
40395ac
[gRPC] Migrate raylet client implementation to grpc (#5120)
jiangzihao2009 Jul 25, 2019
bf9199a
[rllib] ModelV2 support for pytorch (#5249)
ericl Jul 25, 2019
3321555
Increase timeout for `ray.wait` test (#5273)
stephanie-wang Jul 25, 2019
6f682db
avoid copying ActorTableData when NodeMananger updates an actor to GC…
micafan Jul 26, 2019
8276182
[rllib] Configure learner queue timeout (#5270)
antoine-galataud Jul 26, 2019
6f737e6
Add CODEOWNERS file (#5259)
raulchen Jul 26, 2019
d9e81da
[tune] configurable maximum length of trial identifier (#5287)
llan-ml Jul 27, 2019
06fec63
[autoscaler] Add a 'request_cores' function for manual autoscaling (#…
ls-daniel Jul 27, 2019
7e71552
[sgd] Example for Training (#5292)
richardliaw Jul 27, 2019
5e15b36
[tune] experiment_analysis split to Analysis (#5115)
richardliaw Jul 27, 2019
9c00616
Retry and exception for hang on memory store full (#5143)
richardliaw Jul 27, 2019
a62c5f4
[rllib] Document ModelV2 and clean up the models/ directory (#5277)
ericl Jul 27, 2019
10cbcce
Correctly setting the input to Train (#3853)
LorenzoCevolani Jul 27, 2019
b4823d6
[autoscaler] Local YAML readability (#5290)
richardliaw Jul 27, 2019
341dbf6
[tune] support nested dictionaries for CSVLogger (#5295)
llan-ml Jul 27, 2019
6f2c5b2
Revert "[autoscaler] Clean up error messages on setup failure (#5210)…
ericl Jul 27, 2019
5ea859d
[sgd] hotfix example failure (#5297)
richardliaw Jul 28, 2019
1465a30
Fix releasing CPUs incorrectly when actor creation task blocked. (#5271)
jovany-wang Jul 28, 2019
3bdd114
[rllib] Better example rnn envs (#5300)
ericl Jul 28, 2019
3b00144
Bump version to 0.7.3 (#5301)
simon-mo Jul 29, 2019
1337c98
[rllib] Importance Sampling and KL Loss for APPO (#5051)
michaelzhiluo Jul 29, 2019
3ba8680
Bump version to 0.8.0.dev3 (#5308)
simon-mo Jul 30, 2019
b3bcf59
Rename ClientTableData to GcsNodeInfo (#5251)
micafan Jul 30, 2019
196495a
Fix Redis Test (#5302)
simon-mo Jul 30, 2019
eb307f9
Support direct actor call (#5183)
zhijunfu Jul 30, 2019
991e71d
Submit task asynchronously from raylet client (#5313)
raulchen Jul 30, 2019
63a6b0e
Fix bug in passing large arguments to tasks. (#5325)
robertnishihara Jul 31, 2019
e218e61
Lineage cache performance optimization to avoid duplicate GCS reques…
stephanie-wang Jul 31, 2019
b3c8091
Fix Tuple spaces in rollout.py (#5332)
flying-mojo Jul 31, 2019
1345802
[autoscaler] Change sys.exit(1) in update ssh_cmd (#5266)
hartikainen Jul 31, 2019
d762379
[Asyncio] Allow Async_API to init when loop is running (#5323)
simon-mo Jul 31, 2019
51b8915
Added CARLA Community Example (#5333)
layssi Aug 1, 2019
0391050
Fixed link in tune that was not working (#5331)
lukasfolle Aug 1, 2019
bd6dfc9
[sgd] Replaced class Resources in sgd with `use_gpu` (#5252)
jichan3751 Aug 1, 2019
20450a4
[rllib] Add rock paper scissors multi-agent example (#5336)
ericl Aug 1, 2019
3ae54a2
Fix log monitor read error (#5221)
ConeyLiu Aug 1, 2019
13fb9fe
[rllib] Feature/soft actor critic v2 (#5328)
hartikainen Aug 2, 2019
1eaa57c
[tune] Distributed example + walkthrough (#5157)
richardliaw Aug 2, 2019
25b5bd1
`ray stop` sends `SIGKILL` instead of `SIGTERM` (#5354)
simon-mo Aug 2, 2019
134c6bd
[direct call] In memory store (#5303)
zhijunfu Aug 5, 2019
67f9e22
[tune] Fix small bug in experiment_analysis (#5365)
Aug 5, 2019
955154a
Reduce Ray / RLlib startup messages (#5368)
ericl Aug 5, 2019
cc5c78b
Fix the issue of not initializing GLOG
jovany-wang Aug 5, 2019
32f2753
[tune] Pandas as soft dep
richardliaw Aug 6, 2019
384cbfb
Fix duplicated timeout logic in AbstractRayRuntime.get() (#5338)
kfstorm Aug 6, 2019
5d7afe8
[rllib] Try moving RLlib to top level dir (#5324)
ericl Aug 6, 2019
a08ea09
[docs] rewrite (#5175)
richardliaw Aug 6, 2019
0a3ff48
Send raylet error logs through the log monitor (#5351)
ericl Aug 6, 2019
02c5d2b
Add common preprocessing for each request in node manager. (#5296)
jiangzihao2009 Aug 6, 2019
94bff24
[docs] Hotfix for removing unneeded files (#5383)
richardliaw Aug 6, 2019
e3c9f7e
Custom action distributions (#5164)
mawright Aug 6, 2019
3ad2fe7
Cap concurrent requests (#5341)
raulchen Aug 6, 2019
e8d9cfc
Ray projects schema and validation (#5329)
pcmoritz Aug 6, 2019
094ec7a
[tune] Allow nested values in trial runner (#5346)
richardliaw Aug 6, 2019
281829e
MADDPG implementation in RLlib (#5348)
wsjeon Aug 6, 2019
d2e8331
[docs] remove table from walkthrough (#5389)
RehanSD Aug 7, 2019
50b93bf
Check upstream with `git remote` (#5377)
simon-mo Aug 7, 2019
d372f24
[ID Refactor] Refactor ActorID, TaskID and ObjectID (#5286)
jovany-wang Aug 7, 2019
7d747da
[rllib] [docs] Add some architecture diagrams (#5390)
ericl Aug 7, 2019
8d6c50c
Fix compiler warnings and make warnings fatal (#5375)
pcmoritz Aug 7, 2019
4a6ebe6
Fix setup (#5400)
ericl Aug 7, 2019
1f8ae17
Silence some installation process for build from source (#5396)
simon-mo Aug 7, 2019
ed89897
[tune,autoscaler] Test yaml, add better distributed docs (#5403)
richardliaw Aug 8, 2019
592f313
[rllib] Centralized critic / PPO example on TwoStepGame (#5392)
ericl Aug 8, 2019
d9b45cc
[Project] Implementing Project CLI (#5397)
simon-mo Aug 9, 2019
1a8fa5d
Clean up top level Ray dir (#5404)
ericl Aug 9, 2019
18f1e90
Bump 0.8.0.dev2 -> 0.8.0.dev3 (#5409)
simon-mo Aug 9, 2019
7e8a4a6
[tune] Add hyperopt warm start feature (#5372)
jredondopizarro Aug 9, 2019
df47bdf
Allow `address` instead of `redis_address` (#5412)
ericl Aug 10, 2019
de95117
[sgd] Tune interface for Pytorch MultiNode SGD (#5350)
jichan3751 Aug 10, 2019
8b6f0d3
[rllib] Fix output API when lz4 not installed (#5421)
ericl Aug 10, 2019
a1d2e17
[rllib] Autoregressive action distributions (#5304)
ericl Aug 10, 2019
983f3c8
[tune] Allow relative local_dir at tune.run() (#4734)
Aug 10, 2019
cc86271
[hotfix] fix Travis action dist test (#5428)
ericl Aug 11, 2019
b1e010f
Fix TestDirectActorTaskCrossNodesFailure test (#5406)
zhijunfu Aug 11, 2019
cff72d1
[minor][tune] update pbt docs (#5420)
richardliaw Aug 11, 2019
61b23a9
Don't stop Jupyter notebook in ray stop. (#5387)
robertnishihara Aug 11, 2019
158567b
Rename function to make actor example correct (#5432)
adamochayon Aug 12, 2019
3218ee3
[tune] Fix get_best_logdir behaviour (#5429)
TomVeniat Aug 12, 2019
79949fb
[rllib] RLlib in 60 seconds documentation (#5430)
ericl Aug 13, 2019
b7d0733
[tune] Implement BOHB (#5382)
lisadunlap Aug 13, 2019
1376f1a
[tune] Reporter crash fix (#5426)
nflu Aug 13, 2019
3a1e8d0
[tune] Fix Travis Blocking (#5448)
richardliaw Aug 14, 2019
16acd18
[tune] Quick Fix BOHB example (#5449)
richardliaw Aug 14, 2019
d7b3092
[tune] MLFlow Logger (#5438)
richardliaw Aug 14, 2019
3a85312
Update the pull request template (#5460)
stephanie-wang Aug 16, 2019
b1aae0e
[Java worker] Migrate task execution and submission on top of core wo…
kfstorm Aug 16, 2019
8ed353a
Fix json.loads compatibility issue for Python 3.5 (#5466)
mitchellstern Aug 17, 2019
bb31620
Add test for mutually recursive remote functions. (#5349)
robertnishihara Aug 17, 2019
47aa2b1
Make GCS Client thread-safe. (#5413)
micafan Aug 17, 2019
657ce4b
increase timeout for test_actor_lifetime_load_balancing (#5463)
raulchen Aug 17, 2019
03d05c8
Fix test_logging_to_driver and test_not_logging_to_driver (#5462)
raulchen Aug 17, 2019
599cc2b
Revert raylet to worker GRPC communication back to asio (#5450)
pcmoritz Aug 18, 2019
c7ae4e5
Check for dead processes in blocked ray start (#5458)
edoakes Aug 18, 2019
0440c00
Use subprocess.check_output in tests (#5465)
edoakes Aug 18, 2019
9d7e8c1
[docs] Added Instructions for Slurm (#5467)
gregSchwartz18 Aug 19, 2019
658e002
[rllib] Add autoregressive KL (#5469)
jon-chuang Aug 19, 2019
341c692
[Project] Add Basic Session CLI Commands (#5433)
simon-mo Aug 19, 2019
0916603
Fixed few broken links in docs (#5477)
holli Aug 19, 2019
cf98b1b
[autoscaler] Fix ssh control path length issue (#5476)
pcmoritz Aug 19, 2019
99a2f9f
Scale bazel HTTP timeout by 5x (#5482)
edoakes Aug 19, 2019
851c5b2
Add a script for benchmarking performance for Ray developers. (#5472)
robertnishihara Aug 20, 2019
da7bdac
support for subscription to an actor (#5269)
micafan Aug 20, 2019
f2b3c27
Fix direct actor transport not treating some tasks as failed (#5464)
raulchen Aug 20, 2019
e065f55
Fix impala stress test (#5491)
pcmoritz Aug 21, 2019
52a7c1d
modify ActorStateAccessor::AsyncGet callback (#5417)
micafan Aug 21, 2019
eab5957
Support multiple store providers in ObjectInterface (#5452)
zhijunfu Aug 21, 2019
c852213
[projects] Project examples and documentation (#5407)
pcmoritz Aug 21, 2019
e2e30ca
Ray, Tune, and RLlib support for memory, object_store_memory options …
ericl Aug 22, 2019
cdc9227
[tune] ASHA xgboost and lightgbm examples (#5500)
richardliaw Aug 22, 2019
f359333
Batch fetch requests in core worker get (#5342)
edoakes Aug 22, 2019
b520f61
[rllib] Adds eager support with a generic `TFEagerPolicy` class (#5436)
gehring Aug 23, 2019
7812dd5
[Java] Fix getCurrentActorId in multi-threading scenario. (#5506)
kfstorm Aug 23, 2019
239c177
[Java] Support calling functions returning void (#5494)
raulchen Aug 23, 2019
dbf7089
Bump version to 0.7.4 (#5474)
pcmoritz Aug 24, 2019
d2a6f79
[tune] Fix for keras threading (#5517)
richardliaw Aug 24, 2019
53fd66f
[Java] Destroy native core worker before killing ray processes (#5516)
kfstorm Aug 24, 2019
fab5ae6
[Java] Automatically clean up temp files. (#5507)
raulchen Aug 24, 2019
28623d2
Add docs for memory quota settings (#5441)
ericl Aug 25, 2019
d41963c
Fixed: missing brackets when appending proc info on OutOfMemory (#5530)
jcridev Aug 25, 2019
7d28bbb
[rllib] Document on traj postprocess (#5532)
ericl Aug 25, 2019
97ccd75
[rllib] Enable object store memory limit by default (#5534)
ericl Aug 26, 2019
f1dcce5
[projects] Add named commands to sessions (#5525)
pcmoritz Aug 26, 2019
948b1b0
Remove previous version of ray.serve (#5541)
simon-mo Aug 27, 2019
03a1b75
[rllib] Fix some eager execution regressions with 1.13 (#5537)
ericl Aug 27, 2019
ff73b67
fix test (#5544)
ericl Aug 27, 2019
d206963
Fix autoscaler format string for memory (#5542)
ericl Aug 27, 2019
52a6a1b
[tune] TF2.0 TensorBoard support (#5547)
idthanm Aug 27, 2019
ddfabab
Fix log files being opened as unicode files (#5545)
pcmoritz Aug 27, 2019
fadfa5f
[Java] `ObjectID::fromRandom` sets proper flags (#5548)
kfstorm Aug 28, 2019
411f30c
[docs] Second push of changes (#5391)
richardliaw Aug 29, 2019
e9d2d04
Make RAY_CHECK for actor re-creation non-fatal (#5553)
pcmoritz Aug 29, 2019
04b8696
Fix O(n^2) behavior in the log_monitor (#5569)
pcmoritz Aug 29, 2019
fb40787
[docs] Distributed Training Quickfix (#5571)
richardliaw Aug 29, 2019
fe5bd09
Fix rllib image in readme and doc typo (#5579)
ericl Aug 29, 2019
3823190
[rllib] Forgot to register param noise layer variables
ericl Aug 30, 2019
85a92bc
Bump version string to 0.8.0.dev4 (#5523)
pcmoritz Aug 30, 2019
93e1031
Update doc versions from 0.8.0.dev3 to 0.8.0.dev4. (#5585)
robertnishihara Aug 30, 2019
bea43c8
Ref count objects created with ray.put (#5590)
ericl Aug 30, 2019
3e70dab
Warn on resource deadlock; improve object store error messages (#5555)
ericl Aug 30, 2019
550c96b
[rllib] Add docs on policy.model (#5597)
ericl Aug 31, 2019
9f31cdf
[docs] Add Pull Request Template Check (#5600)
richardliaw Aug 31, 2019
747daff
Fix impala stress test (#5596)
pcmoritz Aug 31, 2019
daf38c8
[tune] Deprecate tune.function (#5601)
ericl Aug 31, 2019
0cc0abf
Create session_latest symlink for Ray sessions (#5580)
edoakes Sep 1, 2019
0292f99
Fix DeprecationWarning (#5608)
suquark Sep 1, 2019
c49b98c
[tune] Add encoder for PBT (#5599)
richardliaw Sep 1, 2019
a101812
Replace --redis-address with --address in test, docs, tune, rllib (#5…
ericl Sep 1, 2019
4cccfcc
Fix the rllib-stack image display problem (#5612)
suquark Sep 2, 2019
d37c09a
[docs] Add a feedback form (#5610)
richardliaw Sep 2, 2019
dfd2a45
Simplify symlinking and don't print warnings (#5615)
edoakes Sep 2, 2019
378757e
fix CallbackReply resize (#5589)
micafan Sep 3, 2019
c3b0c62
Remove the unused argument `cusomt_loggers` in Experiment. (#5619)
llan-ml Sep 3, 2019
4ed6ee0
Better error message for actor class inheritance (#5598)
edoakes Sep 3, 2019
0c68b4c
Clean up Wait() and Get() in the core worker (#5556)
edoakes Sep 3, 2019
1711e20
[training] Tensorflow interface for MultiNode SGD (#5440)
jichan3751 Sep 3, 2019
130b8f2
[tune] Global checkpointing for tune at end (#5499)
richardliaw Sep 3, 2019
6ab5714
Temporarily remove pytest-sugar dependency (#5627)
edoakes Sep 4, 2019
ad96d3c
[tune] Fix TB Memory Leak (#5629)
richardliaw Sep 4, 2019
8236936
Fix code style in unit test of GCS. (#5634)
micafan Sep 4, 2019
3ea9062
Lazily create summary writer for TF2 logger. (#5631)
llan-ml Sep 4, 2019
34f6d2f
[tune] Update trainable docs and support hparams (#5558)
richardliaw Sep 4, 2019
bb5609a
ignore object exists error for memory store provider (#5607)
zhijunfu Sep 5, 2019
f38bb28
Clean up Wait() in the core worker (#5628)
edoakes Sep 5, 2019
19bbf1e
[rllib] Revert [rllib] Port DDPG to the build_tf_policy pattern (#5626)
ericl Sep 5, 2019
edcc56e
Project fixes and cleanups (#5632)
stephanie-wang Sep 5, 2019
ddadc18
Fix bug in ray.errors and update its default behavior (#5576)
mitchellstern Sep 5, 2019
c33d666
Remove Modin from Ray wheels. (#5647)
devin-petersohn Sep 6, 2019
744f6e4
Update release documentation after 0.7.4 release (#5646)
pcmoritz Sep 6, 2019
8a352a8
`ray stop` kills processes more carefully (#5508)
kfstorm Sep 6, 2019
d89ceb3
[tune] Fix numerical error (#5653)
richardliaw Sep 7, 2019
732336f
[Java] Support multiple workers in Java worker process (#5505)
kfstorm Sep 7, 2019
1455a19
Consolidate and clean up documentation (#5645)
ericl Sep 7, 2019
cf90394
[rllib] Fix TF2 import of EagerVariableStore (#5625)
ericl Sep 7, 2019
d0125d4
Fix log monitor process error when attempting to read raylet PID. (#5…
pcmoritz Sep 7, 2019
cb7102f
[projects] Wrap ProjectDefinition in a class (#5654)
stephanie-wang Sep 8, 2019
d8f5804
Support metadata for passing by value task arguments (#5527)
kfstorm Sep 8, 2019
ebb431a
Add internal_api.pin_object_data() for pinning arbitrary object ids (…
ericl Sep 8, 2019
87adb5a
Remove timeout in test_actor_lifetime_load_balancing. (#5659)
robertnishihara Sep 8, 2019
74abeab
[rllib] Improve accessing model state docs (#5656)
ericl Sep 9, 2019
ed76190
[Java] Support direct actor call in Java worker (#5504)
kfstorm Sep 9, 2019
0010f54
Update Cloudpickle (#5643)
richardliaw Sep 10, 2019
147e7d4
[Flaky tests] FIx test fork (#5671)
simon-mo Sep 10, 2019
4d16677
Fix PyPI version in readme. (#5662)
robertnishihara Sep 10, 2019
336aef1
[tune] Save and Restore for bayesopt (#5623)
hershg Sep 10, 2019
9ce6dd9
[Projects] Add "session execute" (#5681)
pcmoritz Sep 11, 2019
2fdefe1
Take into account queue length in autoscaling (#5684)
ericl Sep 11, 2019
bc6a95d
[rllib] Eager execution for centralized critic example, fix simple op…
ericl Sep 11, 2019
faeaa34
Deflake cluster heartbeat test (#5552)
ericl Sep 11, 2019
946ebfa
[rllib] Validate that entropy coeff is not an integer (#5687)
kiddyboots216 Sep 11, 2019
0bf79cf
Properly short circuit core worker Get() on exception (#5672)
edoakes Sep 12, 2019
ee5db5b
Raise error if space in redis password (#5673)
edoakes Sep 12, 2019
1b88019
Replace NotImplementedException with UnsupportedOperationException (#…
kfstorm Sep 12, 2019
f2dee72
Merge branch 'master' into docker_autoscaling
richardliaw Sep 13, 2019
8eee418
autoscaler
richardliaw Sep 13, 2019
1ec1174
nit
richardliaw Sep 13, 2019
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
6 changes: 6 additions & 0 deletions .bazelrc
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,11 @@
build --compilation_mode=opt
build --action_env=PATH
build --action_env=PYTHON_BIN_PATH
# Warnings should be errors
build --per_file_copt=-src/ray/thirdparty/hiredis/dict.c,-.*/arrow/util/logging.cc@-Werror
# Ignore warnings for protobuf generated files and external projects.
build --per_file_copt='\\.pb\\.cc$@-w'
build --per_file_copt='external*@-w'
# This workaround is needed due to https://github.com/bazelbuild/bazel/issues/4341
build --per_file_copt="external/com_github_grpc_grpc/.*@-DGRPC_BAZEL_BUILD"
build --http_timeout_scaling=5.0
38 changes: 38 additions & 0 deletions .github/CODEOWNERS
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# See https://help.github.com/articles/about-codeowners/
# for more info about CODEOWNERS file

# It uses the same pattern rule for gitignore file,
# see https://git-scm.com/docs/gitignore#_pattern_format.

# ==== Ray core ====

# All C++ code.
/src/ray @ray-project/ray-core-cpp

# Python worker.
/python/ray/ @ray-project/ray-core-python
!/python/ray/tune/ @ray-project/ray-core-python
!/python/ray/rllib/ @ray-project/ray-core-python

# Java worker.
/java/ @ray-project/ray-core-java

# ==== Libraries and frameworks ====

# Ray tune.
/python/ray/tune/ @ray-project/ray-tune

# RLlib.
/python/ray/rllib/ @ray-project/rllib

# ==== Build and CI ====

# Bazel.
/BUILD.bazel @ray-project/ray-core
/WORKSPACE @ray-project/ray-core
/bazel/ @ray-project/ray-core

# CI scripts.
/.travis.yml @ray-project/ray-core
/ci/travis/ @ray-project/ray-core

7 changes: 4 additions & 3 deletions .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,14 @@
<!-- Thank you for your contribution! Please review https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before opening a pull request. -->

## What do these changes do?

## Why are these changes needed?

<!-- Please give a short summary of the change and the problem this solves. -->

## Related issue number

<!-- For example: "Closes #1234" -->

## Linter
## Checks

- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for https://ray.readthedocs.io/en/latest/.
3 changes: 0 additions & 3 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,6 @@
/src/ray/raylet/format/*_generated.h
/java/runtime/src/main/java/org/ray/runtime/generated/*

# Modin source files
/python/ray/modin

# Redis temporary files
*dump.rdb

Expand Down
9 changes: 0 additions & 9 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -149,9 +149,6 @@ install:
- ./ci/suppress_output ./ci/travis/install-cython-examples.sh

- ./ci/suppress_output bash src/ray/test/run_gcs_tests.sh
# stats test.
- ./ci/suppress_output bazel build //:stats_test -c opt
- ./bazel-bin/stats_test

# core worker test.
- ./ci/suppress_output bash src/ray/test/run_core_worker_tests.sh
Expand All @@ -175,12 +172,6 @@ script:
# `cluster_tests.py` runs on Jenkins, not Travis.
- if [ $RAY_CI_TUNE_AFFECTED == "1" ]; then python -m pytest --durations=10 --timeout=300 --ignore=python/ray/tune/tests/test_cluster.py --ignore=python/ray/tune/tests/test_tune_restore.py --ignore=python/ray/tune/tests/test_actor_reuse.py python/ray/tune/tests; fi

# ray rllib tests
- if [ $RAY_CI_RLLIB_AFFECTED == "1" ]; then ./ci/suppress_output python python/ray/rllib/tests/test_catalog.py; fi
- if [ $RAY_CI_RLLIB_AFFECTED == "1" ]; then ./ci/suppress_output python python/ray/rllib/tests/test_filters.py; fi
- if [ $RAY_CI_RLLIB_AFFECTED == "1" ]; then ./ci/suppress_output python python/ray/rllib/tests/test_optimizers.py; fi
- if [ $RAY_CI_RLLIB_AFFECTED == "1" ]; then ./ci/suppress_output python python/ray/rllib/tests/test_evaluators.py; fi

# ray tests
# Python3.5+ only. Otherwise we will get `SyntaxError` regardless of how we set the tester.
- if [ $RAY_CI_PYTHON_AFFECTED == "1" ]; then python -c 'import sys;exit(sys.version_info>=(3,5))' || python -m pytest -v --durations=5 --timeout=300 python/ray/experimental/test/async_test.py; fi
Expand Down
132 changes: 105 additions & 27 deletions BUILD.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ proto_library(

cc_proto_library(
name = "node_manager_cc_proto",
deps = ["node_manager_proto"],
deps = [":node_manager_proto"],
)

proto_library(
Expand All @@ -62,7 +62,7 @@ proto_library(

cc_proto_library(
name = "object_manager_cc_proto",
deps = ["object_manager_proto"],
deps = [":object_manager_proto"],
)

proto_library(
Expand All @@ -87,11 +87,22 @@ cc_proto_library(
deps = ["core_worker_proto"],
)

proto_library(
name = "direct_actor_proto",
srcs = ["src/ray/protobuf/direct_actor.proto"],
deps = [":common_proto"],
)

cc_proto_library(
name = "direct_actor_cc_proto",
deps = ["direct_actor_proto"],
)

# === End of protobuf definitions ===

# === Begin of rpc definitions ===

# grpc common lib
# GRPC common lib.
cc_library(
name = "grpc_common_lib",
srcs = glob([
Expand Down Expand Up @@ -141,7 +152,7 @@ cc_grpc_library(
deps = [":object_manager_cc_proto"],
)

# Object manager server and client.
# Object manager rpc server and client.
cc_library(
name = "object_manager_rpc",
hdrs = glob([
Expand All @@ -157,14 +168,22 @@ cc_library(
],
)

# worker gRPC lib.
# Worker gRPC lib.
cc_grpc_library(
name = "worker_cc_grpc",
srcs = [":worker_proto"],
grpc_only = True,
deps = [":worker_cc_proto"],
)

# direct actor gRPC lib.
cc_grpc_library(
name = "direct_actor_cc_grpc",
srcs = [":direct_actor_proto"],
grpc_only = True,
deps = [":direct_actor_cc_proto"],
)

# worker server and client.
cc_library(
name = "worker_rpc",
Expand All @@ -173,6 +192,7 @@ cc_library(
]),
copts = COPTS,
deps = [
"direct_actor_cc_grpc",
":grpc_common_lib",
":ray_common",
":worker_cc_grpc",
Expand Down Expand Up @@ -201,6 +221,7 @@ cc_library(
copts = COPTS,
deps = [
":common_cc_proto",
":gcs_cc_proto",
":node_manager_fbs",
":ray_util",
"@boost//:asio",
Expand Down Expand Up @@ -326,6 +347,7 @@ cc_library(
[
"src/ray/core_worker/*.cc",
"src/ray/core_worker/store_provider/*.cc",
"src/ray/core_worker/store_provider/memory_store/*.cc",
"src/ray/core_worker/transport/*.cc",
],
exclude = [
Expand All @@ -336,6 +358,7 @@ cc_library(
hdrs = glob([
"src/ray/core_worker/*.h",
"src/ray/core_worker/store_provider/*.h",
"src/ray/core_worker/store_provider/memory_store/*.h",
"src/ray/core_worker/transport/*.h",
]),
copts = COPTS,
Expand All @@ -347,21 +370,36 @@ cc_library(
# should only depend on `raylet_client`, instead of the whole `raylet_lib`.
":raylet_lib",
":worker_rpc",
":gcs",
],
)

cc_binary(
name = "mock_worker",
srcs = ["src/ray/core_worker/mock_worker.cc"],
cc_library(
name = "mock_worker_lib",
srcs = ["src/ray/core_worker/test/mock_worker.cc"],
hdrs = glob([
"src/ray/core_worker/test/*.h",
]),
copts = COPTS,
deps = [
":core_worker_lib",
],
)

cc_binary(
name = "core_worker_test",
srcs = ["src/ray/core_worker/core_worker_test.cc"],
name = "mock_worker",
copts = COPTS,
deps = [
":mock_worker_lib",
],
)

cc_library(
name = "core_worker_test_lib",
srcs = ["src/ray/core_worker/test/core_worker_test.cc"],
hdrs = glob([
"src/ray/core_worker/test/*.h",
]),
copts = COPTS,
deps = [
":core_worker_lib",
Expand All @@ -370,6 +408,14 @@ cc_binary(
],
)

cc_binary(
name = "core_worker_test",
copts = COPTS,
deps = [
":core_worker_test_lib",
],
)

cc_test(
name = "lineage_cache_test",
srcs = ["src/ray/raylet/lineage_cache_test.cc"],
Expand Down Expand Up @@ -403,6 +449,16 @@ cc_test(
],
)

cc_test(
name = "id_test",
srcs = ["src/ray/common/id_test.cc"],
copts = COPTS,
deps = [
"ray_common",
"@com_google_googletest//:gtest_main",
],
)

cc_test(
name = "logging_test",
srcs = ["src/ray/util/logging_test.cc"],
Expand Down Expand Up @@ -585,10 +641,33 @@ cc_library(
],
)

# TODO(micafan) Replace cc_binary with cc_test for GCS test.
cc_binary(
name = "gcs_client_test",
name = "redis_gcs_client_test",
testonly = 1,
srcs = ["src/ray/gcs/client_test.cc"],
srcs = ["src/ray/gcs/redis_gcs_client_test.cc"],
copts = COPTS,
deps = [
":gcs",
"@com_google_googletest//:gtest_main",
],
)

cc_binary(
name = "actor_state_accessor_test",
testonly = 1,
srcs = ["src/ray/gcs/actor_state_accessor_test.cc"],
copts = COPTS,
deps = [
":gcs",
"@com_google_googletest//:gtest_main",
],
)

cc_binary(
name = "subscription_executor_test",
testonly = 1,
srcs = ["src/ray/gcs/subscription_executor_test.cc"],
copts = COPTS,
deps = [
":gcs",
Expand Down Expand Up @@ -647,13 +726,11 @@ pyx_library(
)

cc_binary(
name = "libraylet_library_java.so",
srcs = [
"src/ray/raylet/lib/java/org_ray_runtime_raylet_RayletClientImpl.h",
"src/ray/raylet/lib/java/org_ray_runtime_raylet_RayletClientImpl.cc",
"src/ray/common/id.h",
"src/ray/raylet/raylet_client.h",
"src/ray/util/logging.h",
name = "libcore_worker_library_java.so",
srcs = glob([
"src/ray/core_worker/lib/java/*.h",
"src/ray/core_worker/lib/java/*.cc",
]) + [
"@bazel_tools//tools/jdk:jni_header",
] + select({
"@bazel_tools//src/conditions:windows": ["@bazel_tools//tools/jdk:jni_md_header-windows"],
Expand All @@ -671,24 +748,23 @@ cc_binary(
linkshared = 1,
linkstatic = 1,
deps = [
"//:raylet_lib",
"@plasma//:plasma_client",
"//:core_worker_lib",
],
)

genrule(
name = "raylet-jni-darwin-compat",
srcs = [":libraylet_library_java.so"],
outs = ["libraylet_library_java.dylib"],
name = "core_worker-jni-darwin-compat",
srcs = [":libcore_worker_library_java.so"],
outs = ["libcore_worker_library_java.dylib"],
cmd = "cp $< $@",
output_to_bindir = 1,
)

filegroup(
name = "raylet_library_java",
name = "core_worker_library_java",
srcs = select({
"@bazel_tools//src/conditions:darwin": [":libraylet_library_java.dylib"],
"//conditions:default": [":libraylet_library_java.so"],
"@bazel_tools//src/conditions:darwin": [":libcore_worker_library_java.dylib"],
"//conditions:default": [":libcore_worker_library_java.so"],
}),
visibility = ["//java:__subpackages__"],
)
Expand All @@ -712,6 +788,8 @@ filegroup(
"python/ray/dashboard/res/main.js",
"python/ray/experimental/*.py",
"python/ray/internal/*.py",
"python/ray/projects/*.py",
"python/ray/projects/schema.json",
"python/ray/workers/default_worker.py",
]),
)
Expand Down
Loading