-
Notifications
You must be signed in to change notification settings - Fork 204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add safe_load #119
Comments
Can you please be more specific about the desired behavior? Is it acceptable to load Symbols? Are subclasses of Ruby primitives considered "primitive"? If someone sets an instance variable on a string, then dumps / loads that string, should the instance variable survive? Can you please demonstrate how to use YAML dump / load to execute arbitrary code? AFAIK it depends on objects other than YAML. |
Which is why I believe there should be a |
@postmodern does the |
Given these points, why not just use JSON? |
JSON does not support Symbols, does not support non-String-keyed Hashes and does not have special formatting such as:
Is there any reason YAML should not have a secure mode? |
@tenderlove @charliesome's PoC used These exploits would not work if the YAML parser was configured to only allow primitives (aka |
+1. Only primitives that YAML can deserialize without using No symbols IMO. I can't see a use case where you would want to allow symbols but still restrict other deserialization. |
👍 on what @postmodern and @charliesome are saying. I really wish it supported a way for it to skip over everything and raise what isn't in the specs by default, and certainly like what @charliesome said about skipping over symbols too... the symbols part is kind of important to me because it would make it easier to keep a consistent configuration file to API without having to run to_s on all the keys myself before I create an indifferent Hash. |
Instead of having a big |
There is no other option other than |
@postmodern so YAML didn't execute the code, but Rails did?
I don't particularly care whether it does or not. We just have to be specific about what it means. Also, we need someone to write the patch. :-) |
@tenderlove YAML just happened to call Please see the Rails PoC write up for a full walk through. |
Ah, so it's not specific to YAML, but anything that will feed strings to eval. Makes sense. Aaron Patterson On Jan 13, 2013, at 7:26 PM, Postmodern notifications@github.com wrote:
|
Marshal.load has this problem as well and so would any serialization mechanism that does the following with arbitrary Foo classes and arbitrary field values.
|
@benmmurphy it's in |
I think safe load can have a very simple definition: It simply loads the YAML as if no Psych can add a |
👍 on @trans tag callback idea. |
👍 @trans. The definition is easy to grok. |
I have been toying with my own ideas for a framework to do rpc/serialize objects for a while. I really don't know if this is the correct approach. Attribute assignment safety is not, and, in a language such as ruby, cannot ever be global. Indeed, some of the high-level classes (drb, tk, and the like) are the sorts of things that would make me nervous unless I examined them. (I count 82 instances of []= in /usr/local/lib/ruby/2.0.0.) On the other hand, there are a huge number of user classes for which []= is perfectly safe. My approach is to whitelist. You can examine the existing ruby base and stdlib classes and determine if there is a problem with any of them. There are several ways to proceed from there, but most likely, the data should be held in a data structure inside Psyche. All classes not registered as safe then are unsafe. A facility would be provided to register a class as safe. (HashWithIndifferentAccess comes to mind) Presumably, Psyche might probe a previously unknown class to see if it lists itself as safe. The next issue is attribute assignment generally. Again, if a whitelist approach is taken, (and exact matches required), it become fairly simple to proceed. Indeed, both of these facilities might well specify that some inputs are safe but not others. With this approach, you have the full ability to do attribute assignment where it is known to be safe. And if a maintainer does not want to bother, then their classes are simply not safe. A global switch that says, "while processing the following string, consider everything as safe" would bring back the existing behaviour--which is exactly what would be needed in some cases. |
@Student Protecting setters is your problem, not Pysch's problem. Psych is not your object-guardian. The goal is to block unserializing attributes. Example: If people don't want symbols then they come out as ":value" rather than |
The question to me seems to be how much pain a user must endure in order to make use of a tool. If YAML is to compete with JSON, then YAML cannot require the user to audit their entire object library for []= weirdness. What's more, I doubt that I'm the only one that did not know that YAML was calling []= until last week--people won't even know about the threat. You can sit back at say nasty things about people who do weird stuff with []= or you can make a library that doesn't regularly get implicated in security problems. Again, what I'm saying is that safety requires that unsafe actions be screened out by default. To me, that suggests an audit of Ruby's base classes and a facility for classes to register themselves as safe to whatever degree. |
Do we really needs this ? I have a strange feeling that making safe_load will be really hard and in the end nobody will use it because it will be so limited. Lets just not use YAML as something we accept from requests/external sources and just parse. If someone wants to do something like this he should explicitly say he wants to expose himself to problems that it carries with. In my opinion we should use only/mostly JSON and Ruby. For config files ruby is so expressive that we can have instead of databas.yml like things something that will just load .rb file. aka
Or something similar to this. Do we need YAML at all ? |
@JakubOboza How is it going to be limited? The fact of the matter is it already has an unserializer to move non-primates into their Objects, at that point it's a matter of having that Object tell you what kind of non-primative the Object wants to be and then you deciding if you want that non-primative, or giving you the ability to block non-primatives (like symbols) right off the bat. Since it has to have a decision engine for them to begin with (see lib/psych/visitors/to_ruby.rb) the logic for rejection is not really all that hard to add in. The callbacks might be hard, but only in the sense of how do you do it properly, not how do you implement it. |
Yeah right. So first of all do symbols get ever garbage collected ? No. So basically someone can makes series of reuqests with dozens of symbols and make your VM grow to insane size and die. Performing this DoS attack is not so hard. So making |
@JakubOboza Using Ruby is the exact opposite of the solution to this issue. It would create even more security issues --insurmountable ones. @Student Is your whitelist solution more complicated than we need? I wonder if whitelisting prior to loading is necessary when we can just safe-load plain YAML and then instantiate any nodes we need proactively. In other words, we can simply apply our own whitelist solution after loading given the plain YAML. That way Psych doesn't need to handle it and all the additional code and api it would entail. However, I could see a nice general-purpose method for traversing arrays and hashes to do these conversions would be helpful to make that dead simple. |
Why ? if you use ruby in config files and never autoparse yaml in requests ? I gave the example just to show that we don't need yaml. Everything you need from serialization format is list of objects thingy |
@JakubOboza I think you should take a second, recollect your thoughts and come back when you are more rational and not spewing trash. Psych is what makes the non-primitives. So what you are saying is broken and flawed logic because safe_load would prevent the symbols from ever being created and would keep them as primitive strings. |
First, an apology. I forgot that in general YAML is creating instances of objects, then making calls on these instances or setting attributes on them directly. Some of my first comments were coloured by that error. Clearly, it is safe to set attributes on a newly created instance, so there is no concern there. However, if a class uses a singleton object, this is no longer safe, as global state is being overwritten. (I observe no occurrences of "ingleton" in my ruby 1.9.3p327 source.) []=, is another matter entirely, however. []= is no more a setter than a method ending in ? is a boolean probe. In fact, I would argue that it is less so. There is nothing at all wrong with having some sort of generic class that implements []= as sending messages to other classes. It just breaks some assumptions. Now as I mentioned before, someone who wants to use gem X (which depends on gem Y) should not have to do research to figure out if it is safe to do so with YAML parsing external input. And just because I need to use a gem that needs a gem that implements an obscure class that is not safe, why should I be prevented from using YAML in the normal way for my entire application? I audited the rails lib classes yesterday, and as of 2.0.0.preview2, they are clean. It is easy enough to create a structure to hold this data. When a unknown classes is encountered, it can be queried to determine if it has a yaml_safe method. If it returns false, then not even attributes will be set. If it returns true, then []= will be called with abandon. If it returns a symbol, that method will be called instead of []=. If it returns something that responds to call.... Otherwise, the class is registered as unknown, and attributes will be set but []= will not be called. Of course, the global settings can also be used, and the end user can explicitly register whatever sort of bizarre rules make sense for them. Let the user decide! With such a facility, the community has an easy way to register the external-input safety of their classes, and 99.9% of the people can proceed in ignorant bliss. I REALLY want to be able to use YAML for parsing external input. JSON is entirely too lowest-common-denominator. XML is stilted at best. Marshall practically requires both ends to be running the same everything. DrB is for internal use only. |
This shit is bananas. Seriously though, this thread turned from useful into a bag full of "come again say what?" |
I'm affraid, my experience in the ruby world seems to suggest that ruby programmers tend to favor convenience over almost everything, including security. Given what you've already told about your project and concerns, I'm not sure the ruby/rails culture will fit very well. |
@blambeau I'll just leave this here:
Just because it says it can be used for the internet doesn't mean it's a good idea and doesn't mean "should". |
@Kimcha changing
I do not agree that this is a binary choice between convenience and security, that we must be forced into choosing one or the other. It is simply the matter of the right tool for the right job:
This is where one should take extreme measures, such as auditing their code-base for any
You assume that every developer (and user) would immediately upgrade to the latest version of Psych containing |
@Kimcha something I haven't told you about the ruby world: confusion. A nice example from @postmodern here:
Very strong confusion between functional requirements (loading data that might or might not contain only scalars) and non-functional requirements (keeping the operation safe). There is no such |
continuing #119 (comment) - I assume that every developer does not write Would it be hard to announce an deprecation today, add a warning in a year, allow flag in two years and switch the |
I think YAML.load should remain as it is, .load implies that we are trusting the input, just as in Marshal.load or CSV.load. Perhaps, by analogy, the primitives-only version should be YAML.parse? |
If you want to fall inline with JSON then sure, which I wouldn't mind since it means consistency. |
since it hasn't been mentioned, here's documentation that contains examples of python's / pyyaml's take on safe_load: http://pyyaml.org/wiki/PyYAMLDocumentation |
|
Thank you! |
w00t finally! |
We can't just switch to `YAML.safe_load()` ourselves since that would break backwards compatibility. For instance, `safe_load` returns nil for empty yaml documents where `load` returns `false`. Also, `safe_load` will refuse to parse Symbol keys since DoS attacks targeting symbols are a real thread. Finally, not every Ruby version has a Psych that supports `safe_laod`. ruby/psych#119 (comment) Fixes #92
There's a vulnerability in `YAML.load` which can enable arbitrary code to be run on our sytems. `YAML.safe_load` does not deserialize unsafe classes. Related reading: http://blog.codeclimate.com/blog/2013/01/10/rails-remote-code-execution-vulnerability-explained/ ruby/psych#119 http://docs.ruby-lang.org/en/2.1.0/Psych.html#method-c-safe_load
We can't just switch to `YAML.safe_load()` ourselves since that would break backwards compatibility. For instance, `safe_load` returns nil for empty yaml documents where `load` returns `false`. Also, `safe_load` will refuse to parse Symbol keys since DoS attacks targeting symbols are a real thread. Finally, not every Ruby version has a Psych that supports `safe_laod`. ruby/psych#119 (comment) Fixes #92
An issue from June 2014, lostisland#92, raises the risks of the current `FaradayMiddleware::ParseYaml` middleware which uses `YAML.load`. This method is very unsafe and exposes you to remote code execution - see ruby/psych#119 for discussion. At the time, @mislav decided not to make this change to avoid messing with backwards compatability. I would suggest that we should revisit this decision - the risks of this are very high, very few people are using this middleware most likely, and it doesn't seem unreasonable to break this as long as we are clear on the change in the changelog. This does that by installing the `safe_yaml` gem, which is compatible with all Ruby versions we support.
In lieu of the recent Rails YAML RCE vulnerability, Psych should provide a safe_load equivalent method, that only loads Ruby primitives.
The text was updated successfully, but these errors were encountered: