Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setting service should return default value when json scheme is invalid #125422

Open
lramos15 opened this issue Jun 3, 2021 · 13 comments
Open

Setting service should return default value when json scheme is invalid #125422

lramos15 opened this issue Jun 3, 2021 · 13 comments
Assignees
Labels
config VS Code configuration, set up issues debt Code quality issues

Comments

@lramos15
Copy link
Member

lramos15 commented Jun 3, 2021

To protect against invalid values the setting service should provide some sort of sanitization. Reading back a broken user setting can cause unexpected behavior. This is particularly evident in the rare cases where we update a setting in insiders and it syncs to stable causing unrecoverable exceptions.

My proposal is to sanitize user input to ensure that the expected type is what is in the user's settings.json if not then we return the default value.

/cc @bpasero As this relates to our discussion earlier. Feel free to add anything

@lramos15 lramos15 added debt Code quality issues settings-editor VS Code settings editor issues labels Jun 3, 2021
@lramos15 lramos15 self-assigned this Jun 3, 2021
@bpasero
Copy link
Member

bpasero commented Jun 3, 2021

cc @sandy081, if possible the IConfigurationService should return the next valid value (up to the default), if the configured setting does not match the JSON schema.

In this particular case, Logan changed a setting from an array to a object and a lot of users got issues when syncing the same setting between stable and insiders.

@bpasero bpasero added config VS Code configuration, set up issues and removed settings-editor VS Code settings editor issues labels Jun 3, 2021
@sandy081
Copy link
Member

sandy081 commented Jun 3, 2021

Yeah, I got to know about this stable-insiders sync issue that break backward compatibility. Regarding validating the value by schema, how deep and broad you want to go as the JSON schema can go deeper (objects) and broader (multiple types, enums etc). It would be nice if we can reuse JSON language server instead of rewriting. Also how much cost it gonna add and impact performance. Is not it easy for the consumer to validate the value and go up the chain?

@sandy081 sandy081 added the under-discussion Issue is under discussion for relevance, priority, approach label Jun 3, 2021
@lramos15
Copy link
Member Author

lramos15 commented Jun 3, 2021

I don't think you would have to go very broad, but don't we already do JSON schema validation when we show the warnings and errors in the settings.json? The consumer could of course do the validation but it might be good to have a catch all inside the configuration service to at least ensure we are returning the default value they specify in the read if the value is incorrect. I don't think many think to do any validation. From a performance impact my guess would be not much since it's a very small data set and could be optimized easily. I go back and forth if it would be worth it, but I felt like the discussion was important to have as it would lead to more error resilient code.

@Tyriar
Copy link
Member

Tyriar commented Jun 3, 2021

Some concerns/thoughts:

  • Modifying what setting value is synced was mentioned in standup, this happening upon only when reading out seems like a much safer approach.
  • I like that the whole file is synced now, I wouldn't want it to be synced as a validated version and then stable might sync back and remove that setting. It would be complex internally and also for the user to understand.
  • Sometimes "invalid" settings values are still supported in code, for example a setting may have changed from boolean to an enum but still accept boolean values by design.
  • We shouldn't be spending much time optimizing for the case where insiders is using the stable settings service imo as it's an edge case (insiders users) of an edge case (sharing with stable).

@lramos15
Copy link
Member Author

lramos15 commented Jun 3, 2021

@Tyriar I apologize if I misspoke during standup. The intention is to do validation on read and not modify what is synced or what is contained in the user file.

Sometimes "invalid" settings values are still supported in code, for example a setting may have changed from boolean to an enum but still accept boolean values by design.

This is true and would pose a challenge. I would assume there would be an option to read a raw unvalidated setting if you wish. I found aggressively converting everyone's setting on startup / on write to the settings.json to be much more user friendly than supporting and allowing the user to have both formats. This has the additional benefit of only needing to support an old format for a few iterations because we can guarantee all users who have updated VS Code in the past few months have the new setting.

@sandy081
Copy link
Member

sandy081 commented Jun 4, 2021

Regarding settings sync I agree with @Tyriar comment.

Regarding configuration service returning the valid values always:

I don't think you would have to go very broad, but don't we already do JSON schema validation when we show the warnings and errors in the settings.json?

JSON schema validation is done by JSON language server which is an extension. Configuration service just provides the schema for JSON extension to validate. So it does not have access to JSON language server and so cannot reuse it in core.

The consumer could of course do the validation but it might be good to have a catch all inside the configuration service to at least ensure we are returning the default value they specify in the read if the value is incorrect.

@Tyriar made a good point that some times consumers accept invalid values and do migration. I would expect configuration service return what is there in the settings file by default and it is up to the consumer to check the validity of it.

From a performance impact my guess would be not much since it's a very small data set and could be optimized easily.

I think it does impact perf. If the underlying model validates the value then it impacts the start up perf. Configuration service is the very first service we instantiate and initialize as the complete workbench depends on it. Introducing validation will definitely add cost. If we do validation, we have to do it completely and in all settings.json files which is expensive for sure.

If we validate only the value that is being asked, I would think it is still expensive as workbench is heavy consumer of configuration service and is called large number of times and sometimes multiple times with same key. Also note that configuration service can return the complete settings tree if somebody does not provide any section/key in getValue. Another scenario is that editor ask for complete editor settings very often. I feel this is going to be a big change to do it efficiently.

May I know if this ripples down to extensions too? If so are not they broken now? How can they read the value that is there in the file?

So, I am not sure I would go in this direction. But I see a requirement here for those consumers who would like to validate value. For this, I would suggest to introduce another API / Service which can validate the value as per schema so that consumers can check before applying.

One thing which I have been thinking to do w.r.to following API in IConfigurationService

/**
* Fetches the value of the section for the given overrides.
* Value can be of native type or an object keyed off the section name.
*
* @param section - Section of the configuraion. Can be `null` or `undefined`.
* @param overrides - Overrides that has to be applied while fetching
*
*/
getValue<T>(): T;
getValue<T>(section: string): T;
getValue<T>(overrides: IConfigurationOverrides): T;
getValue<T>(section: string, overrides: IConfigurationOverrides): T;

Typing is bogus here. So I would like to remove the typing and just change the return type to any | undefined | null since IConfigurationService cannot guarantee on the value type and shape. It is up to the caller to type it or validate it etc.,

@bpasero
Copy link
Member

bpasero commented Jun 4, 2021

Well actually the right thing would be to return unknown and force the caller to assert the type, but that is a big big adoption. Maybe we could offer helpers for the simple cases that are primitives.

@lramos15
Copy link
Member Author

lramos15 commented Jun 4, 2021

Thanks for the detailed responses. I think I agree with you that this isn't worth pursuing beyond a typings / expectation change. I was only thinking in the context of validating a single setting entry and if we can't access the JSON language server our own validation logic wouldn't be worth writing.

I would say the biggest issue is what @sandy081 pointed out regarding the typings, they don't mean anything and therefore give the false sense of security that your returned value is type safe. Changing the typings (I do like @bpasero's idea of unknown but that might be a much larger work item) and adding a comment stating there no validation should be enough to at least notify the consumer that validation is required.

@Tyriar
Copy link
Member

Tyriar commented Jun 4, 2021

Well actually the right thing would be to return unknown and force the caller to assert the type, but that is a big big adoption. Maybe we could offer helpers for the simple cases that are primitives.

I've started doing this with terminal profiles because otherwise runtime exceptions happen. 👍 to validation helpers, using is in the function return makes this really nice:

https://github.com/Microsoft/vscode/blob/0f8c499d1014b9079f01a63c097cc2354400e9ed/src/vs/workbench/contrib/terminal/browser/terminalProfileResolverService.ts#L376-L394

Maybe we could even have a general function to validate the schema which would for the most part save needing to do what I'm doing above:

function matchesSchema(obj: unknown, schema: IJSONSchema): obj is IJSONSchema {
  // Check if primitive, recurse object props, etc.
}

EDIT: Actually the return type there wouldn't work, it would probably need to be generic and use obj is T or something.

@lramos15
Copy link
Member Author

Switching to unknown produces ~850 errors. They're fairly simple adoptions and fortify our codebase but it is something that would take time. If we think it's worth the time to force a validation mindset then I can start an engineering item, but it might not be worth our effort.

Something like matchesSchema<T>(obj: unknown, schema: IJSONSchema): obj is T would be interesting as well to explore, although we would have to write our own JSON validation logic in that case.

@sandy081
Copy link
Member

IMO since the API declaration is not truthful and reliable, it is good that we change it. I agree that it is a huge task but it is worth and make our code base rock solid and handles cases when settings values are invalid.

I will bring it up in our stand up and see what other team members feel about it.

@lramos15
Copy link
Member Author

After discussion with Kai, he also agrees this would be a worthwhile investment. Since it's a templated function we can do this iteratively by templating it as unknown i.e. this.configurationService.getValue<unknown>('foo'); that way when we flick the switch to unknown we would just remove the templating.

@lramos15 lramos15 added engineering VS Code - Build / issue tracking / etc. and removed under-discussion Issue is under discussion for relevance, priority, approach engineering VS Code - Build / issue tracking / etc. labels Jun 28, 2021
lramos15 added a commit that referenced this issue Jul 8, 2021
lramos15 added a commit that referenced this issue Jul 8, 2021
lramos15 added a commit that referenced this issue Jul 9, 2021
lramos15 added a commit that referenced this issue Jul 9, 2021
lramos15 added a commit that referenced this issue Jul 9, 2021
@bpasero
Copy link
Member

bpasero commented Nov 3, 2023

cc @benibenj who is working on validation in #197242 for editor options.

After a sync with @sandy081 here is an idea:

  • validation should happen in the configuration service and not be something that consumers need to do
  • for most simple settings, the configuration service can already do validation because when we register a setting, we indicate the valid types already and a default too
  • for complex settings we may optionally need to allow to pass in a validator callback when registering the setting, though that should be the exception
  • we should not validate any setting that is already the default, in other words only user changed settings should be validated
  • when a consumer calls getValue the setting is always validated

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
config VS Code configuration, set up issues debt Code quality issues
Projects
None yet
Development

No branches or pull requests

4 participants