Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add analyzer: Prevent wrong usage of string Split #47927

Open
Tracked by #57797 ...
Mrxx99 opened this issue Feb 5, 2021 · 6 comments · May be fixed by dotnet/roslyn-analyzers#5506
Open
Tracked by #57797 ...

Add analyzer: Prevent wrong usage of string Split #47927

Mrxx99 opened this issue Feb 5, 2021 · 6 comments · May be fixed by dotnet/roslyn-analyzers#5506
Assignees
Labels
api-approved API was approved in API review, it can be implemented area-System.Runtime code-analyzer Marks an issue that suggests a Roslyn analyzer code-fixer Marks an issue that suggests a Roslyn code fixer in-pr There is an active PR which will close this issue when it is merged
Milestone

Comments

@Mrxx99
Copy link
Contributor

Mrxx99 commented Feb 5, 2021

Describe the problem you are trying to solve

Consider the following usage of Split on a string where you want to split on space and tab:
var split = s.Split(' ', '\t'); (which works fine)
If you now get to the conclusion to remove empty entries, it is tempting to do this:
var split = s.Split(' ', '\t', StringSplitOptions.RemoveEmptyEntries);
But this won't give the expected result and will result in a bug in the application, because the second char is converted to an int and used as the count argument for this overload:
public string[] Split(char separator, int count, StringSplitOptions options = StringSplitOptions.None);

Describe suggestions on how to achieve the rule

To prevent this, an analyzer should detect that a char is passed to the count parameter of this specific overload and should warn that it won't split on that char.
A codefix can provide a conversion to the following:
var split = s.Split(new[] { ' ', '\t' }, StringSplitOptions.RemoveEmptyEntries);

(btw. it would be good to also have a template for analyzers in this repository, not only in the roslyn-analyzers repo)

@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Feb 5, 2021
@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@GrabYourPitchforks GrabYourPitchforks added code-analyzer Marks an issue that suggests a Roslyn analyzer area-Meta labels Feb 5, 2021
@GrabYourPitchforks
Copy link
Member

A more generalized form of this analyzer would be "You used a char, but it got widened to an int when you passed it to this method."

Not sure how many false positives such an analyzer would produce. It's common to perform integer arithmetic on chars; e.g., (char)('a' + offset) to generate an output in the range a..z when offset is known to be 0..25. But I wouldn't imagine it's all that common to pass literal chars (or standalone char locals) to APIs that expect a wider data type.

See also #46103.

@joperezr joperezr removed the untriaged New issue has not been triaged by the area owner label Feb 9, 2021
@joperezr joperezr added this to the Future milestone Feb 9, 2021
@joperezr
Copy link
Member

joperezr commented Feb 9, 2021

cc: @carlossanlop FYI

@ghost ghost added the untriaged New issue has not been triaged by the area owner label Feb 9, 2021
@joperezr joperezr removed the untriaged New issue has not been triaged by the area owner label Feb 9, 2021
@carlossanlop carlossanlop added the code-fixer Marks an issue that suggests a Roslyn code fixer label Feb 10, 2021
@carlossanlop
Copy link
Member

Since it's possible to pass a char where a method takes an int, and it gets casted silently, it makes sense that we warn the user to make sure they intended to pass a char array instead.

Analyzer: small
Fixer: small
Suggested category: Reliability
Suggested severity: Warning - The user should manually suppress it in the rare case where they actually intended to get a char casted as int, as described by @GrabYourPitchforks above.
Suggested milestone: Future

Example:

string txt = "bla bla";
string[] splitted1 = txt.Split(' ', '\t'); // OK because argument is "params char[] separators"

// BEFORE
// Second argument is passed as int because "params" can only be used for the last argument
string[] splitted2 = txt.Split(' ', '\t', StringSplitOptions.RemoveEmptyEntries);

// AFTER
// Suggested fix: Wrap chars in array of chars
string[] splitted2 = txt.Split(new char[] { ' ', '\t' }, StringSplitOptions.RemoveEmptyEntries);

// DO NOT FLAG
// Make sure to detect the data type of the second argument
char c = ' ';
int count = 5;
string[] splitted3 = txt.Split(c, count, StringSplitOptions.RemoveEmptyEntries);

@ghost ghost added the needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration label Feb 23, 2021
@carlossanlop carlossanlop removed the needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration label Feb 23, 2021
@buyaa-n buyaa-n added the api-ready-for-review API is ready for review, it is NOT ready for implementation label Feb 23, 2021
@bartonjs
Copy link
Member

bartonjs commented Jul 27, 2021

Video

Using a messaging along the lines of "suspicious cast", having the rule identify calls to string.Split or StringBuilder..ctor that involves an implicit conversion from char to int seems like goodness (and we may add more specific API over time). All of the methods/constructors that we group together in this rule will use the same diagnostic ID.

There are some generalizations (any implicit char->int in an argument; or that but only when there's an overload that takes char/string) which would be good to follow up on, but they're out of scope for this issue.

  • Category: Correctness
  • Severity: Warning

@ghost ghost added the needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration label Jul 27, 2021
@bartonjs bartonjs added api-approved API was approved in API review, it can be implemented and removed api-ready-for-review API is ready for review, it is NOT ready for implementation needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration labels Jul 27, 2021
@NewellClark
Copy link
Contributor

I would like to be assigned to this, please.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-approved API was approved in API review, it can be implemented area-System.Runtime code-analyzer Marks an issue that suggests a Roslyn analyzer code-fixer Marks an issue that suggests a Roslyn code fixer in-pr There is an active PR which will close this issue when it is merged
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants