Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add bleeding edge and precise literal string param #1655

Merged
merged 6 commits into from
May 22, 2022

Conversation

VincentLanglet
Copy link
Member

@VincentLanglet VincentLanglet commented May 15, 2022

Subject

I am targeting this branch, because {reason}.

Closes #{put_issue_number_here}.

Changelog

### Fixed
- Precise Filter::filter() param as literal-string.

@@ -28,6 +28,9 @@ abstract class Filter extends BaseFilter implements GroupableConditionAwareInter

/**
* Apply the filter to the QueryBuilder instance.
*
* @phpstan-param literal-string $alias
* @phpstan-param literal-string $field
*/
abstract public function filter(ProxyQueryInterface $query, string $alias, string $field, FilterData $data): void;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the rfc https://wiki.php.net/rfc/is_literal about literal-string.

There are some PR on phpstan-doctrine to add literal-string check on join/where method,
like phpstan/phpstan-doctrine#327 and phpstan/phpstan-doctrine#326.

The issue is that if I write

public function filter(ProxyQueryInterface $query, string $alias, string $field, FilterData $data): void
{
        if (!$data->hasValue()) {
            return;
        }

        $queryBuilder = $query->getQueryBuilder()->andWhere($alias . '.' . $field);
}

phpstan will report an issue because I don't pass a literal-string to andWhere.
So both $alias and $field need to be literal-string.

This imply that association returns array{literal-string, literal-string}, and then method like
$filter->getFieldName() too. But this is coming from the option field_name provided by the developer. Same for the alias option provided in the CallbackFilter.

The issue is that we cannot use phpdoc to force the developer to pass a literal-string for the option field_name or alias, since the filters options are generics:
https://github.com/sonata-project/SonataAdminBundle/blob/4.x/src/Filter/FilterInterface.php#L58-L73

Currently I don't see another solution that adding phpdoc /** @var literal-string */. Because the other solution (doing nothing) will result into a tons of false positive errors for $queryBuilder = $query->getQueryBuilder()->andWhere($alias . '.' . $field); code.

WDYT @jordisala1991 ?

cc @craigfrancis what do you have in mind for those situations ?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you have in mind for those situations?

I'm not familiar with this project, but anything that goes into andWhere() for the Doctrine QueryBuilder needs to be written by the developer (anything else is difficult to prove it can be trusted, see eiv.dev).

With your filter() method, I cannot see it in the current repo, but could you start by adding @param literal-string $alias and @param literal-string $field? then see where the next problem is?

There may be times where you temporarily use /** @var literal-string */, but the idea is to eventually remove them all, as a literal-string represents a string defined in the source code (it can be trusted to not cause an Injection Vulnerability), and anything else should be considered untrusted (e.g. user values) where they should be provided separately to the ORM (those values should be parameterised or escaped properly).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I started to add @param literal-string $alias and then it lead me to the following issue:

The alias is provided by a method called by the user this way

$this->setOption('alias', $alias);

But a lot of other options can be provided, so the phpdoc is

public function setOption(string $key, mixed $value) {}

So I cannot force the developer to pass a literal-string.

So far, I can just assume that he pass a literal-string and add /** @var literal-string */.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that's a really useful example on the difference between the static analysis approach (e.g. a literal-string type is always required) vs the is_literal() native/runtime implementation (i.e. each string value has a literal flag, so you could check the individual option values as needed).

Anyway, the use of setOption(), is it always a sensitive-string or an array of sensitive-string/null values? If so, could you use something like literal-string|array<int, (literal-string|null)>?

Or, while it's not pretty, the sensitive options could have their own get/set methods?

Or, and probably a bit more of a pain, you could use a stringable value-object, so your filter() method could require it for $alias and $field?

class sensitive_value {
  private $value = NULL;
  /** @param literal-string $value */
  public function __construct($value) {
    $this->value = $value;
  }
  /** @return literal-string */
  public function __toString() {
    return $this->value;
  }
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway, the use of setOption(), is it always a sensitive-string or an array of sensitive-string/null values? If so, could you use something like literal-string|array<int, (literal-string|null)>?

No, it's not always a sensitive-string.

Or, while it's not pretty, the sensitive options could have their own get/set methods?

Unfortunately, when the CallbackFilter need an alias option, another Filter may not need one (so the getAlias method wouldn't make sens). And maybe another Filter could require an array or an object for the alias option.

Or, and probably a bit more of a pain, you could use a stringable value-object, so your filter() method could require it for $alias and $field?

class sensitive_value {
  private $value = NULL;
  /** @param literal-string $value */
  public function __construct($value) {
    $this->value = $value;
  }
  /** @return literal-string */
  public function __toString() {
    return $this->value;
  }
}

That's basically the implementation of a literalString, it would even be better to have this in php.

This could be possible but

  • Changing the typehint of the method is a BC-break and would require a new major version.
  • For a lot of user which doesn't care about literal-string it could be considered as extra-complexity to use this class.

One solution I see would be to add a check

if (!is_literal($this->getOption('alias')) {
    throw new \Exception();
}

but

  • the is_literal method doesn't exist so far
  • this is kinda restrictive because I force the developper to pass a literal-string, and he may want to pass a manipulated string (which is known secure still, or escaped).

Unfortunately, there is no method like

function toLiteral(string $string): literal-string;

or no way to cast a string into a literal string ^^.

That's why in this use-case I currently don't see any way to avoid static-analysis error without using a /** @var literal-string */ annotation which is not fully true.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right... start with /** @var literal-string */ for now, then we can think of tweaks to the libraries API to handle these sensitive values more clearly/safely (so they can be easily marked as a @param literal-string).

For example, while this isn't a perfect match to what you're doing, something like the following kinda works (with table_name being sensitive, and date_format being less so; not that they need to be in an array):

class my_example {

  /** @var array{table_name:literal-string,date_format:string} $config */
  private $config = [
      'table_name'  => 'article',
      'date_format' => 'jS M Y g:ia',
      // 'filters'     => [],
    ];

  // public function config_set(string $key, mixed $value): void {
  //   $this->config[$key] = $value;
  // }

  /** @param literal-string $value */
  public function table_name_set(string $value): void {
    $this->config['table_name'] = $value;
  }

  public function date_format_set(string $value): void {
    $this->config['date_format'] = $value;
  }

  /** @return array<string, string> */
  public function get(int $id): array {

    $qb = new QueryBuilder();

    $result = $qb
      ->select('*')
      ->from($this->config['table_name'])
      ->where('id', $id);

    $result['date_formatted'] = (new DateTime($result['date']))->format(strval($this->config['date_format']));

    return $result;

  }

}

$my_example = new my_example();
$my_example->table_name_set('my_table');
$my_example->table_name_set($_GET['table'] ?? 'default_table'); // INSECURE

print_r($my_example->get(123));

To respond to your previous points...

I'd also prefer the sensitive_value approach to be built into PHP (via a literal flag), because the is_literal() function would make this much easier, and libraries could allow developers to choose how non-literal strings are handled (e.g. the protection_level and unsafe_value value-object in this example). Unfortunately the static analysis approach doesn't give this flexibility easily (i.e. we simply say "this parameter accepts a literal-string").

While I appreciate many developers don't care about literal-string today (note how often SQL Injection Vulnerabilities still happen), the intention is to get to a position where it's the default way of working (with no thought given), where developers are made aware of their mistakes as soon as possible (e.g. IDE's could alert them as they type).

To do this, we need to move towards an approach where libraries (like Doctrine) do all the escaping... I know this sounds odd, but I think developers should never escape values/identifiers themselves, simply because it's easy to get wrong (lack of knowledge, or via simple mistakes). Unfortunately I don't think Doctrine has a way to safely provide identifiers to it yet (last year I proposed a setIdentifier() method, and I believe Sergei is proposing an RFC to improve handling of SQL object names, but no luck yet).

@VincentLanglet VincentLanglet marked this pull request as ready for review May 22, 2022 15:15
@VincentLanglet VincentLanglet requested review from jordisala1991 and a team May 22, 2022 15:30
@@ -36,10 +36,6 @@ public function testFilterEmpty(): void
'type' => null,
'value' => ['start' => null, 'end' => null],
]));
$filter->filter($proxyQuery, 'alias', 'field', FilterData::fromArray([
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why you removed this one?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test were failing because I changed

if ($value['start'])

to

if ($value['start'] !== null)

'' is not supposed to be a supported value since the filter is used with DateTimeType which returns either a dateTime or null, isn't it ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove it, but it could cause some BC breaks

@VincentLanglet VincentLanglet merged commit 9ccf050 into sonata-project:4.x May 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants