Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separate import and improve operations #525

Merged
merged 40 commits into from
Jan 25, 2022

Commits on Jan 25, 2022

  1. Get rid of added_advisory and batches

    useless after a530627
    
    Signed-off-by: Hritik Vijay <hritikxx8@gmail.com>
    Hritik14 committed Jan 25, 2022
    Configuration menu
    Copy the full SHA
    de518f7 View commit details
    Browse the repository at this point in the history
  2. Implement improver registry and management command

    Improver look strikingly similar to importers with some enhancements.
    (Eg: It doesn't use an importer_yielder alternative, see issue 501).
    
    Overview:
    Improvers maintain a contract (like Advisory) with improve_runner which
    is named Inference. Inference class embeds an advisory and a confidence
    score for that advisory. It is the job of an improver to fetch data to
    improve from the database (probably using some helper functions) then
    use whatever means necessary to improve that data sample and return with
    Inferences. Do note, that Inferences which have already been "imported"
    by importers would be totally discarded as redundant. Also, in case of
    two inferences on same data point, the one with highest confidence will
    be taken into the database.
    
    Food for thought:
    Pssst... Probably Inference class is useless and Advisory class can
    itself have that confidence score, but then the importers would have to
    mention that whatever they import have 100% confidence which might be
    susceptible to typo errors making some importers not mention their
    confidence thus zeroing on confidence. Anyway, importer and improvers
    should be different and separated. If not, then we could totally discard
    the idea of improvers and embed everything in an importer with a
    confidence score. Well, then, where goes the idea of modularity and
    keeping things simple ? Also, data coming from an "import"er should
    always be absolutely correct. This will also ensure that if downstream
    doesn't want any "improved" data then they don't get our guesses. The
    whole point of separating importers and improvers is that running
    improvers could be totally optional and based on downstream taste.
    
    Signed-off-by: Hritik Vijay <hritikxx8@gmail.com>
    Hritik14 committed Jan 25, 2022
    Configuration menu
    Copy the full SHA
    f783915 View commit details
    Browse the repository at this point in the history
  3. Implement improver

    This is work in progress, there are a few bugs and a few fixmes as well.
    Everything will be replaced before the final commit
    
    Signed-off-by: Hritik Vijay <hritikxx8@gmail.com>
    Hritik14 committed Jan 25, 2022
    Configuration menu
    Copy the full SHA
    fa74025 View commit details
    Browse the repository at this point in the history
  4. Use new Advisory model for importers

    Importing Advisory data is now 2 step process.
    1) Import Advisory
    2) Create relations (default improver)
    
    importers are now required to return AdvisoryData instead of Advisory.
    The data contained inside AdvisoryData is converted into a model named
    Advisory and then saved into the database. This will be useful later
    for improvers to work on existing Advisories.
    All the relationships for the Advisories are generated using a default
    improver at improvers/default.py.
    
    NOTE: There are errors in the UI due to models.py
        # TODO: Cannot resolve keyword 'resolved_vulnerabilities' into field
        # make vulnerabilities and resolved_vulnerabilities use the `fix` flag of PackageRelatedVulnerability
    
    Knows defects (in current PR):
    -[ ] UI break (see above)
    -[ ] might crash in multiple imports / improves
    -[ ] No improver than default improver is implemented yet
    -[ ] normalized function of ``AdvisoryData`` has no body
    -[ ] nginx importer still has remains of set_api etc
    -[ ] Inference -> AdvisoryData encapsulation
    -[ ] Duplicated data in database
    -[ ] ???
    
    Knows defects (to be solved in different PR):
    -[ ] inconsistent naming - will be resolved in a different PR
    -[ ] unordered imports
    
    Signed-off-by: Hritik Vijay <hritikxx8@gmail.com>
    Hritik14 committed Jan 25, 2022
    Configuration menu
    Copy the full SHA
    5fa3cd5 View commit details
    Browse the repository at this point in the history
  5. Remove Inference->Advisory encapsulation

    Earlier Inference used to nest Advisory class, this doesn't make sense
    logically. Now, the Inference class is of it's own and contains some
    attributes that could be present in an advisory but are not required.
    
    Technical reason behind this decoupling is that the PackageURL in
    AffectedPackage is NOT supposed to contain version information. This
    information must be present inside an Inference (that's the whole point).
    Improvers are required to read AdvisoryData, modify the relevant fields,
    add inference specific fields (confidence, source) and expand the
    VersionSpecifier contained inside AffectedPackage package as they deem
    fit into PackageURLs.
    
    Known defects (in current PR):
    -[ ] UI break (see above)
    -[ ] might crash in multiple imports / improves
    -[ ] No improver than default improver is implemented yet
    -[ ] normalized function of ``AdvisoryData`` has no body
    -[ ] nginx importer still has remains of set_api etc
    -[x] Inference -> AdvisoryData encapsulation
    -[ ] Duplicated data in database
    -[ ] ???
    
    Knows defects (to be solved in different PR):
    -[ ] inconsistent naming - will be resolved in a different PR
    -[ ] unordered imports
    
    Signed-off-by: Hritik Vijay <hritikxx8@gmail.com>
    Hritik14 committed Jan 25, 2022
    Configuration menu
    Copy the full SHA
    08605cd View commit details
    Browse the repository at this point in the history
  6. Fix UI break

    Recent model changes break the UI as now the PackageRelatedVulnerability
    contains a ``fix`` flag to mark the relationship as a fix.
    This is leveraged to eliminate multiple columns like patched_package or
    vulnerable_package.
    
    Known defects (in current PR):
    -[x] UI break
    -[ ] might crash in multiple imports / improves
    -[ ] No improver than default improver is implemented yet
    -[ ] normalized function of ``AdvisoryData`` has no body
    -[ ] nginx importer still has remains of set_api etc
    -[x] Inference -> AdvisoryData encapsulation
    -[ ] Duplicated data in database
    -[ ] ???
    
    Knows defects (to be solved in different PR):
    -[ ] inconsistent naming - will be resolved in a different PR
    -[ ] unordered imports
    
    Signed-off-by: Hritik Vijay <hritikxx8@gmail.com>
    Hritik14 committed Jan 25, 2022
    Configuration menu
    Copy the full SHA
    43c8ab9 View commit details
    Browse the repository at this point in the history
  7. Use to_dict, add docs, infer as function name

    The refactors are based on aboutcode-org#525 (review)
    
    - class Inference: order should be the logic order from most important to least important fields
    - class Improver: docs
    - data_source.py: Use to_dict() than json()
    - models: Advisory: docs
    - improvers/nginx.py: Removed, relevant improvers will now reside inside
      importer module. In this case, importers/nginx.py
    - black formatting
    
    Signed-off-by: Hritik Vijay <hritikxx8@gmail.com>
    Hritik14 committed Jan 25, 2022
    Configuration menu
    Copy the full SHA
    96a8cb3 View commit details
    Browse the repository at this point in the history
  8. Use OSV design for AffectedPackages

    AffectedPackges now contains all affected versions and one fix version.
    This is inspired from the design documented at https://docs.google.com/document/d/1sylBGNooKtf220RHQn1I8pZRmqXZQADDQ_TOABrKTpA
    Under "Format Overview", along the lines of:
    
        "affects": {
                "ranges": [ {
                    "type": string,
                    "repo": string,
                    "introduced": string,
                    "fixed": string
                } ],
                "versions": [ string ]
            },
    
    Signed-off-by: Hritik Vijay <hritikxx8@gmail.com>
    Hritik14 committed Jan 25, 2022
    Configuration menu
    Copy the full SHA
    c3f34e9 View commit details
    Browse the repository at this point in the history
  9. Modify nginx importer to adopt OSV design

    This is still not perfect because univers is not stable yet. The
    uncertainty about the structure of version_specifier needs to be
    resolved. As of now, there are many redundant AffectedPackage objects
    which would be gone after aboutcode-org/univers#8 is
    fixed.
    
    Signed-off-by: Hritik Vijay <hritikxx8@gmail.com>
    Hritik14 committed Jan 25, 2022
    Configuration menu
    Copy the full SHA
    c2ef79c View commit details
    Browse the repository at this point in the history
  10. Assert Inference fields

    There must be an affected or fixed purls if there's no vulnerability id.
    Otherwise, when vulnerability id is present, any (except summery for now)
    could be present. It does not make sense to neither have vulnerability id
    nor affected or fixed purls, such relation would be meaningless.
    
    Signed-off-by: Hritik Vijay <hritikxx8@gmail.com>
    Hritik14 committed Jan 25, 2022
    Configuration menu
    Copy the full SHA
    7c1e24c View commit details
    Browse the repository at this point in the history
  11. Process one advisory in one transaction

    Updated way:
    
    - Each Improver has a method to process a single Advisory model instance
      such as Improver.get_inferences(self, advisory): -> Inference
    
    - The framework then iterates on an Improver-provided QuerySet such as
      Improver.interesting_advisories that has the Advisories it is
      interested in.
    
    - In the framework, there is an atomic transaction that updates both the
      Advisory (e.g. date and later a log of improvements with select for
      update) and whatever is updated or create from the Inference
    
    This avoids failing the entire improver when only a single inference is
    erroneous. Also, the atomic transaction for every advisory and its
    inferences makes sure that improved_on date of advisory is consistent.
    
    Signed-off-by: Hritik Vijay <hritikxx8@gmail.com>
    Hritik14 committed Jan 25, 2022
    Configuration menu
    Copy the full SHA
    31c1054 View commit details
    Browse the repository at this point in the history
  12. Improve docs and cleanup code

    Git rid of extraneous methods with empty body and commented code.
    Now as AdvisoryData has all the parameters optional, make sure there's
    a __post_init__ to assert the constraints.
    
    Signed-off-by: Hritik Vijay <hritikxx8@gmail.com>
    Hritik14 committed Jan 25, 2022
    Configuration menu
    Copy the full SHA
    2694c4e View commit details
    Browse the repository at this point in the history
  13. Hotfix parse_version from univers

    aboutcode-org/univers#10
    - parse_version() should accept None and empty strings
    
    Signed-off-by: Hritik Vijay <hritikxx8@gmail.com>
    Hritik14 committed Jan 25, 2022
    Configuration menu
    Copy the full SHA
    c60e030 View commit details
    Browse the repository at this point in the history
  14. Disable failing importers, add docs and refactor

    This is based on PR review done at
    aboutcode-org#525 (review)
    Majorly removes commented code, adds some doctest
    
    Signed-off-by: Hritik Vijay <hritikxx8@gmail.com>
    lchritik authored and Hritik14 committed Jan 25, 2022
    Configuration menu
    Copy the full SHA
    fd0f909 View commit details
    Browse the repository at this point in the history
  15. Use latest univers branch

    This adds univer from the current branch that support "vers".
    See aboutcode-org/univers#12
    
    Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
    pombredanne authored and Hritik14 committed Jan 25, 2022
    Configuration menu
    Copy the full SHA
    be31a0e View commit details
    Browse the repository at this point in the history
  16. Update TODOs and docstrings

    All of this comes from the review done at
    aboutcode-org#525 (review)
    
    Co-authored-by: Philippe Ombredanne <pombredanne@gmail.com>
    Signed-off-by: Hritik Vijay <hritikxx8@gmail.com>
    Hritik14 and pombredanne committed Jan 25, 2022
    Configuration menu
    Copy the full SHA
    1971e5b View commit details
    Browse the repository at this point in the history
  17. Adopt new univers for importers and tiny refactor

    This commit adopts the new univers, this also requires to reset all the
    migrations as migration 0003 was using a univers method which is no
    longer available.
    There are also a few refactors based on aboutcode-org#525 (review)
    
    Signed-off-by: Hritik Vijay <hritikxx8@gmail.com>
    Hritik14 committed Jan 25, 2022
    Configuration menu
    Copy the full SHA
    1216244 View commit details
    Browse the repository at this point in the history
  18. Enable doctest in pytest

    because why not ?
    
    Signed-off-by: Hritik Vijay <hritikxx8@gmail.com>
    Hritik14 committed Jan 25, 2022
    Configuration menu
    Copy the full SHA
    8065e3c View commit details
    Browse the repository at this point in the history
  19. Rewrite nginx importer to use new univers design

    Signed-off-by: Hritik Vijay <hritikxx8@gmail.com>
    Hritik14 committed Jan 25, 2022
    Configuration menu
    Copy the full SHA
    5ec989d View commit details
    Browse the repository at this point in the history
  20. Refactor according to code review on 2021-12-04

    The following suggestions were acception plus some code refactor
    
    https://github.com/nexB/vulnerablecode/blob/main/vulnerabilities/migrations/0003_populate_patched_package.py
    Is this hand written migration ? Why ? I'm resetting migrations, this is breaks on changes in univers
    - > move to init migration, provide data dump
    
    https://github.com/Hritik14/vulnerablecode/blob/ee0dba45f1d5b6680e121d91ce59b050325a5e67/vulnerabilities/import_runner.py#L73
    - [x] name should be inside the importer class
    https://github.com/Hritik14/vulnerablecode/blob/ee0dba45f1d5b6680e121d91ce59b050325a5e67/vulnerabilities/import_runner.py#L108
    - [ ] Advisory. get or create in the loop itself
    https://github.com/Hritik14/vulnerablecode/blob/ee0dba45f1d5b6680e121d91ce59b050325a5e67/vulnerabilities/import_runner.py#L95
    - [x] Make it a list (or iterable is better), not set
    
    class AdvisoryData:
    """
    This data class expresses the contract between data sources and the import runner.
    """
    
    vulnerability_id: Optional[str] = None
    summary: str = None
    affected_packages: List[AffectedPackage] = dataclasses.field(default_factory=list)
    references: List[Reference] = dataclasses.field(default_factory=list)
    date_published: Optional[datetime.datetime] = None
    
    - [x] Use this to create an Advisory model and store List objects as json
    
    https://github.com/Hritik14/vulnerablecode/blob/ee0dba45f1d5b6680e121d91ce59b050325a5e67/vulnerabilities/data_source.py#L99
    - [x] affected_version_range
    
    - [x] https://github.com/Hritik14/vulnerablecode/blob/ee0dba45f1d5b6680e121d91ce59b050325a5e67/vulnerabilities/data_source.py#L120
    VersionRange.version_class to get the Version subclass
    
    https://github.com/Hritik14/vulnerablecode/blob/ee0dba45f1d5b6680e121d91ce59b050325a5e67/vulnerabilities/importers/nginx.py#L71
    - [ ] docify this
    
    - [x] advisory_data should return an iterable as a contract
    
    https://github.com/Hritik14/vulnerablecode/blob/ee0dba45f1d5b6680e121d91ce59b050325a5e67/vulnerabilities/importers/nginx.py#L77
    - [x] yield better
    
    https://github.com/Hritik14/vulnerablecode/blob/ee0dba45f1d5b6680e121d91ce59b050325a5e67/vulnerabilities/importers/nginx.py#L158
    - [ ] use getattr
    
    https://github.com/Hritik14/vulnerablecode/blob/ee0dba45f1d5b6680e121d91ce59b050325a5e67/vulnerabilities/importers/nginx.py#L162
    - [x] only for nginx advisory
    
    https://github.com/Hritik14/vulnerablecode/blob/ee0dba45f1d5b6680e121d91ce59b050325a5e67/vulnerabilities/importers/nginx.py#L188
    - [x] return a dict and use ** on 77
    https://github.com/Hritik14/vulnerablecode/blob/ee0dba45f1d5b6680e121d91ce59b050325a5e67/vulnerabilities/importers/nginx.py#L99
    - [x] _,_,fixed_versions =
    
    https://github.com/Hritik14/vulnerablecode/blob/ee0dba45f1d5b6680e121d91ce59b050325a5e67/vulnerabilities/importers/nginx.py#L86
    - [x] how does a paragraph look
    
    https://github.com/Hritik14/vulnerablecode/blob/ee0dba45f1d5b6680e121d91ce59b050325a5e67/vulnerabilities/importers/nginx.py#L116
    - [x] remove branch qualifier
    
    Signed-off-by: Hritik Vijay <hritikxx8@gmail.com>
    Hritik14 committed Jan 25, 2022
    Configuration menu
    Copy the full SHA
    18e3d4b View commit details
    Browse the repository at this point in the history
  21. Update improvers to accept new univers design

    Rewrite following improvers:
    	DefaultImprover
    Plus adjust the improver framework
    
    Signed-off-by: Hritik Vijay <hritikxx8@gmail.com>
    Hritik14 committed Jan 25, 2022
    Configuration menu
    Copy the full SHA
    22d7132 View commit details
    Browse the repository at this point in the history
  22. Add AdvisoryData.to_inference and AffectedPackage.merge

    Signed-off-by: Hritik Vijay <hritikxx8@gmail.com>
    Hritik14 committed Jan 25, 2022
    Configuration menu
    Copy the full SHA
    7cd4044 View commit details
    Browse the repository at this point in the history
  23. Add to_reper to DataSource

    Signed-off-by: Hritik Vijay <hritikxx8@gmail.com>
    Hritik14 committed Jan 25, 2022
    Configuration menu
    Copy the full SHA
    c7bc550 View commit details
    Browse the repository at this point in the history
  24. Apply formatting changes

    Signed-off-by: Hritik Vijay <hritikxx8@gmail.com>
    Hritik14 committed Jan 25, 2022
    Configuration menu
    Copy the full SHA
    f399af6 View commit details
    Browse the repository at this point in the history
  25. Adopt new vers spec

    VersionRange is always a flat list of constraints
    
    Signed-off-by: Hritik Vijay <hritikxx8@gmail.com>
    Hritik14 committed Jan 25, 2022
    Configuration menu
    Copy the full SHA
    73f8682 View commit details
    Browse the repository at this point in the history
  26. Implement NginxBasicImprover

    Signed-off-by: Hritik Vijay <hritikxx8@gmail.com>
    Hritik14 committed Jan 25, 2022
    Configuration menu
    Copy the full SHA
    d89e3d3 View commit details
    Browse the repository at this point in the history
  27. Introduce aliases

    An alias is a unique vulnerability identifier in some database, such as
    the NVD, PYSEC, CVE or similar. These databases guarantee that these
    identifiers are unique within their namespace.
    An alias may also be used as a Reference. But in contrast with some
    Reference may not be an identifier for a single vulnerability, for instance,
    security advisories such as Debian security advisory reference various
    vulnerabilities.
    
    Signed-off-by: Hritik Vijay <hritikxx8@gmail.com>
    Hritik14 committed Jan 25, 2022
    Configuration menu
    Copy the full SHA
    4988452 View commit details
    Browse the repository at this point in the history
  28. Reformat Alias structure and process_improver

    Signed-off-by: Hritik Vijay <hritikxx8@gmail.com>
    Hritik14 committed Jan 25, 2022
    Configuration menu
    Copy the full SHA
    299d6f1 View commit details
    Browse the repository at this point in the history
  29. Change __repr__ to qualified_name fn

    Signed-off-by: Hritik Vijay <hritikxx8@gmail.com>
    Hritik14 committed Jan 25, 2022
    Configuration menu
    Copy the full SHA
    b9ba26a View commit details
    Browse the repository at this point in the history
  30. Update .gitignore for junk files

    Signed-off-by: Hritik Vijay <hritikxx8@gmail.com>
    Hritik14 committed Jan 25, 2022
    Configuration menu
    Copy the full SHA
    e3a1ea5 View commit details
    Browse the repository at this point in the history
  31. Fix few tests to recent structure, dump ugettext_lazy

    Signed-off-by: Hritik Vijay <hritikxx8@gmail.com>
    Hritik14 committed Jan 25, 2022
    Configuration menu
    Copy the full SHA
    c3e71f4 View commit details
    Browse the repository at this point in the history
  32. Ignore outdated tests

    Signed-off-by: Hritik Vijay <hritikxx8@gmail.com>
    Hritik14 committed Jan 25, 2022
    Configuration menu
    Copy the full SHA
    e822c41 View commit details
    Browse the repository at this point in the history
  33. Disable outdated importers

    Signed-off-by: Hritik Vijay <hritikxx8@gmail.com>
    Hritik14 committed Jan 25, 2022
    Configuration menu
    Copy the full SHA
    e9f940d View commit details
    Browse the repository at this point in the history
  34. Reset migrations

    Signed-off-by: Hritik Vijay <hritikxx8@gmail.com>
    Hritik14 committed Jan 25, 2022
    Configuration menu
    Copy the full SHA
    318663e View commit details
    Browse the repository at this point in the history
  35. Partition without numerical index

    Signed-off-by: Hritik Vijay <hritikxx8@gmail.com>
    Hritik14 committed Jan 25, 2022
    Configuration menu
    Copy the full SHA
    86d175e View commit details
    Browse the repository at this point in the history
  36. Add get_fixed_purl in AffectedPackage, fix from_dict

    from_dict is a factory, should be a classmethod
    
    Signed-off-by: Hritik Vijay <hritikxx8@gmail.com>
    Hritik14 committed Jan 25, 2022
    Configuration menu
    Copy the full SHA
    bd390d4 View commit details
    Browse the repository at this point in the history
  37. Add FIXME about wrong filters in PackageSearchView

    The entire view is messed up. Will probably need a rewrite
    
    Co-authored-by: Philippe Ombredanne <pombredanne@gmail.com>
    Signed-off-by: Hritik Vijay <hritikxx8@gmail.com>
    Hritik14 and pombredanne committed Jan 25, 2022
    Configuration menu
    Copy the full SHA
    6ab6f06 View commit details
    Browse the repository at this point in the history
  38. Teeny weeny fixes

    Fixes are in accordance with the final review for this PR done at :
    aboutcode-org#525 (review)
    
    Signed-off-by: Hritik Vijay <hritikxx8@gmail.com>
    Co-authored-by: Philippe Ombredanne <pombredanne@gmail.com>
    Hritik14 and pombredanne committed Jan 25, 2022
    Configuration menu
    Copy the full SHA
    33d0c0d View commit details
    Browse the repository at this point in the history
  39. Correct file format

    Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
    pombredanne committed Jan 25, 2022
    Configuration menu
    Copy the full SHA
    21c6178 View commit details
    Browse the repository at this point in the history
  40. Use univers 30+. Bump lxml

    Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
    pombredanne committed Jan 25, 2022
    Configuration menu
    Copy the full SHA
    0e74bea View commit details
    Browse the repository at this point in the history