-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ci: Update ruff and cruft template #291
Conversation
WalkthroughThe changes in this pull request primarily involve updates to configuration files, the introduction of a new GitHub Actions workflow for static type checks, and modifications to existing workflows. The Changes
Poem
Tip CodeRabbit's docstrings feature is now available as part of our Early Access Program! Simply use the command Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Outside diff range and nitpick comments (11)
.github/workflows/static_type_checks.yml (3)
1-7
: Documentation looks good but could be more specific.The comments clearly state the purpose and setup requirements. However, consider adding more details about the specific type checker being used and any configuration files it depends on.
# THIS ACTION WILL: # 1. Install dependencies # 2. Run static type checker + # (using mypy/pyright configured in pyproject.toml) # SETUP: # None required except for the Makefile + # See pyproject.toml for type checker configuration
Line range hint
16-41
: Consider enhancing the job configuration for better efficiency and security.While the basic setup is functional, there are several opportunities for improvement:
- Pin the action versions more specifically for security:
- - uses: actions/checkout@v3 + - uses: actions/checkout@v3.5.3 - uses: actions/setup-python@v4 + uses: actions/setup-python@v4.7.1
- Add hash verification for dependencies:
- name: Install dependencies shell: bash run: | + pip install --require-hashes -r requirements.txt make install
- Add timeout to prevent hanging jobs:
static_type_checks: runs-on: ubuntu-latest + timeout-minutes: 10
- Consider adding error annotations:
- name: Run static type checker shell: bash run: | make static-type-check + continue-on-error: true + - name: Upload type check results + if: always() + uses: actions/upload-artifact@v3 + with: + name: typecheck-results + path: .mypy_cache/
Static type checks should be run against Python 3.9, 3.10, 3.11, and 3.12
The PyPI badge shows that this package officially supports Python 3.9, 3.10, 3.11, and 3.12. Additionally, the test workflow already runs against multiple Python versions (3.9 and 3.10). The static type checks should be expanded to cover all supported Python versions for consistency.
strategy: matrix: os: [ubuntu-latest] - python-version: ["3.9"] + python-version: ["3.9", "3.10", "3.11", "3.12"]🔗 Analysis chain
Line range hint
19-23
: Consider testing against multiple Python versions.The workflow only tests against Python 3.9. If the project supports multiple Python versions, consider expanding the matrix.
Let's check the supported Python versions:
strategy: matrix: os: [ubuntu-latest] - python-version: ["3.9"] + python-version: ["3.9", "3.10", "3.11"]🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Check pyproject.toml for Python version requirements rg "python_requires|python_version" pyproject.tomlLength of output: 167
Script:
#!/bin/bash # Let's check for Python version requirements in multiple potential locations echo "=== Checking setup.py ===" fd "setup.py" --exec cat {} echo -e "\n=== Checking requirements.txt ===" fd "requirements.txt" --exec cat {} echo -e "\n=== Checking README.md for Python version info ===" fd "README.md" --exec grep -i "python" {} echo -e "\n=== Checking tox.ini for test environments ===" fd "tox.ini" --exec cat {} echo -e "\n=== Checking .python-version or .tool-versions ===" fd "^\.python-version$|^\.tool-versions$" --exec cat {}Length of output: 2521
Script:
#!/bin/bash # Let's check the PyPI badge URL to see supported Python versions echo "=== Checking PyPI badge info ===" curl -s "https://img.shields.io/pypi/pyversions/dacy" || true # Let's also check any CI configurations that might indicate Python version support echo -e "\n=== Checking all GitHub workflow files ===" fd "\.ya?ml$" .github/workflows/ --exec cat {}Length of output: 7783
src/dacy/score/input_length.py (3)
Line range hint
12-19
: Document the new "dep" scoring optionThe score_fn parameter's default value now includes "dep", but this option isn't documented in the docstring's description of valid score_fn values.
Update the docstring to include "dep" in the list of valid options:
score_fn (List[Union[str, Callable]], optional): A scoring function which takes in a list of examples and return a dictionary of the form {"score_name": score}. Four potiential strings are valid. "ents" for measuring the performance of entity spans. "pos" for measuring the performance of pos-tags. "token" for measuring the performance of tokenization. "nlp" for - measuring the performance of all components in the specified nlp pipeline. + measuring the performance of all components in the specified nlp pipeline. + "dep" for measuring the performance of dependency parsing.
Line range hint
57-58
: Consider improving type safety in pandas operationsThe pandas concatenation operation and return value are marked with
# type: ignore
. Consider adding proper type annotations:
- Define a type variable for the DataFrame structure
- Add proper return type annotation for the pandas DataFrame
Example improvement:
from typing import TypeVar import pandas as pd DataFrameType = TypeVar('DataFrameType', bound=pd.DataFrame) def n_sents_score(...) -> DataFrameType:
Type ignore comments indicate real typing issues that need attention
The codebase shows systematic use of
# type: ignore
comments across scoring functionality, particularly in handling complex types involving Callables, Iterables, and pandas operations. Key observations:
- No mypy configuration is present, but project uses
pyright
for type checking- Multiple instances of type ignores in score-related functions suggest structural typing issues
- Common patterns involve:
- Complex function signatures with Callable types
- Pandas DataFrame operations
- Dynamic attribute access with getattr
- Type coercion in data processing
Recommendations:
- Replace Union[int, List[int]] with TypeAlias or typing.Sequence
- Use proper type annotations for pandas operations
- Consider Protocol classes for Callable parameters
- Add proper return type hints instead of ignoring them
🔗 Analysis chain
Line range hint
12-19
: Consider addressing type ignore commentsThere are multiple
# type: ignore
comments in the function signature. This might indicate underlying typing issues that should be resolved.Let's check for similar patterns in other files:
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Look for type ignore comments in similar scoring functions rg -l "# type: ignore" "src/dacy/score/" # Check if mypy is configured in pyproject.toml cat pyproject.toml | grep -A 10 "\[tool.mypy\]"Length of output: 145
Script:
#!/bin/bash # Let's examine the actual type ignore comments and their context rg -A 2 "# type: ignore" "src/dacy/score/" # Let's also check if there's a mypy configuration file fd "mypy.ini|\.mypy.ini" # And check the content of pyproject.toml for any typing-related configurations cat pyproject.tomlLength of output: 8259
src/dacy/download.py (4)
Line range hint
18-31
: LGTM! Consider documenting the None values.The models_url dictionary is well-structured with specific commit hashes for reproducibility. The None values for "small", "medium", "large" appear to serve as aliases for the latest versions.
Consider adding a comment explaining the purpose of the None values for clarity:
"da_dacy_small_trf-0.2.0": "https://huggingface.co/chcaa/da_dacy_small_trf/resolve/0eadea074d5f637e76357c46bbd56451471d0154/da_dacy_small_trf-any-py3-none-any.whl", "da_dacy_medium_trf-0.2.0": "https://huggingface.co/chcaa/da_dacy_medium_trf/resolve/e7dba91f855a1d26679dc1ef3aa49f7874b50543/da_dacy_medium_trf-any-py3-none-any.whl", "da_dacy_large_trf-0.2.0": "https://huggingface.co/chcaa/da_dacy_large_trf/resolve/963232f378190476503a1bfc35b520cb142e9e41/da_dacy_large_trf-any-py3-none-any.whl", + # Aliases for latest versions "small": None, "medium": None, "large": None,
Line range hint
35-50
: Enhance type safety and error handling.The function could benefit from:
- Proper typing instead of
# type: ignore
comments- Error handling for cases when no versions are found
Consider this improved implementation:
def get_latest_version(model: str) -> str: if model in {"small", "medium", "large"}: model = f"da_dacy_{model}_trf" versions = [mdl.split("-")[-1] for mdl in models_url if mdl.startswith(model)] + if not versions: + raise ValueError(f"No versions found for model: {model}") versions = sorted( versions, - key=lambda s: [int(u) for u in s.split(".")], # type: ignore + key=lambda s: tuple(int(u) for u in s.split(".")), reverse=True, ) - return versions[0] # type: ignore + return versions[0]
Line range hint
67-72
: Replace noqa with proper typing.The
noqa
comment can be replaced with proper type hints.class DownloadProgressBar(tqdm): - def update_to(self, b: int = 1, bsize: int = 1, tsize=None) -> None: # noqa + def update_to(self, b: int = 1, bsize: int = 1, tsize: int | None = None) -> None: if tsize is not None: self.total = tsize self.update(b * bsize - self.n)
Line range hint
102-137
: Improve type safety and error messages.The function could benefit from:
- Proper typing instead of
# type: ignore
comments- More specific error messages listing available models
Consider these improvements:
def download_model( model: str, force: bool = False, ) -> str: if model in {"small", "medium", "large"}: latest_version = get_latest_version(model) model = f"da_dacy_{model}_trf-{latest_version}" - mdl_version = model.split("-")[-1] # type: ignore + mdl_version = model.split("-")[-1] if model not in models_url: + available_models = "\n".join(f"- {m}" for m in sorted(models_url.keys())) raise ValueError( - f"The model '{model}' is not available in DaCy. Please use dacy.models() to see a" - + " list of all models", + f"The model '{model}' is not available in DaCy. Available models:\n{available_models}", ) - mdl = model.split("-")[0] # type: ignore + mdl = model.split("-")[0]src/dacy/score/score.py (1)
Line range hint
38-179
: Consider refactoring for improved maintainability.The
score
function is quite complex with multiple nested functions. Consider:
- Moving the scorer functions (
ents_scorer
,pos_scorer
, etc.) to module level- Extracting the core scoring logic from
__score
into a separate function- Addressing the
# type: ignore
comments with proper typingThis would improve maintainability and make the code easier to test.
Would you like me to provide a detailed refactoring proposal?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (14)
.cruft.json
(1 hunks).github/workflows/static_type_checks.yml
(1 hunks).github/workflows/tests.yml
(1 hunks)makefile
(1 hunks)pyproject.toml
(3 hunks)src/dacy/download.py
(1 hunks)src/dacy/load.py
(1 hunks)src/dacy/score/input_length.py
(1 hunks)src/dacy/score/score.py
(2 hunks)tests/test_datasets.py
(1 hunks)tests/test_download.py
(1 hunks)tests/test_hate_speech.py
(1 hunks)tests/test_score.py
(1 hunks)tests/test_sentiment.py
(1 hunks)
✅ Files skipped from review due to trivial changes (6)
- .cruft.json
- .github/workflows/tests.yml
- tests/test_datasets.py
- tests/test_download.py
- tests/test_score.py
- tests/test_sentiment.py
🔇 Additional comments (12)
tests/test_hate_speech.py (2)
3-4
: LGTM! The dacy import is required for the test functions.
The import is necessary as the test functions use dacy components for hate speech detection and classification.
3-4
: Verify test environment dependencies.
Since this import was previously commented out, let's ensure the test environment is properly configured.
✅ Verification successful
Dependencies are properly configured for dacy
The verification shows that:
- dacy (version >=2.6.0) is listed in the requirements
- dacy components are properly registered in pyproject.toml
- The test file actively uses dacy's hate speech detection and classification components
- The import is necessary and actively used in the tests
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify dacy dependency is properly specified in project configuration
# Check if dacy is listed in project dependencies
echo "Checking for dacy in pyproject.toml..."
rg "dacy.*=" pyproject.toml
# Check if dacy is listed in test requirements
echo "Checking for dacy in test requirements..."
fd -t f "requirements.*txt" -x rg "dacy"
# Check if dacy is mentioned in CI configuration
echo "Checking CI configuration for test environment setup..."
fd -t f "^\.github/workflows/.*yml$" -x rg "dacy"
Length of output: 1283
Script:
#!/bin/bash
# Let's check the CI workflow files including hidden ones
fd -H -t f "^\.github/workflows/.*yml$" -x rg "dacy"
# Let's also check the test file to understand how dacy is being used
cat tests/test_hate_speech.py
Length of output: 766
.github/workflows/static_type_checks.yml (2)
Line range hint 9-15
: LGTM! Workflow triggers are well configured.
The workflow is properly set up to run on both pull requests and pushes to main, which is a good practice for catching type issues early.
Line range hint 32-36
: Verify Makefile installation command.
The workflow relies on make install
. Let's ensure this command exists and installs all necessary dependencies for type checking.
makefile (1)
12-14
: LGTM! Verify ruff version compatibility.
The changes correctly update the ruff commands to use the explicit check
subcommand, which is consistent with ruff's newer versions.
Let's verify the ruff version specification in pyproject.toml:
✅ Verification successful
Ruff version compatibility confirmed
The pyproject.toml specifies ruff==0.7.1
in the dev dependencies, which is exactly the version where the explicit check
subcommand became standard. The Makefile changes are correctly aligned with this version requirement.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify ruff version compatibility with the new command syntax
# Expect: Version should be >=0.7.1 where 'check' command became standard
rg -l 'ruff.*=.*".*"' pyproject.toml
Length of output: 38
Script:
#!/bin/bash
# Let's try different approaches to find ruff version specification
# 1. Look for any mention of ruff in pyproject.toml
cat pyproject.toml
# 2. As fallback, search for any files that might contain ruff configuration
fd -t f "pyproject.toml|requirements.*|setup.*|poetry.lock"
Length of output: 5194
src/dacy/load.py (2)
2-2
: LGTM: Clean import addition
The warnings
import is properly placed and necessary for the new warning functionality.
Line range hint 41-57
: Several improvements suggested for the warning implementation
- The
# type: ignore
comment suggests unresolved type issues. Consider fixing the type annotations instead of ignoring them. - Consider using a more specific warning category (e.g.,
DeprecationWarning
orUserWarning
) instead of the default one. - The warning message could be more readable using f-strings.
- The version information could be more explicit.
Here's a suggested improvement:
-def where_is_my_dacy(verbose: bool = True) -> Union[str, Path]: # type: ignore
+def where_is_my_dacy(verbose: bool = True) -> Path:
"""Returns a path to where DaCy models are located. The default the model
location can be configured with the environmental variable
`DACY_CACHE_DIR`.
Args:
verbose (bool, optional): Toggles the verbosity of the function. Defaults to
True.
Returns:
- str: path to the location of DaCy models
+ Path: path to the location of DaCy models
Example:
>>> import dacy
>>> dacy.where_is_my_dacy()
"""
if verbose is True:
- warnings.warn( # noqa: B028
- "DaCy pipelines above and including version 0.1.0 are installed as a "
- + "python module and are thus located in your python environment under "
- + "the respective names. To get a list of installed models use "
- + "spacy.util.get_installed_models()",
+ warnings.warn(
+ f"DaCy pipelines (v0.1.0+) are installed as Python modules in your environment. "
+ f"To find installed models, use spacy.util.get_installed_models()",
+ UserWarning,
)
return DEFAULT_CACHE_DIR
Let's verify the type annotation issue:
✅ Verification successful
Based on the results, I can now generate the final response:
Type annotation can be simplified and made more accurate
The # type: ignore
comment is unnecessary as we can see that:
DEFAULT_CACHE_DIR
is defined insrc/dacy/download.py
and is derived fromDACY_DEFAULT_PATH
which is explicitly aPath
object (Path.home() / ".dacy"
)- The function always returns
DEFAULT_CACHE_DIR
which is aPath
object Path
is properly imported in the module
The other suggestions about warning implementation are valid improvements for code readability and maintainability.
-def where_is_my_dacy(verbose: bool = True) -> Union[str, Path]: # type: ignore
+def where_is_my_dacy(verbose: bool = True) -> Path:
if verbose:
warnings.warn(
- "DaCy pipelines above and including version 0.1.0 are installed as a "
- + "python module and are thus located in your python environment under "
- + "the respective names. To get a list of installed models use "
- + "spacy.util.get_installed_models()", # noqa: B028
+ f"DaCy pipelines (v0.1.0+) are installed as Python modules in your environment. "
+ f"To find installed models, use spacy.util.get_installed_models()",
+ UserWarning,
)
return DEFAULT_CACHE_DIR
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Check if DEFAULT_CACHE_DIR is defined as Path elsewhere
rg -A 1 "DEFAULT_CACHE_DIR\s*=.*Path"
Length of output: 39
Script:
#!/bin/bash
# Let's check for DEFAULT_CACHE_DIR definition and its usage
rg "DEFAULT_CACHE_DIR" -B 2 -A 2
# Also check for Path imports and usage
rg "from pathlib import Path" -B 2 -A 2
# And check the actual return value usage
ast-grep --pattern 'where_is_my_dacy($$$)'
Length of output: 4869
src/dacy/score/input_length.py (1)
3-3
: LGTM: Type hint imports are properly added
The new typing imports align well with the type hints used in the function signature and annotations.
src/dacy/download.py (1)
Line range hint 1-8
: LGTM! Well-structured code organization.
The file is well-organized with clear sections, proper imports, and consistent type hints.
pyproject.toml (2)
51-51
: LGTM! Good practice to pin ruff version.
Pinning to exact version 0.7.1 ensures consistent linting behavior across all environments.
196-201
: LGTM! Well-structured rule configurations.
The configurations for flake8-annotations and mccabe follow best practices:
- Appropriate handling of type annotations with mypy integration
- Standard McCabe complexity threshold of 10
src/dacy/score/score.py (1)
Line range hint 38-76
: LGTM! Well-documented function signature.
The function signature is properly typed and the documentation is comprehensive, clearly explaining all parameters and their expected types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (8)
papers/DaCy-A-Unified-Framework-for-Danish-NLP/apply_fns/apply_fn_utils.py (1)
1-1
: Consider importingIterable
fromcollections.abc
instead.Ruff suggests migrating away from
typing.Iterable
tocollections.abc.Iterable
to align with modern Python best practices.-from typing import Callable, Iterable +from collections.abc import Callable, Iterable🧰 Tools
🪛 Ruff (0.8.2)
1-1: Import from
collections.abc
instead:Iterable
Import from
collections.abc
(UP035)
papers/DaCy-A-Unified-Framework-for-Danish-NLP/apply_fns/apply_fn_nerda.py (1)
3-3
: Avoid overshadowing the Python built-inlist
.Importing
list
fromtyping
shadows the built-in. Removelist
from the import, and update theIterable
import tocollections.abc
as recommended by Ruff.-from typing import Iterable, list +from collections.abc import Iterable🧰 Tools
🪛 Ruff (0.8.2)
3-3: Import from
collections.abc
instead:Iterable
Import from
collections.abc
(UP035)
3-3: Import
list
is shadowing a Python builtin(A004)
papers/DaCy-A-Unified-Framework-for-Danish-NLP/apply_fns/apply_fn_daluke.py (1)
2-2
: MigrateIterable
import tocollections.abc
.Ruff recommends importing
Iterable
fromcollections.abc
in newer Python versions to align with best practices.-from typing import Iterable +from collections.abc import Iterable🧰 Tools
🪛 Ruff (0.8.2)
2-2: Import from
collections.abc
instead:Iterable
Import from
collections.abc
(UP035)
src/dacy/datasets/dane.py (1)
23-23
: Consider the Python 3.10 union operator.While
Union[list[Corpus], Corpus]
is correct, if your project supports Python 3.10 or higher, you might replace it withlist[Corpus] | Corpus
for readability.training/main/scripts/add_readme_metadata.py (4)
91-94
: Use ofdict[str, Any]
in_insert_value
.This transition away from
Dict[str, Any]
is well-aligned with the modern type hint approach.
Also, static analysis suggests replacingUnion
with the pipe operator (X | Y
), for exampleOptional[Any]
→Any | None
. This is optional but recommended for code clarity.-def _insert_value( - metadata: dict[str, Any], - name: str, - value: Optional[Any], -) -> dict[str, Any]: +def _insert_value( + metadata: dict[str, Any], + name: str, + value: Any | None, +) -> dict[str, Any]:🧰 Tools
🪛 Ruff (0.8.2)
93-93: Use
X | Y
for type annotationsConvert to
X | Y
(UP007)
102-105
: Use ofdict[str, list[Any]]
in_insert_values_as_list
.Similarly, you can replace
Optional[Any]
withAny | None
to align with the new type hinting style introduced in Python 3.10.-def _insert_values_as_list( - metadata: dict[str, Any], - name: str, - values: Optional[Any], -) -> dict[str, list[Any]]: +def _insert_values_as_list( + metadata: dict[str, Any], + name: str, + values: Any | None, +) -> dict[str, list[Any]]:🧰 Tools
🪛 Ruff (0.8.2)
104-104: Use
X | Y
for type annotationsConvert to
X | Y
(UP007)
116-116
: Consider using the pipe operator for the return type.For
_create_metric
, you might replaceUnion[str, float]
withstr | float
.-def _create_metric(name: str, t: str, value: float) -> dict[str, Union[str, float]]: +def _create_metric(name: str, t: str, value: float) -> dict[str, str | float]:🧰 Tools
🪛 Ruff (0.8.2)
116-116: Use
X | Y
for type annotationsConvert to
X | Y
(UP007)
125-125
: Adopt the pipe operator for the function_create_p_r_f_list
.Currently returning
list[dict[str, Union[str, float]]]
. Consider the modern union syntax forUnion[str, float]
.-def _create_p_r_f_list( - metric_name: str, - precision: float, - recall: float, - f_score: float, -) -> list[dict[str, Union[str, float]]]: +def _create_p_r_f_list( + metric_name: str, + precision: float, + recall: float, + f_score: float, +) -> list[dict[str, str | float]]:🧰 Tools
🪛 Ruff (0.8.2)
125-125: Use
X | Y
for type annotationsConvert to
X | Y
(UP007)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (15)
docs/evaluation/datasets.py
(4 hunks)docs/evaluation/utils.py
(6 hunks)papers/DaCy-A-Unified-Framework-for-Danish-NLP/apply_fns/apply_fn_daluke.py
(2 hunks)papers/DaCy-A-Unified-Framework-for-Danish-NLP/apply_fns/apply_fn_nerda.py
(2 hunks)papers/DaCy-A-Unified-Framework-for-Danish-NLP/apply_fns/apply_fn_utils.py
(2 hunks)pyproject.toml
(3 hunks)src/dacy/datasets/dane.py
(3 hunks)src/dacy/datasets/names.py
(7 hunks)src/dacy/hate_speech/wrapped_models.py
(2 hunks)src/dacy/score/input_length.py
(2 hunks)src/dacy/score/score.py
(4 hunks)src/dacy/sentiment/wrapped_models.py
(2 hunks)training/main/scripts/add_readme_metadata.py
(5 hunks)training/main/scripts/combine.py
(1 hunks)training/main/scripts/create_kb.py
(3 hunks)
✅ Files skipped from review due to trivial changes (2)
- training/main/scripts/combine.py
- src/dacy/datasets/names.py
🚧 Files skipped from review as they are similar to previous changes (3)
- src/dacy/score/input_length.py
- pyproject.toml
- src/dacy/score/score.py
🧰 Additional context used
🪛 Ruff (0.8.2)
papers/DaCy-A-Unified-Framework-for-Danish-NLP/apply_fns/apply_fn_nerda.py
3-3: Import from collections.abc
instead: Iterable
Import from collections.abc
(UP035)
3-3: Import list
is shadowing a Python builtin
(A004)
papers/DaCy-A-Unified-Framework-for-Danish-NLP/apply_fns/apply_fn_daluke.py
2-2: Import from collections.abc
instead: Iterable
Import from collections.abc
(UP035)
papers/DaCy-A-Unified-Framework-for-Danish-NLP/apply_fns/apply_fn_utils.py
1-1: Import from collections.abc
instead: Iterable
Import from collections.abc
(UP035)
training/main/scripts/add_readme_metadata.py
93-93: Use X | Y
for type annotations
Convert to X | Y
(UP007)
104-104: Use X | Y
for type annotations
Convert to X | Y
(UP007)
116-116: Use X | Y
for type annotations
Convert to X | Y
(UP007)
125-125: Use X | Y
for type annotations
Convert to X | Y
(UP007)
training/main/scripts/create_kb.py
26-26: Do not use mutable data structures for argument defaults
Replace with None
; initialize within function
(B006)
133-133: Missing type annotation for function argument qid
(ANN001)
133-133: Do not use mutable data structures for argument defaults
Replace with None
; initialize within function
(B006)
150-150: Do not use mutable data structures for argument defaults
Replace with None
; initialize within function
(B006)
🔇 Additional comments (21)
papers/DaCy-A-Unified-Framework-for-Danish-NLP/apply_fns/apply_fn_utils.py (1)
15-15
: Change for built-in type hints is correct.
Switching from List[str]
to list[str]
is aligned with modern Python type hinting. This looks good and maintains clarity.
papers/DaCy-A-Unified-Framework-for-Danish-NLP/apply_fns/apply_fn_nerda.py (1)
20-20
: Return type update is correct.
Using the built-in list[Example]
is a good modern practice. No further changes needed here.
papers/DaCy-A-Unified-Framework-for-Danish-NLP/apply_fns/apply_fn_daluke.py (1)
21-21
: Updated return type.
Shifting to list[Example]
is consistent with the modern Python approach to type hints. No issues found here.
docs/evaluation/datasets.py (2)
1-1
: Using from __future__ import annotations
is beneficial.
This helps optimize and simplify type hints in Python, especially with forward references. Great addition.
20-20
: Consistent use of built-in generic types is commendable.
The shift from Dict
& List
to dict
& list
is aligned with modern Python best practices. This unifies the codebase with a clearer syntax.
Also applies to: 39-39, 68-68, 73-73, 78-78, 86-86
src/dacy/datasets/dane.py (4)
8-8
: Consider removing unused imports from typing
.
Now that List
is replaced by the built-in list
, verify whether any imports like List
or others from typing
can be safely removed to avoid confusion.
18-18
: Modern type annotation usage.
Switching from List[str]
to list[str]
aligns well with Python 3.9+ conventions.
30-30
: Docstring type hint updated correctly.
This docstring accurately reflects the updated splits (list[str])
.
43-43
: Docstring return type updated.
This follows modern type notation (Union[list[Corpus], Corpus]
), so it’s consistent with the function signature.
src/dacy/hate_speech/wrapped_models.py (1)
78-83
: Transition to modern built-in types.
Lines 78, 79, and 83 correctly replace List[...]
with list[...]
, maintaining consistency. Nicely done.
src/dacy/sentiment/wrapped_models.py (1)
135-140
: Use of list[...]
type hints.
The updates at lines 135, 136, and 140 enhance clarity. This reflects best practices for Python 3.9+.
docs/evaluation/utils.py (6)
4-4
: Remove unused type imports if unnecessary.
If List
and Dict
are no longer used, consider removing them from typing
imports to keep things clean.
18-22
: Updated function parameter and return type to built-in generics.
Changing examples: list[Example]
and returning list[dict[str, Any]]
is consistent with modern type hints.
38-38
: Consistent usage of list[dict[str, Any]]
.
This improves readability and aligns with Python’s recommended approach.
119-119
: Use list[Example]
for “predictions_to_disk” function.
This update ensures a uniform typing style throughout the file.
202-202
: Change to built-in list[Example]
.
Maintains the codebase-wide shift to native type annotations.
215-215
: Refined signature for score_to_string
.
dict[str, Any]
is a clear and concise type hint, fully consistent with the rest of the changes.
training/main/scripts/add_readme_metadata.py (4)
8-9
: Modernized type annotations with from __future__ import annotations
.
Enabling this future import allows the usage of Python 3.9+ style annotations, which is consistent with the updated type hints below.
12-12
: Removed Dict
and List
from imports.
You no longer need Dict
or List
from the typing
module as you’re now using built-in collection types for type annotations. This is a clean way to modernize the codebase.
21-21
: Return type updated to use built-in dict[str, Any]
.
This is consistent with Python 3.9+ type hinting. Good enhancement for clearer type annotations.
132-132
: Use of list[dict[str, Any]]
in _create_model_index
.
Transitioning from List[Dict[str, Any]]
to list[dict[str, Any]]
is consistent with Python 3.9+ style. Looks great!
@@ -131,7 +130,7 @@ def _fetch_wikidata_description(qid: str): | |||
return item.description.get("da", item.description.get("en", "")) # type: ignore | |||
|
|||
|
|||
def _fetch_wikidata_aliases(qid, langs: List[str] = ["da", "en"]): | |||
def _fetch_wikidata_aliases(qid, langs: list[str] = ["da", "en"]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add missing type annotation for qid
; also avoid mutable defaults.
Here, qid
is untyped, and langs: list[str] = ["da", "en"]
can lead to mutable default issues. Consider declaring a type for qid
(e.g., str
) and replacing the default list argument.
-def _fetch_wikidata_aliases(qid, langs: list[str] = ["da", "en"]):
+def _fetch_wikidata_aliases(qid: str, langs: list[str] | None = None):
+ if langs is None:
+ langs = ["da", "en"]
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def _fetch_wikidata_aliases(qid, langs: list[str] = ["da", "en"]): | |
def _fetch_wikidata_aliases(qid: str, langs: list[str] | None = None): | |
if langs is None: | |
langs = ["da", "en"] |
🧰 Tools
🪛 Ruff (0.8.2)
133-133: Missing type annotation for function argument qid
(ANN001)
133-133: Do not use mutable data structures for argument defaults
Replace with None
; initialize within function
(B006)
@@ -24,7 +23,7 @@ | |||
def main( | |||
trf_name: str, | |||
save_path_kb: Path = project_path / "assets/knowledge_bases/knowledge_base.kb", | |||
langs_to_fetch: List[str] = ["da", "en"], | |||
langs_to_fetch: list[str] = ["da", "en"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid using mutable lists as default arguments.
Static analysis warns that langs_to_fetch: list[str] = ["da", "en"]
could cause unexpected behavior if the default list is ever modified. Replace with None
and initialize inside the function.
-def main(
- trf_name: str,
- save_path_kb: Path = project_path / "assets/knowledge_bases/knowledge_base.kb",
- langs_to_fetch: list[str] = ["da", "en"],
-):
+def main(
+ trf_name: str,
+ save_path_kb: Path = project_path / "assets/knowledge_bases/knowledge_base.kb",
+ langs_to_fetch: list[str] | None = None,
+):
+ if langs_to_fetch is None:
+ langs_to_fetch = ["da", "en"]
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
langs_to_fetch: list[str] = ["da", "en"], | |
def main( | |
trf_name: str, | |
save_path_kb: Path = project_path / "assets/knowledge_bases/knowledge_base.kb", | |
langs_to_fetch: list[str] | None = None, | |
): | |
if langs_to_fetch is None: | |
langs_to_fetch = ["da", "en"] |
🧰 Tools
🪛 Ruff (0.8.2)
26-26: Do not use mutable data structures for argument defaults
Replace with None
; initialize within function
(B006)
@@ -148,7 +147,7 @@ | |||
return aliases | |||
|
|||
|
|||
def _load_ents_from_data(nlp: Language, splits: List[str] = ["dev", "train"]): | |||
def _load_ents_from_data(nlp: Language, splits: list[str] = ["dev", "train"]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid mutable default for splits
.
Similar concern regarding default list usage. Replace with None
and initialize within the function.
-def _load_ents_from_data(nlp: Language, splits: list[str] = ["dev", "train"]):
+def _load_ents_from_data(nlp: Language, splits: list[str] | None = None):
+ if splits is None:
+ splits = ["dev", "train"]
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def _load_ents_from_data(nlp: Language, splits: list[str] = ["dev", "train"]): | |
def _load_ents_from_data(nlp: Language, splits: list[str] | None = None): | |
if splits is None: | |
splits = ["dev", "train"] |
🧰 Tools
🪛 Ruff (0.8.2)
150-150: Do not use mutable data structures for argument defaults
Replace with None
; initialize within function
(B006)
Summary by CodeRabbit
Release Notes
New Features
Bug Fixes
Documentation
Chores
ruff
in project configuration.