-
-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add denied reasoning when trashing torrents and added adult parsing #888
Conversation
Important Review skippedReview was skipped due to path filters ⛔ Files ignored due to path filters (1)
CodeRabbit blocks several paths by default. You can override this behavior by explicitly including those paths in the path filters. For example, including You can disable this status message by setting the WalkthroughThe pull request introduces updates across several files. The Changes
Possibly related PRs
Suggested reviewers
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (5)
pyproject.toml (1)
36-36
: Consider keeping version flexibility with caret prefix.The change from
^1.0.2
to1.5.1
removes version flexibility. While this ensures reproducible builds, it might prevent getting important security updates.Consider using
^1.5.1
instead to allow compatible updates while maintaining stability:-rank-torrent-name = "1.5.1" +rank-torrent-name = "^1.5.1"src/program/services/scrapers/shared.py (2)
129-131
: Consider structuring error messages for UI presentation.Since this PR aims to add denied reasoning when trashing torrents, consider enhancing the error handling to provide more structured error messages that can be easily displayed in the UI. This could involve:
- Creating an enum or constants for common denial reasons
- Structuring the error messages to include both a user-friendly message and a technical detail
- Adding context about which validation step failed (country mismatch, year mismatch, etc.)
Example structure:
class TorrentDenialReason(Enum): GARBAGE = "garbage" COUNTRY_MISMATCH = "country_mismatch" YEAR_MISMATCH = "year_mismatch" class StructuredTorrentError(Exception): def __init__(self, reason: TorrentDenialReason, user_message: str, technical_detail: str): self.reason = reason self.user_message = user_message self.technical_detail = technical_detail
Line range hint
1-181
: Consider refactoring_parse_results
for better maintainability.The
_parse_results
function is quite complex with multiple responsibilities. Consider breaking it down into smaller, focused functions:Suggested structure:
def _parse_results(item: MediaItem, results: Dict[str, str], log_msg: bool = True) -> Dict[str, Stream]: """Parse and validate torrent results.""" torrents = _parse_torrents(item, results, log_msg) if not torrents: return {} return _create_streams_dict(sort_torrents(torrents)) def _parse_torrents(item: MediaItem, results: Dict[str, str], log_msg: bool) -> Set[Torrent]: """Parse raw results into validated torrents.""" torrents: Set[Torrent] = set() processed_infohashes: Set[str] = set() for infohash, raw_title in results.items(): if infohash in processed_infohashes: continue torrent = _parse_single_torrent(item, infohash, raw_title, log_msg) if torrent and _validate_torrent(item, torrent, log_msg): torrents.add(torrent) processed_infohashes.add(infohash) return torrents def _validate_torrent(item: MediaItem, torrent: Torrent, log_msg: bool) -> bool: """Validate torrent based on media type specific rules.""" validators = { "movie": _validate_movie_torrent, "show": _validate_show_torrent, "season": _validate_season_torrent, "episode": _validate_episode_torrent } return validators[item.type](item, torrent, log_msg)This refactoring would:
- Improve code readability and maintainability
- Make the code easier to test
- Simplify adding new validation rules
- Make error handling more consistent
src/program/settings/models.py (2)
Line range hint
299-304
: Consider using Pydantic's Config for model validationBoth RTNSettingsModel and LoggingModel should leverage Pydantic's validation capabilities to ensure data integrity.
Example implementation:
class RTNSettingsModel(SettingsModel, Observable): class Config: validate_assignment = True extra = "forbid" # Add your scene ranking fields here class LoggingModel(Observable): class Config: validate_assignment = True extra = "forbid" level: str = "INFO" format: str = "%(asctime)s - %(name)s - %(levelname)s - %(message)s" denied_reasons_enabled: bool = True file_output: bool = True console_output: bool = True
Line range hint
299-304
: Add tests and update documentationThe changes to RTNSettingsModel and LoggingModel are significant and require:
- Unit tests for both models
- Integration tests with AppModel
- Updated documentation reflecting the new configuration options
- Migration guide for existing users
This aligns with the PR checklist items mentioned in the objectives.
Would you like help with:
- Generating unit tests for the models?
- Creating documentation templates?
- Writing a migration guide?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
⛔ Files ignored due to path filters (1)
poetry.lock
is excluded by!**/*.lock
📒 Files selected for processing (3)
pyproject.toml
(1 hunks)src/program/services/scrapers/shared.py
(1 hunks)src/program/settings/models.py
(1 hunks)
🔇 Additional comments (5)
pyproject.toml (2)
36-36
: Verify breaking changes in rank-torrent-name v1.5.1.
The major version jump from 1.0.2 to 1.5.1 might introduce breaking changes that need to be handled in the codebase.
✅ Verification successful
Let me check the actual usage of this package in the codebase to better understand potential impacts.
Let me check the package's repository directly to understand its changes.
Version upgrade appears safe to proceed
The version history from PyPI shows this is not actually a major version jump, but rather a series of minor version increments from 1.0.2 to 1.5.1 through several stable releases (1.1.0, 1.2.0, 1.3.0, 1.4.0, 1.5.0). Additionally, our codebase search shows minimal integration with this package, only appearing in dependency files. The gradual version progression suggests backward compatibility has been maintained.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for changelog or release notes to identify breaking changes
# Check if there's a changelog in the package
gh api repos/rivenmedia/rank-torrent-name/contents/CHANGELOG.md 2>/dev/null || \
gh api repos/rivenmedia/rank-torrent-name/contents/changelog.md 2>/dev/null || \
gh api repos/rivenmedia/rank-torrent-name/releases/tags/v1.5.1 2>/dev/null
# Search for imports and usage patterns that might be affected
rg -l "rank.torrent.name" || rg -l "rank_torrent_name"
Length of output: 374
Script:
#!/bin/bash
# Check how rank-torrent-name is imported and used in the code
rg -A 5 "rank.torrent.name|rank_torrent_name" --type py
# Check the current version in poetry.lock to see what version we're upgrading from
rg -A 2 "rank-torrent-name" poetry.lock
Length of output: 202
Script:
#!/bin/bash
# Get the repository information and recent commits
gh api repos/rivenmedia/rank-torrent-name 2>/dev/null || \
gh api repos/rivenmedia/rank-torrent-name/commits 2>/dev/null
# Check PyPI for version history
curl -s https://pypi.org/pypi/rank-torrent-name/json | jq '.releases | keys[]'
Length of output: 792
36-36
: Update documentation for scene ranking changes.
Since this PR adds scene to ranking (as per PR objectives), ensure that any relevant documentation is updated to reflect the new functionality enabled by this dependency upgrade.
src/program/services/scrapers/shared.py (1)
129-131
: LGTM! Enhanced error logging for better debugging.
The change improves error visibility by logging the actual exception message instead of just the raw title. This will help in debugging issues with garbage torrents more effectively.
Let's verify the GarbageTorrent exception usage across the codebase:
✅ Verification successful
Based on the results, I can now generate the final response:
Verified: Exception handling for GarbageTorrent is consistent and properly isolated
The codebase scan shows that:
GarbageTorrent
is properly imported fromRTN.exceptions
- It's only actively used in
shared.py
for the scraping logic - The commented import in
torbox.py
indicates it was intentionally removed from downloaders - The exception handling pattern is unique to the scraping context, which is appropriate
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check GarbageTorrent exception usage and logging patterns
# This helps ensure consistent error handling across the codebase
# Check for GarbageTorrent imports and usage
rg -A 3 "GarbageTorrent"
# Check for similar exception logging patterns
ast-grep --pattern 'except GarbageTorrent as $_: $$$'
Length of output: 1049
src/program/settings/models.py (2)
Line range hint 304-304
: LoggingModel implementation is missing
The LoggingModel appears to be related to the "denied reasoning when trashing torrents" feature, but it's currently empty. This could affect the system's ability to log denial reasons.
Let's verify the logging requirements:
#!/bin/bash
# Search for LoggingModel usage to understand required fields
rg -A 5 "LoggingModel" --type py
# Search for torrent trash-related logging
rg -B 5 -A 5 "trash.*torrent" --type py
Please implement the required logging configuration:
- Log level settings
- Log format configuration
- Specific fields for torrent denial reasons
- File/console output settings
299-299
:
RTNSettingsModel implementation is missing
The RTNSettingsModel is crucial for the "scene to ranking" feature mentioned in the PR objectives, but it's currently empty. This could cause runtime issues as the model lacks the necessary attributes and validation.
Let's verify the usage of this model in the codebase:
Please implement the required attributes and validation rules. At minimum, we need:
- Scene ranking configuration
- Default values for ranking parameters
- Appropriate field validators
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (7)
src/routers/secure/webhooks.py (1)
44-45
: Consider enhancing the success response and loggingWhile the implementation is correct, consider these improvements for better observability:
- Include more context in the success response (e.g., the created item's ID)
- Add structured logging for successful item creation
Consider applying this enhancement:
- new_item = MediaItem({"imdb_id": imdb_id, "requested_by": "overseerr", "overseerr_id": req.request.request_id}) - request.app.program.em.add_item(new_item, service="Overseerr") - return {"success": True} + new_item = MediaItem({"imdb_id": imdb_id, "requested_by": "overseerr", "overseerr_id": req.request.request_id}) + request.app.program.em.add_item(new_item, service="Overseerr") + logger.info(f"Successfully created media item", imdb_id=imdb_id, overseerr_id=req.request.request_id) + return { + "success": True, + "item": { + "imdb_id": imdb_id, + "overseerr_id": req.request.request_id + } + }src/program/apis/overseerr_api.py (1)
Line range hint
183-194
: Consider using enums for status codes.The status codes are currently documented as comments. Consider converting them into enums for better type safety and maintainability:
from enum import IntEnum class RequestStatus(IntEnum): PENDING_APPROVAL = 1 APPROVED = 2 DECLINED = 3 class MediaStatus(IntEnum): UNKNOWN = 1 PENDING = 2 PROCESSING = 3 PARTIALLY_AVAILABLE = 4 AVAILABLE = 5This would make the code more maintainable and provide better IDE support when checking status codes.
src/routers/secure/items.py (1)
354-358
: Consider implementing transaction managementThe media item removal process involves multiple steps and external service calls. Consider implementing transaction management to ensure data consistency.
Suggestions:
- Use database transactions to ensure atomic operations
- Implement a rollback mechanism for failed external service calls
- Consider using the Unit of Work pattern to manage the complex removal process
Example approach:
async def remove_item_transaction(item: MediaItem, request: Request, session: Session) -> None: try: # Start transaction with session.begin(): # Cancel jobs request.app.program.em.cancel_job(item.id) # Clean up related items if item.type == "show": await cleanup_show_items(item, request) # Clean up streams and symlinks db_functions.clear_streams_by_id(item.id) if symlink_service := request.app.program.services.get(Symlinker): symlink_service.delete_item_symlinks_by_id(item.id) # Delete Overseerr request if item.overseerr_id: if overseerr := request.app.program.services.get(Overseerr): try: overseerr.delete_request(item.overseerr_id) except Exception as e: logger.error(f"Failed to delete Overseerr request: {e}") raise # Delete from database db_functions.delete_media_item_by_id(item.id) except Exception as e: logger.error(f"Failed to remove item {item.id}: {e}") raisesrc/program/program.py (4)
Line range hint
449-452
: Consider using specific exception types.Instead of catching a generic Exception, consider catching specific exceptions that might occur during item enhancement (e.g., NetworkError, ValidationError) to provide more targeted error handling.
- except Exception as e: - logger.exception(f"Error processing {item.log_string}: {e}") - raise # Re-raise to trigger rollback + except (NetworkError, ValidationError) as e: + logger.error(f"Failed to process {item.log_string}: {str(e)}") + errors.append(f"Error processing {item.log_string}: {str(e)}") + except Exception as e: + logger.exception(f"Unexpected error processing {item.log_string}: {e}") + raise # Re-raise to trigger rollback
Line range hint
426-427
: Consider making the chunk size configurable.The chunk size is hardcoded to 100 items. Consider making this configurable through settings to allow tuning based on system resources and library size.
- chunk_size = 100 + chunk_size = settings_manager.settings.symlink.get('chunk_size', 100)
Line range hint
442-445
: Enhance duplicate detection messages.The error messages for duplicate detection could be more informative by including the actual paths involved.
- errors.append(f"Duplicate symlink directory found for {item.log_string}") + errors.append(f"Duplicate symlink directory found for {item.log_string} at path: {item.path}") - errors.append(f"Duplicate item found in database for id: {item.id}") + errors.append(f"Duplicate item found in database for {item.log_string} (id: {item.id})")
Thread pool size validation needed across multiple components
The codebase uses ThreadPoolExecutor in multiple locations with configurable worker counts:
src/program/program.py
: UsesSYMLINK_MAX_WORKERS
env var without boundssrc/program/managers/event_manager.py
: Usesmax_workers
parametersrc/program/services/libraries/symlink.py
: Has two instances usingmax_workers
Add bounds validation to all thread pool configurations to prevent resource exhaustion. Consider implementing a shared configuration validator.
🔗 Analysis chain
Line range hint
429-430
: Verify thread pool size configuration.The thread pool size is configured through an environment variable but there's no validation of the value. Consider adding bounds checking to prevent resource exhaustion.
- workers = int(os.getenv("SYMLINK_MAX_WORKERS", 4)) + workers = min(max(int(os.getenv("SYMLINK_MAX_WORKERS", 4)), 1), 16) # Bound between 1 and 16🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Check for thread pool size configuration in environment variables and settings files. # Check environment variable echo "Current SYMLINK_MAX_WORKERS value:" printenv SYMLINK_MAX_WORKERS # Search for any thread pool configuration in settings files fd -e toml -e yaml -e json | xargs rg -i "workers|threads|pool"Length of output: 244
Script:
#!/bin/bash # Let's search for the thread pool configuration in Python files rg -U "SYMLINK_MAX_WORKERS" -A 3 -B 3 # Search for ThreadPoolExecutor usage context ast-grep --pattern 'ThreadPoolExecutor($$$)'Length of output: 1204
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
⛔ Files ignored due to path filters (1)
poetry.lock
is excluded by!**/*.lock
📒 Files selected for processing (5)
pyproject.toml
(1 hunks)src/program/apis/overseerr_api.py
(2 hunks)src/program/program.py
(2 hunks)src/routers/secure/items.py
(1 hunks)src/routers/secure/webhooks.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- pyproject.toml
🔇 Additional comments (5)
src/routers/secure/webhooks.py (1)
44-44
: Verify the field renaming impact across the codebase
The change from requested_id
to overseerr_id
in the MediaItem construction appears to be part of a broader refactoring. While the change itself is logical, we should verify its consistency across the codebase.
Let's verify the field usage across the codebase:
src/program/apis/overseerr_api.py (2)
139-139
: LGTM! Good simplification.
The simplified boolean expression removes redundant comparison while maintaining the same functionality. The error handling and logging are comprehensive.
79-79
: Verify MediaItem model compatibility with overseerr_id.
The field rename from requested_id
to overseerr_id
better reflects its source. Let's verify this change is consistent with the MediaItem model definition.
✅ Verification successful
The field requested_id
is still in use and hasn't been properly renamed
The codebase still uses requested_id
:
- In the database schema (
baseline_schema.py
) - In the
MediaItem
model definition (src/program/media/item.py
) - In attribute copying logic (
src/program/services/indexers/trakt.py
)
Let's verify if overseerr_id
is a separate field or if this is an inconsistency:
Both overseerr_id
and requested_id
fields coexist and serve different purposes
The verification reveals that both fields are intentionally separate in the MediaItem model:
overseerr_id
represents the Overseerr's media ID (fromitem.media.id
)requested_id
is a different field, likely for internal request tracking
The change in overseerr_api.py
is correct as it's specifically mapping Overseerr's media ID to the appropriate field. No issues found.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check if MediaItem model has been updated to use overseerr_id
# Search for MediaItem class definition and its usage
ast-grep --pattern 'class MediaItem {
$$$
}'
# Search for any remaining usage of requested_id
rg 'requested_id' -A 3
Length of output: 1700
Script:
#!/bin/bash
# Check for overseerr_id usage in the codebase
rg 'overseerr_id' -A 3
# Check MediaItem model definition more thoroughly
ast-grep --pattern 'class MediaItem {
$$$
overseerr_id
$$$
}'
Length of output: 4740
src/routers/secure/items.py (1)
354-358
: Verify Overseerr service integration
Let's verify the Overseerr service implementation and usage of overseerr_id.
✅ Verification successful
Overseerr integration is properly implemented
The verification confirms that:
- The
delete_request
method is correctly implemented inOverseerrAPI
class with proper error handling and logging - The
overseerr_id
field is properly defined in theMediaItem
model with appropriate database schema support - The usage in
items.py
is consistent with the implementation
The integration is working as expected with:
- Proper API error handling and logging
- Database schema support with appropriate indexing
- Consistent usage across the codebase
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify Overseerr service implementation and overseerr_id usage
# Check Overseerr service implementation
echo "Checking Overseerr service implementation..."
ast-grep --pattern 'class Overseerr {
$$$
delete_request($id) {
$$$
}
$$$
}'
# Check overseerr_id usage in models
echo "Checking overseerr_id in models..."
rg -l "overseerr_id" --type py
# Check for any TODO comments related to Overseerr
echo "Checking for Overseerr-related TODOs..."
rg -i "todo.*overseerr" --type py
Length of output: 649
Script:
#!/bin/bash
# Let's check the Overseerr service implementation and delete_request method
echo "Checking Overseerr service files..."
fd "overseerr.*\.py" --type f --exec cat {}
# Check the actual usage of delete_request method
echo "Checking delete_request usage..."
rg "delete_request" -B2 -A2 --type py
# Check Item model for overseerr_id field
echo "Checking Item model..."
rg "overseerr_id" src/program/media/item.py -B2 -A2
Length of output: 14259
src/program/program.py (1)
Line range hint 225-245
: LGTM! Improved logging and error handling.
The changes enhance the logging clarity and robustness:
- More concise debug logs without redundant prefixes
- Added exception handling to prevent single item failures from affecting the entire update process
- Clear visibility into the state transition process
if item.overseerr_id: | ||
overseerr: Overseerr = request.app.program.services.get(Overseerr) | ||
if overseerr: | ||
overseerr.delete_request(item.overseerr_id) | ||
logger.debug(f"Deleted request from Overseerr with ID {item.overseerr_id}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Enhance error handling for Overseerr request deletion
While the Overseerr integration looks good, the error handling could be improved to handle potential failures when deleting requests.
Consider wrapping the Overseerr request deletion in a try-catch block:
if item.overseerr_id:
overseerr: Overseerr = request.app.program.services.get(Overseerr)
if overseerr:
- overseerr.delete_request(item.overseerr_id)
- logger.debug(f"Deleted request from Overseerr with ID {item.overseerr_id}")
+ try:
+ overseerr.delete_request(item.overseerr_id)
+ logger.debug(f"Deleted request from Overseerr with ID {item.overseerr_id}")
+ except Exception as e:
+ logger.error(f"Failed to delete Overseerr request {item.overseerr_id}: {str(e)}")
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
if item.overseerr_id: | |
overseerr: Overseerr = request.app.program.services.get(Overseerr) | |
if overseerr: | |
overseerr.delete_request(item.overseerr_id) | |
logger.debug(f"Deleted request from Overseerr with ID {item.overseerr_id}") | |
if item.overseerr_id: | |
overseerr: Overseerr = request.app.program.services.get(Overseerr) | |
if overseerr: | |
try: | |
overseerr.delete_request(item.overseerr_id) | |
logger.debug(f"Deleted request from Overseerr with ID {item.overseerr_id}") | |
except Exception as e: | |
logger.error(f"Failed to delete Overseerr request {item.overseerr_id}: {str(e)}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…ed RTN version
Pull Request Check List
Resolves: #issue-number-here
Description:
Summary by CodeRabbit
New Features
rank-torrent-name
dependency to a more recent version for improved functionality.Bug Fixes
Refactor
RTNSettingsModel
andLoggingModel
, indicating potential future changes in their implementation.MediaItem
in various API methods to reflect updated naming conventions.