Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance File Detection and Extend Compatibility for module read-bioinfo-metadata #253

Closed
10 tasks done
Daniel-VM opened this issue Feb 21, 2024 · 0 comments
Closed
10 tasks done
Assignees
Labels
enhancement New feature or request

Comments

@Daniel-VM
Copy link
Member

Daniel-VM commented Feb 21, 2024

Refactor read_bioinfo_metadata.py to enhance the detection of target files (those generated during bioinformatics analysis) and expand its applicability to additional bioinformatics software and pipelines.

Task list

  • Code review and refactoring.

  • Create a bioinfo configuration file containing all properties required for tools used in the software or pipeline.

  • Implement automatic identification of bioinformatic analysis output files.

  • Update the method to verify and validate required files in bioinformatics analysis input.

  • Generate a new progress log method and format. This prints a formatted table in stdout, displaying warnings, errors, and valid specifications encountered while parsing metadata (sample and field level).

  • Expand read-bioinfo-metadata functionality to parse results from current and upcoming bioinformatics tools. Ideally, this module should be capable of identifying relevant metadata (must be specified via config files) based on the specific tool used in the bioinformatic analysis.

    • Add method to handle and map metadata in standard format (csv,tsv...).
    • Add strategy to handle and map metadata from pipeline-specific scenarios(formats, data structures).
    • Create assets/pipeline_utils to allocate pipeline-specific Class and Functions.
  • Fix the mapping sample name in the method include_data_from_mapping_stats. An error occurs when there are no matches between the mapping stats samples and the submitting_lab_sample_id in processed_metadata_lab.json (Managing viralrecon execution #240).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants