Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: creating molecules with invalid molfile using faulty smiles #2255

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

adambasha0
Copy link
Contributor

@adambasha0 adambasha0 commented Dec 6, 2024

ensure the status of request to pubchem for getting mol from smiles is OK.
Otherwise return nil

@adambasha0 adambasha0 self-assigned this Dec 6, 2024
@adambasha0 adambasha0 added the bug label Dec 6, 2024
@@ -42,15 +42,16 @@ class MoleculeAPI < Grape::API
molfile = babel_info[:molfile] if babel_info
begin
rw_mol = RDKitChem::RWMol.mol_from_smiles(smiles)
rd_mol = rw_mol.mol_to_mol_block unless rw_mol.nil?
rd_mol = rw_mol.mol_to_mol_block unless rw_mol.nil?
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Metrics/ClassLength: Class has too many lines. [291/200]

Copy link

github-actions bot commented Dec 6, 2024

LCOV of commit c5bf635 during Continuous Integration #4208

Summary coverage rate:
  lines......: 65.5% (14059 of 21452 lines)
  functions..: no data found
  branches...: no data found

Files changed coverage rate: n/a

let(:faulty_smiles) { Chemotion::OpenBabelService.get_smiles_from_molfile(invalid_molfile) }

before do
allow(Chemotion::PubchemService).to receive(:molfile_from_smiles).with(faulty_smiles).and_return(corrupt_molfile)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Layout/LineLength: Line is too long. [121/120]

Copy link

github-actions bot commented Dec 9, 2024

LCOV of commit 648c42c during Continuous Integration #4221

Summary coverage rate:
  lines......: 66.4% (14233 of 21425 lines)
  functions..: no data found
  branches...: no data found

Files changed coverage rate: n/a

@adambasha0 adambasha0 force-pushed the prevent-creation-of-faulty-molecules-with-incorrect-smiles branch 2 times, most recently from d057483 to d2fb6be Compare December 10, 2024 08:16
Copy link

LCOV of commit d2fb6be during Continuous Integration #4226

Summary coverage rate:
  lines......: 66.4% (14232 of 21426 lines)
  functions..: no data found
  branches...: no data found

Files changed coverage rate: n/a

@adambasha0 adambasha0 force-pushed the prevent-creation-of-faulty-molecules-with-incorrect-smiles branch from d2fb6be to b17441a Compare December 12, 2024 09:15
@@ -42,15 +42,15 @@ class MoleculeAPI < Grape::API
molfile = babel_info[:molfile] if babel_info
begin
rw_mol = RDKitChem::RWMol.mol_from_smiles(smiles)
rd_mol = rw_mol.mol_to_mol_block unless rw_mol.nil?
rd_mol = rw_mol.mol_to_mol_block unless rw_mol.nil?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Metrics/ClassLength: Class has too many lines. [290/200]

@adambasha0 adambasha0 force-pushed the prevent-creation-of-faulty-molecules-with-incorrect-smiles branch from b17441a to 16992a1 Compare December 12, 2024 09:27
Copy link

LCOV of commit b17441a during Continuous Integration #4237

Summary coverage rate:
  lines......: 66.5% (14249 of 21424 lines)
  functions..: no data found
  branches...: no data found

Files changed coverage rate: n/a

Copy link

LCOV of commit 16992a1 during Continuous Integration #4238

Summary coverage rate:
  lines......: 66.4% (14232 of 21425 lines)
  functions..: no data found
  branches...: no data found

Files changed coverage rate: n/a

@adambasha0
Copy link
Contributor Author

adambasha0 commented Dec 12, 2024

  • This bug fix introduces a routine to do the following:
  1. prevent the creation of faulty molecule (meaning invalid/corrupt molfile) using molecule :smiles API endpoint.

fix: return nil when response status is 400 for get_molfile_by_smiles

@adambasha0 adambasha0 force-pushed the prevent-creation-of-faulty-molecules-with-incorrect-smiles branch from 16992a1 to 4cd48c2 Compare January 9, 2025 20:43
@@ -80,7 +80,10 @@ def self.get_molfile_by_smiles(smiles)
@auth = { username: '', password: '' }
options = { timeout: 10, headers: { 'Content-Type' => 'text/json' } }
encoded_smiles = URI.encode(smiles, '[]/()+-.@#=\\')
HTTParty.get(http_s + PUBCHEM_HOST + '/rest/pug/compound/smiles/' + encoded_smiles + '/record/SDF', options).body
response = HTTParty.get(http_s + PUBCHEM_HOST + '/rest/pug/compound/smiles/' + encoded_smiles + '/record/SDF', options)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Metrics/ModuleLength: Module has too many lines. [192/100]

@@ -80,7 +80,10 @@ def self.get_molfile_by_smiles(smiles)
@auth = { username: '', password: '' }
options = { timeout: 10, headers: { 'Content-Type' => 'text/json' } }
encoded_smiles = URI.encode(smiles, '[]/()+-.@#=\\')
HTTParty.get(http_s + PUBCHEM_HOST + '/rest/pug/compound/smiles/' + encoded_smiles + '/record/SDF', options).body
response = HTTParty.get(http_s + PUBCHEM_HOST + '/rest/pug/compound/smiles/' + encoded_smiles + '/record/SDF', options)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style/StringConcatenation: Prefer string interpolation to string concatenation.

@@ -80,7 +80,10 @@ def self.get_molfile_by_smiles(smiles)
@auth = { username: '', password: '' }
options = { timeout: 10, headers: { 'Content-Type' => 'text/json' } }
encoded_smiles = URI.encode(smiles, '[]/()+-.@#=\\')
HTTParty.get(http_s + PUBCHEM_HOST + '/rest/pug/compound/smiles/' + encoded_smiles + '/record/SDF', options).body
response = HTTParty.get(http_s + PUBCHEM_HOST + '/rest/pug/compound/smiles/' + encoded_smiles + '/record/SDF', options)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Layout/LineLength: Line is too long. [123/120]

before do
allow(Chemotion::OpenBabelService).to receive(:molfile_clear_hydrogens)
.with(any_args)
.and_return("Status: 400\n OpenBabel01092512342D\nMessage: Unable to standardize the given structure - perhaps some special charac\n 0 0 0 0 0 0 0 0 0 0999 V2000\nM END\n")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Layout/LineLength: Line is too long. [193/120]

end

JSON.parse(File.read('spec/fixtures/structures/bad_smiles.json')).each_value do |data|
context 'when testing SMILES entry' do
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RSpec/NestedGroups: Maximum example group nesting exceeded [4/3].

error_body = "<<~BODY
Status: 400
Code: PUGREST.BadRequest
Message: Unable to standardize the given structure - perhaps some special characters need to be escaped or data packed in a MIME form?
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Layout/LineLength: Line is too long. [140/120]

Copy link

github-actions bot commented Jan 9, 2025

LCOV of commit 4cd48c2 during Continuous Integration #4294

Summary coverage rate:
  lines......: 67.1% (14469 of 21571 lines)
  functions..: no data found
  branches...: no data found

Files changed coverage rate: n/a

@adambasha0 adambasha0 force-pushed the prevent-creation-of-faulty-molecules-with-incorrect-smiles branch from 4cd48c2 to 5930336 Compare January 10, 2025 08:22
@adambasha0 adambasha0 force-pushed the prevent-creation-of-faulty-molecules-with-incorrect-smiles branch from 5930336 to bbaa130 Compare January 10, 2025 08:43
Copy link

LCOV of commit bbaa130 during Continuous Integration #4296

Summary coverage rate:
  lines......: 67.1% (14471 of 21571 lines)
  functions..: no data found
  branches...: no data found

Files changed coverage rate: n/a

Copy link

LCOV of commit 5930336 during Continuous Integration #4295

Summary coverage rate:
  lines......: 67.1% (14470 of 21570 lines)
  functions..: no data found
  branches...: no data found

Files changed coverage rate: n/a

@PiTrem PiTrem changed the title fix: prevent create molecules with invalid molfile using faulty smiles fix: creating molecules with invalid molfile using faulty smiles Jan 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant