Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eval #1

Open
bujianbusan123-alt opened this issue Oct 26, 2021 · 11 comments
Open

eval #1

bujianbusan123-alt opened this issue Oct 26, 2021 · 11 comments
Assignees

Comments

@bujianbusan123-alt
Copy link

is:issue is:open How to evaluate the train model ? SOS

@DreamMemory001
Copy link

I reckon that you only python ./scaffold3D.py eval 3D_Scaffold ./data/ ./model . Could you tell me in the folder of db_script, where i can get content in xyz_files? Or do you have trained succeed? Thanks U.

@bujianbusan123-alt
Copy link
Author

I have succeeding in training and generation, and the split.npz in the model directory.
I use "python /home/drug/soft/3d-scaffold/3D_Scaffold/scaffold3D.py eval 3D_Scaffold ./data/ ./model/ --split test --cuda --batch_size 5 --draw_random_samples 5 --features 64 --interactions 6", this will cause error "NameError: name 'main' is not defined" Thanks U very much.

@DreamMemory001
Copy link

OS is Win10, centos or ubuntu? Could you please send the special error information in run?
Under this picture, in the folder of dbs_script, in generate_3D_Scaffold.py ->raw_path = './xyz_files/', but in this folder,not have this "xyz_files/" and content in it. Could you please tell how do you operate?
image

@bujianbusan123-alt
Copy link
Author

OS is centos, "xyz_files" is produced from QM9 database by myself, My question is how to run "eval" mode in a correct way
after train?
image

@DreamMemory001
Copy link

DreamMemory001 commented Oct 27, 2021

image
This is main function, you can check if you rename? Because I can not traine now, so I want to know how you get ""xyz_files" ?Because in the readme.md not mention it.
I use the readme.md code to start generate DB not succeed, Please could tell how you operate it?
Thanks

@bujianbusan123-alt
Copy link
Author

xyz_file is original database which need to be preprocessed, I download this database from qm9 database.
but I advice you to use her/his database to train and generate directly in their directories, these steps are easy to complete.

@DreamMemory001
Copy link

Thanks U, but i should run where is .py file?

@bujianbusan123-alt
Copy link
Author

Begin with below
Training a model; To train a model with same hyperparameters as described in paper;

python ./scaffold3D.py train 3D_Scaffold ./data/ ./model --split 2000 500 --cuda --batch_size 5 --draw_random_samples 5 --features 64 --interactions 6 --max_epochs 1000

Generate a model;

python ./scaffold3D.py generate 3D_Scaffold ./model/ 100 --functional_group 'C=CC(=O)N' --chunk_size 100 --max_length 65 --file_name scaffold

Filter the generated molecules;

python filter_generated.py ./model/generated/scaffold.mol_dict

Write generated molecules in to xyz file

Python write_xyz.py

@DreamMemory001
Copy link

but it should generate DB first, is not?
image

@bujianbusan123-alt
Copy link
Author

In data, there are serveral data files, it can be used,
If you generated database by yourself, there will be cased lots of troubles from beginning.

@mcnaughtonadm mcnaughtonadm self-assigned this Mar 11, 2022
@mcnaughtonadm
Copy link
Collaborator

Hi, sorry this is a late response. I will try to replicate this NameError: name 'main' is not defined issue.

For eval of the model, you are right where you need to run:
python ./scaffold3D.py 3D_Scaffold eval model_path
You can also see optional inputs by replacing model_path with help.

To answer some other issues in this thread: there is no ./xyz_files/ directory inherently. If you wish to create a database from your own dataset of XYZ files, you can place the directory in the ./dbs_script/ directory and then run the generate_3D_Scaffold.py {path_to_xyz_files} script with a path to the XYZ files you want to build into a dataset.

I am rewording some of this in the code right now, but if you want to run training with the QM9 dataset, just remove the files that currently exist within the ./data/ folder and then run the training pointing to the empty ./data as the datapath. This will directly download data for training of the model and you will not need to do any of the generation steps. I believe the QM9 data downloaded has 133885 molecules in it so you can tailor your splits accordingly.

If you don't want to pull the whole QM9 dataset for training, you can use the small sample we have already located in ./dbs_script/scaffold3D.db. Just remove the existing files in ./data/ and copy the new db file in.

From the main directory:
cp ./dbs_script/scaffold3D.db ./data/scaffold3D.db

From here you can run the training on this small dataset. You may need to tweak the --split until it runs properly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants