Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Results about dataset E3-thiea #14

Closed
Acomand opened this issue May 25, 2024 · 6 comments
Closed

Results about dataset E3-thiea #14

Acomand opened this issue May 25, 2024 · 6 comments

Comments

@Acomand
Copy link

Acomand commented May 25, 2024

I'd like to know the test set of dataset thiea.

If I use the graph you provided, I can get the results.

But if I run the program as "python trace_parser.py --dataset theia", the generated file "test0.pkl" is different from the one provided by you. And the result is far from the previous one (regardless of using the checkpoint you provided or retraining the model).

So I'd like to know, is "ta1-theia-e3-official-6r.json.8" the test set? (I have also tried to replace the test set with other files, but the results are also bad). Could you please rerun the trace_parser.py to check the code?

(The results in trace and cadets do not have problem)

Thanks a lot.

@zmkzmkzmkzmkzmk
Copy link

Do you have any problems reproducing the results of the cadets data set? After I processed the data and used it for model training, I got the following results during verification:
image

I repeated the experiment five times and the results were similar.

@Acomand
Copy link
Author

Acomand commented May 26, 2024

When I use the dataset E3-cadets, I can get the result which is similar to yours.

But I can not reproduce the result on the dataset E3-theia. I run the code as
`
cd utils && python trace_parser.py --dataset theia

cd ..

python train.py --dataset theia

python eval.py --dataset theia
`
And the result is
image

@Jimmyokok
Copy link
Collaborator

The correct way: Put ta1-theia-e3-official-6r.json to ta1-theia-e3-official-6r.8.json (including ALL of 0-8) into the data directory and the parser will read entities from 0-8, parse 0-3 into training graphs and 8 into the test graph.
I tried to parse E3-THEIA without 4-7 present, resulting in many entities being lost, including several thousand malicious entities, because they are simply defined in 4-7.
Additionally, the peak detection threshold is pre-defined in the eval script. If you are training and detecting from scratch, you could adjust that threshold to make the confusion matrix looks normal, or simply refer to the AUC for threshold-insensitive evaluation.

@Jimmyokok
Copy link
Collaborator

Do you have any problems reproducing the results of the cadets data set? After I processed the data and used it for model training, I got the following results during verification: image

I repeated the experiment five times and the results were similar.

Similar to above, the printed out confusion matrix could be quite different when the threshold varies. Please refer to the AUC for threshold-insensitive evaluation.
Meanwhile, the n_neighbors (i.e. k) also affects detection performance and it is more sensitive on E3-CADETS than others.
We are also planning to release a new version of MAGIC, which is more stable, more efficient and easier to reproduce from scratch.

@Jimmyokok
Copy link
Collaborator

Do you have any problems reproducing the results of the cadets data set? After I processed the data and used it for model training, I got the following results during verification: image

I repeated the experiment five times and the results were similar.

For reference, I repeated the evaluation with two different k values, where k = 100 yields AUC= 0.9933 and k = 400 yields AUC=0.9968.

@Acomand
Copy link
Author

Acomand commented May 26, 2024

It works, thanks a lot
image

@Acomand Acomand closed this as completed May 26, 2024
@Jimmyokok Jimmyokok pinned this issue May 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants