Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potentially missing test cases generated from a simple loop #88

Closed
thuanpv opened this issue Oct 11, 2021 · 4 comments
Closed

Potentially missing test cases generated from a simple loop #88

thuanpv opened this issue Oct 11, 2021 · 4 comments

Comments

@thuanpv
Copy link

thuanpv commented Oct 11, 2021

Hello SymCC maintainers,

I have been trying to test how SymCC works on loops. To that end, I wrote a simple C program based on this example: https://github.com/AdaLogics/adacc/blob/master/examples/afl-sample.c. Following is my modified program (stored in example.c) and the function of interest is foo.

#include <stdio.h>
#include <stdint.h>
#include <unistd.h>
#include <string.h>
#include <stdlib.h>
#include <assert.h>

int foo(char *arr, int count) {
    int i = 0;

    for (i = 0; i < count; i++) {
      if (arr[i] == 'A' + i) return i;
    }

    if (*(int*)(arr) != 0x4d4f4f42) {
        return i;
    }
    return 0;
}

int main(int argc, char* argv[]) {
    // open file
    FILE *f = fopen(argv[1], "rb");

    // get file size
    fseek(f, 0, SEEK_END);
    long fsize = ftell(f);

    // read file contents
    fseek(f, 0, SEEK_SET);
    char *string = (char*)malloc(fsize + 1);
    fread(string, 1, fsize, f);
    fclose(f);

    // call into target
    int retval = foo(string, fsize);

    free(string);
    return retval;
}

I compiled this program and ran SymCC using the following commands

symcc example.c -o example
printf "xxxx" > seed
SYMCC_INPUT_FILE=seed ./example seed

I expected that SymCC would generate 5 test cases ("Axxx", "xBxx", "xxCx", "xxxD", and "BOOM"). However, somehow SymCC only generated 4 test cases, the string "xxxD" was missing.

It looked strange to me so I thought something had gone wrong. I modified the program and manually unrolled the loop 4 times; the modified foo function is listed below

int foo(char *arr, int count) {
    int i = 0;
    if (i < count) {
      if (arr[i] == 'A' + i) return i;
      i++;
    }

    if (i < count) {
      if (arr[i] == 'A' + i) return i;
      i++;
    }

    if (i < count) {
      if (arr[i] == 'A' + i) return i;
      i++;
    }

    if (i < count) {
      if (arr[i] == 'A' + i) return i;
      i++;
    }

    if (*(int*)(arr) != 0x4d4f4f42) {
        return i;
    }
    return 0;
}

Interestingly, when I ran SymCC with the new program, which was saved into example_loop_unrolled.c, SymCC generated 5 test cases as expected. The commands are similar to the above ones

symcc example_loop_unrolled.c -o example_loop_unrolled
SYMCC_INPUT_FILE=seed ./example_loop_unrolled seed

So I suspect that there are some issues in the loop-handling code of SymCC. Could you please confirm?

Thanks,

Thuan

@julihoh
Copy link

julihoh commented Oct 11, 2021

If you are running SymCC (or SymQEMU) with the QSYM backend, then QSYM may be pruning the iterations of the loop. This would also explain why the unrolled loop does not show this behaviour: since it will (presumably) generate multiple branches (instead of one in a loop), each branch will be considered new by QSYM. See here.

@thuanpv
Copy link
Author

thuanpv commented Oct 12, 2021

Thanks @julihoh for your explanation. It makes sense to me now. It seems the QSYM backend applies some heuristics like the "loop bucketing" idea of AFL -- AFL only retains branches having specific "hit counts" like 1, 2, 3, 4, 8, 16, 32, ....

@julihoh
Copy link

julihoh commented Oct 12, 2021

Exactly. Always happy to help :)

@thuanpv
Copy link
Author

thuanpv commented Sep 12, 2022

Just a quick update. I recently re-ran the same example using the simple backend instead of QSYM and it works as expected: the Z3-based backend generates exactly 5 test cases.

@thuanpv thuanpv closed this as completed Oct 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants