Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix mime types of responses. #21

Merged
merged 1 commit into from
May 21, 2020
Merged

Fix mime types of responses. #21

merged 1 commit into from
May 21, 2020

Conversation

jonmeow
Copy link
Contributor

@jonmeow jonmeow commented May 21, 2020

No description provided.

@jonmeow jonmeow merged commit b369da2 into carbon-language:master May 21, 2020
@jonmeow jonmeow deleted the firebase-mimetype branch May 27, 2020 21:08
chandlerc added a commit that referenced this pull request Dec 8, 2020
The only change here is to update the fuzzer build extension path.

The main original commit message:

> Add an initial lexer. (#17)
>
> The specific logic here hasn't been updated to track the latest
> discussed changes, much less implement many aspects of things like
> Unicode support.
>
> However, this should lay out a reasonable framework and set of APIs.
> It gives an idea of the overall lexer architecture being proposed. The
> actual lexing algorithm is a relatively boring and naive hand written
> loop. It may make sense to replace this with something generated or
> other more advanced approach in the future, getting the implementation
> right was not the primary goal here. Instead, the focus was entirely
> on the architecture, encapsulation, APIs, and the testing
> infrastructure.
>
> The architecture of the lexer differs from "classical" high
> performance lexers in compilers. A high level summary:
>
> -   It is eager rather than lazy, lexing an entire file.
> -   Tokens intrinsically know their source location.
> -   Grouping lexical symbols are tracked within the lexer.
> -   Indentation is tracked within the lexer.
>
> Tracking of grouping and indentation is intended to simplify the
> strategies used for recovery of mismatched grouping tokens, and
> eventually use indentation.
>
> Folding source location into the token itself simplifies the data
> structures significantly, and doesn't lose any fidelity due to the
> absence of a preprocessor with token pasting.
>
> The fact that this is an eager lexer instead of a lazy lexer is
> designed to simplify the implementation and testing of the lexer (and
> subsequent components). There is no reason to expect Carbon to lex so
> many tokens that there are significant locality advantages of lazy
> lexing. Moreover, if we want comparable performance benefits, I think
> pipelining is a much more promising architecture than laziness. For
> now, the simplicity is a huge win.
>
> Being eager also makes it easy for us to use extremely dense memory
> encodings for the information about lexed tokens. Everything is
> created in a dense array, and small indices are used to identify each
> token within the array.
>
> There is a fuzzer included here that we have run extensively over the
> code, but currently toolchain bugs and Bazel limitations prevent it
> from easily building. I'm hoping myself or someone else can push on
> this soon and enable the fuzzer to at least build if not run fuzz
> tests automatically. We have a significant fuzzing corpus that I'll
> add in a subsequent commit as well.

This also includes the fuzzer whose commit message was:

> Add fuzz testing infrastructure and the lexer's fuzzer. (#21)
>
> This adds a fairly simple `cc_fuzz_test` macro that is specialized for
> working with LLVM's LibFuzzer. In addition to building the fuzzer
> binary with the toolchain's `fuzzer` feature, it also sets up the test
> execution to pass the corpus as file arguments which is a simple
> mechanism to enable regression testing against the fuzz corpus.
>
> I've included an initial fuzzer corpus as well. To run the fuzzer in
> an open ended fashion, and build up a larger corpus:
> ```shell
> mkdir /tmp/new_corpus
> cp lexer/fuzzer_corpus/* /tmp/new_corpus
> ./bazel-bin/lexer/tokenized_buffer_fuzzer /tmp/new_corpus
> ```
>
> You can parallelize the fuzzer by adding `-jobs=N` for N threads. For
> more details about running fuzzers, see the documentation:
> http://llvm.org/docs/LibFuzzer.html
>
> To minimize and merge any interesting new inputs:
> ```shell
> ./bazel-bin/lexer/tokenized_buffer_fuzzer -merge=1 \
>     lexer/fuzzer_corpus /tmp/new_corpus
> ```

Co-authored-by: Jon Meow <46229924+jonmeow@users.noreply.github.com>
chandlerc added a commit that referenced this pull request Jun 28, 2022
The only change here is to update the fuzzer build extension path.

The main original commit message:

> Add an initial lexer. (#17)
>
> The specific logic here hasn't been updated to track the latest
> discussed changes, much less implement many aspects of things like
> Unicode support.
>
> However, this should lay out a reasonable framework and set of APIs.
> It gives an idea of the overall lexer architecture being proposed. The
> actual lexing algorithm is a relatively boring and naive hand written
> loop. It may make sense to replace this with something generated or
> other more advanced approach in the future, getting the implementation
> right was not the primary goal here. Instead, the focus was entirely
> on the architecture, encapsulation, APIs, and the testing
> infrastructure.
>
> The architecture of the lexer differs from "classical" high
> performance lexers in compilers. A high level summary:
>
> -   It is eager rather than lazy, lexing an entire file.
> -   Tokens intrinsically know their source location.
> -   Grouping lexical symbols are tracked within the lexer.
> -   Indentation is tracked within the lexer.
>
> Tracking of grouping and indentation is intended to simplify the
> strategies used for recovery of mismatched grouping tokens, and
> eventually use indentation.
>
> Folding source location into the token itself simplifies the data
> structures significantly, and doesn't lose any fidelity due to the
> absence of a preprocessor with token pasting.
>
> The fact that this is an eager lexer instead of a lazy lexer is
> designed to simplify the implementation and testing of the lexer (and
> subsequent components). There is no reason to expect Carbon to lex so
> many tokens that there are significant locality advantages of lazy
> lexing. Moreover, if we want comparable performance benefits, I think
> pipelining is a much more promising architecture than laziness. For
> now, the simplicity is a huge win.
>
> Being eager also makes it easy for us to use extremely dense memory
> encodings for the information about lexed tokens. Everything is
> created in a dense array, and small indices are used to identify each
> token within the array.
>
> There is a fuzzer included here that we have run extensively over the
> code, but currently toolchain bugs and Bazel limitations prevent it
> from easily building. I'm hoping myself or someone else can push on
> this soon and enable the fuzzer to at least build if not run fuzz
> tests automatically. We have a significant fuzzing corpus that I'll
> add in a subsequent commit as well.

This also includes the fuzzer whose commit message was:

> Add fuzz testing infrastructure and the lexer's fuzzer. (#21)
>
> This adds a fairly simple `cc_fuzz_test` macro that is specialized for
> working with LLVM's LibFuzzer. In addition to building the fuzzer
> binary with the toolchain's `fuzzer` feature, it also sets up the test
> execution to pass the corpus as file arguments which is a simple
> mechanism to enable regression testing against the fuzz corpus.
>
> I've included an initial fuzzer corpus as well. To run the fuzzer in
> an open ended fashion, and build up a larger corpus:
> ```shell
> mkdir /tmp/new_corpus
> cp lexer/fuzzer_corpus/* /tmp/new_corpus
> ./bazel-bin/lexer/tokenized_buffer_fuzzer /tmp/new_corpus
> ```
>
> You can parallelize the fuzzer by adding `-jobs=N` for N threads. For
> more details about running fuzzers, see the documentation:
> http://llvm.org/docs/LibFuzzer.html
>
> To minimize and merge any interesting new inputs:
> ```shell
> ./bazel-bin/lexer/tokenized_buffer_fuzzer -merge=1 \
>     lexer/fuzzer_corpus /tmp/new_corpus
> ```

Co-authored-by: Jon Meow <46229924+jonmeow@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants