Skip to content

Commit

Permalink
Merge branch 'mity:master' into master
Browse files Browse the repository at this point in the history
  • Loading branch information
pbek authored Feb 17, 2024
2 parents 8c23869 + 3299546 commit c64ee9a
Show file tree
Hide file tree
Showing 51 changed files with 5,826 additions and 4,637 deletions.
79 changes: 79 additions & 0 deletions .github/workflows/ci-build.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
name: Build and Test

on:
- pull_request
- push

jobs:
# Linux builds.
#
# gcc sometimes warns (e.g. about potentially uninitialized variables) only
# when some optimizations are enabled. So we build Debug as well as Release
# on Linux. The Debug build also collects and uploads test coverage.
linux-debug:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Configure
run: CFLAGS='--coverage -Werror' cmake -DCMAKE_BUILD_TYPE=Debug -G 'Unix Makefiles' .
- name: Build
run: make VERBOSE=1
- name: Test
run: python3 ./scripts/run-tests.py
- name: Create coverage report
run: |
sudo apt-get install -y lcov
lcov --directory . --capture --output-file coverage.info
lcov --remove coverage.info '/usr/*' --output-file coverage.info
lcov --list coverage.info
- name: Upload coverage report
uses: codecov/codecov-action@v3

linux-release:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Configure
run: CFLAGS='--coverage -Werror' cmake -DCMAKE_BUILD_TYPE=Release -G 'Unix Makefiles' .
- name: Build
run: make VERBOSE=1
- name: Test
run: python3 ./scripts/run-tests.py

# Windows builds.
#
# We do both 32 and 64-bit builds. Also note 32-bit does Debug build while
# 64-bit one does Release build. (Full matrix would likely be an overkill.)
windows-32-debug:
runs-on: windows-latest
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Dev command prompt
uses: ilammy/msvc-dev-cmd@v1
with:
arch: x86
- name: Configure
run: cmake -DCMAKE_BUILD_TYPE=Debug -G "NMake Makefiles" .
- name: Build
run: nmake
- name: Test
run: python .\scripts\run-tests.py

windows-64-release:
runs-on: windows-latest
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Dev command prompt
uses: ilammy/msvc-dev-cmd@v1
with:
arch: x64
- name: Configure
run: cmake -DCMAKE_BUILD_TYPE=Release -G "NMake Makefiles" .
- name: Build
run: nmake
- name: Test
run: python .\scripts\run-tests.py
37 changes: 37 additions & 0 deletions .github/workflows/ci-fuzz.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
name: Fuzz Test
on:
pull_request:
paths:
- '**.c'
- '**.h'
jobs:
Fuzzing:
runs-on: ubuntu-latest
permissions:
security-events: write
steps:
- name: Build Fuzzers
id: build
uses: google/oss-fuzz/infra/cifuzz/actions/build_fuzzers@master
with:
oss-fuzz-project-name: 'md4c'
language: c
- name: Run Fuzzers
uses: google/oss-fuzz/infra/cifuzz/actions/run_fuzzers@master
with:
oss-fuzz-project-name: 'md4c'
fuzz-seconds: 600
output-sarif: true
- name: Upload Crash
uses: actions/upload-artifact@v3
if: failure() && steps.build.outcome == 'success'
with:
name: artifacts
path: ./out/artifacts
- name: Upload Sarif
if: always() && steps.build.outcome == 'success'
uses: github/codeql-action/upload-sarif@v2
with:
# Path to SARIF file relative to the root of the repository
sarif_file: cifuzz-sarif/results.sarif
checkout_path: cifuzz-sarif
36 changes: 0 additions & 36 deletions .travis.yml

This file was deleted.

208 changes: 207 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,212 @@
# MD4C Change Log


## Next Version (Work in Progress)

Fixes:

- [#236](https://github.com/mity/md4c/issues/236):
Fix quadratic time behavior caused by one-by-one walking over block lines
instead of calling `md_lookup_line()`.

- [#238](https://github.com/mity/md4c/issues/238):
Fix quadratic time and output size behavior caused by malicious misuse of
link reference definitions.


## Version 0.5.2

Changes:

* Changes mandated by CommonMark specification 0.31:

- The specification expands set of Unicode characters seen by Markdown
parser as a punctuation. Namely all Unicode general categories P
(punctuation) and S (symbols) are now seen as such.

- The definition of HTML comment has been changed so that `<!-->` and
`<!--->` are also recognized as HTML comments.

- HTML tags recognized as HTML block starting condition of type 4 has been
updated, namely a tag `<source>` has been removed, whereas `<search>`
added.

Refer to [CommonMark 0.31.2](https://spec.commonmark.org/0.31.2/) for full
specification.

Fixes:

- [#230](https://github.com/mity/md4c/issues/230):
The fix [#223](https://github.com/mity/md4c/issues/223) in 0.5.1 release
was incomplete and one corner case remained unfixed. This is now addressed.

- [#231](https://github.com/mity/md4c/issues/231):
`md2html --full-html` now emits `<meta charset="UTF-8">` in the HTML header.


## Version 0.5.1

Changes:

* LaTeX math extension (`MD_FLAG_LATEXMATHSPANS`) now requires that opener
mark is not immediately preceded with alpha-numeric character and similarly
that closer mark is not immediately followed with alpha-numeric character.

So for example `foo$ x + y = z $` is not recognized as LaTeX equation
anymore because there is no space between `foo` and the opening `$`.

* Table extension (`MD_FLAG_TABLES`) now recognizes only tables with no more
than 128 columns. This limit has been imposed to prevent a pathological
case of quadratic output size explosion which could be used as DoS attack
vector.

* We are now more strict with `MD_FLAG_PERMISSIVExxxAUTOLINKS` family of
extensions with respect to non-alphanumeric characters, with the aim to
mitigate false positive detections.

Only relatively few selected non-alphanumeric are now allowed in permissive
e-mail auto-links (`MD_FLAG_PERMISSIVEEMAILAUTOLINKS`):
- `.`, `-`, `_`, `+` in user name part of e-mail address; and
- `.`, `-`, `_` in host part of the e-mail address.

Similarly for URL and e-mail auto-links (`MD_FLAG_PERMISSIVEURLAUTOLINKS` and
`MD_FLAG_PERMISSIVEWWWAUTOLINKS`):
- `.`, `-`, `_` in host part of the URL;
- `/`, `.`, `-`, `_` in path part of the URL;
- `&`, `.`, `-`, `+`, `_`, `=`, `(`, `)` in the query part of the URL
(additionally, if present, `(` and `)` must form balanced pairs); and
- `.`, `-`, `+`, `_` in the fragment part of the URL.

Furthermore these characters (with some exceptions like where they serve as
delimiter characters, e.g. `/` for paths) are generally accepted only when
an alphanumeric character both precedes and follows them (i.e. these cannot
be "stacked" together).

Fixes:

* Fix several bugs where we haven't properly respected already resolved spans
of higher precedence level in handling of permissive auto-links extensions
(family of `MD_FLAG_PERMISSIVExxxAUTOLINKS` flags), LaTeX math extension
(`MD_FLAG_LATEXMATHSPANS`) and wiki-links extension (`MD_FLAG_WIKILINKS`)
of the form `[[label|text]]` (with pipe `|`). In some complex cases this
could lead to invalid internal parser state and memory corruption.

Identified with [OSS-Fuzz](https://github.com/google/oss-fuzz).

* [#222](https://github.com/mity/md4c/issues/222):
Fix strike-through extension (`MD_FLAG_STRIKETHROUGH`) which did not respect
same rules for pairing opener and closer marks as other emphasis spans.

* [#223](https://github.com/mity/md4c/issues/223):
Fix incorrect handling of new-line character just at the beginning and/or
end of a code span where we were not following CommonMark specification
requirements correctly.


## Version 0.5.0

Changes:

* Changes mandated by CommonMark specification 0.30.

Actually there are only very minor changes to recognition of HTML blocks:

- The tag `<textarea>` now triggers HTML block (of type 1 as per the
specification).

- HTML declaration (HTML block type 4) is not required to begin with an
upper-case ASCII character after the `<!`. Any ASCII character is now
allowed. Also it now doesn't require a whitespace before the closing `>`.

Other than that, the newest specification mainly improves test coverage and
clarifies its wording in some cases, without affecting the implementation.

Refer to [CommonMark 0.30](https://spec.commonmark.org/0.30/) for full
specification.

* Make Unicode-specific code compliant to Unicode 15.1.

* Update list of entities known to the HTML renderer from
https://html.spec.whatwg.org/entities.json.

New Features:

* Add extension allowing to treat all soft break as hard ones. It has to be
explicitly enabled with `MD_FLAG_HARD_SOFT_BREAKS`.

Contributed by [l-m](https://github.com/l1mey112).

* Structure `MD_SPAN_A_DETAIL` now has a new member `is_autolink`.

Contributed by [Jens Alfke](https://github.com/snej).

* `md2html` utility now supports command line options `--html-title` and
`--html-css`.

Contributed by [Andreas Baumann](https://github.com/andreasbaumann).

Fixes:

* [#163](https://github.com/mity/md4c/issues/163):
Make HTML renderer to emit `'\n'` after the root tag when in the XHTML mode.

* [#165](https://github.com/mity/md4c/issues/165):
Make HTML renderer not to percent-encode `'~'` in URLs. Although it does
work, it's not needed, and it can actually be confusing with URLs such as
`http://www.example.com/~johndoe/`.

* [#167](https://github.com/mity/md4c/issues/167),
[#168](https://github.com/mity/md4c/issues/168):
Fix multiple instances of various buffer overflow bugs, found mostly using
a fuzz testing. Contributed by [dtldarek](https://github.com/dtldarek) and
[Thierry Coppey](https://github.com/TCKnet).

* [#169](https://github.com/mity/md4c/issues/169):
Table underline now does not require 3 characters per table column anymore.
One dash (optionally with a leading or tailing `:` appended or prepended)
is now sufficient. This improves compatibility with the GFM.

* [#172](https://github.com/mity/md4c/issues/172):
Fix quadratic time behavior caused by unnecessary lookup for link reference
definition even if the potential label contains nested brackets.

* [#173](https://github.com/mity/md4c/issues/173),
[#174](https://github.com/mity/md4c/issues/174),
[#212](https://github.com/mity/md4c/issues/212),
[#213](https://github.com/mity/md4c/issues/213):
Multiple bugs identified with [OSS-Fuzz](https://github.com/google/oss-fuzz)
were fixed.

* [#190](https://github.com/mity/md4c/issues/190),
[#200](https://github.com/mity/md4c/issues/200),
[#201](https://github.com/mity/md4c/issues/201):
Multiple fixes of incorrect interactions of indented code block with a
preceding block.

* [#202](https://github.com/mity/md4c/issues/202):
We were not correctly calling `enter_block()` and `leave_block()` callbacks
if multiple HTML blocks followed one after another; instead previously
such blocks were merged into one.

(This may likely impact only applications interested in Markdown's AST,
and not just converting Markdown to other formats like HTML.)

* [#210](https://github.com/mity/md4c/issues/210):
The `md2html` utility now handles nested images with optional titles
correctly.

* [#214](https://github.com/mity/md4c/issues/214):
Tags `<h2>` ... `<h6>` incorrectly did not trigger HTML block.

* [#215](https://github.com/mity/md4c/issues/215):
The parser incorrectly did not accept optional tabs after setext header
underline.

* [#217](https://github.com/mity/md4c/issues/217):
The parser incorrectly resolved emphasis in some situations, if the emphasis
marks were enclosed by punctuation characters.


## Version 0.4.8

Fixes:
Expand All @@ -11,7 +217,7 @@ Fixes:
the block) could eat 1 line of actual contents.

* [#150](https://github.com/mity/md4c/issues/150):
Fix md2html utility to output proper DOCTYPE and HTML tags when `--full-html`
Fix `md2html` to output proper DOCTYPE and HTML tags when `--full-html`
command line options is used, accordingly to the expected output format
(HTML or XHTML).

Expand Down
Loading

0 comments on commit c64ee9a

Please sign in to comment.