Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for bookmarks and fields (references, forms) in docx reader #6781

Open
mjfs opened this issue Oct 26, 2020 · 2 comments
Open

Support for bookmarks and fields (references, forms) in docx reader #6781

mjfs opened this issue Oct 26, 2020 · 2 comments

Comments

@mjfs
Copy link
Contributor

mjfs commented Oct 26, 2020

There is currently no support for converting bookmarks or fields (references, forms) from docx to markdown. The primary use case for this is the ability to embedded invisible metadata into specific docx segments and enable bidirectional collaboration between docx and markdown users.

Docx's invisible metadata options are more limited than markdown's and include:
a) bookmarks
b) custom properties
c) fields (references, forms - including XML bound ones)

Since bookmark creation is already implemented for conversion from markdown to docx (e.g. [This is a bookmark in docx]{#bookmark1}), it perhaps makes most sense for that to also be the first candidate for reader implementation. In addition, bookmarks also exist in other formats.

Combination of bookmarks and custom properties (e.g. as already discussed in #3034) would enable lossless conversion between docx and markdown. By using this combination one could transfer everything not supported by docx as custom properties linked to bookmarks' location and by that enable almost perfect reverse operation without source markdown document.

@tstenner
Copy link
Contributor

I've created a test document with custom properties, a bookmark and a link to the bookmark:

test.docx

The docProps/custom.xml:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Properties xmlns="http://schemas.openxmlformats.org/officeDocument/2006/custom-properties" xmlns:vt="http://schemas.openxmlformats.org/officeDocument/2006/docPropsVTypes">
  <property fmtid="{D5CDD505-2E9C-101B-9397-08002B2CF9AE}" pid="2" name="test_bool">
    <vt:bool>1</vt:bool>
  </property>
  <property fmtid="{D5CDD505-2E9C-101B-9397-08002B2CF9AE}" pid="3" name="test_date">
    <vt:filetime>2021-04-15T00:00:00Z</vt:filetime>
  </property>
  <property fmtid="{D5CDD505-2E9C-101B-9397-08002B2CF9AE}" pid="4" name="test_datetime">
    <vt:filetime>2021-04-30T02:04:05Z</vt:filetime>
  </property>
  <property fmtid="{D5CDD505-2E9C-101B-9397-08002B2CF9AE}" pid="5" name="test_duration">
    <vt:lpwstr>P1Y2M3DT4H5M6.000000007S</vt:lpwstr>
  </property>
  <property fmtid="{D5CDD505-2E9C-101B-9397-08002B2CF9AE}" pid="6" name="test_number">
    <vt:r8>2</vt:r8>
  </property>
  <property fmtid="{D5CDD505-2E9C-101B-9397-08002B2CF9AE}" pid="7" name="test_string">
    <vt:lpwstr>teststring</vt:lpwstr>
  </property>
</Properties>

The xml for a bookmark:

<w:p>
      <w:r>
        <w:rPr/>
        <w:t xml:space="preserve">This is a </w:t>
      </w:r>
      <w:bookmarkStart w:id="0" w:name="Bookmark_name"/>
      <w:r>
        <w:rPr/>
        <w:t>bookmarked text</w:t>
      </w:r>
      <w:bookmarkEnd w:id="0"/>
    </w:p>

And for the link to the bookmark (implemented as a field):

    <w:p>
      <w:pPr>
        <w:pStyle w:val="Normal"/>
        <w:rPr/>
      </w:pPr>
      <w:r>
        <w:rPr/>
        <w:t xml:space="preserve">This is a link to the bookmark: </w:t>
      </w:r>
      <w:r>
        <w:rPr>
          <w:rStyle w:val="Internetverknpfung"/>
        </w:rPr>
        <w:fldChar w:fldCharType="begin"/>
      </w:r>
      <w:r>
        <w:rPr>
          <w:rStyle w:val="Internetverknpfung"/>
        </w:rPr>
        <w:instrText> REF Bookmark_name \h </w:instrText>
      </w:r>
      <w:r>
        <w:rPr> 
          <w:rStyle w:val="Internetverknpfung"/>
        </w:rPr>
        <w:fldChar w:fldCharType="separate"/>
      </w:r>    
      <w:r>
        <w:rPr>
          <w:rStyle w:val="Internetverknpfung"/>
        </w:rPr>
        <w:t>bookmarked text</w:t>
      </w:r>
      <w:r>
        <w:rPr> 
          <w:rStyle w:val="Internetverknpfung"/>
        </w:rPr>
        <w:fldChar w:fldCharType="end"/>
      </w:r>    
    </w:p>

@samtuke
Copy link

samtuke commented Nov 11, 2022

I'm also missing this. As a heavy user of custom fields for template ODT files, converting them correctly in Pandoc would be very useful. Currently those fields are simply blank, making the converted documents unusable for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants