Skip to content

Embedding PDF.JS in CEFSharp

aimcosoftware edited this page May 25, 2023 · 19 revisions

Using PDF.JS as default PDF viewer

PDF.JS is Mozilla's open source PDF renderer, available here. The same one that is used in Firefox.

There are several advantages to using PDF.JS

  • Has a complete API for manipulating the viewer and PDF, accessible from javascript.
  • All loaded from local HTML, CSS and JS files, so can be easily modified.
  • Page navigation from the index is added to browser history
  • Restores last scroll position when revisiting the document

To use it in CEFSharp:

  • unzip pdf.js in a convenient folder for your project
  • Add a sub folder to download and read PDFs
  • Register the folder in a custom scheme handler
  • Add javascript to post message when viewer loaded
  • disable the default PDF extension in CEFSharp

Basic code flow

  • Disable default handling so PDFs trigger a download
  • Use download and request handlers to trigger PDF handling
  • Download PDF to sub folder of the custom scheme
  • Load PDF.JS viewer from custom scheme when download starts
  • Show file progress on PDF.JS viewer while downloading
  • When complete, open PDF using JS from custom scheme

Reloading and history

The PDF.JS viewer is loaded with a query string of the original url e.g pdfjs://viewer/web/viewer.html?https://original.com/pdffile.pdf, this is not used by the viewer, only to navigate on reload or history.

When the browser reloads or history is navigated, cancel the navigation and reload the original PDF Url to trigger the download and viewer loading.

Browser address

As the browser will use the custom scheme Url, you may need to provide your own address property to return the PDF Url. e.g.

Readonly Property Address() As String
	Get
		Return If(Browser.Address.StartsWith("pdfjs"), PDFSourceUrl, Browser.Address)
	End Get
End Property

Download PDF.JS and copy it to a convenient folder

Example folder structure with cache folder added for PDF downloads:

build         pdf.js code
cache         add folder for PDFs
web           pdf.js viewer 

Create custom scheme handler for root folder

Friend PDFEnabled As Boolean
Friend PDFCachePath As String

Friend Sub RegisterPDFJS(ByRef Settings As CefSettings, RootPath As String)

	'Disable default handling - PDFs will be downloaded
	Settings.CefCommandLineArgs.Add("disable-pdf-extension")

	Dim Scheme As New CefCustomScheme
	Scheme.SchemeName = "pdfjs"
	Scheme.DomainName = "viewer"
	Scheme.SchemeHandlerFactory = New SchemeHandler.FolderSchemeHandlerFactory(RootPath)
	Settings.RegisterScheme(Scheme)
	PDFCachePath = $"{RootPath}\cache\"
	PDFEnabled = True
End Sub

Add Javascript to post message when viewer loaded

addEventListener("webviewerloaded", function(){
	CefSharp.PostMessage("ViewerLoaded")
})

You could do this in Browser.LoadingStateChanged, but as the viewer loads asynchronously, it may not be completed when the page is finished loading.

Private Sub Browser_LoadingStateChanged(Sender As Object, e As LoadingStateChangedEventArgs) Handles Browser.LoadingStateChanged
	If Not e.IsLoading Then
		PDFViewer = True
		PDFOpenInViewer()
	End If
End Sub

Add download and request handlers

'Set download and request handlers
Browser.DownloadHandler = Me
Browser.RequestHandler = Me

' Locals to control PDF download and viewing
Private PDFSourceUrl As String = ""
Private PDFCacheUrl As String = ""
Private PDFSavePath As String = ""
Private PDFLoading As Boolean
Private PDFComplete As Boolean
Private PDFViewer As Boolean

Private Sub OnBeforeDownload(chromiumWebBrowser As IWebBrowser, browser As IBrowser, downloadItem As DownloadItem, callback As IBeforeDownloadCallback) Implements IDownloadHandler.OnBeforeDownload
	If PDFBeforeDownload(downloadItem) Then
		callback.Continue(PDFSavePath, False)
	End If
End Sub

Private Sub OnDownloadUpdated(chromiumWebBrowser As IWebBrowser, browser As IBrowser, downloadItem As DownloadItem, callback As IDownloadItemCallback) Implements IDownloadHandler.OnDownloadUpdated
	PDFDownloadUpdated(downloadItem)
End Sub

Private Sub Browser_JavascriptMessageReceived(sender As Object, e As JavascriptMessageReceivedEventArgs) Handles Browser.JavascriptMessageReceived
	If e.Message = "ViewerLoaded" Then
		' Handle loaded message from viewer
		PDFViewer = True
		PDFOpenInViewer()
	End If
End Sub

Handle history navigation or reload of PDF

' Redirect to original if history or reload
Function OnBeforeBrowse(WebBrowser As IWebBrowser, browser As IBrowser, frame As IFrame, request As IRequest, userGesture As Boolean, isRedirect As Boolean) As Boolean Implements IRequestHandler.OnBeforeBrowse
	' Cancel if reloading PDF
	Return PDFIsReload(Request.Url)
End Function

Helper functions to handle PDFs

' Check and navigate to viewer if needed
Private Function PDFBeforeDownload(Item As DownloadItem) As Boolean
	If PDFIsDownload(Item) Then
		PDFComplete = False
		PDFViewer = False
		PDFLoading = True
		PDFSourceUrl = Item.Url
		PDFSavePath = $"{PDFCachePath}{Item.SuggestedFileName}"
		PDFCacheUrl = $"pdfjs://viewer/cache/{Item.SuggestedFileName}"

		' Navigate viewer with original url used to reload when needed
		Browser.LoadUrl($"pdfjs://viewer/web/viewer.html?{Item.Url}")
		Return True
	Else
		Return False
	End If
End Function

' Update viewer progress or open when complete
Private Function PDFDownloadUpdated(Item As DownloadItem) As Boolean
	If PDFIsDownload(Item) Then
		If Item.IsComplete Then
			PDFComplete = True
			PDFOpenInViewer()
		ElseIf PDFViewer Then
			Browser.ExecuteScriptAsync($"PDFViewerApplication.loadingBar.percent={Item.PercentComplete}")
		End If
		Return True
	Else
		Return False
	End If
End Function

' If viewer ready and PDF is complete, open it
Private Sub PDFOpenInViewer()
	If PDFViewer And PDFComplete Then
		Browser.ExecuteScriptAsync($"PDFViewerApplication.open({{url:'{PDFCacheUrl}'}})")
	End If
End Sub

' Is the download a PDF?
Private Function PDFIsDownload(Item As DownloadItem) As Boolean
	Return PDFEnabled And Item.MimeType = "application/pdf"
End Function

' Is the navigation a reload of PDF viewer?
Private Function PDFIsReload(Url As String) As Boolean
	If Not PDFLoading And Url.StartsWith("pdfjs") Then
		' Trigger a download of the source url
		Browser.LoadUrl(Url.Split("?").Last)
		Return True
	Else
		PDFLoading = False
		Return False
	End If
End Function

Other things you may want to implement

  • Handle IsCancelled and IsValid properties in DownloadItem
  • Keep reference to callback handler to cancel download externally
  • Implement stop command to cancel download of the PDF

Those will depend on your implementation, and how you want to handle the PDF.JS viewer when the download fails or is cancelled.