forked from prettydiff/prettydiff
-
Notifications
You must be signed in to change notification settings - Fork 0
/
documentation.xhtml
1 lines (1 loc) · 48.9 KB
/
documentation.xhtml
1
<?xml version='1.0' encoding='UTF-8' ?><!DOCTYPE html PUBLIC '-//W3C//DTD XHTML 1.1//EN' 'http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd'><html xmlns='http://www.w3.org/1999/xhtml' xml:lang='en'><head> <title>Pretty Diff - The difference tool</title> <meta name="robots" content="index, follow"/> <meta name="DC.title" content="Pretty Diff - The difference tool"/> <link rel='canonical' href='http://prettydiff.com/documentation.php' type='application/xhtml+xml' /> <link rel="icon" type="image/x-icon" href="http://prettydiff.com/images/favicon.ico"/> <link rel="meta" href="http://prettydiff.com/labels.rdf" type="application/rdf+xml" title="ICRA labels"/> <meta http-equiv="pics-Label" content='(pics-1.1 "http://www.icra.org/pics/vocabularyv03/" l gen true for "http://prettydiff.com" r (n 0 s 0 v 0 l 0 oa 0 ob 0 oc 0 od 0 oe 0 of 0 og 0 oh 0 c 1) gen true for "http://www.prettydiff.com" r (n 0 s 0 v 0 l 0 oa 0 ob 0 oc 0 od 0 oe 0 of 0 og 0 oh 0 c 1))'/> <meta name="author" content="Austin Cheney"/> <meta name="description" content="Pretty Diff tool can minify, beautify (pretty-print), or diff between minified and beautified code. This tool can even beautify and minify HTML."/> <meta name="distribution" content="Global"/> <meta http-equiv="Content-Language" content="en"/> <meta http-equiv="Content-Type" content="application/xhtml+xml;charset=UTF-8"/> <meta http-equiv="Page-Enter" content="blendTrans(Duration=0)"/> <meta http-equiv="Page-Exit" content="blendTrans(Duration=0)"/> <meta http-equiv="content-style-type" content="text/css"/> <meta http-equiv="content-script-type" content="application/javascript"/> <meta name="google-site-verification" content="qL8AV9yjL2-ZFGV9ey6wU3t7pTZdpD4lIetUSiNen7E" /> <link rel="stylesheet" type="text/css" href="diffview.css" media="all"/> </head><body id="doc" class="canvas"><h1><a href="http://prettydiff.com/">Pretty Diff</a> - Documentation</h1><p>Find <a href="https://github.com/austincheney/Pretty-Diff">Pretty Diff on GitHub</a>.</p><p><strong>Since this tool is client-side JavaScript only, meaning that execution occurs on the local computer only and resulting output is not transmitted or stored, it is safe for processing classified information.</strong></p><div><h2>Known Issues</h2><!--p>If you encounter any problems please <a href="bug.php">submit a bug</a>.</p--><ol><li>In the difference report this collection of characters, "$#lt;" and "$#gt;", may appear respectively as, "%#lt;" and "%#gt;", if those collections of characters are interrupted by starting or end of a difference.</li><li>Server-side parsing tags, such as PHP or ASP, are mutilated during markup beautification and markup minification only if placed inside segments of JavaScript or CSS.</li><li>Markup beautification will output flawed data if less than characters, "<", are included into sections of content and not escaped. This error occurs regardless if less than characters embedded within content are quoted or not.</li></ol></div><div><h2>Web Tool URI Parameters</h2><div><ol><li><strong>s</strong> - This parameter receives a URI as a value that points a code source. If the value of this URI contains ampersand characters, &, or question mark characters, ?, please escape the ampersands characters to "%26" and the question mark characters to %3F. The tool executes this code automatically on page load for beautify and minify modes, but only for diff mode if a source is provided with the <em>d</em> parameter.</li><li><strong>d</strong> - This parameter receives a URI as a value that points a difference code source. If the value of this URI contains ampersand characters, &, or question mark characters, ?, please escape the ampersands characters to "%26" and the question mark characters to %3F.</li><li><strong>m</strong> - This parameter receives a value of <em>beautify</em>, <em>minify</em>, or <em>diff</em>. This parameter sets the mode of the tool.</li><li><strong>l</strong> - This parameter receives a value of <em>markup</em>, <em>html</em>, <em>auto</em>, <em>javascript</em>, <em>js</em>, <em>css</em>, <em>csv</em>, or <em>text</em>. This bypasses all other language settings and determinations thereby forcefully applying the language against the supplied value to this parameter. The value of "html" is identical to the value "markup" except that it forces the option "Presume SGML type HTML" for all modes while "markup" unsets this option. The values "javascript" and "js" are treated equally.</li><li><strong>c</strong> - This parameter receives the name of a supported color scheme.</li></ol> <p>The parameters are optional and are provided soley for portability. The parameters may occur in any order. Examples:</p> <ol><li><a href="http://prettydiff.com/?l=html&s=http://google.com/&m=beautify">http://prettydiff.com/?l=html&s=http://google.com/&m=beautify</a></li><li><a href="http://prettydiff.com/?s=http://www.amazon.com/Definitive-XML-Schema-Priscilla-Walmsley/dp/0130655678/ref=sr_1_1%3Fie=UTF8%26qid=1312890971%26sr=8-1&html&m=beautify">http://prettydiff.com/?s=http://www.amazon.com/Definitive-XML-Schema-Priscilla-Walmsley/dp/0130655678/ref=sr_1_1%3Fie=UTF8%26qid=1312890971%26sr=8-1&html&m=beautify</a></li></ol></div></div><div><h2>Pretty Diff Function</h2><div><h3>Overview</h3><p>Pretty Diff is an application written entirely in JavaScript and expressed as a single function named 'prettydiff()'. This application was originally written as a means to algorithmically difference between two file similar pieces of code regardless of minification. The result is a fast difference engine offering many options that allows access to the world's most advanced markup beautification algorithm.</p><p>While the Pretty Diff application is expressed as a single function it is actually a collection of several components of which many were started by different authors for unrelated projects. Every one of those components that began externally have been heavily modified for either increased processing efficiency, expanded functionality, or JSLint compliance. As a result each of those components must be considered a fork of the originally maintained code that is unique to Pretty Diff.</p><p>The Pretty Diff application is completely isolated from DOM methods with only one exception. This means the prettydiff() function receives all data from its one argument and returns output. At this time the <a href="charDecoder.js">charDecoder.js</a> code is entirely reliant upon DOM access to allow an external application transform character entity references into character literals. The only solution to allow similar functionality without DOM access is to include a character map with the charDecoder function. There are no plans, at this time, to write a JavaScript formulation of the Unicode character map.</p><p>The Pretty Diff function receives input from a single argument that is an object literal containing named properties as discussed in detail a few sections down. If the input is valid Pretty Diff will return an array of two values. The first value is the beautified source code if in "Beautify" mode, the minified source code if in "Minify" mode, or an HTML formatted diff report if in "Pretty Diff" mode. The second output value is an HTML formatted analysis report of the submitted code.</p></div><div><h3>Practices of Pretty Diff</h3><p>Pretty Diff hopes to encourage conventions of efficiency, but not at cost to recursion, regression testing, or altered functionality. For example consider the situation of minifying markup. In a typical scenario the practice of code minifcation is to remove all code comments and all white space characters not absolutely necessary for syntax interpretation. If the most impactful form of minification is exercised upon markup the functionality of the code is certainly changed. Consider these two examples:</p><ol><li><p>This is a paragraph with a text field. <input type="text"/></p></li><li><p>This is a paragraph with a text field.<input type="text"/></p></li></ol><p>The difference between the two examples above is the difference of a single space character between the period and the input tag. In markup white space characters are tokenized when the code is parsed to output by default, and rarely is this default challenged. Tokenized white space means sequential white space characters are converted to a single space character and then sequential space characters are converted to a single space character. This means the presence of some white space characters are completely trivial, while others are not. A single space character separating words of content is not trivial if it is an isolated space. In the above sample the difference of a space separating the input tag from the content is also not trivial since it alters how tokenized content is interpreted.</p><p>If markup code were fully minified then all white space characters outside of syntax containers, such as tags, would be removed, thereby making the content illegible. Markup can still be correctly minified, but only when rendering of tokenized white space is fully considered. The opposite of this problem is accidental addition of white space characters from flawed beautification schemes. Consider the following two examples:</p><ol><li><p>This is a statement with a <a href="#">hyperlink</a>.</p></li><li><p><span>This is a statement with a<span><a href="#"><span>hyperlink</span></a></span>.</span></p></li></ol><p>The differences between the two prior examples is that the second example introduces white space tokens where they do not exist in the first example. In the first example there are no characters between the opening <p> tag and the text or the opening <a> and the text while this is not true of the second example. Therefore these two statements are not similar enough for a logical comparison. A well crafted beautifier, or pretty printer, will take these differences into account so far as to alter the entirety of a code base for easier reading but not at the cost of manipulating how the code is parsed by a given interpreter.</p><p>Code must never be minified if it cannot be automatically recovered into an easily readable form and must never be beautified if such beautification changes how the code is parsed. This is the importance of regression. The most extreme form of minification is referred to obfuscation. Obfuscation removes all code comments and all white space characters not absolutely required for syntax compliance, but goes one step further and changes all variable and command names to the fewest available character length. Pretty Diff considers the practice of obfuscation to be harmful as its practice eliminates the possibility of regression. Without the possibility for regression recursive practices are improbable.</p><p>An instantiation of a pattern where the pattern's presence is available in the given instance without regard for multiplication is said to be idempotent. A recursive practice is the ability replicate an action where the replication does not harm the potential of further replication, or the idempotent nature of a pattern, upon or resulting from that action. In the case of Pretty Diff code that begins unminified should be capable of being minified, beautified, minified again, and so on without harm or difference to the functional integrity of the supplied code. Any process that prevents such recursive practices, such as obfuscation, are harmful and must be avoided.</p><p>Comments are the regression exception. There is no way to efficiently reduce code while retaining comments and documentation. Pretty Diff <strong>strongly</strong> recommends that documentation be separated from production code into either a redundant development version or into a separated documentation archive so that it can be preserved apart from the production code.</p><p>There is one extremely limited exception to functional interference observed by Pretty Diff. The Cascading Style Sheets language provides a syntax and vocabulary that are limited and fully known. Therefore functional changes can, and are, supplied to CSS code during minification because in this one narrow instance there is no harm to regression. Superior minification can be performed by supplying minor functional changes to the code which be easily and intelligently reversed without error or prior knowledge of the code sample.</p></div><div><h3>Properties of Pretty Diff's sole function argument</h3><p><strong>Please note:</strong> As of 19 June 2011 Pretty Diff application no longer accepts formal function arguments. It only accepts a single argument, named <em>api</em>, which is an object literal that stores the application options as the following object properties.</p><ol><li>source - The string value of this property is code that is to be beautified, minified, or represents the base text in a diff operation. Any string data is accepted.</li><li>diff - The string value of this property stores the data that is to be differenced against value of the source property in a diff operation. Any string is accepted, but this property is only used if the mode property is set to <em>"diff"</em>.</li><li>mode - This property determiens the operation to be performed. Acceptable values are: <strong>"beautify"</strong>, <strong>"minify"</strong>, and <strong>"diff"</strong>.</li><li>lang - This property tells the diff program which language it is receiving. Acceptable values are: <strong>"auto"</strong>, <strong>"javascript"</strong>, <strong>"css"</strong>, <strong>"csv"</strong>, <strong>"markup"</strong>, and <strong>"text"</strong>. The value "auto" allows the application to determine between CSS, JavaScript, and Markup without human effort.</li><li>csvchar - This property stores the string value used as a data separator for the <em>"csv"</em> language. Any string is accepted, but if value of lang property is not set to <em>"csv"</em> the value of this property is ignored.</li><li>insize - This property stores the character length of an indention. An integer or string containing an integer is accepted. If anything else is supplied this property defaults to a value of <strong>"4"</strong>.</li><li>inchar - This property stores the character literal used for an indentation. Any string is accepted. A single indentation is the result of this value repeated <em>insize</em> times.</li><li>comments - This property determines whether comments should be indented. This property is only used if the source property is set to <em>beautify</em>. Acceptable values are: <strong>indent</strong> and <strong>noindent</strong>.</li><li>indent - This property sets the style of indentation during JavaScript beautification. The default value <strong>"knr"</strong> sets a JSLint compliant beautification scheme and the other value <strong>"allman"</strong> puts opening curly braces on their own line.</li><li>style - This property determines whether CSS and JavaScript code should be indented according to the surrounding markup or if they should be indented starting from 0. This property is only used if lang property is set to <em>"markup"</em> and mode property is set to <em>"beautify"</em> or <em>"diff"</em>. Acceptable values are <strong>"indent"</strong> and <strong>"noindent"</strong>.</li><li>html - This property determiens if the application must presume markup input to be an SGML form of the HTML language. This property is only used if lang property is set to <em>"markup"</em>. Acceptable values are <strong>"html-no"</strong> and <strong>"html-yes"</strong>.</li><li>context - The value of this property stores a number or empty string so as to trigger the <em>Context Size</em> diff option. This property is only used if mode property is set to <em>"diff"</em>.</li><li>content - This property determines if string literals in JavaScript and content in markup should be normalized to a value of <em>text</em> prior to a diff operation. This property is only used if mode property is set to <em>"diff"</em>. Acceptable values are boolean <strong>false</strong> and <strong>true</strong>.</li><li>quote - This property determines if single quote characters should be converted to double quote characters prior to a diff operation. This property is only used if mode property is set to <em>"diff"</em>. Acceptable values are boolean <strong>false</strong> and <strong>true</strong>.</li><li>semicolon - This property determines if semicolon characters at the end of a line should be removed prior to a diff operation. This property is only used if mode property is set to <em>"diff"</em>. Acceptable values are boolean <strong>false</strong> and <strong>true</strong>.</li><li>diffview - This property determines if the diff report should be expressed is a side-by-side comparison or a single inline view. This property is only used if mode property is set to <em>"diff"</em>. Acceptable values are: <strong>"sidebyside"</strong> and <strong>"inline"</strong>.</li><li>sourcelabel - The value of this property sets a label describing the value of source property in the diff report. Any string is accepted.</li><li>difflabel - The value of this property sets a label describing the value of diff property. Any string is accepted.</li><li>topcoms - This property determines whether minification should ignore comments at the top of JavaScript or CSS input before any code. This property is only used if the mode property is set to <em>"minify"</em>. It accepts values of boolean <strong>"true"</strong> or <strong>"false"</strong>.</li><li>force_indentation - This property determines if every piece of code and content must be indented or if the source should be beautified according to its semantics so that white space tokens are not added or removed. This property accepts boolean values and defaults to <strong>"false"</strong>.</li></ol></div><div><h3>Option String</h3><p>The Pretty Diff option string is similar in convention to the JSLint option string. Pretty Diff will only process the first option string it encounters, but it will search the diff code if the option cannot be found in the source code. In order for the option string to be recognized it must start with <strong>/*prettydiff.com</strong> and end with <strong>*/</strong>. The options are listed in this string separated by commas as a colon separated name value pair. The options match the exact value definition for the Pretty Diff application properties above and options that allow abstract values must have their values enclosed in either single or double quotes. The options can be listed in any order. To prevent possible corruption of the option string it should be separated from other code comments or content. These are examples of appropriate option strings:<ul><li>/*prettydiff.com api.mode:beautify, api.lang:javascript, api.inchar:"abc"*/</li><li><!--/*prettydiff.com api.mode: beautify, api.lang: javascript, api.inchar: "abc", api.html: html-yes */--></li></ul></p></div><div><h3>Input and Output</h3><p>The prettydiff function accepts all code input from JavaScript variables supplied to the function's source property and diff property. The code supplied to the diff property is used only during diff operations and is otherwise ignored.</p><p>The function outputs an array of two indexes. The first array index is always the processed data and the second array index contains some metadata. In the case of the "beautify" and "minify" operations the first index of the output array is the processed source code as text and the second array index is the code report, as seen generated on the client side tool, formatted as HTML. The output from the diff operation returns an HTML table of the actual diff output as the first array index and a some minor metadata about the number of errors in the second array index with both indexes formatted as HTML and neither comprising a complete HTML document. To form the diff output of the prettydiff function into a single HTML document I typically supply the following extra code:</p><ol><li>output = prettydiff(...);</li><li>heading = '<?xml version="1.0" encoding="UTF-8" ?><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"><html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en"><head><title>Pretty Diff</title><link rel="canonical" href="http://prettydiff.com/" type="application/xhtml+xml"/><meta http-equiv="Content-Type" content="application/xhtml+xml;charset=UTF-8"/><meta name="robots" content="index, follow"/><meta name="DC.title" content="Pretty Diff - The difference tool"/><link rel="icon" type="image/x-icon" href="http://prettydiff.com/images/favicon.ico"/><link rel="meta" href="http://prettydiff.com/labels.rdf" type="application/rdf+xml" title="ICRA labels"/><meta http-equiv="pics-Label" content='(pics-1.1 "http://www.icra.org/pics/vocabularyv03/" l gen true for "http://prettydiff.com" r (n 0 s 0 v 0 l 0 oa 0 ob 0 oc 0 od 0 oe 0 of 0 og 0 oh 0 c 1) gen true for "http://www.prettydiff.com" r (n 0 s 0 v 0 l 0 oa 0 ob 0 oc 0 od 0 oe 0 of 0 og 0 oh 0 c 1))'/><meta name="author" content="Austin Cheney"/><meta name="description" content="Pretty Diff tool can minify, beautify, or diff between minified and beautified code.This tool can even beautify and minify HTML."/><meta name="distribution" content="Global"/><meta http-equiv="Page-Enter" content="blendTrans(Duration=0)"/><meta http-equiv="Page-Exit" content="blendTrans(Duration=0)"/><meta http-equiv="content-style-type" content="text/css"/><meta http-equiv="content-script-type" content="text/javascript"/><meta name="google-site-verification" content="qL8AV9yjL2-ZFGV9ey6wU3t7pTZdpD4lIetUSiNen7E"/><link rel="stylesheet" type="text/css" href="http://prettydiff.com/diffview.css" media="all"/></head><body><h1>Pretty Diff - The difference tool</h1><span class="clear"></span><div id="diffoutput">';</li><li>return heading + output[1] + "</div>" + output[0] + "</body></html>";</li></ol></div><div><h3>Executing with <a href="http://nodejs.org/">Node.js</a></h3><p>Node.js is a command line run time operating on *nix type environments. Node.js execution could be the result of a one-time execution or part of a scripted automation process. To execute the prettydiff function with Node.js the following code needs to be added after the prettydiff function. Nothing else needs to occur for compatibility or integration with Node.js.</p><p><code>if(typeof exports!=="string"){exports.api=function(x){"use strict";return prettydiff(x);};}</code></p></div><div><h3>Executing with <a href="http://msdn.microsoft.com/en-us/library/9bbdkx3k%28v=VS.85%29.aspx">Windows Script Host</a> (WSH)</h3><p>Windows Script Host allows for a JavaScript run time in Windows environments from a command line with output directly returned to the command line or in a debugger window. To execute JavaScript with WSH a file is needed to supply the JavaScript function call, pass arguments from command line into a JavaScript compatible format, and to request dependencies.</p><p>Example code for operating the markup_beauty.js application is supplied in a recently written <a href="wsh.wsf">wsh.wsf</a> file. A WSF file follows basic XML syntax and may allow multiple operations of different languages to execute in tandem so long as each operation is confined to a <em>job</em> tag. The <em>named</em> elements in the example file are used to intercept arguments supplied via command line.</p><p>The example file would be operated for HTML compatibility using the following command: <code>wsh.wsf /source:"my_source_file" /html:true</code></p><p>Writing the output of a WSH task into a file would require an additional ActiveX instruction in the wsh.wsf file or would require the automation of script execution in the context of the PowerShell language.</p></div></div><div><h2>Beautification</h2><div><h3>JavaScript</h3><p>JavaScript beautification uses <a href="js-beautify.js">js-beautify</a> function. This is originally written by Einars Lielmanis. I have modified this code for more strict conformance to JSLint. This code is used in accordance with licensing provided in documentation embedded within the original code.</p><p>JavaScript is beautified in accordance with conventions expressed by JSLint beautification scheme. This essentially means that spaces are added after commas, object contents are indented, and a new line is added after each statement termination and object closing.</p><h4>Summary</h4><p>An unassigned anonymous function creates the summary report and assigns its output to a variable named <em>summary</em>. Variable <em>summary</em> is not provided a scope by js-beautify.js, so it must be provided a scope by a consuming function or it will become an implied global, or an undeclared variable error in strict mode. It is intended to be provided to js-beautify as closure so that it can access the interior of js-beautify and be accessed outside of js-beautify.</p></div><div><h3>CSS</h3><p>CSS beautification uses <a href="cleanCSS.js">cleanCSS</a> function originally written by Anthony Lieuallen. I have modified it to for better conformity against JSLint and also for some minor beautification tweaks and customized indentation. I can no longer remember any specific errors I have found and corrected from modification of this function. I am using this code with permission from the original author.</p><p>CSS is beautified so that the contents of each object are indented and a new line is added after termination of each property declaration.</p><h4>CSS Summary</h4><p>An unassgined anonymous function creates the summary report by assigning its output to a variable named <em>summary</em>. Variable <em>summary</em> is not provided a scope by cleanCSS so it must be provided a scope by the consuming application or it will become an implied global, or an undeclared variable error in strict mode. It is intended to be supplied as closure so that it may access the interior of cleanCSS, but can be accessed outside cleanCSS. This summary reports the number of HTTP requests and what those requests are.</p></div><div><h3>CSV</h3><p>CSV typically stands for <em>comma separated values</em>, but in this tool it stands for character separated values. The <a href="csvbeauty.js">csvbeauty</a> function takes a sequence of characters and splits the input upon that supplied sequence onto new lines. Prior existing line breaks, if they were quoted, are converted to a space contained by braces: <em>{ }</em>. Unquote line breaks are converted into two simultaneous line breaks. If the final character(s) match the user supplied character sequence, after charDecoder processing, then those characters are converted into <em>{|}</em> so that csvmin will know a character sequence must exist at the extreme end of input. Escaped double quote characters, escaped using the formal CSV method by immediately preceeding the characters with an extra double quote character, are converted in a single double quote character to improve ledgibility.</p><p>CSV beautification uses <a href="charDecoder.js">charDecoder</a> to decode Unicode character entities. The charDecode function accepts any combination of HTML decimal Unicode entities and Unicode hexidecimal entities. HTML decimal entities must begin with an ampersand and pound character '&#', be immediately followed with between one and six decimals, and be immediately terminated by a semicolon ';'. Examples of accepted HTML entities are:</p><ul><li>&#9;</li><li>&#37;</li><li>&#10279;</li></ul><p>The Unicode hexidecimal entities must begin with a lowercase u and plus character 'u+', be immediately followed by a four or five digit hexidecimal value, and be immediately terminated by a plus character. Hexidecimal values smaller than four digits must be padded with 0 characters necessary to achieve four digits. Examples of accepted Unicode entities are:</p><ul><li>u+0009+</li><li>u+003c+</li><li>u+1037a+</li></ul><p>Please be aware that charDecode is reliant upon the interpreting application's HTML character rendering engine to map entity values to character maps, which means if the browser does not support the entity supplied the browser will return a generic character marker instead of the intended character. The content will then be separated in accordance to the rendered sequence value, which means a generic character marker will be used in the separation instead of the character referrenced by the supplied entity. In summary, if your browser has limited support for Unicode characters you must expect equally limited results when using entity references.</p></div><div><h3>Markup</h3><p>Markup beautification uses the <a href="markup_beauty.js">markup_beauty</a> function of the application. This function operates upon a pattern based logic of referrential integrity. This means decisions are made through exposure to the pattern as established so far. Unfortunately, this requires defined logic to consider all possible combinations of patterns. At this point the beautification appears to work for more than 99.9% of pattern combinations, but undefined combinations are continually being discovered. If an error in my logic is discovered please <a href="mailto:austin.cheney@us.army.mil">contact me</a> so that I am supply you with a corrected application.</p><p>The markup beautification is based upon syntax conventions only and absolutely not upon vocabulary. The two exceptions are that the contents of a script tag are presumed to be JavaScript if the tag does not contain a type attribute or the type attribute contains one of these values: text/javascript, text/ecmascript, application/javascript, application/x-javascript, application/ecmascript. If the contents of a script tag are presumed as JavaScript are beautified accordingly. The contents of a style tag are presumed to be CSS if the tag contains a type attribute with a value of text/css or if the type attribute is not present, and those contents are beautified as CSS. The presumed CSS and JavaScript do not inherit indentation from the markup. Since the beautification is not based upon vocabulary any language that uses angle brackets for delimiters should work assuming the conditions of the next paragraph are met. The supplied markup does not have be valid or well formed by any means.</p><p>Content in the markup is represented by whether or not it begins or ends with any whitespace. If content does begin and/or end with whitespace then new line characters are added and the content is indented. This means tags that but up directly to content are then treated as an extension of that content and are indented as such. Singleton tags are expected to be terminated as XML singleton tags, which means a forward slash character prior to its closing angle bracket. If a singleton tag is not properly closed the beautifier believes the tag to be a start tag, which expects an end tag. Singleton tags may represent an indication of content in the form of media or form controls, and so they are indented in the same manner as content.</p><p>PHP tags are expected to open with "<em><?php</em>" and XML parsing declarations are expected to open with "<em><?xml</em>". Tags that begin with only "<em><?</em>" are not supported, and so they are believed to be start tags missing a closing tag. This unsupported convention is no longer supported by PHP, even if tolerated, and will generate errors to an XML parser. I don't support this and neither should you.</p><p>Start tags expect to receive an end tag. End tags will be indented exactly like their starting pair unless they are directly next to content and the same is true for start tags. The beautification logic is smart enough to compensate and correct itself in adjustment for start tags or end tags that are not indented due to content.</p><p>The <a href="markup_beauty.js">markup_beauty</a> function also supports nested tags. Some server side processing languages use an XML base tag syntax for application processing, such as JSTL, and allow the direct embedding of HTML and XML tags directly. This following tag is example of something that can be beautified: <em><c:out value="<strong>variable text output</strong>"/></em>. The only limitation is that the nested tags must be quoted in either double or single quotes.</p><p>On 12 April 2011 support was added for complex nested SGML tags. This support is limited to the points of conflict with the terse syntax requirements of XML. In the case of such conflicts markup_beauty chooses to comply with XML syntax in opposition to SGML flexibility. This is of particular concern to tags that use the "<em><?</em>" syntax.</p><h4>Markup Summary</h4><p>The markup_beauty function contains, at its end, an unassigned anonymous function that creates the summary report and assigns its output to a variable named <em>summary</em>. Variable <em>summary</em> is not provided a scope by markup_beauty, because it is meant to be supplied as a closure to markup_beauty. This summary variable must be provided a scope by the consuming application or it will become an implied global, or an undeclared variable error in strict mode.</p><p>The markup_summary creates a report of the number of parts comprising the markup, the weight of each of those parts, and a score using a math formula to compute a performance rating that reenforces reliance upon structure and elaboration of content. This function also displays each HTML element making a HTTP request.</p></div><div><h3>Beautification Options</h3><p>The <em>Indentation Size</em> is merely a multiplier of the specified indentation character. Each component of the total application computes indentation from multiplying the indentation size by the indendation character independently so that the components may be used without the others and with minimal or no modification.</p><p>The <em>Indentation Characters</em> option allows a choice of space, tab, new line, or a string of text characters from the HTML tool. The code will literally allow anything that can be expressed as a character or series of characters.</p><p>The <em>Indent Comments</em> option determines if comments should be indented in accordance to the neighboring code or if comments should not be indented at all.</p><p>The <em>Indent Style/Script</em> option is only available for markup beautification. This option determines if the contents of style or script should be indented to match the indentation of their parent tag or if their indentation should begin from the left.</p><p>The <em>Presume SGML type HTML</em> option is only available for the markup type of code. When this option is enabled all tag names that are "presumed" as singleton tags in HTML 4.01 are treated as singletons regardless of their syntax.</p></div></div><div><h2>Minification</h2><div><h3>JavaScript</h3><p>A heavily modified <a href="fulljsmin.js">JSMin</a> is used to perform minification of both JavaScript and CSS. This code is originally written in C language by Douglas Crockford and converted to JavaScript by Franck Marcia. I am using this code in accordance with the licensing expressed in the JavaScript form of this code from Mr. Marcia. This code is modified to recognize differences in requirements for JavaScript and CSS, for better conformance to the JSLint tool, and I added a semicolon insertion mechanism for sloppy JavaScript. This function is also enhanced so that it will always minify code down to a single line, which makes the optional semicolon insertion mechanism necessary.</p><p>This custom fork of JSMin contains an automated semicolon insertion (ASI) mechanism. This mechanism is applied in the Pretty Diff tool by default, but only for minification operations and only for the JavaScript language. To apply ASI using the custom fork of JSMin directly the following arguments must evaluate as follows:</p><ul><li>type !== "css"</li> <li>alter === true</li> <li>level === 2</li></ul></div><div><h3>CSS</h3><p>CSS uses the exact same modified <a href="fulljsmin.js">JSMin</a> application described for JavaScript. The minification is largely identical except that "-", ".", and "\" are recognized as string characters and not operators or comments. The "$" and "/" characters are removed from this list. Some extra whitespace is inserted to preseve naming conventions that do not exist in JavaScript.</p></div><div><h3>CSV</h3><p>CSV typically stands for <em>comma separated values</em>, but in this tool it stands for character separated values. The <a href="csvmin.js">csvmin</a> function reverts all changes inflicted by the <a href="csvbeauty.js">csvbeauty</a> function.</p><p>CSV beautification uses <a href="charDecoder.js">charDecoder</a> to decode Unicode character entities. The charDecode function accepts any combination of HTML decimal Unicode entities and Unicode hexidecimal entities. HTML decimal entities must begin with an ampersand and pound character '&#', be immediately followed with between one and six decimals, and be immediately terminated by a semicolon ';'. Examples of accepted HTML entities are:</p><ul><li>&#9;</li><li>&#37;</li><li>&#10279;</li></ul><p>The Unicode hexidecimal entities must begin with a lowercase u and plus character 'u+', be immediately followed by a four or five digit hexidecimal value, and be immediately terminated by a plus character. Hexidecimal values smaller than four digits must be padded with 0 characters necessary to achieve four digits. Examples of accepted Unicode entities are:</p><ul><li>u+0009+</li><li>u+003c+</li><li>u+1037a+</li></ul><p>Please be aware that charDecoder is reliant upon the interpreting application's HTML character rendering engine to map entity values to character maps, which means if the browser does not support the entity supplied the browser will return a generic character marker instead of the intended character. The content will then be separated in accordance to the rendered sequence value, which means a generic character marker will be used in the separation instead of the character referrenced by the supplied entity. In summary, if your browser has limited support for Unicode characters you must expect equally limited results when using entity references. csvmin does not revert any changes supplied by the charDecoder function.</p></div><div><h3>Markup</h3><p>Markup is minified using <a href="markupmin.js">markupmin</a> that I wrote recently. This function does little more than tokenize a run of whitespace characters into a single space character and scrubbing of comments. It does, however, preserve whitespace inside ASP and PHP tags and preserve SSI tags. It will also assume the contents of a script tag are JavaScript and minify them according, and also assumes the contents of style tags are CSS and minifies them as such.</p></div><div><h3>Minification Options</h3> <p>The <em>Presume SGML type HTML</em> option is only available for the markup type of code. When this option is enabled all tag names that are "presumed" as singleton tags in HTML 4.01 are treated as singletons regardless of their syntax.</p></div></div><div><h2>Pretty Diff</h2><div><h3>Diff code</h3><p>The diff engine uses three separate functions: <a href="difflib.js">difflib</a>, <a href="diffview.js">diffview</a>, and <a href="charcomp.js">charcomp</a>. The first two components are originally by Snowtide Informatics Systems and the third component I wrote. difflib is altered in order to achieve more strict JSLint compliance, but is otherwise not significantly altered. diffview is almost entirely rewritten from scratch so that JavaScript arrays are used to store the dynamic output instead of DOM objects. This change has result in a 3.5x faster response rate. charcomp is the function used to highlight per character differences.</p></div><div><h3>Diff process</h3><p>JavaScript code is first minified using <a href="fulljsmin.js">JSMin</a> and then beautified using <a href="js-beautify.js">js-beautify</a>. This prevents differences from comments or whitespace interfering with the analysis of the code. It also allows beautified code to be flawlessly compared with minified code. CSV is first minified with <a href="csvmin.js">csvmin</a> and then beautified with <a href="csvbeauty.js">csvbeauty</a>. CSS is first minified using <a href="fulljsmin.js">JSmin</a> for CSS and then beautified using <a href="cleanCSS.js">cleanCSS.js</a> for the same reasons mentioned for JavaScript. Markup is first minified using <a href="markupmin.js">markupmin</a> and then beautified using <a href="markup_beauty.js">markup-beautify</a>. Plain text is diffed without any minification or beautification. If code that needs to be compared that is not compatible with the other processes then use the plain text mode.</p><p>Only after the automated beautification does the diff process begin. The difflib finds differences per line and sends its results, as an array of numeric values where to look in the code, over the diffview. Diffview takes the opcodes supplied by difflib and then builds an array where the code is pumped into HTML table cell code. Once the view is completely built it is immediately inserted into the page using innerHTML. You cannot see the output at this point because it is set to display none until charcomp finishes.</p><p>charcomp finds the table cells with a class of "replace" and only works on those cells. Before performing any comparison it converts non-breaking space references into actual spaces to reduce processing requirements, and converts angle brackets into entity references, and converts entity references for quotes into actual quotes. In JavaScript a single quote compares to true against a double quote even if both are string literals, so I invented character references that I could convert back to quotes within the context of the comparison function so that they actually do become string literals that comparable. Once a difference is located it is wrapped in an em tag.</p></div><div><h3>Diff options</h3><p><em>Print or Save Output</em> option dumps the HTML output as raw text into a textarea. Please copy the text out of this new textarea at the bottom of the page and paste it into a new text file to be saved with the file extension of 'html'. If the output is to be printed please use landscape instead of portrait orientation for the paper in order to achieve the best results.</p><p>The <em>Context Size</em> option provides padding to the lines of code differences with lines of code that are not different. This option expects to receive a number or empty value only. If anything else is entered an empty value will be processed. An empty value negates this option by returning all supplied code with differences highlighted. If there is no differences are discovered this option is negated.</p><p><em>JavaScript Toleration</em> is exactly the same as described in Minification Options above. <em>Indentation Size</em> are exactly the same as described in Beautification Options above. <em>Indentation Characters</em>, <em>Indent Style/Script</em>, and <em>Presume SGML type HTML</em> are exactly the same as described in Beautification Options above.</p><p>The <em>Diff View Type</em> option provides two choices. The first choice, Side by Side View, reports the output into two columns that display a side by side comparison of the differences. The second choice, Inline View, displays the output into a single column so that the differences can be seen in a vertical comparison.</p><p><em>Diff Quotes</em> is an option to normalize all single quote characters, ', to double quote characters, ". This option is off by default and it functions after input code is beautified and before diff operation execution.</p><p><em>Trailing Semicolons</em> is an option to remove semicolon characters just before a new line character. This option is off by default and it functions after input code is beautified and before diff operation execution.</p></div></div><div><h2>External Resources</h2><div><h3>Validation</h3><ul><li><a href="http://jslint.com/">JSLint</a> is recommend for JavaScript as the most complete source of automated JavaScript validation.</li><li><a href="http://jshint.com/">JSHint</a> is a fork of JSLint.</li><li><a href="http://validator.w3.org/">W3C Markup Validation Service</a> is recommended as the most complete validation source for all variants of (X)HTML.</li><li><a href="http://jigsaw.w3.org/css-validator/">W3C CSS Validation Service</a> is recommended as the most accurate CSS validation application.</li></ul></div><div><h3>Documentation</h3><ul><li><a href="http://naturaldocs.org/">Natural Docs</a> is a tool for automatically grabbing, archiving, sorting, and formatting code documentation.</li><li><a href="http://openjsan.org/">JSAN</a> is an online archive of JavaScript application modules and documentation schemes.</li></ul></div><div><h3>Editors</h3><ul><li><a href="http://www.miscutil.com/apps/xmleditor/">aws XML Editor</a> is an XML tree viewer and editor written in JavaScript and proxy for conversion to JSON.</li><li><a href="http://notepad-plus-plus.org/">Notepad ++</a> is an open source code editor that is light weight and supports syntax highlighting for more than 30 languages.</li></ul></div></div><div><h2>Fragmented Components</h2><table summary="components, authors, and dates of modification"><caption>A list of code components, author information, and dates of revision.</caption><thead><tr><th>Component</th><th>Author(s)</th><th>Summary</th><th>Revised</th></tr></thead><tbody><tr><td><a href="diffview.css">diffview.css</a></td><td><a href="http://prettydiff.com/">Austin Cheney</a></td><td>The CSS that powers everything to do with the form, diff output, and this documentation.</td><td>28 Jun 12</td></tr><tr><td><a href="lib/diffview.js">diffview.js</a></td><td><a href="http://snowtide.com/">Snowtide Informatics</a><span>revised by <a href="http://prettydiff.com/">Austin Cheney</a></span></td><td>Builds the HTML diff output.</td><td>28 Jun 12</td></tr><tr><td>documentation.php</td><td><a href="http://prettydiff.com/">Austin Cheney</a></td><td>This documentation page.</td><td>28 May 12</td></tr><tr><td><a href="lib/fulljsmin.js">fulljsmin.js</a></td><td><span>Original - <a href="http://www.crockford.com/">Douglas Crockford</a></span><span>JavaScript adaptation - <a href="http://fmarcia.info/jsmin/test.html">Franck Marcia</a></span><span>revised by <a href="http://prettydiff.com/">Austin Cheney</a></span></td><td>Minifies JavaScript code and CSS code.</td><td>28 Jun 12</td></tr><tr><td><a href="http://prettydiff.com/">index.php</a></td><td><a href="http://prettydiff.com/">Austin Cheney</a></td><td>Actual Pretty Diff tool HTML file.</td><td>28 Jun 12</td></tr><tr><td><a href="pd.js">pd.js</a></td><td><a href="http://prettydiff.com/">Austin Cheney</a></td><td>A supplemental JavaScript file providing DOM access and interaction with the web tool.</td><td>28 Jun 12</td></tr><tr><td><a href="prettydiff.js">prettydiff.js</a></td><td><a href="http://prettydiff.com/">Austin Cheney</a></td><td>Actual Pretty Diff application code.</td><td>28 Jun 12</td></tr><tr><td><a href="lib/js-beautify.js">js-beautify.js</a></td><td><a href="http://jsbeautifier.org/">Einars Lielmanis</a><span>Revised by <a href="http://prettydiff.com/">Austin Cheney</a></span></td><td>Beautifies JavaScript code.</td><td>14 Jun 12</td></tr><tr><td><a href="lib/cleanCSS.js">cleanCSS.js</a></td><td><a href="http://arantius.com/">Anthony Lieuallen</a><span>revised by <a href="http://prettydiff.com/">Austin Cheney</a></span></td><td>Beautifies CSS code.</td><td>13 Jun 12</td></tr><tr><td><a href="lib/csvmin.js">csvmin.js</a></td><td><a href="http://prettydiff.com/">Austin Cheney</a></td><td>The function that minifies character sequence values.</td><td>19 May 12</td></tr><tr><td><a href="lib/markup_beauty.js">markup_beauty.js</a></td><td><a href="http://prettydiff.com/">Austin Cheney</a></td><td>Beautifies markup code.</td><td>19 May 12</td></tr><tr><td><a href="lib/csvbeauty.js">csvbeauty.js</a></td><td><a href="http://prettydiff.com/">Austin Cheney</a></td><td>The function that beautifies character sequence values.</td><td>7 May 12</td></tr><tr><td><a href="lib/markupmin.js">markupmin.js</a></td><td><a href="http://prettydiff.com/">Austin Cheney</a></td><td>Minifies markup code.</td><td>7 May 12</td></tr><tr><td><a href="lib/charDecoder.js">charDecoder.js</a></td><td><a href="http://prettydiff.com/">Austin Cheney</a></td><td>The function that decodes Unicode character entities for <a href="csvbeauty.js">csvbeauty.js</a> and <a href="csvmin.js">csvmin.js</a>.</td><td>11 Feb 12</td></tr><tr><td><a href="api/api.js">api.js</a></td><td><a href="http://prettydiff.com/">Austin Cheney</a></td><td>The code for processing the server side API for Node.js. This file is no longer maintained.</td><td>19 Jun 11</td></tr></tbody></table><p>* I only claim to be a revision author where I completely rewrote or extended functional output opposed to merely reorganizing the original logic of the code for JSLint compliance.</p><p>Please send comments, feedback, and requests to <a href="mailto:austin.cheney@us.army.mil">austin.cheney@us.army.mil</a>.</p></div><script type="application/javascript">var _gaq=_gaq||[];_gaq.push(["_setAccount","UA-27834630-1"]);_gaq.push(["_trackPageview"]);(function(){var ga=document.createElement("script"),s=document.getElementsByTagName("script")[0];ga.setAttribute("type",s.getAttribute("type"));ga.setAttribute("async",true);ga.setAttribute("src",("https:"===document.location.protocol?"https://ssl":"http://www")+".google-analytics.com/ga.js");s.parentNode.insertBefore(ga,s);}());</script></body></html>