-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integrate Docile.jl and Markdown.jl into Base #8514
Comments
Also pinging @johnmyleswhite who originally suggested to me about Docile.jl as a good starting point. Also @dmbates has been looking for this. |
@MichaelHatherly I recommend waiting until I've worked through overhauling Markdown.jl before finalising anything / setting up PRs. There will probably be some technical changes to work out within Docile/Markdown to get the string macros etc. working smoothly. |
Yes, just saw your |
Docile does look quite good. I'm wondering about
For point (2), I feel the system should be as lazy as possible, just populating metadata with strings until interaction and display happen. |
Special syntax would be nice. The Docile doesn't really have any dependencies, it's just harvesting strings and metadata. Lexicon's providing the presentation layer. I agree about the laziness -- I don't have any hard numbers, but when I was parsing docstrings during |
Another thing that needs to be hashed out is what non-standard form of markdown we wish to support. Inline latex, tables, and cross references seem necessary. |
@jakebolewski CommonMark [1, 2] looks reasonably promising. Inline math is a must-have feature -- not sure whether that would be part of the spec though. [1] MichaelHatherly/Docile.jl#33 |
HttpServer now uses Docile (thanks to @astrieanna), which could be another interesting case study: |
That's cool, thanks @astrieanna. Guess I can't make breaking changes now! |
Hah, doesn't stop anyone else ;) |
I think this is the right way to go. I agree with Jeff's point that special syntax would make Docile nicer to work with. |
CommonMark doesn't have any standard for embedded equations; see this discussion. Pandoc's |
My understanding in #3988 was always that there would eventually be a special syntax for this; macros are only for prototyping. |
Was syntax ever agreed upon for this? Something along the lines of: doc """
...
"""
function foo(x)
end Where Or just without the |
Jeff and I just talked about this today and a bare string literal in void context followed by a definition seems like the way to go. This should be lowered by the parser something like this: "`frob(x)` frobs the heck out of `x`."
function frob(x)
# commence frobbing
end becomes the moral equivalent of this: let doc = "`frob(x)` frobs the heck out of `x`."
if haskey(__DOC__, :frob)
__DOC__[:frob] *= doc
else
__DOC__[:frob] = doc
end
end
function frob(x)
# commence frobbing
end Important points about this approach:
An open issue is how to handle adding methods to functions from other modules. Does the definition go into the current module's |
Just to comment that it's super-exciting to see momentum on this. Looking forward to seeing what emerges. |
Is using If "`frob(x)` frobs the heck out of `x`."
function frob(x)
# commence frobbing
end is instead translated to function frob(x)
# commence frobbing
end
let doc = "`frob(x)` frobs the heck out of `x`."
if haskey(__DOC__, frob)
__DOC__[frob] *= doc
else
__DOC__[frob] = doc
end
end then you could use the For adding docs to methods that are being extended from those in a different module, I'd be in favour of adding them to the current module's |
Stefan's proposal looks good to me, but +1 for being either being aware of methods properly or being limited to one docstring per function (as opposed to concatenating each successive dosctring regardless). Another way to do this might be something like __DOC__[:frob][(Int, String...)] = "`frob(x)` frobs the heck out of `x`."
function frob(x::Int, ys::String...)
# ... i.e. indexing doc strings by type as well as name. Key points in this approach:
(1.i) is my main concern – redefining functions messing up their own docs is something we could probably live with / work around / ignore, but if we can solve this early it will make for a much better interactive experience, I think. |
Key problems with this approach:
One possibility would be to make the |
Documentation specific to argument signature is definitely better than concatenation, +1 for documentation being anything with a __DOC__[:frob][(Int, String...)] = () -> "`frob(x)` frobs the heck out of `x`."
function frob(x::Int, ys::String...)
# ... would also let us evaluate documentation objects only when they are needed. e.g. |
I know that this is probably not a popular opinion but I really think we should consider using Restructured Text at least for the default markup in Base. It supports everything we will want (inline math / code, cross-links, tables, etc.), supports extensions in the standard for functionality we would want to add, and would allow us to reuse all the tooling in developed in the Python world (Sphinx, ReadTheDocs, etc.) which imo is the best out there. Otherwise I see us developing yet another superset of Markdown to support our needs which may or may not be consumable by other tools. I guess if we pick a superset with better tooling support (such as PanDoc markdown with all the extensions) we might be able to mitigate this problem. |
These are good points. Having an API for this is key, as that will allow even more flexibility than a keyword. For fancy documentation needs, use the API instead of the special syntax. It's probably also true that we'll want to associate docs with particular type signatures. I think associating arbitrary metadata with every docstring is overengineering at this point. Where we are, we can't even ask for help for a simple function in a package. |
ReStructured Text is awful. I wrote most of the original manual and writing it in Markdown was a pleasure. Writing documentation has been a painful chore ever since we switched from Markdown to RST. Having complicated formatting types for documentation is overkill and something that we can consider, if at all, only if there's strong evidence of a real need in practice. I don't think there will be any such need. There should be essentially no choice about documentation – the worst possible situation is one where everyone writes docs in their personal favorite format and there are a dozen of them. There should be one reasonable way to write docs that works well and that everyone is familiar with. What we generate during parsing should be simple and easy for the parser to construct – i.e. just strings – and these strings should look decent if you just show them as is. Markdown fits the bill perfectly – it is already (by design) how people intuitively markup plain text content. |
@shashi, as I've discussed in the abovementioned Docile issue, the plan for typical documentation objects (e.g. Markdown text) is to store only the unparsed string when the file is loaded. Parsing of the AST, generation of HTML, etcetera, is only performed "lazily" when the help is requested in some format. @jakebolewski, the choice of format is orthogonal to this feature if my suggestion is adopted. Markdown documentation would be |
@JeffBezanson, we absolutely have to have some kind of metadata if you want to have any possibility of generating offline documentation, because you can't just have a long list of 3000 functions in Base, sorted alphabetically. At the very least, you have to be able to mark what section and subsection of the manual they should appear in. |
Let's cross that bridge when we get there. |
I am extremely hawkish about load time but I'm not really worried about the slowdown from metadata dicts on docstrings, for reasons that have been discussed already: (1) they're not all that slow, (2) not all docstrings will have them, (3) they can be shared among docstrings. My main concern is getting something simple working first so we can have help and docs for packages ASAP. After that there are concerns about complexity and where various information should be stored, but we can continue to discuss that while enjoying the availability of package help :) |
+1 to having something that works for packages asap. |
Yes, +1 to having something vaguely like what's been discussed in this thread soon. I'm happy to adjust Docile to match whatever makes it into Base so that 0.3 packages can have documentation too. |
I agree that we should get something asap, with the caveat that major flaws and disagreements should be things that are resolvable later without much breakage. Adding documentation metadata is something that can be done later without breakage, because most docstrings won't have metadata so we will want an optional syntax anyway. Changing |
Regarding the |
@StefanKarpinski, note that we'll need a string macro anyway in order to easily use LaTeX equations in Markdown (otherwise you have to backslash like crazy). |
That would be true if we couldn't change the parser ;-) |
I would prefer format-agnostic documentation (requiring only writemime). I don’t think “getting something out fast” is affected by which of these we chose. Making something work for special Julia Markdown strings only vs. an equivalent MarkdownString type doesn’t seem like a big difference as far as implementation effort. Forcing everyone to use the same format seems unfortunate. I agree with having a strong default (i.e. shipping and using only one format in base), but choosing not to support any other format is actively preventing anyone from ever using a different format. There is always some dissent about formats, and if someone strongly prefers rst for their project (for the toolchain, or whatever), then there’s no reason to actively prevent them from doing so. An example of using different types of documentation in one package: some documentation might be in a separate file, so those functions would just like to refer to the file path & have the file actually read lazily. This could be accomplished with a different type (FileDocString or whatever) that behaves appropriately. Allowing user-defined documentation formats would also allow users to define their own extensions to Julia Markdown -- and try them out without forcing them on anyone else or needing to modify the Julia parser. |
FWIW I'm in violent agreement with @stevengj w.r.t allowing whatever system we end up with to store arbitrary metadata, not just strings. My impression is that the clojure community (e.g.) has benefited tremendously from this and built some really cool stuff (core.typed anyone?) on top of it, and it seems uncharacteristically restrictive (for what I see as the "Julian" attitude about this sort of thing) to not allow it. |
What @porterjamesj said! Just learned about how Clojure does this: http://en.wikibooks.org/wiki/Learning_Clojure/Meta_Data - very neat! IMO a good implementation would make documentation a special case of a general mechanism to attach metadata to certain kinds of objects. (at least under the hood while providing sufficient syntactic sugar.) ref: #3988 |
which, unless I'm mistaken, is exactly what @stevengj has been arguing for. |
I like the idea of having "..." / """..."""be Julia's default Markdown, whatever flavor that is, so we and our tools don't have to think very hard about how to deal with basic comments. I'd also like to see provision, even if just a placeholder for now, to add flexible metadata. Although most docs right now are either plain text or rich text, there are plenty of areas where a picture or equation would really help, and with tools like IJulia and Juno we already have much of the infrastructure required to serve rich help. |
Note also that if we support attaching an arbitrary "documentation" object with output via
(Where, as I mentioned above, we probably need an optional |
FWIW, I find @stevengj's suggestion really compelling. It seems much easier to make an initial pass that's very vague about what "should" go in a MetaDoc object and flesh it out, than to take a stricter rule about strings and later replace it with MetaDoc objects. |
I'll just note as a minor point that using some kind of clue, like |
We already have a concept that for creating new syntactic elements, and they are called macros and string macros. Having different rules for I'd argue that two extra letters to type for Markdown parsing isn't a big problem. If you use markdown for formatting your documentation, you'll probably have a multiline doc, and two characters seem like a small annoyance. I agree that it is poor style to mix different documentation formats in a single file, it might sometimes be useful. That way you can gradually change format in a file without having to fix all the issues at once. Usually design discussions in Julia has not been won by the argument "someone is going to use this feature to write horrible unreadable code". |
I have to disagree with this. Firstly, a lot of docstrings are likely to look like "`push!(object, x)`: Append x to the object." i.e. not multiline. That said, it's not really about the two character overhead. The fact is that most people will use the most the most convenient documentation form available, so defaulting to plain docstrings amounts to endorsing them. I'm all for supporting richer formats ( |
This is a good idea. One of the problems with Python docstrings is that they are plain text, and you can't get people to use anything else unless it's endorsed by the language implementation. TIMTOWTDI leads to everyone using the lowest common denominator, i.e. plain text. Unambiguously going with one default markup language in Julia makes it better. Markdown is a good choice, especially as IJulia is the de facto "more than plaintext" display environment for Julia. |
Putting myself in the loop to make sure Lint can check through doc string correctly. |
I think that rather the problem with Python docstrings is that there is no standard way of specifying the format. That means that when you aggregate documentation from docstrings, you have to guess the format, and computers are bad at guessing, so the feature is little used. @one-more-minute Maybe that is a valid case, but if I want to save characters to type I'll rather not have to repeat the signature inside the docstring, but have it automatically captured from the actual signature on the next line. |
By the way, another reason to support (a) plain-text strings and (b) non-literal documentation strings is importing help from other languages. e.g. in PyPlot I define various functions which are wrappers around Python functions, and I want their help to be automatically imported from the Python docstring (which is plain text). If we have a const bar_py = pyplot["bar"]
doc convert(String, bar_py["__doc__"])
function bar(...)
end Note also that if you make the doc "*foo*" foo(x) = ...
# versus:
const foodoc = "*foo*"
doc foodoc foo(x) = ... Whereas if you interpret |
Just checking in here to see if we have something usable to start with. Are we still waiting on |
Markdown.jl is already in that other PR (which is good to go as far as I'm concerned, though I'm happy to make any changes if I've missed anything of course). |
Oh yeah we can totally close this |
Some good discussion started here.
This is to more formerly track integrating the necessary parts into Base since it seems some good consensus is building.
@one-more-minute
@MichaelHatherly
The text was updated successfully, but these errors were encountered: