Skip to content

Commit

Permalink
Move handling comments in sequences into SequenceBuilder.
Browse files Browse the repository at this point in the history
I had an idea that it would make sense to consolidate all comment
handling in CommentWriter. But with the forthcoming argument list
support, it looks like there will be three separate styles of comments.
At that point, it probably makes more sense to have the comment handling
for each construct with that construct.

So I moved the code in SequencePiece that incrementally builds it and
the code in CommentWriter that handles sequence comments into a new
SequenceBuilder class.

I think this is overall a cleaner design. I like that SequencePiece is
now immutable.
  • Loading branch information
munificent committed Sep 24, 2023
1 parent 9d4a93f commit 0d5f83c
Show file tree
Hide file tree
Showing 6 changed files with 174 additions and 149 deletions.
23 changes: 11 additions & 12 deletions lib/src/front_end/ast_node_visitor.dart
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,11 @@ import 'package:analyzer/dart/ast/visitor.dart';
import 'package:analyzer/source/line_info.dart';

import '../dart_formatter.dart';
import '../piece/sequence.dart';
import '../source_code.dart';
import 'comment_writer.dart';
import 'piece_factory.dart';
import 'piece_writer.dart';
import 'sequence_builder.dart';

/// Visits every token of the AST and produces a tree of [Piece]s that
/// corresponds to it and contains every token and comment in the original
Expand Down Expand Up @@ -166,33 +166,32 @@ class AstNodeVisitor extends ThrowingAstVisitor<void>

@override
void visitCompilationUnit(CompilationUnit node) {
var sequence = SequencePiece();
var sequence = SequenceBuilder(this);

if (node.scriptTag case var scriptTag?) {
addToSequence(sequence, scriptTag);
sequence.add(scriptTag);
sequence.addBlank();
}

// Put a blank line between the library tag and the other directives.
Iterable<Directive> directives = node.directives;
if (directives.isNotEmpty && directives.first is LibraryDirective) {
addToSequence(sequence, directives.first);
sequence.add(directives.first);
sequence.addBlank();
directives = directives.skip(1);
}

for (var directive in directives) {
addToSequence(sequence, directive);
sequence.add(directive);
}

// TODO(tall): Handle top level declarations.
if (node.declarations.isNotEmpty) throw UnimplementedError();

// Write any comments at the end of the file.
beforeSequenceNode(sequence);
writeCommentsAndBlanksBefore(node.endToken.next!);
sequence.addCommentsBefore(node.endToken.next!);

writer.push(sequence);
writer.push(sequence.build());
}

@override
Expand Down Expand Up @@ -488,15 +487,15 @@ class AstNodeVisitor extends ThrowingAstVisitor<void>

@override
void visitLabeledStatement(LabeledStatement node) {
var sequence = SequencePiece();
var sequence = SequenceBuilder(this);

for (var label in node.labels) {
addToSequence(sequence, label);
sequence.add(label);
}

addToSequence(sequence, node.statement);
sequence.add(node.statement);

writer.push(sequence);
writer.push(sequence.build());
}

@override
Expand Down
136 changes: 42 additions & 94 deletions lib/src/front_end/comment_writer.dart
Original file line number Diff line number Diff line change
Expand Up @@ -7,128 +7,72 @@ import 'package:analyzer/dart/ast/token.dart';
import 'package:analyzer/source/line_info.dart';

import '../comment_type.dart';
import '../piece/sequence.dart';
import 'piece_writer.dart';

/// Functionality used by [AstNodeVisitor] to build text and pieces from the
/// comment tokens between meaningful tokens used by AST nodes.
/// Functionality used by [AstNodeVisitor] and [SequenceBuilder] to build text
/// and pieces from the comment tokens between meaningful tokens used by AST
/// nodes.
///
/// Also handles preserving discretionary blank lines in places where they are
/// allowed. These are handled with comments because both comments and
/// whitespace are found between the linear series of [Token]s produced by the
/// analyzer parser. Likewise, both are output as whitespace (in the sense of
/// not being executable code) interleaved with the [Piece]-building code that
/// walks the actual AST and processes the code tokens.
/// Also handles tracking newlines between tokens and comments so that
/// information can be used to preserve discretionary blank lines in places
/// where they are allowed. These are handled along with comments because both
/// comments and whitespace are found between the linear series of [Token]s
/// produced by the analyzer parser. Likewise, both are output as whitespace
/// (in the sense of not being executable code) interleaved with the
/// [Piece]-building code that walks the actual AST and processes the code
/// tokens.
///
/// Comments are a challenge because they confound the intuitive tree-like
/// structure of the code. A comment can appear between any two tokens, and a
/// line comment can force the formatter to insert a newline in places where
/// one wouldn't otherwise make sense. When that happens, the formatter then
/// has to decide how to indent the next line.
///
/// To deal with that, there are two styles or ways that comments are handled:
/// At the same time, comments appearing in idiomatic locations like between
/// statements should be formatted gracefully and give users control over the
/// blank lines around them. To support all of that, comments are handled in a
/// couple of different ways.
///
/// ### Sequence comments
///
/// Most comments appear around statements in a block, members in a class, or
/// at the top level of a file. At the point the comment appears, the formatter
/// is in the middle of building a [SequencePiece]. For those, [CommentWriter]
/// treats the comments almost like their own statements or members and inserts
/// them into the surrounding sequence as their own separate pieces.
///
/// Sequences already support allowing discretionary blank lines between child
/// pieces, so this lets us use that same functionality to control blank lines
/// between comments as well.
///
/// ### Non-sequence comments
/// Comments between top-level declarations, member declarations inside types,
/// and statements are handled directly by [SequenceBuilder].
///
/// All other comments occur inside the middle of some expression or other
/// construct. These get directly embedded in the [TextPiece] of the code being
/// written. When that [TextPiece] is output later, it will include the comments
/// as well.
// TODO(tall): When argument lists and their comment handling is supported,
// mention that here.
mixin CommentWriter {
PieceWriter get writer;

LineInfo get lineInfo;

/// If the next token written is the first token in a sequence element, this
/// will be that sequence.
SequencePiece? _pendingSequence;

/// Call this before visiting an AST node that will become a piece in a
/// [SequencePiece].
void beforeSequenceNode(SequencePiece sequence) {
_pendingSequence = sequence;
}
/// The tokens whose preceding comments have already been taken by calls to
/// [takeCommentsBefore()].
final Set<Token> _takenTokens = {};

/// Writes comments that appear before [token].
void writeCommentsAndBlanksBefore(Token token) {
if (_pendingSequence case var sequence?) {
_pendingSequence = null;
_writeSequenceComments(sequence, token);
} else {
_writeNonSequenceComments(token);
}
}

/// Writes [comments] to [sequence].
/// Returns the comments that appear before [token].
///
/// This is used when the token is the first token in a node inside a
/// sequence. In that case, any comments that belong on their own line go as
/// separate elements in the sequence. This lets the sequence handle blank
/// lines before and/or after them.
void _writeSequenceComments(SequencePiece sequence, Token token) {
var comments = _collectComments(token);

// Edge case: if we require a blank line, but there exists one between
// some of the comments, or after the last one, then we don't need to
// enforce one before the first comment. Example:
//
// library foo;
// // comment
//
// class Bar {}
//
// Normally, a blank line is required after `library`, but since there is
// one after the comment, we don't need one before it. This is mainly so
// that commented out directives stick with their preceding group.
if (comments.containsBlank) {
sequence.removeBlank();
}

for (var i = 0; i < comments.length; i++) {
var comment = comments[i];
if (sequence.contents.isNotEmpty && comments.isHanging(i)) {
// Attach the comment to the previous token.
writer.space();

writer.writeComment(comment, following: true);
} else {
// Write the comment as its own sequence piece.
writer.writeComment(comment);
if (comments.linesBefore(i) > 1) sequence.addBlank();
sequence.add(writer.pop());
writer.split();
}
}

// Write a blank before the token if there should be one.
if (comments.linesBeforeNextToken > 1) sequence.addBlank();
/// The caller is required to write them because a later call to [token()]
/// for this token will not write the preceding comments.
CommentSequence takeCommentsBefore(Token token) {
if (_takenTokens.contains(token)) return CommentSequence.empty;
_takenTokens.add(token);
return _collectComments(token);
}

/// Writes comments before [token] when [token] is not the first element in
/// a sequence.
///
/// In that case, the comments are directly embedded in the [TextPiece]s for
/// the preceding token and/or [token].
void _writeNonSequenceComments(Token token) {
/// Writes comments that appear before [token].
void writeCommentsBefore(Token token) {
// In the common case where there are no comments before the token, early
// out. This avoids calculating the number of newlines between every pair
// of tokens which is slow and unnecessary.
if (token.precedingComments == null) return;

var comments = _collectComments(token);
// Don't write the comments if some other construct has already handled
// them.
if (_takenTokens.contains(token)) return;

var comments = _collectComments(token);
for (var i = 0; i < comments.length; i++) {
var comment = comments[i];

Expand Down Expand Up @@ -161,7 +105,7 @@ mixin CommentWriter {
// just override the script tag's line.
if (token.previous!.type == TokenType.SCRIPT_TAG) previousLine = tokenLine;

var comments = CommentSequence();
var comments = CommentSequence._([], []);
for (Token? comment = token.precedingComments;
comment != null;
comment = comment.next) {
Expand Down Expand Up @@ -287,13 +231,17 @@ class SourceComment {
/// * 3 newlines between `/* c3 */` and `b`
/// ```
class CommentSequence extends ListBase<SourceComment> {
static const CommentSequence empty = CommentSequence._([0], []);

/// The number of newlines between a pair of comments or the preceding or
/// following tokens.
///
/// This list is always one element longer than [_comments].
final List<int> _linesBetween = [];
final List<int> _linesBetween;

final List<SourceComment> _comments;

final List<SourceComment> _comments = [];
const CommentSequence._(this._linesBetween, this._comments);

/// The number of newlines between the comment at [commentIndex] and the
/// preceding comment or token.
Expand Down
31 changes: 10 additions & 21 deletions lib/src/front_end/piece_factory.dart
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,10 @@ import '../piece/import.dart';
import '../piece/infix.dart';
import '../piece/piece.dart';
import '../piece/postfix.dart';
import '../piece/sequence.dart';
import 'ast_node_visitor.dart';
import 'comment_writer.dart';
import 'piece_writer.dart';
import 'sequence_builder.dart';

/// Record type for a destructured binary operator-like syntactic construct.
typedef BinaryOperation = (AstNode left, Token operator, AstNode right);
Expand All @@ -36,23 +38,9 @@ typedef BinaryOperation = (AstNode left, Token operator, AstNode right);
/// word for "import or export directive" or "named thing with argument list".
/// To avoid that, we pick one concrete construct formatted by the function,
/// usually the most common, and name it after that, as in [createImport()].
mixin PieceFactory {
PieceWriter get writer;

void beforeSequenceNode(SequencePiece sequence);

void writeCommentsAndBlanksBefore(Token token);

mixin PieceFactory implements CommentWriter {
void visit(AstNode? node, {void Function()? before, void Function()? after});

/// Adds [node] to [sequence], handling blank lines around it.
void addToSequence(SequencePiece sequence, AstNode node) {
beforeSequenceNode(sequence);
visit(node);
sequence.add(writer.pop());
writer.split();
}

/// Creates metadata annotations for a directive.
///
/// Always forces the annotations to be on a previous line.
Expand All @@ -76,18 +64,19 @@ mixin PieceFactory {
var leftBracketPiece = writer.pop();
writer.split();

var sequence = SequencePiece();
var sequence = SequenceBuilder(this);
for (var node in nodes) {
addToSequence(sequence, node);
sequence.add(node);
}

// Place any comments before the "}" inside the block.
beforeSequenceNode(sequence);
sequence.addCommentsBefore(rightBracket);

token(rightBracket);
var rightBracketPiece = writer.pop();

writer.push(BlockPiece(leftBracketPiece, sequence, rightBracketPiece,
writer.push(BlockPiece(
leftBracketPiece, sequence.build(), rightBracketPiece,
alwaysSplit: nodes.isNotEmpty));
}

Expand Down Expand Up @@ -255,7 +244,7 @@ mixin PieceFactory {
void token(Token? token, {void Function()? before, void Function()? after}) {
if (token == null) return;

writeCommentsAndBlanksBefore(token);
writeCommentsBefore(token);

if (before != null) before();
writeLexeme(token.lexeme);
Expand Down
Loading

0 comments on commit 0d5f83c

Please sign in to comment.