Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOCS-127 - Add KSQL doc to CP #472

Merged
merged 3 commits into from
Nov 30, 2017
Merged

Conversation

joel-hamill
Copy link
Contributor

@joel-hamill joel-hamill commented Nov 20, 2017

This PR was originally filed/reviewed here, and then ported over to this repo (we decided against hosting directly on confluentinc/docs)

@joel-hamill joel-hamill changed the title [WIP] DOCS-127 - Add KSQL doc to CP DOCS-127 - Add KSQL doc to CP Nov 20, 2017
@joel-hamill joel-hamill requested review from a team and ewencp November 20, 2017 21:16
Copy link
Contributor

@ewencp ewencp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

found a couple of issues.

for the old content, do we actually want to hold onto it this way or just rely on the git history?

docs/index.rst Outdated
ksql-clickstream-demo/index
examples
faq
contributing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn't seem to exist?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, yeah. it exists as contributing.md - and should be displayed only in the GH repo.

Example:

.. code:: sql
SELECT * FROM pageviews
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs a newline between the .. code command and its contents. currently it isn't rendering.

**Tip:** If you want to select older data, you can configure KSQL to query the stream from the beginning. You must
run this configuration before running the query:

.. code:: sql
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same issue with newline here

Copy link
Contributor

@ewencp ewencp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't do a deep review of the content, but build structure and based on a quick scan of the rendered docs and build warnings, this LGTM. Should probably have one of the @confluentinc/ksql folks do the longer content/formatting review.

Copy link
Contributor

@miguno miguno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made a quick first pass. Would appreciate reviews from other team members, though.


ARG DOCKER_REGISTRY

FROM ${DOCKER_REGISTRY}confluentinc/docker-demo-base:3.3.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The version string 3.3.0 is outdated I think? cc @bluemonk3y

Copy link

@bluemonk3y bluemonk3y Nov 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should probably be 3.3.1
4.0.x is not available

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to demo-base:3.3.1

ADD demo/*sh /usr/share/doc/ksql-clickstream-demo/
ADD demo/*sql /usr/share/doc/ksql-clickstream-demo/
ADD demo/*json /usr/share/doc/ksql-clickstream-demo/
ADD demo/connect-config/null-filter-4.0.0-SNAPSHOT.jar /share/java/kafka-connect-elasticsearch/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure about the hardcoded 4.0.0-SNAPSHOT version string? Shouldn't this be sth like ${KSQL_VERSION}? (I don't know, I just don't like hardcoding versions unless it's truly what we want here)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, should this jar file go to /share/java/kafka-connect-elasticsearch/ or to /usr/share/java/kafka-connect-elasticsearch/?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for now - its stuck on version 4.0.0-snapshot
correct dest is: /share/java/kafka-connect-elasticsearch/

Prerequisites
-------------

- :ref:`Confluent 3.3.0 <installation>` locally
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still 3.3.0?

Copy link

@bluemonk3y bluemonk3y Nov 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that can be bumped to 4.0.0

3. Copy the Kafka Connect Elasticsearch configuration file
(``ksql/ksql-clickstream-demo/demo/connect-config/null-filter-4.0.0-SNAPSHOT.jar``)
to your Confluent installation ``share`` directory
(``confluent-3.3.0/share/java/kafka-connect-elasticsearch/``).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3.3.0?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes


.. code:: bash

cp ksql-clickstream-demo/demo/connect-config/null-filter-4.0.0-SNAPSHOT.jar <path-to-confluent-3.3.0>/share/java/kafka-connect-elasticsearch/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3.3.0? 4.0.0-SNAPSHOT?

If you are running the Confluent Platform, you can stop it with this
command.

::
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment on bash highlighting above.

:depth: 1

Prerequisites
- :ref:`Confluent Platform 3.3.0 <installation>` or later is installed. This installation includes a Kafka broker, ZooKeeper, Schema Registry, REST Proxy, and Kafka Connect.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3.3.0?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep

====================

This topic describes how to setup a Kafka cluster and start KSQL in your local environment. After you complete these steps,
you can return to the :ref:`create-a-stream-and-table` and start querying the data in the Kafka cluster.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps add a link to the KSQL Docker instructions? Think: "We also describe how to do this with Docker if you prefer that." Just in case people don't follow the top-level navigation down to here, but find this page via Google Search.

Same backlink for the Docker instructions to this page (think: "If you don't want to use Docker, we also have ...")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doc appears in a table of contents and will show docker as well, a few inches to the left. i think it's best not to add extraneous (repeat) information.

image

Start Kafka
-----------

Navigate to the ``confluent-3.3.0`` directory and start the Confluent
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3.3.0?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep


.. code:: bash

$ ./bin/confluent start
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're not consistent IMHO. Sometimes we say ./bin/confluent start, sometimes we say <path-to-confluent-3.3.0>/bin/confluent start. Can we pick one and stick with it? Note that this should be consistent across the docs of all CP components.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be <path-to-confluent>/bin/confluent start.

sent $100 to Bob, then Charlie sent $50 to Bob”. Facts in a stream are
immutable, which means new facts can be inserted to a stream, but
existing facts can never be updated or deleted. Streams can be created
from a Kafka topic or derived from existing streams and tables. In both
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Streams cannot be created from tables.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hjafarpour so should this portion be deleted or derived from existing streams and tables?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be derived from existing streams.

$ ksql-cli local --properties-file ./ksql.properties

# Start a KSQL server node (for client-server mode) with the custom properties above
$ ksql-server-start --properties-file ./ksql.properties
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No --properties-file flag for ksql-server.

interests array<VARCHAR>, \
contact_info map<VARCHAR, VARCHAR>) \
WITH (kafka_topic='users-topic', \
value_format='JSON');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add key = 'userid' to the WITH section similar to the create stream example.

Copy link

@bluemonk3y bluemonk3y left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

once version changes are made - lgtm

@joel-hamill
Copy link
Contributor Author

@bluemonk3y are we recommending 3.3.0 for quickstart (#472 (comment)) and 4.0.0 for clickstream (#472 (comment))? seems like we should pick one.

Copyright 2017 Confluent Inc.

CLI v0.1, Server v0.1 located at http://localhost:9098

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Version for CLI and server is 0.2 now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed globally

Copyright 2017 Confluent Inc.

CLI v0.1, Server v0.1 located at http://localhost:9098

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, version is 0.2 now.

= Streaming SQL Engine for Kafka =
Copyright 2017 Confluent Inc.

CLI v0.1, Server v0.1 located at http://localhost:9098
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Version is 0.2 now.

Copy link
Contributor

@miguno miguno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments, thanks @joel-hamill.

docs/README.md Outdated
@@ -1,7 +1,6 @@
# KSQL Documentation

| Overview |[Quick Start](/docs/quickstart#quick-start) | [Concepts](/docs/concepts.md#concepts) | [Syntax Reference](/docs/syntax-reference.md#syntax-reference) |[Demo](/ksql-clickstream-demo#clickstream-analysis) | [Examples](/docs/examples.md#examples) | [FAQ](/docs/faq.md#frequently-asked-questions) | [Roadmap](/docs/roadmap.md#roadmap) |
|---|----|-----|----|----|----|----|----|
The full Confluent Platform documentation is available at [docs.confluent.io](https://docs.confluent.io/current/ksql/docs/index.html).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be clear that this is pointing to the KSQL docs in CP.

I'd suggest sth like:

See the KSQL documentation in Confluent Platform.

docs/README.md Outdated
@@ -26,7 +25,41 @@ You can use KSQL in standalone, client-server, application, and embedded modes.

## Getting Started

* Beginners: Try the [interactive quick start](/docs/quickstart#quick-start). The quick start configures a single instance in a lightweight Docker container or in a Kafka cluster. It demonstrates a simple workflow using KSQL to write streaming queries against data in Kafka.
* Advanced users: Try the [end-to-end Clickstream Analysis demo](/ksql-clickstream-demo#clickstream-analysis).
* Beginners: Try the [interactive quick start](quickstart/index.rst). The quick start configures a single instance in a lightweight Docker container or in a Kafka cluster. It demonstrates a simple workflow using KSQL to write streaming queries against data in Kafka.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: I didn't have the time to double-check links etc.

@@ -1,7 +1,6 @@
# KSQL Documentation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meta comment: There's a lot of duplicated content for README.md and docs/README.md. I'd suggest that we focus on the top-level README.md. The docs/README.md doesn't need (essentially) duplicate sections on "KSQL overview", "Use cases", "Modes of operaiton", "Getting started", etc.

My suggestion would be that docs/README.md contains only:

  1. Pointer to the (rendered) KSQL docs under docs.confluent.io = what we currently have at the beginning. This is for users of KSQL.
  2. An explanation how to contribute to the KSQL documentation ("hey, we use Sphinx" etc.). This is for contributors of KSQL.

To keep things simple, I'd suggest to keep (1) in this README, but move most of (2) into the new top-level contributing.md file, and only link from docs/README.md to contributing.md. Otherwise docs/README.md is very busy and comes across as if you'd have to build the docs yourself if you fail to see the "Click here to go to docs.confluent.io" section at the top. That is, this README's content should not be 5% of what 99% of users need (where are the docs? in case I ended up in the docs/ folder), and 95% of what 1% of users need (how can I contribute to docs?).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, makes sense - i'll make the update.

@@ -70,7 +70,7 @@ Whether you need help, want to contribute, or are just looking for the latest ne
* Join the [Confluent Google group](https://groups.google.com/forum/#!forum/confluent-platform).

# Contributing
Contributions to the code, examples, documentation, etc, are very much appreciated. For more information, see the [contribution guidelines](/docs/contributing.md).
Contributions to the code, examples, documentation, etc, are very much appreciated. For more information, see the [contribution guidelines](contributing.md).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Typically, the Markdown files are uppercased, i.e. README.md and CONTRIBUTING.md rather than readme.md and contributing.md.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm not aware of this convention. the filename is lowercase and the link should match the filename?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, typically the .md filenames are uppercased. For example, our GH README is called README.md, not Readme.md or readme.md. Same for CONTRIBUTING.md vs. contributing.md. Again, a nit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to ignore, Joel.

The KSQL server runs the engine that executes KSQL queries, which
includes the data processing as well as reading data from and writing
data to the target Kafka cluster.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on user feedback, and while we're at it, I'd suggest to add a new paragraph here (as discussed with @bluemonk3y):

As described in the deployment documentation, servers can run in containers, VMs, or on bare-metal machines. You can add and remove multiple servers in the same resource pool to elastically scale in and out the processing of queries, and use different resource pools to achieve isolation of workloads.

There's some overlap here with what we want to cover in the section on deployment modes, but I think we should also mention the info above already here where most people are likely to read it.

Standalone mode
---------------

In stand-alone mode, both the KSQL client and server components are
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit consistency: standalone vs. stand-alone

Client-server mode
------------------

In client-server mode, you can run a pool of KSQL servers on remote
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested:

In client-server mode, the KSQL servers are run separately from the KSQL CLI (the client). You can deploy servers on remote machines, VMs, or containers. The CLI then connects to these remote servers.

You can add and remove servers to/from the same resource pool during live operations to elastically scale in and out the processing of queries. You can use different resource pools to achieve isolation of workloads, for example by deploying separate pools for production and for testing and exploration.


$ ./bin/ksql-cli remote http://my-ksql-server:8090

All KSQL servers (and their engines) share the work of processing KSQL
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bluemonk3y : We should update this section on how to actually pool the servers.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A KSQL server pool is one where all servers share the same 'command' topic. The default command topic is named: 'ksql__commands'. The command topic name can be changed by applying the following system properties to the ksqlserver.properties file.
ksql.service.id = ksql_
ksql.command.topic.suffix=commands

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good draft! Here's my slightly modified one:

KSQL servers that share the same "command" topic belong to the same resource pool. By default, a KSQL server uses the command topic named ksql__commands. To assign a server to a different pool than the default, change the ksql.command.topic.suffix setting in its configuration:

# Default value: `commands`.
#
# Changing this to `production_commands` as shown below will result in
# the command topic named `ksql__production_commands` being used.
ksql.command.topic.suffix = production_commands

Copy link
Contributor

@miguno miguno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated comment on "pools".

WIP

WIP

WIP

WIP

WIP

WIP

WIP

Merge change from PR 440

Review comments from @miguno and @hjafarpour

comments addressed

Change CLI version

Change version 4.0

WIP

WIP

WIP

WIP

WIP

WIP

WIP

WIP

TOC
@joel-hamill joel-hamill changed the base branch from 0.1.x to 4.0.x November 30, 2017 23:19
@joel-hamill joel-hamill merged commit c488125 into 4.0.x Nov 30, 2017
@joel-hamill joel-hamill deleted the joel-hamill/merge-ksql-2-cp branch November 30, 2017 23:21
@miguno
Copy link
Contributor

miguno commented Dec 1, 2017

Why was this merged into 4.0.x? That was a mistake. It means our docs on GH are currently broken.

$ cd ksql/docs/quickstart/
$ docker-compose up -d

After you have successfully started the Kafka cluster and started
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This paragraph and the KSQL prompt doesn't belong here. You only see the KSQL ascii art prompt after you ran the docker-compose exec ksql-cli ksql-cli local --bootstrap-server kafka:29092 command in the next section on "Start KSQL".

Please also double-check that the non-docker Quickstart doesn't have the same mistake.

apurvam pushed a commit to apurvam/ksql that referenced this pull request Dec 4, 2017
apurvam pushed a commit to apurvam/ksql that referenced this pull request Dec 4, 2017
@joel-hamill
Copy link
Contributor Author

@miguno i confirmed with @hjafarpour that this should be merged to 4.0.x last week - and that is why it was merged. what exactly is broken?

apurvam added a commit that referenced this pull request Dec 5, 2017
* Revert "Make doc links point to GH repo (#496)"

This reverts commit 0371a08.

* Revert "DOCS-127 - Add KSQL doc to CP (#472)"

This reverts commit c488125.
@joel-hamill joel-hamill restored the joel-hamill/merge-ksql-2-cp branch December 5, 2017 23:31
@xvrl
Copy link
Member

xvrl commented May 29, 2018

looks like the PR broke links in our January blog post https://www.confluent.io/blog/ksql-january-release-streaming-sql-apache-kafka/

@miguno
Copy link
Contributor

miguno commented May 29, 2018

Good catch @xvrl -- I updated the blog post with new links.

@miguno
Copy link
Contributor

miguno commented May 29, 2018

PS: I also checked the Dec 2017, Feb 2018 blog posts - all good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants