.. _cos-quickstart: |cos| Quick Start ================= This quick start demonstrates how to get up and running with |cos| and its main components. It demonstrates the basic and most powerful capabilities, including creating topics, adding and modifying data, and stream processing by using KSQL. In this quick start you will create Kafka topics and streaming queries on these topics by using KSQL. This quick start leverages the |cp| CLI, the Kafka CLI, and the KSQL CLI. For a rich UI-based experience, try out the |cpe| :ref:`quick start `. You can also run an `automated version of this quick start `_ designed for |cp| local installs. .. important:: .. include:: ../includes/java-reqs.rst :start-after: java_snippet Step 1: Download and Install ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ #. Go to the `downloads page `_ and choose **Confluent Open Source**. #. Provide your name and email and select **Download**. #. Decompress the file. You should have these directories: .. include:: ../includes/tarball-contents.rst #. Start |cp| using the :ref:`Confluent CLI `. This command will start all of the |cp| components, including Kafka, |zk|, |sr|, HTTP REST Proxy for Apache Kafka, Kafka Connect, and KSQL. .. tip:: If not already in your PATH, add Confluent's ``bin`` directory by running: ``export PATH=/bin:$PATH`` .. sourcecode:: bash $ /bin/confluent start Your output should resemble: .. include:: ../includes/cli.rst :start-after: COS_CP_CLI_startup_output :end-before: COS_CP_CLI_startup_output_end Step 2: Create Kafka Topics ^^^^^^^^^^^^^^^^^^^^^^^^^^^ In this step Kafka topics are created in |cp| by using the Kafka CLI. #. Run this command to create a topic named ``users``. .. sourcecode:: bash $ /bin/kafka-topics --create --zookeeper localhost:2181 \ --replication-factor 1 --partitions 1 --topic users Your output should resemble: .. sourcecode:: bash Created topic "users". #. Run this command to create a topic named ``pageviews``. .. sourcecode:: bash $ /bin/kafka-topics --create --zookeeper localhost:2181 \ --replication-factor 1 --partitions 1 --topic pageviews Your output should resemble: .. sourcecode:: bash Created topic "pageviews". .. include:: includes/quickstart.rst :start-after: qs_step_3 :end-before: qs_step_3_produce_1_start #. .. include:: includes/quickstart.rst :start-after: qs_step_3_produce_1_start :end-before: qs_step_3_produce_1_end .. code:: bash $ /bin/ksql-datagen quickstart=pageviews format=delimited topic=pageviews maxInterval=100 #. .. include:: includes/quickstart.rst :start-after: qs_step_3_produce_2_start :end-before: qs_step_3_produce_2_end .. code:: bash $ /bin/ksql-datagen quickstart=users format=json topic=users maxInterval=1000 Step 4: Create and Write to a Stream and Table using KSQL ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ In this step KSQL queries are run on the ``pageviews`` and ``users`` topics that were created in the previous step. The following KSQL commands are run from the :ref:`KSQL CLI `. Enter these commands in your terminal and press **Enter**. .. important:: - |cp| must be :ref:`installed and running `. - To try out the |ksql-ui|, see the |cpe| :ref:`quick start `. - All KSQL commands must end with a closing semicolon (``;``). ^^^^^^^^^^^^^^^^^^^^^^^^^ Create Streams and Tables ^^^^^^^^^^^^^^^^^^^^^^^^^ #. Start the KSQL CLI in your terminal with this command. .. code:: bash $ LOG_DIR=./ksql_logs /bin/ksql .. include:: ../ksql/docs/includes/ksql-includes.rst :start-after: log_limitations_start :end-before: log_limitations_mid #. .. include:: includes/quickstart.rst :start-after: create_streams_tables1_start :end-before: create_streams_tables1_end .. code:: bash ksql> CREATE STREAM pageviews (viewtime BIGINT, userid VARCHAR, pageid VARCHAR) \ WITH (KAFKA_TOPIC='pageviews', VALUE_FORMAT='DELIMITED'); **Tip:** Enter the ``SHOW STREAMS;`` command to view your streams. For example: .. code:: bash Stream Name | Kafka Topic | Format ------------------------------------------------- PAGEVIEWS | pageviews | DELIMITED ------------------------------------------------- #. .. include:: includes/quickstart.rst :start-after: create_streams_tables1_end :end-before: create_streams_tables2_end .. code:: bash ksql> CREATE TABLE users (registertime BIGINT, gender VARCHAR, regionid VARCHAR, \ userid VARCHAR, interests array, contact_info map) \ WITH (KAFKA_TOPIC='users', VALUE_FORMAT='JSON', KEY = 'userid'); **Tip:** Enter the ``SHOW TABLES;`` query to view your tables. .. code:: bash Table Name | Kafka Topic | Format | Windowed -------------------------------------------------------------- USERS | users | JSON | false -------------------------------------------------------------- ^^^^^^^^^^^^^ Write Queries ^^^^^^^^^^^^^ These examples write queries using KSQL. The following KSQL commands are run from the :ref:`KSQL CLI `. Enter these commands in your terminal and press **Enter**. #. .. include:: includes/quickstart.rst :start-after: custom_query_property_start :end-before: custom_query_property_end .. code:: bash ksql> SET 'auto.offset.reset'='earliest'; Your output should resemble: .. code:: bash Successfully changed local property 'auto.offset.reset' from 'null' to 'earliest' #. .. include:: includes/quickstart.rst :start-after: create_query_stream3_start :end-before: create_query_stream3_end .. code:: bash ksql> SELECT pageid FROM pageviews LIMIT 3; Your output should resemble: .. code:: bash Page_45 Page_38 Page_11 LIMIT reached for the partition. Query terminated #. .. include:: includes/quickstart.rst :start-after: create_query_stream3_end :end-before: create_persist_end .. code:: bash ksql> CREATE STREAM pageviews_female AS SELECT users.userid AS userid, pageid, \ regionid, gender FROM pageviews LEFT JOIN users ON pageviews.userid = users.userid \ WHERE gender = 'FEMALE'; Your output should resemble: .. code:: bash Message ---------------------------- Stream created and running ---------------------------- #. .. include:: includes/quickstart.rst :start-after: create_persist_end :end-before: create_persist_regionid_end .. code:: bash ksql> CREATE STREAM pageviews_female_like_89 WITH (kafka_topic='pageviews_enriched_r8_r9', \ value_format='DELIMITED') AS SELECT * FROM pageviews_female WHERE regionid LIKE '%_8' OR regionid LIKE '%_9'; Your output should resemble: .. code:: bash Message ---------------------------- Stream created and running ---------------------------- #. .. include:: includes/quickstart.rst :start-after: create_persist_regionid_end :end-before: custom_query_property_start .. code:: bash ksql> CREATE TABLE pageviews_regions AS SELECT gender, regionid , \ COUNT(*) AS numusers FROM pageviews_female WINDOW TUMBLING (size 30 second) \ GROUP BY gender, regionid HAVING COUNT(*) > 1; Your output should resemble: .. code:: bash Message --------------------------- Table created and running --------------------------- ^^^^^^^^^^^^^^^^^^^^^^ Monitor Streaming Data ^^^^^^^^^^^^^^^^^^^^^^ Now that your streams are running you can monitor them. - View the details for your stream or table with the ``DESCRIBE EXTENDED`` command. For example, run this command to view the ``pageviews_female_like_89`` stream: .. code:: bash DESCRIBE EXTENDED pageviews_female_like_89; Your output should look like this: .. code:: bash Type : STREAM Key field : PAGEVIEWS.USERID Timestamp field : Not set - using Key format : STRING Value format : DELIMITED Kafka output topic : pageviews_enriched_r8_r9 (partitions: 4, replication: 1) Field | Type -------------------------------------- ROWTIME | BIGINT (system) ROWKEY | VARCHAR(STRING) (system) USERID | VARCHAR(STRING) (key) PAGEID | VARCHAR(STRING) REGIONID | VARCHAR(STRING) GENDER | VARCHAR(STRING) -------------------------------------- Queries that write into this STREAM ----------------------------------- id:CSAS_PAGEVIEWS_FEMALE_LIKE_89 - CREATE STREAM pageviews_female_like_89 WITH (kafka_topic='pageviews_enriched_r8_r9', value_format='DELIMITED') AS SELECT * FROM pageviews_female WHERE regionid LIKE '%_8' OR regionid LIKE '%_9'; For query topology and execution plan please run: EXPLAIN Local runtime statistics ------------------------ messages-per-sec: 2.01 total-messages: 10515 last-message: 3/14/18 2:25:40 PM PDT failed-messages: 0 failed-messages-per-sec: 0 last-failed: n/a (Statistics of the local KSQL server interaction with the Kafka topic pageviews_enriched_r8_r9) - Discover the query execution plan with the ``EXPLAIN`` command. For example, run this command to view the query execution plan for ``CTAS_PAGEVIEWS_REGIONS``: .. code:: bash EXPLAIN CTAS_PAGEVIEWS_REGIONS; Your should look like this: .. code:: bash Type : QUERY SQL : CREATE TABLE pageviews_regions AS SELECT gender, regionid , COUNT(*) AS numusers FROM pageviews_female WINDOW TUMBLING (size 30 second) GROUP BY gender, regionid HAVING COUNT(*) > 1; Local runtime statistics ------------------------ messages-per-sec: 1.42 total-messages: 13871 last-message: 3/14/18 2:50:02 PM PDT failed-messages: 0 failed-messages-per-sec: 0 last-failed: n/a (Statistics of the local KSQL server interaction with the Kafka topic PAGEVIEWS_REGIONS) Execution plan -------------- > [ PROJECT ] Schema: [GENDER : STRING , REGIONID : STRING , NUMUSERS : INT64]. > [ FILTER ] Schema: [PAGEVIEWS_FEMALE.GENDER : STRING , PAGEVIEWS_FEMALE.REGIONID : STRING , PAGEVIEWS_FEMALE.ROWTIME : INT64 , KSQL_AGG_VARIABLE_0 : INT64 , KSQL_AGG_VARIABLE_1 : INT64]. > [ AGGREGATE ] Schema: [PAGEVIEWS_FEMALE.GENDER : STRING , PAGEVIEWS_FEMALE.REGIONID : STRING , PAGEVIEWS_FEMALE.ROWTIME : INT64 , KSQL_AGG_VARIABLE_0 : INT64 , KSQL_AGG_VARIABLE_1 : INT64]. > [ PROJECT ] Schema: [PAGEVIEWS_FEMALE.GENDER : STRING , PAGEVIEWS_FEMALE.REGIONID : STRING , PAGEVIEWS_FEMALE.ROWTIME : INT64]. > [ SOURCE ] Schema: [PAGEVIEWS_FEMALE.ROWTIME : INT64 , PAGEVIEWS_FEMALE.ROWKEY : STRING , PAGEVIEWS_FEMALE.USERID : STRING , PAGEVIEWS_FEMALE.PAGEID : STRING , PAGEVIEWS_FEMALE.REGIONID : STRING , PAGEVIEWS_FEMALE.GENDER : STRING]. Processing topology ------------------- Topologies: Sub-topology: 0 Source: KSTREAM-SOURCE-0000000000 (topics: [PAGEVIEWS_FEMALE]) --> KSTREAM-MAPVALUES-0000000001 Processor: KSTREAM-MAPVALUES-0000000001 (stores: []) --> KSTREAM-TRANSFORMVALUES-0000000002 <-- KSTREAM-SOURCE-0000000000 ... Sub-topology: 1 Source: KSTREAM-SOURCE-0000000008 (topics: [KSQL_Agg_Query_1521052072079-repartition]) --> KSTREAM-AGGREGATE-0000000005 Processor: KSTREAM-AGGREGATE-0000000005 (stores: [KSQL_Agg_Query_1521052072079]) --> KTABLE-FILTER-0000000009 <-- KSTREAM-SOURCE-0000000008 ... For more information about KSQL syntax, see :ref:`ksql_syntax_reference`. Next Steps ^^^^^^^^^^ .. include:: includes/quickstart.rst :start-after: next_steps_start :end-before: next_steps_cos_end