documentation
Get Started Free
  • Get Started Free
  • Stream
      Confluent Cloud

      Fully-managed data streaming platform with a cloud-native Kafka engine (KORA) for elastic scaling, with enterprise security, stream processing, governance.

      Confluent Platform

      An on-premises enterprise-grade distribution of Apache Kafka with enterprise security, stream processing, governance.

  • Connect
      Managed

      Use fully-managed connectors with Confluent Cloud to connect to data sources and sinks.

      Self-Managed

      Use self-managed connectors with Confluent Platform to connect to data sources and sinks.

  • Govern
      Managed

      Use fully-managed Schema Registry and Stream Governance with Confluent Cloud.

      Self-Managed

      Use self-managed Schema Registry and Stream Governance with Confluent Platform.

  • Process
      Managed

      Use Flink on Confluent Cloud to run complex, stateful, low-latency streaming applications.

      Self-Managed

      Use Flink on Confluent Platform to run complex, stateful, low-latency streaming applications.

Stream
Confluent Cloud

Fully-managed data streaming platform with a cloud-native Kafka engine (KORA) for elastic scaling, with enterprise security, stream processing, governance.

Confluent Platform

An on-premises enterprise-grade distribution of Apache Kafka with enterprise security, stream processing, governance.

Connect
Managed

Use fully-managed connectors with Confluent Cloud to connect to data sources and sinks.

Self-Managed

Use self-managed connectors with Confluent Platform to connect to data sources and sinks.

Govern
Managed

Use fully-managed Schema Registry and Stream Governance with Confluent Cloud.

Self-Managed

Use self-managed Schema Registry and Stream Governance with Confluent Platform.

Process
Managed

Use Flink on Confluent Cloud to run complex, stateful, low-latency streaming applications.

Self-Managed

Use Flink on Confluent Platform to run complex, stateful, low-latency streaming applications.

Learn
Get Started Free
  1. Home
  2. Platform
  3. Confluent Platform for Apache Flink
  4. Flink Jobs
  5. Deploy and Manage Statements in Confluent Manager for Apache Flink

CONFLUENT PLATFORM

  • Overview
  • Get Started
    • Platform Overview
    • Quick Start
    • Learn More About Confluent and Kafka
    • Tutorial: Set Up a Multi-Broker Cluster
    • Scripted Confluent Platform Demo
      • Overview
      • Deploy Confluent Platform Environment
      • Deploy Hybrid Confluent Platform and Cloud Environment
      • Troubleshoot and Stop the Demo
  • Install and Upgrade
    • Overview
    • System Requirements
    • Supported Versions and Interoperability
    • Install Manually
      • ZIP and TAR
      • Ubuntu and Debian
      • RHEL, CentOS, Rocky, & Amazon Linux
      • Docker
        • Install
        • Configure
        • Image Reference
        • Security
        • Developer Guide
      • Configure Automatic Startup and Monitoring
    • Deploy with Ansible Playbooks
    • Deploy with Confluent for Kubernetes
    • License
    • Upgrade
      • Overview
      • Upgrade the Operating System
      • Confluent Platform Upgrade Procedure
    • Installation Packages
    • Migrate to Confluent Platform
    • Migrate to and from Confluent Server
    • Migrate from Confluent Server to Confluent Kafka
    • Migrate from ZooKeeper to KRaft
    • Installation FAQ
  • Build Client Applications
    • Overview
    • Configure Clients
      • Consumer
      • Share Consumers
      • Producer
      • Schemas, Serializers, and Deserializers
      • Configuration Properties
    • Client Guides
      • Python
      • .NET Client
      • JavaScript Client
      • Go Client
      • C++ Client
      • Java Client
    • Client Examples
      • Overview
      • Python Client
      • .NET Client
      • JavaScript Client
      • Go Client
      • C++ Client
      • Java
      • Spring Boot
      • KafkaProducer
      • REST
      • Clojure
      • Groovy
      • Kafka Connect Datagen
      • kafkacat
      • Kotlin
      • Ruby
      • Rust
      • Scala
    • Kafka Client APIs
      • Python Client API
      • .NET Client API
      • JavaScript Client API
      • Go Client API
      • C++ Client API
      • Java Client API
    • Deprecated Client APIs
    • VS Code Extension
      • Overview
      • Quick Start
      • Connect to a Kafka Cluster
    • MQTT Proxy
      • Overview
      • Secure Communication
      • Configure
  • Build Kafka Streams Applications
    • Overview
    • Quick Start
    • Streams API
    • Tutorial: Streaming Application Development Basics on Confluent Platform
    • Connect Streams to Confluent Cloud
    • Concepts
    • Architecture
    • Examples
    • Developer Guide
      • Overview
      • Write a Streams Application
      • Configure
      • Run a Streams Application
      • Test
      • Domain Specific Language
      • Name Domain Specific Language Topologies
      • Optimize Topologies
      • Processor API
      • Data Types and Serialization
      • Interactive Queries
      • Memory
      • Manage Application Topics
      • Security
      • Reset Streams Applications
    • Build Pipeline with Connect and Streams
    • Operations
      • Metrics
      • Monitor Kafka Streams Applications in Confluent Platform
      • Integration with Confluent Control Center
      • Plan and Size
    • Upgrade
    • Frequently Asked Questions
    • Javadocs
    • ksqlDB
      • Overview
      • Quick Start
      • Install
      • Operate
      • Upgrade
      • Concepts
        • Overview
        • Kafka Primer
        • Connectors
        • Events
        • Functions
        • Lambda Functions
        • Materialized Views
        • Queries
        • Streams
        • Stream Processing
        • Tables
        • Time and Windows in ksqlDB Queries
      • How-to Guides
        • Overview
        • Control the Case of Identifiers
        • Convert a Changelog to a Table
        • Create a User-defined Function
        • Manage Connectors
        • Query Structured Data
        • Test an Application
        • Update a Running Persistent Query
        • Use Variables in SQL Statements
        • Use a Custom Timestamp Column
        • Use Lambda Functions
      • Develop Applications
        • Overview
        • Joins
          • Overview
          • Join Streams and Tables
          • Partition Data
          • Synthetic Keys
        • Reference
          • Overview
          • Aggregate Functions
          • ALTER SYSTEM
          • ASSERT SCHEMA
          • ASSERT TOPIC
          • CREATE CONNECTOR
          • CREATE STREAM AS SELECT
          • CREATE STREAM
          • CREATE TABLE AS SELECT
          • CREATE TABLE
          • CREATE TYPE
          • DEFINE
          • DESCRIBE CONNECTOR
          • DESCRIBE FUNCTION
          • DESCRIBE
          • DROP CONNECTOR
          • DROP STREAM
          • DROP TABLE
          • DROP TYPE
          • EXPLAIN
          • Functions
          • INSERT INTO
          • INSERT VALUES
          • Operators
          • PAUSE
          • PRINT
          • Quick Reference
          • RESUME
          • RUN SCRIPT
          • Scalar Functions
          • SELECT (Pull Query)
          • SELECT (Push Query)
          • SHOW CONNECTORS
          • SHOW FUNCTIONS
          • SHOW PROPERTIES
          • SHOW QUERIES
          • SHOW STREAMS
          • SHOW TABLES
          • SHOW TOPICS
          • SHOW TYPES
          • SHOW VARIABLES
          • SPOOL
          • Table Functions
          • TERMINATE
          • Type Coercion
          • UNDEFINE
        • REST API
          • Overview
          • REST API Reference
          • Cluster Status
          • Info
          • Is Valid Property
          • Run SQL statements
          • Query a Stream
          • Get statement status
          • Streaming API endpoint
          • Terminate a cluster
        • Java Client
      • Operate and Deploy
        • Overview
        • Installation
          • Overview
          • Install ksqlDB
          • Install with Docker Containers
          • Check Server Health
          • Configure ksqlDB Server
          • Configure Security
          • Connect the CLI to a Server
          • Integrate with Schema Registry
          • Upgrade ksqlDB
          • Legacy Documentation
        • ksqlDB Architecture
        • Capacity Planning
        • Changelog
        • Processing Guarantees
        • High Availability
        • High Availability Pull Queries
        • KSQL versus ksqlDB
        • Logging
        • Manage Metadata Schemas
        • Monitoring
        • Performance Guidelines
        • Schema Inference With ID
        • Schema Inference
      • Reference
        • Overview
        • SQL
          • Overview
          • SQL Keywords and Operators
          • Use DDL to Structure Data
          • Data Types
          • Syntax and Lexical Structure
          • Time Units and Formats
        • Metrics
        • Migrations Tool
        • Processing Log
        • Serialization Formats
        • Server Configuration Parameters
        • User-defined functions (UDFs)
      • Run ksqlDB in Confluent Cloud
      • Connect Local ksqlDB to Confluent Cloud
      • Connect ksqlDB to Control Center
      • Secure ksqlDB with RBAC
      • Frequently Asked Questions
      • Troubleshoot
      • Tutorials and Examples
        • Overview
        • How-to Guides
          • Overview
          • Control the Case of Identifiers
          • Convert a Changelog to a Table
          • Create a User-defined Function
          • Manage Connectors
          • Query Structured Data
          • Test an Application
          • Update a Running Persistent Query
          • Use Variables in SQL Statements
          • Use a Custom Timestamp Column
          • Use Lambda Functions
        • Materialized View
        • Streaming ETL Pipeline
        • Event-Driven Microservice
        • Build Clickstream Data Analysis Pipeline
  • Confluent Private Cloud
    • Overview
    • Confluent Private Cloud Gateway
    • Intelligent Replication
      • Overview
      • Enable Intelligent Replication
      • Monitor Intelligent Replication
      • Configure Intelligent Replication
      • Troubleshoot Intelligent Replication
    • Release Notes for Confluent Private Cloud
  • Confluent REST Proxy for Apache Kafka
    • Overview
    • Quick Start
    • API Reference
    • Production Deployment
      • Overview
      • Deploy REST Proxy for Confluent Server
        • Configure REST Admin APIs
        • Configure Security
      • Deploy a Standalone REST Proxy node
        • Overview
        • Configure
        • Monitor
        • Secure REST Proxy
    • Connect to Confluent Cloud
  • Process Data With Flink
    • Overview
    • Installation and Upgrade
      • Overview
      • Versions and Interoperability
      • Install with Helm
      • Configure Authentication
      • Configure Authorization
      • Configure Encryption
      • Upgrade
    • Get Started
      • Get Started with Applications
      • Get Started with Statements
    • Architecture and Features
      • Overview
      • Understand Flink
      • Confluent Manager for Apache Flink
    • Configure Environments, Catalogs and Compute Pools
      • Overview
      • Manage Environments
      • Manage Catalogs and Databases
      • Manage Compute Pools
      • Configure Access Control
    • Deploy and Manage Flink Jobs
      • Overview
      • Applications
        • Overview
        • Create Applications
        • Manage Applications
        • Application Instances
        • Events
        • Package Flink Jobs
        • Package PyFlink Jobs
        • Supported Features
      • SQL Statements
        • Overview
        • Create Statements
        • Manage Statements
        • Use Interactive Shell
        • Forecast
        • Anomaly Detection
        • Features and Support
      • Manage Savepoints
      • Job Configuration
        • Overview
        • Logging
        • Metrics
        • Security
    • Disaster Recovery
    • Clients and APIs
      • Overview
      • Use REST APIs
      • Use CLI Operations
      • Use Confluent for Kubernetes
      • Use Control Center with Confluent Manager for Apache Flink
    • How-to Guides
      • Overview
      • Checkpoint to S3
    • FAQ
    • Get Help
    • What’s New
  • Connect to External Services
    • Overview
    • Get Started
    • Connectors
    • Confluent Hub
      • Overview
      • Component Archive Specification
      • Contribute
    • Connect on z/OS
    • Install
    • License
    • Supported
      • Supported Self-Managed Connectors
      • Supported Connector Versions in Confluent Platform 8.1
    • Preview
    • Configure
    • Monitor
    • Logging
    • Connect to Confluent Cloud
    • Developer Guide
    • Tutorial: Moving Data In and Out of Kafka
    • Reference
      • Kafka Connect Javadocs
      • REST interface
      • Kafka Connect Worker Configuration Properties for Confluent Platform
      • Connector Configuration Properties for Confluent Platform
    • Transform
    • Custom Transforms
    • Security
      • Kafka Connect Security Basics
      • Kafka Connect and RBAC
        • Get Started With RBAC and Kafka Connect
        • Configure RBAC for a Connect Cluster
        • Configure RBAC for a Connect Worker
        • RBAC for self-managed connectors
        • Connect Secret Registry
        • Example Connect role-binding sequence
      • Manage CSFLE in Confluent Cloud for Self-Managed Connectors
      • Manage CSFLE in Confluent Platform for Self-Managed Connectors
      • Manage CSFLE for partner-managed connectors
    • Design
    • Add Connectors and Software
    • Install Community Connectors
    • Upgrade
    • Troubleshoot
    • FileStream Connectors
    • FAQ
  • Manage Schema Registry and Govern Data Streams
    • Overview
    • Get Started with Schema Registry Tutorial
    • Install and Configure
      • Install
      • Configure Schema Registry
      • Configure Clients to Schema Registry
      • Deploy in Production
      • Deployment Architectures
      • Use Schema Registry to Migrate Schemas in Confluent Platform
    • Fundamentals
      • Concepts
      • Schema Evolution and Compatibility
      • Schema Formats
        • Serializers and Deserializers Overview
        • Avro
        • Protobuf
        • JSON Schema
      • Data Contracts
    • Manage Schemas
      • Work with Schemas in Control Center
      • Schema Contexts
      • Schema Linking
      • Validate Schema IDs
      • Monitor
      • Delete Schemas
      • Integrate Schemas from Connectors
    • Security
      • Overview
      • Configure Role-Based Access Control
      • Configure OAuth
      • Schema Registry Security Plugin
        • Overview
        • Install
        • Schema Registry Authorization
          • Operation and Resource Support
          • Role-Based Access Control
          • ACL Authorizer
          • Topic ACL Authorizer
      • Passwordless authentication for Schema Registry
    • Reference
      • Overview
      • Maven Plugin
      • API
      • API Examples
    • FAQ
  • Manage Security
    • Overview
    • Deployment Profiles
    • Compliance
      • Overview
      • Audit Logs
        • Audit Logs Concepts
        • Auditable Events
        • Configure Audit Logs Using Confluent CLI
        • Configure MDS to Manage Centralized Audit Logs
        • MDS API Audit Log Configuration
        • Use Properties Files to Configure Audit Logs in Confluent Platform
      • Manage Secrets
        • Overview
        • Tutorial: Protect Secrets
    • Authenticate
      • Overview
      • Mutual TLS
        • Overview
        • Use Principal Mapping
      • OAuth/OIDC
        • Overview
        • Claim Validation for OAuth JWT tokens
        • OAuth/OIDC Service-to-Service Authentication
        • Configure Confluent Server Brokers
        • Configure Confluent Schema Registry
        • Configure Metadata Service
        • Configure Kafka Connect
        • Configure Confluent Control Center
        • Configure REST Proxy
        • Configure Truststores for TLS Handshake with Identity Providers
        • Migrate from mTLS to OAuth Authentication
        • Use OAuth with ksqlDB
      • Multi-Protocol Authentication
        • Overview
        • Use AuthenticationHandler Class
      • REST Proxy
        • Overview
        • Principal Propagation for mTLS
      • SSO for Confluent Control Center
        • Overview
        • Configure OIDC SSO for Control Center
        • Configure OIDC SSO for Confluent CLI
        • Troubleshoot
      • HTTP Basic Authentication
        • Overview
      • SASL
        • Overview
        • SASL/GSSAPI (Kerberos)
          • Overview
        • SASL/OAUTHBEARER
          • Overview
          • Configure Confluent Server Brokers
          • Configure Clients
          • Configure UAMI
        • SASL/PLAIN
          • Overview
        • SASL/SCRAM
          • Overview
      • LDAP
        • Overview
        • Configure Kafka Clients
      • Delegation Tokens
        • Overview
    • Authorize
      • Overview
      • Access Control Lists
        • Overview
        • Manage ACLs
      • Role-Based Access Control
        • Overview
        • Quick Start
        • Predefined RBAC Roles
        • Cluster Identifiers
        • Example of Enabling RBAC
        • Enable RBAC on Running Cluster
        • Use mTLS with RBAC
        • Configure mTLS with RBAC
        • Deployment Patterns for mTLS with RBAC
        • Client Flow for OAuth-OIDC using RBAC
        • Migrate LDAP to OAuth for RBAC
        • Migrate LDAP to mTLS for RBAC
        • RBAC using REST API
        • Use Centralized ACLs with MDS for Authorization
        • Request Forwarding with mTLS RBAC
        • Deploy Secure ksqlDB with RBAC
        • Metadata API
      • LDAP Group-Based Authorization
        • Configure LDAP Group-Based Authorization
        • LDAP Configuration Reference
        • Tutorial: Group-Based Authorization Using LDAP
        • Configure Confluent Server Authorizer in Confluent Platform
    • Protect Data
      • Overview
      • Protect Data in Motion with TLS Encryption
      • Protect Sensitive Data Using Client-side Field Level Encryption
        • Overview
        • Quick Start
        • Use Client-side Field Level Encryption
        • Configuration Settings
        • Manage Encryption Keys
        • Implement a Custom KMS Driver
        • Code examples
        • Troubleshoot
        • FAQ
      • Redact Confluent Logs
    • Configure Security Properties using Prefixes
    • Secure Components
      • Overview
      • Schema Registry
      • Kafka Connect
      • KRaft Security
      • ksqlDB RBAC
      • REST Proxy
        • Deploy Secure Standalone REST Proxy in Confluent Platform
        • REST Proxy Security Plugins in Confluent Platform
    • Enable Security for a Cluster
    • Add Security to Running Clusters
    • Configure Confluent Server Authorizer
    • Security Management Tools
      • Ansible Playbooks for Confluent Platform
      • Deploy Secure Confluent Platform Docker Images
    • Cluster Registry
    • Encrypt using Client-Side Payload Encryption
  • Deploy Confluent Platform in a Multi-Datacenter Environment
    • Overview
    • Multi-Data Center Architectures on Confluent Platform
    • Cluster Linking on Confluent Platform
      • Overview
      • Tutorials
        • Share Data Across Topics
        • Link Hybrid Cloud and Bridge-to-Cloud Clusters
        • Migrate Data
      • Manage
        • Manage Mirror Topics
        • Configure
        • Command Reference
        • Monitor
        • Security
      • FAQ
      • Troubleshooting
    • Multi-Region Clusters on Confluent Platform
      • Overview
      • Tutorial: Multi-Region Clusters
      • Tutorial: Move Active-Passive to Multi-Region
    • Replicate Topics Across Kafka Clusters in Confluent Platform
      • Overview
      • Example: Active-active Multi-Datacenter
      • Tutorial: Replicate Data Across Clusters
      • Tutorial: Run as an Executable or Connector
      • Configure
      • Verify Configuration
      • Tune
      • Monitor
      • Configure for Cross-Cluster Failover
      • Migrate from MirrorMaker to Replicator
      • Replicator Schema Translation Example for Confluent Platform
  • Configure and Manage
    • Overview
    • Configuration Reference
      • Overview
      • Configure Brokers and Controllers
      • Configure Topics
      • Configure Consumers
      • Configure Producers
      • Configure Connect
        • Overview
        • Configure Sink Connectors
        • Configure Source Connectors
      • Configure AdminClient
      • Configure Licenses
      • Configure Streams
    • CLI Tools for Use with Confluent Platform
      • Overview
      • Bundled CLI Tools
      • Confluent CLI
      • Generate Diagnostics
      • Check Clusters for KRaft Migration
      • kcat (formerly kafkacat) Utility
    • Change Configurations Without Restart
    • Manage Clusters
      • Overview
      • Cluster Metadata Management
        • Overview
        • KRaft Overview
        • Configure KRaft
        • Find ZooKeeper Resources
      • Manage Self-Balancing Clusters
        • Overview
        • Tutorial: Adding and Remove Brokers
        • Configure
        • Performance and Resource Usage
      • Auto Data Balancing
        • Overview
        • Quick Start
        • Tutorial: Add and Remove Brokers
        • Configure
      • Tiered Storage
    • Metadata Service (MDS) in Confluent Platform
      • Configure MDS
      • Configure Communication with MDS over TLS
      • Configure mTLS Authentication and RBAC for Kafka Brokers
      • Configure Kerberos Authentication for Brokers Running MDS
      • Configure LDAP Authentication
      • Configure LDAP Group-Based Authorization for MDS
      • MDS as token issuer
      • Metadata Service Configuration Settings
      • MDS File-Based Authentication for Confluent Platform
    • Docker Operations for Confluent Platform
      • Overview
      • Monitor and Track Metrics Using JMX
      • Configure Logs
      • Mount External Volumes
      • Configure a Multi-Node Environment
    • Run Kafka in Production
    • Production Best Practices
  • Manage Hybrid Environments with USM
    • Overview
    • Get Started with USM
    • USM Agent Operations
    • Schema Registry in Hybrid setup
  • Monitor with Control Center
  • Monitor
    • Logging
    • Monitor with JMX
    • Monitor with Metrics Reporter
    • Monitor Consumer Lag
    • Monitor with Health+
      • Overview
      • Enable
      • Intelligent Alerts
      • Monitor Using Dashboard
      • Configure Telemetry Reporter
      • Telemetry Reporter Metrics Reference
      • FAQ
  • Confluent CLI
  • Release Notes
    • Release Notes
    • Changelogs
  • APIs and Javadocs for Confluent Platform
    • Overview
    • Kafka API and Javadocs for Confluent Platform
      • Kafka Java Client APIs
      • Kafka Producer Java API
      • Kafka Consumer Java API
      • Kafka AdminClient Java API
      • Kafka Common Java API
      • Kafka Streams Java API
      • Kafka Connect Java API
    • Client APIs
      • Python Client API
      • .NET Client API
      • JavaScript Client API
      • Go Client API
      • C++ Client API
      • Java Client API
    • Confluent APIs for Confluent Platform
      • Overview
      • Confluent REST Proxy API
      • Connect REST API
      • Flink REST API
      • Schema Registry API
      • ksqlDB REST API
      • Metadata API
  • Glossary

Forecast Data Trends with Confluent Platform for Apache Flink

Forecasting helps predict future values in time-series data by analyzing historical trends. In real-time streaming environments, continuous forecasting enables businesses to anticipate changes, optimize operations, and detect deviations as data flows in.

With Apache Flink®, you can perform continuous forecasting and anomaly detection directly within your familiar SQL-based environment.

Flink provides the Flink SQL forecast function, which enables you to perform real-time analysis and gain actionable insights from your streaming data without needing in-depth data science expertise. The ML_FORECAST function uses ARIMA (Autoregressive Integrated Moving Average), optimized for real-time performance, to deliver accurate forecasts in a streaming context.

Your data must include:

  • A timestamp column.

  • A target column representing some quantity of interest at each timestamp.

ARIMA model

ARIMA, which stands for Autoregressive Integrated Moving Average, is a powerful statistical technique used for time series analysis and forecasting. It combines three key components to model and predict future values based on past data:

  • Autoregressive (AR): This component uses past values of the time series to predict future values. It assumes that current observations are influenced by previous observations.

  • Integrated (I): This refers to the differencing of raw observations to make the time series stationary. Stationary data has consistent statistical properties over time, which is crucial for accurate forecasting.

  • Moving Average (MA): This component incorporates the relationship between an observation and a residual error from a moving average model applied to previous observations.

ARIMA models are used widely in various fields, including finance, economics, and weather forecasting, to analyze time-dependent data and make predictions. They are particularly effective when dealing with non-seasonal time series data that exhibits trends or patterns over time.

The model is typically represented as ARIMA(p,q,d), where:

  • p: The number of autoregressive terms

  • q: The order of the moving average

  • d: The degree of differencing

By adjusting these parameters, analysts can fine-tune the model to best fit their specific time series data and improve forecasting accuracy.

Be aware that forecasting accuracy can vary greatly, based on many parameters, and there is no guarantee of correctness in predictions made by using ARIMA and machine learning.

Forecasting parameters

The following parameters are supported for forecasting.

Model configuration

Use the following parameters to configure the ML_FORECAST function.

enableStl

  • Default: FALSE

Enable STL (Seasonal-Trend decomposition using Loess smoothing) decomposition. If set to TRUE, the time series is decomposed into trend, seasonal, and remainder components.

STL is a statistical method that separates a time series into three additive components: trend, seasonal, and remainder.

  • The trend component captures the long-term direction or movement in the data.

  • The seasonal component captures regular, predictable patterns that repeat over fixed periods.

  • The remainder component captures random fluctuations that aren’t explained by trend or seasonality.

This decomposition helps improve forecasting accuracy by enabling the ARIMA model to better analyze and predict the underlying patterns in your data.

evalWindowSize

  • Default: 5

  • Validation: Must be ≥ 1

Keeps the last evalWindowSize squared errors for both RMSE and AIC calculations. RMSE uses the rolling sum of these errors, while AIC uses the count (n) and sum of those same errors. A smaller window makes both metrics more responsive to recent errors; a larger window smooths out short-term noise.

horizon

  • Default: 1

Number of future time periods to forecast. This value defines how many steps ahead the model predicts.

m

  • Default: If enableStl is TRUE, seasonality is estimated using ACF.

  • Validation: m >= 0 and m < (minTrainingSize / 2)

The seasonal period length. If enableStl is TRUE, the model uses an ACF (Autocorrelation Function) to estimate the seasonal period length. The ACF measures the correlation between a time series and its lagged versions at different time intervals. This helps to identify repeating patterns and seasonal cycles in the data by detecting how strongly current values relate to past values at various lags.

maxTrainingSize

  • Default: 512

  • Validation: maxTrainingSize must be greater than minTrainingSize and less than 10000

Maximum number of historical data points used for training. This value limits the training window to recent data.

minTrainingSize

  • Default: 128

  • Validation: minTrainingSize must be less than maxTrainingSize and greater than 1

Minimum number of data points required for training. The model won’t train if the quantity of available data is less than this value.

p

  • Default: NULL

  • Validation: If provided, must be >= 0

Order of the AR (Autoregressive) term in the ARIMA model, which defines the number of lagged values used in the model. This value represents the number of past values that directly influence the current prediction.

If p is not provided, the model uses auto-ARIMA to determine the best order.

q

  • Default: NULL

  • Validation: If provided, must be >= 0

Order of the MA (Moving Average) term in the ARIMA model, which defines the number of lagged error terms in the model. This value represents the number of past prediction errors that influence the current prediction.

If q is not provided, the model uses auto-ARIMA to determine the best order.

d

  • Default: NULL

  • Validation: If provided, must be >= 0

Order of differencing in the ARIMA model, which defines the number of times the data is differenced to achieve stationarity. A value of 0 means no differencing, 1 means first difference, and so on.

If d is not provided, the model uses auto-ARIMA to determine the best order.

updateInterval

  • Default: 5

Frequency of model updates. This value defines how often the model retrains with new data.

Forecast results

The following fields are returned by the ML_FORECAST function.

aic

  • Type: DOUBLE

The AIC (Akaike Information Criterion) at the time the forecast was generated, calculated over the last evalWindowSize errors.

AIC is a statistical measure used to evaluate and compare the quality of different models. It balances model accuracy against complexity by penalizing models with more parameters. Lower AIC values indicate better model performance, which helps to identify the optimal model configuration that provides the best fit to the data without overfitting.

forecast_value

  • Type: DOUBLE

The forecast value at the timestamp.

lower_bound

  • Type: DOUBLE

The lower confidence interval for the forecasted value.

rmse

  • Type: DOUBLE

The RMSE (Root Mean Square Error) at the time the forecast was generated, calculated over the last evalWindowSize errors.

RMSE measures the difference between the predicted and actual values in a time series by calculating the square root of the average of the squared differences between the predicted and actual values. A lower RMSE indicates better model performance, because it means the model’s predictions are closer to the actual values.

timestamp

  • Type: TIMESTAMP

The LocalDateTime for this forecast.

upper_bound

  • Type: DOUBLE

The upper confidence interval for the forecasted value.

Example

  1. Create a test table that has a timestamp field for your time series data.

    CREATE TABLE orders (
        total_orderunits double,
        summed_ts TIMESTAMP(3) NOT NULL,
        WATERMARK FOR summed_ts AS summed_ts
    );
    
  2. Insert mock time series data into the table.

    INSERT INTO orders (total_orderunits, summed_ts) VALUES
    (102.12, TIMESTAMP '2024-11-19 10:00:00.000'),
    (103.45, TIMESTAMP '2024-11-19 10:01:00.000'),
    (101.89, TIMESTAMP '2024-11-19 10:02:00.000'),
    (104.23, TIMESTAMP '2024-11-19 10:03:00.000'),
    (102.78, TIMESTAMP '2024-11-19 10:04:00.000'),
    (102.12, TIMESTAMP '2024-11-19 10:05:00.000'),
    (103.45, TIMESTAMP '2024-11-19 10:06:00.000'),
    (101.89, TIMESTAMP '2024-11-19 10:07:00.000'),
    (104.23, TIMESTAMP '2024-11-19 10:08:00.000'),
    (102.78, TIMESTAMP '2024-11-19 10:09:00.000'),
    (102.12, TIMESTAMP '2024-11-19 10:10:00.000'),
    (103.45, TIMESTAMP '2024-11-19 10:11:00.000'),
    (101.89, TIMESTAMP '2024-11-19 10:12:00.000'),
    (104.23, TIMESTAMP '2024-11-19 10:13:00.000'),
    (102.78, TIMESTAMP '2024-11-19 10:14:00.000'),
    (102.12, TIMESTAMP '2024-11-19 10:15:00.000'),
    (103.45, TIMESTAMP '2024-11-19 10:16:00.000'),
    (101.89, TIMESTAMP '2024-11-19 10:17:00.000'),
    (104.23, TIMESTAMP '2024-11-19 10:18:00.000'),
    (102.78, TIMESTAMP '2024-11-19 10:19:00.000'),
    (102.12, TIMESTAMP '2024-11-19 10:20:00.000'),
    (103.45, TIMESTAMP '2024-11-19 10:21:00.000'),
    (101.89, TIMESTAMP '2024-11-19 10:22:00.000'),
    (104.23, TIMESTAMP '2024-11-19 10:23:00.000'),
    (102.78, TIMESTAMP '2024-11-19 10:24:00.000'),
    (102.12, TIMESTAMP '2024-11-19 10:25:00.000'),
    (103.45, TIMESTAMP '2024-11-19 10:26:00.000'),
    (101.89, TIMESTAMP '2024-11-19 10:27:00.000'),
    (104.23, TIMESTAMP '2024-11-19 10:28:00.000'),
    (102.78, TIMESTAMP '2024-11-19 10:29:00.000');
    
  1. Run the following statement to perform a forecast on the mock data in the orders table.

    SELECT
        ML_FORECAST(total_orderunits, summed_ts, JSON_OBJECT('p' VALUE 1, 'q' VALUE 1, 'd' VALUE 1, 'minTrainingSize' VALUE 10, 'enableStl' VALUE false, 'horizon' VALUE 5))
        OVER (
            ORDER BY summed_ts
            RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS forecast
    FROM orders;
    
  2. Run the following statement to perform a forecast with AutoArima on the mock data in the orders table.

    SELECT
        ML_FORECAST(total_orderunits, summed_ts, JSON_OBJECT('minTrainingSize' VALUE 10, 'enableStl' VALUE false))
        OVER (
            ORDER BY summed_ts
            RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS forecast
    FROM orders;
    

Note

This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.

Was this doc page helpful?

Give us feedback

Do you still need help?

Confluent support portalAsk the community
Thank you. We'll be in touch!
Be the first to get updates and new content

By clicking "SIGN UP" you agree that your personal data will be processed in accordance with our Privacy Policy.

  • Confluent
  • About
  • Careers
  • Contact
  • Professional Services
  • Product
  • Confluent Cloud
  • Confluent Platform
  • Connectors
  • Flink
  • Stream Governance
  • Developer
  • Free Courses
  • Tutorials
  • Event Streaming Patterns
  • Documentation
  • Blog
  • Podcast
  • Community
  • Forum
  • Meetups
  • Kafka Summit
  • Catalysts
Terms & ConditionsPrivacy PolicyDo Not Sell My InformationModern Slavery PolicyCookie SettingsFeedback

Copyright © Confluent, Inc. 2014- Apache®️, Apache Kafka®️, Kafka®️, Apache Flink®️, Flink®️, Apache Iceberg®️, Iceberg®️ and associated open source project names are trademarks of the Apache Software Foundation

On this page: