Enable Private Networking with Confluent Cloud for Apache Flink

Confluent Cloud for Apache Flink® supports private networking on AWS. This feature enables Flink to securely read and write data stored in Confluent Cloud clusters that are located in private networking, with no data flowing to the public internet. With private networking you can use Flink and Apache Kafka® together for stream processing in Confluent Cloud, even in the most stringent regulatory environments.

Confluent Cloud for Apache Flink supports only Private Networking for AWS Dedicated clusters and Enterprise clusters. A small number of Dedicated clusters created with a previous version of the networking stack are not yet available.

Note

You must enable the resource metadata access option to see tables in the tree-view part of your Flink workspaces when you use private networking. Flink statements, like SHOW TABLES, don’t need this option to be enabled.

How does it work?

Flink Private Networking requires a PrivateLink Attachment (PLATT) to access Kafka clusters with private networking. A PrivateLink Attachment is a resource that enables you to connect to Confluent serverless products, like Enterprise Clusters and Flink.

For Flink, the new PrivateLink Attachment is used only to establish a connection between your clients (like Cloud Console UI, Confluent CLI, Terraform, apps using the Confluent REST API) and Flink. Flink-to-Kafka is routed internally within Confluent Cloud. As a result, this PLATT is used only for submitting statements and fetching results from the client.

  • For Dedicated clusters, regardless of the Kafka cluster connection type (Private Link, Peering, or Transit Gateway), Flink requires that you define a PLATT in the same region of the cluster, even if a private link exists for the Dedicated cluster.
  • For Enterprise clusters, you can reuse the same PLATT used by your Enterprise clusters.

By creating a PrivateLink Attachment to a Confluent Cloud environment in a region, you are enabling Flink statements created in that environment to securely access data in any of the Flink clusters in the same region, regardless of their environment. Access to the Flink clusters is governed by RBAC.

Also, a PrivateLink Attachment enables your data-movement components in Confluent Cloud, including Flink statements and cluster links, to move data between all of the private networks in the organization, including the Confluent Cloud networks associated with any Dedicated Kafka clusters.

Private networking with Confluent Cloud for Apache Flink

Protected resources

Flink PrivateLink Attachment provides comprehensive protection for Kafka clusters, statements, and workspaces, ensuring secure and private connectivity is required to access these resources.

With a PrivateLink Attachment, Flink statements can access topics in Dedicated and Enterprise clusters, regardless of their network type: public, PrivateLink, VPC Peering, or Transit Gateway. However, this access is governed by strict rules to ensure access controls and prevent data exfiltration to the public internet.

Private networking provides protection for Flink statements and workspaces, which can contain sensitive information. You can secure these resources under a private network, ensuring that they are not accessible over public networking. This includes both the creation and access of private statements and workspaces, which is possible only from an authorized private network attached to the Confluent Cloud environment.

When private networking is enabled, you can read both public and private data. But to prevent data exfiltration when using Flink private networking, statements can only write to clusters using private networking. If you want to write to a public cluster, use Flink from another environment without a PrivateLink Attachment.

Important

After a PrivateLink Attachment is created and private networking is enabled, you can’t disable it. Because a resource may contain sensitive information, this Confluent policy ensures that private resources stay private. If you delete a PrivateLink Attachment, the environment stays private, and you must create a new PrivateLink Attachment to access Flink statements and workspaces. Without a new PrivateLink Attachment, you will not be able to access your private resources.

The following table shows access to public and private resources with and without a PrivateLink Attachment created. “CRUD” stands for “Create, Read, Update, Delete”.

  Connect from public internet Connect from private connection to the PrivateLink Attachment of the current environment
Public (no PrivateLink Attachment created)
  • Default connection [1]
  • ✅ CRUD on public statements
  • ✅ CRUD on public workspaces
  • 🚫 CRUD on private statements
  • 🚫 CRUD on private workspaces
  • ✅ CRUD on public statements
  • ✅ CRUD on public workspaces
  • 🚫 CRUD on private statements
  • 🚫 CRUD on private workspaces
Private (PrivateLink Attachment created)
  • 🚫 CRUD on public statements
  • 🚫 CRUD on public workspaces
  • 🚫 CRUD on private statements
  • 🚫 CRUD on private workspaces
  • Default connection [2]
  • 🟡 RUD on public statements, new statements are private only
  • 🟡 RD on public workspaces, public workspaces are read-only
  • ✅ CRUD on private statements
  • ✅ CRUD on private workspaces
[1]Default connection for Cloud Console and Confluent CLI with no PrivateLink Attachment.
[2]Default connection for Cloud Console and Confluent CLI with a PrivateLink Attachment created.

Cross-environment queries

If you’re using a single VPC for multiple environments, you may need to use the topology shown in the following diagram to query data across different environments, due to a known limitation of PLATT.

Topology for cross-environment queries with private networking and Confluent Cloud for Apache Flink

In this case, you can use a single environment and a single PLATT where you run all their Flink workloads and use three-part name queries, to query data in other environments, for example:

SELECT * FROM `myEnvironment`.`myDatabase`.`myTable`;

As a result, a single routing rule is necessary on the VPC side, per region, to redirect all traffic to the Flink regional endpoint(s) using this PrivateLink Attachment Connection.

To isolate different workloads, you can create different compute pools, which enables you to control budget and scale of these workloads independently.

Data access is protected by RBAC at the Kafka cluster (Flink database) or Kafka topic (Flink table) level. If your user account or service account that runs the query doesn’t have access, Flink can’t access sources and destinations.

Prerequisites

Create a PrivateLink Attachment overview

In this walkthrough, you perform the following steps.

  1. Create a PLATT/PLATTC.
  2. If your client is not in the VPC, enable the Confluent Cloud Console or Confluent CLI to connect to your private network as shown in Step 2.

You can now use Flink from the Confluent Cloud Console or Confluent CLI. When the previous steps are completed, the experience is the same as with public networking.

Step 2: Connect to the network with Confluent Cloud Console or Confluent CLI

If you don’t connect from a machine in the VPC, you see the following error.

Private networking error when not connecting from a machine in the VPC

To connect to Confluent Cloud with PrivateLink Attachment, see Use Confluent Cloud with Private Networking. One way to connect is to set up a reverse proxy.

  1. Create an EC2 instance

  2. Connect to the instance with SSH

  3. Install NGINX

  4. Configure Routing Table

  5. Set up DNS resolution: point to the Flink regional endpoints you use, as described in Step 6 of Configure a proxy.

    <Public IP Address of VM instance> <Flink-private-endpoint>
    

    <Flink-private-endpoint> will resemble flink.<region>.<cloud>.private.confluent.cloud, for example: flink.us-east-2.aws.private.confluent.cloud.

    Find the DNS part of the PrivateLink Attachment by navigating to your environment’s Network management page and finding the DNS domain setting.

    DNS domain on the Network Management for Flink private networking

    You can find the full list of supported Flink regions by using the Regions endpoint API.

Once networking is set up in Cloud Console, the interface uses the correct endpoint automatically, either public or private, based on the presence of a PrivateLink Attachment. If the connection is private, access to the Flink private network works transparently.

Additional Confluent CLI options

Like Cloud Console, with Confluent CLI the interface uses the correct endpoint automatically, either public or private, based on the presence of a PrivateLink Attachment, but you also have the option of overriding the endpoint by using the following command:

# Override to private endpoint
confluent flink connectivity-type use private

# Override to public endpoint:
confluent flink connectivity-type use public

For more information, see Create a connection to the network that hosts the private cluster endpoints.