Protect Sensitive Data Using Client-Side Field Level Encryption on Confluent Cloud¶
Confluent Cloud offers comprehensive security features, enabling customers to address authorization, data confidentiality, and encryption issues. These features include role-based access control (RBAC), encryption at rest with support for self-managed encryption keys (aka BYOK), encryption in transit using TLS, and private networking. Client-side field level encryption (CSFLE) complements these offerings, empowering organizations to protect their most sensitive data across producers and consumers, thereby securing all data in motion, including the most sensitive information.
CSFLE allows you to safeguard sensitive data, such as personally identifiable information (PII), by enabling field-level encryption both at the producer and consumer levels. By encrypting and decrypting individual fields within your data, CSFLE ensures that access to sensitive information is tightly controlled, granting only authorized stakeholders access to the data they are permitted to see.
CSFLE enhances data privacy and security by providing an additional layer of protection for specific data elements. It ensures that even if an unauthorized user gains access to the data, they would only see encrypted values that are meaningless without the corresponding decryption keys.
Encryption and decryption processes in field-level encryption execute on your clients. Sensitive data is encrypted before it is stored or transmitted, and then decrypted when it needs to be accessed by authorized users, producers, or consumers.
Client-side field level encryption offers the following advantages:
Targeted and flexible approach to data security, allowing your organization to selectively encrypt only the most sensitive fields while leaving other data in plaintext.
Ensures complete control over access to sensitive data, allowing only authorized users and applications to access the information. This includes preventing access by administrators and others who might otherwise be able to view the data in plaintext without CSFLE.
Significant savings in infrastructure and operational costs are achieved by eliminating the need to create multiple copies of the data for access control.
Meets regulatory compliance requirements, such as the protection of personally identifiable information (PII) or other sensitive data. It can be particularly useful in scenarios where data must be shared across different systems or with third-party services, as it ensures that the data remains protected throughout its lifecycle.
For requirements and supported clients, see Requirements.
CSFLE options for Confluent Cloud¶
Client-side field level encryption (CSFLE) is available in Confluent Cloud to help you protect sensitive data in your Confluent Cloud and perform stream processing on encrypted data. You have two options for how you can use CSFLE with Confluent Cloud:
- CSFLE with shared Confluent access to Key Encryption Keys (KEKs).
- CSFLE without sharing access to your Key Encryption Keys (KEKs).
For both CSFLE options:
- You must use a key management service (KMS) to manage access to your Key Encryption Keys (KEKs).
- Extensive security checks and balances provided by Confluent protect your sensitive data.
Details about the two CSFLE options are provided in the following sections.
CSFLE without Confluent access to KEK¶
If you do not share your Key Encryption Key (KEK) with Confluent:
- No user or application in Confluent Cloud can access your encrypted fields in plaintext.
- Stream processing in Confluent Cloud using Flink is not possible because the data is encrypted and cannot be decrypted to perform operations.
- Your organization manages running producers and consumers with the proper configurations to access the KEKs and encrypt or decrypt data.
- Support for Hashicorp Vault Transit Secrets Engine applies only when the KEK is not shared with Confluent.
- You must use a key management service (KMS) to manage access to your Key Encryption Keys (KEKs).
- Extensive security checks and balances provided by Confluent protect your sensitive data.
- You own and manage your Key Encryption Keys (KEKs) and are responsible for overseeing the entire lifecycle of the KEKs.
- Confluent never directly accesses your Key Encryption Keys (KEKs). Each KEK remains securely stored in your key management service (KMS) that is owned and managed by you. Confluent interacts with two APIs that use a KEK identifier and a payload to either encrypt or decrypt the payload with the specified KEK. Confluent can only see the KEK identifier and the payloads (encrypted or decrypted DEKs).
- Use the logging and auditing capabilities provided by your KMS to monitor and trace all access to KEKs to address any compliance or regulatory requirements.
The steps for CSFLE when you do not share KEK access with Confluent are summarized in the diagram below.
Dependency on Stream Governance¶
CSFLE operates within the Stream Governance framework on Confluent Cloud, requiring a well-defined schema that includes tagging and rules to function properly. To use CSFLE with a specific topic, that topic must have a corresponding schema subject defined within your Schema Registry. CSFLE is only available with the Stream Governance Advanced package.
For more information about governance in Confluent Cloud, see:
FIPS 140-2 compliance¶
FIPS 140-2 serves as a standard for validating cryptographic modules, and is often used to ensure that cryptographic modules are secure and compliant with regulatory requirements. CSFLE uses Google Tink format for DEKs to ensure interoperability with Java and Google Cloud clients. You can use Google Tink for common cryptographic operations to meet FIPS 140-2 requirements. For more information, see Use Tink to meet FIPS 140-2 security requirements.
Performance considerations¶
CSFLE can impact performance. The extent of this impact depends primarily on the number of encrypted fields and the nature of your workload. In light usage scenarios, the throughput and latency overhead might be minimal. However, heavy encryption and decryption across multiple fields in every read and write operation can significantly degrade performance. Note the following:
- CSFLE can reduce the effectiveness of data compression. Encryption increases the randomness of data, making it harder for compression algorithms to find patterns.
- CSFLE can reduce throughput (fewer messages per second). All encryption and decryption occur on the client, so there is no direct performance burden placed on the brokers.