Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -61,3 +61,5 @@ flake.lock
/src/testdrive/ci/protobuf-include
/known-docker-images.txt
/test/sqllogictest/sqlite

**/.idea/**
65 changes: 65 additions & 0 deletions doc/user/content/ingest-data/kafka/kafka-proxy.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
---
title: "Kafka Proxies"
description: "How to connect a Kafka proxy as a source"
menu:
main:
parent: "kafka"
name: "Kafka Proxies"
---

## Proxy Configurations

There are multiple ways to utilize a Kafka proxy in conjunction with Materialize:

- **Kafka Authentication Proxy/Gateway**: A Kafka proxy can be setup to allow
unauthenticated local connections within a private network to an external
Kafka cluster that requires authentication. In this setup the proxy handles
the authentication to the external cluster.

- Kafka Reverse Proxy: A proxy may also be used to handle authentication
into a private Kafka instance. In this setup, the proxy has trusted access into
a private Kafka cluster and then handles the authentication to outside parties.

{{< tabs tabID="1" >}}
{{< tab "kafka-proxy" >}}

Comment on lines +23 to +25
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the tab?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had initially planned on using them for covering different patterns, i.e. reverse-proxy as well, but will remove them for now.

### Working with [kafka-proxy](https://github.com/grepplabs/kafka-proxy)

When working with [kafka-proxy](https://github.com/grepplabs/kafka-proxy)
there are a few patterns to consider when it comes to listener configuration
and network security.
Comment on lines +26 to +30
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The general flow of this section is a little confusing to me. Each should start with what the thing is - dynamic listener or static listener - and then how to use it with Materialize. The lead in with ssh feels abrupt.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll adjust the flow


#### Dynamic Listeners (random port)

[kafka-proxy](https://github.com/grepplabs/kafka-proxy) by default is configured
to handle multiple brokers in the upstream kafka topic by dynamically creating
listeners on new ports on the proxy instance. It is then expected for kafka
clients to start reading off of these ports. Utilizing an
[SSH Bastion](/ingest-data/network-security/ssh-tunnel/) will allow
Materialize to properly respond to the dynamic ports as they are created.
Comment on lines +34 to +39
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean you can only use dynamic listeners with an ssh bastion? I.e, you can't connect directly if the proxy is what exposes your cluster to the public intenet?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you can connect if it's exposed to the public internet, I've added a section for this. In theory you could do privatelink if you had a script or program that dynamically registered listeners to target groups on an NLB. I've not tested this though, happy to document it or just leave it off for now.


#### Static Listeners

If an [SSH Bastion](/ingest-data/network-security/ssh-tunnel/) is not an
option, an alternative approach is to configure
[kafka-proxy](https://github.com/grepplabs/kafka-proxy) without dynamic port
allocation. With this setup, each kafka broker will need to be set to a static
port. Once this is configured, Materialize from here can be configured to
either connect over AWS PrivateLink (as a load balancer can be configured to
point to the configured set of ports) or can be configured with an
[SSH Bastion](/ingest-data/network-security/ssh-tunnel/) as in the
[Dynamic Listeners](#dynamic-listeners-random-port) case.

#### SSL Configurations

[kafka-proxy](https://github.com/grepplabs/kafka-proxy) by default does not
configure TLS on the exposed endpoint. When leveraging this with the Kafka
source, it will be necessary to specify `SECURITY PROTOCOL = 'PLAINTEXT'` as
documented in [kafka-options](/sql/create-connection/#kafka-options). If SSL is
required, [kafka-proxy](https://github.com/grepplabs/kafka-proxy) can be
configured with TLS support and the connection can be setup with
`SECURITY PROTOCOL = 'SSL'` along with the `SSL KEY`, `SSL CERTIFICATE`, and
`SSL CERTIFICATE AUTHORITY` options.

{{< /tab >}}
{{< /tabs >}}
Loading