Get Started Free
‹ Back to courses
course: Apache Kafka® for Python Developers

Introduction to Python for Apache Kafka

6 min
dave-klein-headshot

Dave Klein

Developer Advocate

Overview

In this lecture, you will learn why Python has become such a popular language for developing real time event streaming applications that take advantage of the Apache Kafka platform. You will also be presented with details about the modules and hands-on exercises that follow.

Resources

Use the promo code PYTHONKAFKA101 to get $25 of free Confluent Cloud usage

Be the first to get updates and new content

We will only share developer content and updates, including notifications when new content is added. We will never send you sales emails. 🙂 By subscribing, you understand we will process your personal information in accordance with our Privacy Statement.

Introduction to Python for Apache Kafka

It's becoming increasingly important for companies to be able to respond to the events that affect their business in real time. Apache Kafka has become the de facto standard for meeting this need for real-time data processing. If you're a Python developer, this course will show you how to harness the power of Apache Kafka in your applications. You'll learn how to build Kafka producer and consumer applications, how to work with the event schema, and take advantage of the Confluent Schema Registry and more. Python and Kafka, two great technologies that work great together. Let's get started. In this course, we will learn how we can take advantage of data in motion by processing events in real time using the Python programming language and Apache Kafka. Kafka and real-time data are a natural fit, but why Python? Python is a full-featured programming language that has been carefully designed for developer productivity and ease of use. Experienced programmers can pick it up quickly, and those new to programming find a much smoother learning curve. It's also a lot of fun to code in. Python has also become quite ubiquitous, ranking as the most popular language on the TIOBE Index for August, 2022, and consistently in the top three. This high adoption leads to more resources available, which makes it easier for teams to learn and to grow when needed. Python is a versatile language and is popular for automating system administration tasks or end-to-end tests. It has also become the lingua franca of machine learning and data science applications and data engineering, and it's widely used by application developers for building web applications and microservices. Strong communities tend to grow around great technology, and Python is no exception. Whether you're talking about online support forums like Slack or Stack Overflow, community events such as conferences and meetups, or books available in print or online, you'll find all that and more in the Python community. This active and welcoming community provides enormous learning opportunities as well as a huge pool of potential teammates. All of these together help to explain why Python has become such a popular programming language for so many use cases. Apache Kafka is an event streaming platform that has become the de facto standard for data in motion. If you're not familiar with Kafka, please take a look at our Kafka 101 course on Confluent Developer. For developers building applications with Kafka, our main tool is the client library. While the client library that comes with Kafka is Java-based and can only be used with JVM languages, there are client libraries available for most popular languages, including multiple options for Python. In this course, we'll focus on the Python client library provided by Confluent. Confluent's Python Client for Apache Kafka is a fast, full-featured library of classes and functions that enable us to harness the power of Kafka in our Python applications. To get the full benefits of this package, we recommend using a recent version of Kafka, but the Python client is compatible with older versions as well. Confluent maintains and supports the Python client and is actively working to keep it on par with the Java client. In this course, we will look more deeply into the following classes in the Python Kafka client package: the producer, which we will use to send events to Kafka Topics, the consumer, which we can use to read events from one or more Kafka topics, the Schema Registry client that gives us access to the Confluent Schema Registry, and we'll cover serializers and deserializers that enable us to use JSON Schema, Protocol Buffers, or Avro schemas with our events. Finally, we'll take a look at the AdminClient, which we can use to create and modify resources such as topics, partitions, access control lists, and more. As you learn about what these Python classes do and how to work with them, you'll put your new knowledge to work with hands-on exercises. As part of these exercises, you will develop simple Apache Kafka client applications with Python that will have you produce data to and consume data from Confluent Cloud. So if you haven't already created a Confluent Cloud account, take some time to sign up now so that you're ready to go with the first exercise. Be sure to look for the promo code and use it when signing up to get the additional free usage that it provides. Once you have created your Confluent Cloud account, continue with the first hands-on exercise that follows this module, during which you will install the Confluent Python package and set up the environment that you will use for the exercises that follow in the course. My colleague Danica Fine will now walk you through setting up your free Confluent Cloud account.

  • [Danica] First off, you'll want to follow the URL on the screen. On the sign-up page, enter your name, email, and password. Be sure to remember these sign-in details as you'll need them to access your account later. Click the Start free button and wait to receive a confirmation email in your inbox. The link in the confirmation email will lead you to the next step where you'll be prompted to set up your cluster. You can choose between a Basic, Standard, or Dedicated cluster. Basic and Standard clusters are serverless offerings where your free Confluent Cloud usage is only exhausted based on what you use, perfect for what we need today. For the exercises in this course, we'll choose the Basic cluster. Usage costs will vary with any of these choices, but they are clearly shown at the bottom of the screen. That being said, once we wrap up these exercises, don't forget to stop and delete any resources that you created to avoid exhausting your free usage. Click Review to get one last look at the choices you've made and give your cluster a name, then launch. It may take a few minutes for your cluster to be provisioned. And that's it. You'll receive an email once your cluster is fully provisioned. But in the meantime, let's go ahead and leverage that promo code. From Settings, choose Billing & payment. You'll see here that you have $400 of free Confluent Cloud usage, but if you select the Payment details & contacts tab, you can either add a credit card or choose to enter a promo code. And with that done, you're ready to dive in. If you are not already on Confluent Developer, head there now using the link in the video description to access the rest of this course and its hands-on exercises.