Notes on schema registry/cloudevents in Kafka

2 min readMar 16, 2022

Apache Kafka gain a lot of attention since it was first released in 2010 as an in house project in LinkedIn, by definition apache Kafka is an distributed event stream/processing platform which used widely in the last couple of years in different domain starting from web activity tracking to communication between microservices.

Schema Registry

Kafka itself transfer data either from producers or consumers as byte format so Kafka don’t know anything about the data which is produced/consumed if is for example string, integer or other type of data so for example If producer start to send bad data/events to Kafka and consumer start to consume that invalid data, in that case consumer will start to throw exceptions or start to go down.

From the above scenario schema registry start to handle that issue where schema registry is a sperate application from Kafka cluster which handle the part of distribution of schemas to producers and consumers by storing a copy of schema in it cache

Producer talks to the schema registry to check if the schema is available. If it doesn’t find the schema then it registers and caches it in the schema registry. Once the producer gets the schema, it will serialize the data with the schema and send it to Kafka in binary format, when the consumer processes this message, it will communicate with the schema registry using the schema ID it got from the producer and deserialize it using the same schema.

If there is a schema mismatch, the schema registry will throw an error letting the producer know that it’s breaking the schema agreement.

To be continued regarding data format/cloudevents

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Written by Mohamed Kashif

6 Followers

23 Following

Software Engineer

No responses yet

Write a response

What are your thoughts?

Also publish to my profile

More from Mohamed Kashif

Mohamed Kashif

Git flow for hotfix

There are a lot of flows in git to handle hotfix but usually there is specific flows which widely used among most teams, in this little…

Sep 6, 2021

mid = (low+high) / 2 vs mid=low+ (high-low) /2

Mohamed Kashif

mid = (low+high) / 2 vs mid=low+ (high-low) /2

In binary search algorithm implementation most of us including me we get used to calculate mid value by mid = (low + high) / 2 which is ok…

Sep 5, 2021

Constructor injection Vs Field injection in Java

Mohamed Kashif

Constructor injection Vs Field injection in Java

In this short article I gonna walk you through constructor injection in spring boot Vs field injection and why we should avoid field in…

Aug 19, 2020

Mohamed Kashif

What, how and why is Bloom Filter

A lot of us heard about aa concept or a thing called bloom filters when we started to learn and discover new thongs and concepts about…

Feb 12, 2023

See all from Mohamed Kashif

Recommended from Medium

Kafka consumer seek operation : kafka.js, node.js

Ashish Joshi

Kafka consumer seek operation : kafka.js, node.js

This post explores the seek operation of kafka consumer and how it can be leveraged in a distributed microservice architecture to pull…

Sep 22, 2024

Understanding Debezium Connectors: A Deep Dive into Real-Time Change Data Capture for Event-Driven…

Vinotech

Understanding Debezium Connectors: A Deep Dive into Real-Time Change Data Capture for Event-Driven…

Debezium connectors are part of the Debezium platform, an open-source distributed platform designed for change data capture (CDC). Debezium…

Nov 6, 2024

Lists

Staff picks

827 stories1648 saves

Stories to Help You Level-Up at Work

19 stories948 saves

Self-Improvement 101

20 stories3355 saves

Productivity 101

20 stories2819 saves

Data Engineer Things

Vu Trinh

Apache Kafka — Overview

The terminology and the architecture.

Jul 6, 2024

749

Kafka Stream API Json Parse - IT Tutorial

Stackademic

Amit Singh

Kafka in Action: Real-Life Inspirations and Code to Fuel Your Data Streams

Apache Kafka is an open-source stream-processing software platform which is used to handle the real-time data storage. It works as a broker…

Sep 18, 2024

151

Kafka Showdown: poll() vs consume() 🚀—Which One Should You Use?

Priyanshu Rajput

Kafka Showdown: poll() vs consume() 🚀—Which One Should You Use?

When working with Apache Kafka, you’ll often need to retrieve messages from a topic. That’s where Kafka’s consumer APIs come into play. Two…

Dec 21, 2024

High-Level System Architecture of Booking.com

Talha Şahin

High-Level System Architecture of Booking.com

Take an in-depth look at the possible high-level architecture of Booking.com.

Jan 10, 2024

6.2K

See more recommendations

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams