#Kafka

Jan 26, 2021
SAK Meetup: Kafka Streams and the Suppress Operator

Backstory: A few months after I moved to Stockholm, I’ve decided to support the Stockholm Apache Kafka Meetup (SAK). So, to kick off the 2021 season, I gave a talk about a tutorial that I had created a few months earlier and published on https://kafka-tutorials.confluent.io. Tutorial: Emit a final result from a time window

Jun 20, 2020
Repost 🔃 Kafka Streams, co-partitioning requirements illustrated

Back in 2020, I wrote an article about data exchange between Kafka Streams instances. I was not working with Kafka Stream at a specific time. But my head was full of many ideas I wanted to put on paper after using the Kafka Streams library for quite a while. The way joins happen was one of those ideas.

Jun 1, 2020
EDML: the Confluent Webinar edition

The abstract in French 🇫🇷: Le serving de modèle de Machine Learning pour la prédiction en temps réel présente des défis tant en Data Engineering qu’en Data Science. Comment construire un pipeline moderne qui permet de réaliser des prédictions en continu ? Dans le cas d’un exercice supervisé, comment allier tracing et tracking des performances ?

Nov 28, 2019
Event Driven Machine Learning (Xebicon'19)

The abstract (in French 🇫🇷): Le serving de modèle de Machine Learning pour la prédiction en temps réel présente des défis tant en Data Engineering qu’en Data Science. Comment construire un pipeline moderne qui permet de réaliser des prédictions en continu ? Dans le cas d’un exercice supervisé, comment allier tracing et tracking des performances ?

Oct 1, 2019
Kafka Streams Poison Pills (Kafka Summit SF'19)

The abstract: Apache Kafka’s Streams API lets us process messages from different topics with very low latency. Messages may have different formats, schemas and may even be serialised in different ways. What happens when an undesirable message comes in the flow? When an error occurs, real-time applications can’t always wait for manual recovery and need to handle such failures.

Jun 4, 2019
Kafka Streams On k8s: The Difficulties

The abstract (in French 🇫🇷): L’ Auto Scaling c’est l’argument phare d’un bon nombre de technologies en Data Engineering. Parmi les outils du moment, on retrouve Kafka-Streams. Avec sa forte intégration au bus de message Apache Kafka, il est pensé pour être un framework distribué capable de passer à l’échelle. Pourtant, dans la pratique, sa seule utilisation est limitée.

Apr 18, 2019
Kafka Streams Poison Pills (DEVOXX France'19)

The abstract (in French 🇫🇷): Kafka-Streams, la librairie de traitement de données en temps réel de Apache Kafka permet de traiter une grande quantité de messages avec de très faibles latences. Les messages peuvent avoir des formats différents, des schémas différents et même être sérialisés de manières différentes. Alors que se passe-t-il quand un message indésirable se retrouve dans un flux ?

Apr 15, 2019
Repost 🔃 Kafka Streams: a road to Autoscaling via Kubernetes

There are many reasons for working on community contributions such as a blog post, a demo, or a talk. Sometimes, you produce those contributions to share something that you’ve learned at work. But sometimes, the contribution itself can be a way to learn and experiment something new. I was in the second case when I worked on the article Kafka Streams: a road to Autoscaling via Kubernetes.

Nov 20, 2018
Scale in / Scale out with Kafka Streams and Kubernetes

The abstract: Apache Kafka’s Streams API lets us process messages from different topics with very low latency. Messages may have different formats, schemas and may even be serialised in different ways. What happens when an undesirable message comes in the flow? When an error occurs, real-time applications can’t always wait for manual recovery and need to handle such failures.

Mar 12, 2018
Processor API: the dark side of Kafka Streams

The abstract (in French 🇫🇷): Complexe et fastidieuse, la Processor API est souvent mise de côté. C’est dommage, surtout quand on apprend que les plus grandes fonctionnalités de Kafka Streams s’y cachent. Notamment les stateful operations et interactives queries. Malgré cela c’est Stream DSL, l’API haut niveau, qui a su séduire les développeurs.

Loïc M. DIVAD