Kafka REST to Google BigQuery Data Pipeline Use Case

In this use case, we will create a data pipeline that starts from the REST producer, consumes data from the topic using Kafka Connect, inserts data into the Google BigQuery table. It can be said that this way is the streaming to BigQuery. Important to note that streaming to BigQuery is charged by Google. This Read more about Kafka REST to Google BigQuery Data Pipeline Use Case[…]

Spark Streaming from Kafka to HBase Use Case

Let’s assume that our data is stored on the Kafka cluster and it should be moved to another storage layer which is will be HBase in this case. And few transformations need to be made before data is moved. These steps can be depicted as below architecture. Data could only be collected using the Spark Read more about Spark Streaming from Kafka to HBase Use Case[…]

Kafka Kerberos Configuration on secured Cloudera Cluster

Introduction Apache Kafka is an open-source distributed streaming platform developed by Linkedin and donated to Apache Software Foundation. It is robust, scalable horizontally, also it has a flexible architecture. Kafka ingests the data for storing data in a limited time and this data can be used by multiple teams in a company. This can create a vulnerably situation in Read more about Kafka Kerberos Configuration on secured Cloudera Cluster[…]

Apache Kafka Streams DSL Stateless Transformations

A state is not needed when doing sequential processing data (like instant arithmetic calculations). I examine all stateless transformations with samples. Processing logic created with Java 8. Below dependency is used for logic API creation. 1. Branch (or split): Branch transformation is used when source topic is splitted to different child downstream topics. KStream –> Read more about Apache Kafka Streams DSL Stateless Transformations[…]

Apache Kafka Streams DSL Stateful Transformations

Stateful transformations use the state store for processing input records and creating output from them. Aggregations, joins, and windowing operation need state stores of each previous stream processors (tasks) to accumulate the final status of the elements. In this topic, Stream DSL stateful transformations is being examined with samples. Sample logics is developed using Java Read more about Apache Kafka Streams DSL Stateful Transformations[…]