Making Kafka Queryable with Apache Pinot • Tim Berglund • GOTO 2023

This presentation was recorded at GOTO Copenhagen 2023. #GOTOcon #GOTOcph Tim Berglund - VP DevRel StarTree & Author of “Gradle Beyond the Basics“ @tlberglund @StarTree RESOURCES ABSTRACT Apache Kafka has become the standard infrastructure for event-driven and streaming data systems. The stunningly simple abstraction of the distributed log provides exactly what modern microservices and real-time systems need, but no choice is without its tradeoffs. Logs are an excellent way to keep track of events, but they are notoriously difficult to query. Given a constellation of services exchanging events with each other and reacting to inputs in real time, how can you find out—and gain insight into—what has just happened? How, in other words, do you query a log? This is where Apache Pinot comes in. Developed at LinkedIn alongside Kafka, Pinot is a distributed, real-time analytics database designed to ingest data from Kafka (and other sources) and make it instantly queryable at low latency in the face of a huge number of concurrent requests. All that data tucked neatly away into topics, maintaining an immutable record of how the state of the system has evolved, can now be ingested into Pinot and made accessible through simple SQL queries. This talk explores Pinot’s internal architecture, how its integration with Kafka is specially optimized, and how Pinot fits architecturally in the modern streaming stack. You’ll leave understanding how Pinot works, how it fits together with Kafka, where it has been used successfully in the real world, and what steps to take next in your own Pinot learning journey. [...] TIMECODES 00:00 Intro 02:57 A brief history 12:53 Pinot architecture 24:04 Indexes 32:29 Ingest 41:51 Remember our history 44:57 Outro Download slides and read the full abstract here: RECOMMENDED BOOKS Tim Berglund • Gradle Beyond the Basics • Tim Berglund & Matthew McCullough • Building and Testing with Gradle • Mark Needham • Building Real-Time Analytics Systems • Gwen Shapira, Todd Palino, Rajini Sivaram & Krit Petty • Kafka: The Definitive Guide • Adi Polak • Scaling Machine Learning with Spark • #ApachePinot #Analytics #RealTime #RealTimeAnalytics #TimBerglund #StarTree #StarTreeCloud #Cloud #ApachePinotTutorial #ApachePinotTraining #Snowflake #ApacheZooKeeper #ApacheHelix #Hadoop #ApacheSpark Looking for a unique learning experience? Attend the next GOTO conference near you! Get your ticket at Sign up for updates and specials at SUBSCRIBE TO OUR CHANNEL - new videos posted almost daily.
Back to Top