LBRY Block Explorer • Claim • integrate-schema-registry-kafka-in

LBRY Claims • integrate-schema-registry-kafka-in

6bd8fa93f8f003ee12a939da8dfa3a89e3c1d723

Published By

@pycontw

Created On

11 Sep 2022 09:32:49 UTC

Transaction ID

022f0b54abc90af533403a94c0a95f00848f8e73752ed66dcc27b6ec0846261e

Cost

Safe for Work

Free

Yes

Integrate Schema Registry & Kafka in Python to Build Streaming Processing｜蘇揮原 Mars Su

PyCon APAC 2022｜一般演講 Talks｜國泰金控 Cathay Financial Holdings / 美光科技 Micron 冠名贊助

✏️ 共筆 Note：https://hackmd.io/@pycontw/BkwCLT7Ji
?? Slido：https://app.sli.do/event/sE3aaBEUsYZ7HiLkrvcrAp
? 投影片 Slides：https://drive.google.com/file/d/1NdBaMqZjVTucV_Z2EY2iFub826hF92ga/view?usp=sharing
? 語言 Language：中文演講/英文投影片 Chinese talk w. English slides
? 層級 Level：中階 Intermediate
? 分類 Category：應用 Application

? 摘要 Abstract ?
In the current data-driven world, we are often faced with how to process and analyze data effectively and in real time. And streaming processing will be an important application. In addition, the data will have different schemas for different applications and needs. In order to effectively achieve data correctness and availability in the application of streaming, it is necessary to integrate schema verification into the streaming process. In order to achieve this objective, I will start with introducing the concept and use cases or scenarios of streaming process and two services, Apache Kafka and Schema Registry. The Kafka is a message queue system that can handle a large amount of streaming data. And Schema Registry is a service which based on Kafka, it can help us do schema verification during producing data to Kafka or consuming data from Kafka. Lastly, I will share how to use python to integrate these two service to implement a reliable streaming process.

? 說明 Description ?
Abstract
In this session, i will start with sharing the difference between batch and streaming to help participants establish a basic concept, also introduce importance and use cases or scenarios about streaming process. And i will highlight Apache Kafka and Schema Registry architecture and purpose. Then, i will discuss how to use python to implement streaming process, include produce data, consume data and achieve data schema verification through example code and demo. Lastly, i also share some important settings and how to finetune producer and consumer to improve high throughput and latency on streaming process.

Session Outline
1. Batch vs. Streaming
- Introducing the concept of difference between batch and streaming.
2. Why need streaming process?
- Introducing importance and use cases about streaming processing.
3. Talk about Apache Kafka
- Introducing Apache Kafka, include purpose, architecture and components.
4. Talk about Schema Registry
- Introducing Schema Registry, include purpose, schema evolution strategy and schema verification.
5. Python lib - confluent kafka client
- Introducing the key python lib to help integrate Apache Kafka and Schema Registry.
6. How to produce message?
- Demo example the producer code with python.
7. How to consume message?
- Demo example the consumer code with python.
8. Finetune producer and consumer.
- Introducing important settings and how to achieve high throughput on producer and consumer side.
9. Conclusion

Reference
1. Apache Kafka Introduction: https://kafka.apache.org/intro
2. Schema Registry Introduction on Confluent: https://docs.confluent.io/platform/current/schema-registry/index.html
3. Schema Registry Introduction on Confluent youtube: https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=&cad=rja&uact=8&ved=2ahUKEwjg0p-B7sX2AhVkJaYKHcPwDo4QwqsBegQIBRAB&url=https%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3D_x9RacHDQY0&usg=AOvVaw3fDpPkciE2fMTQijPh2Mkx
4. Confluent's Kafka Python Client - GitHub: https://github.com/confluentinc/confluent-kafka-python
5. Streaming Data: How it Works, Benefits, and Use Cases: https://www.confluent.io/learn/data-streaming/
6. 6 Most Common Streaming Data Use Cases: https://www.upsolver.com/blog/6-most-common-streaming-data-use-cases

? 講者介紹 About Speaker - 蘇揮原 Mars Su ?
A Senior ML/Data Engineer in Gogolook. Currently i am in charge of implementing streaming etl infrastructure and nlp related ml model and application. Having 4+ years experience of data science and data engineering, include NLP and Streaming(micro-batch) ETL design. My research interests include nlp related algorithm model and paper, streaming data pipeline and cloud service. Hope i can contribute something in data world.

#pycontw #pyconapac2022 #python #kafka #streaming

Follow “PyCon Taiwan”
⭐️ Official Website: https://tw.pycon.org
⭐️ Facebook: https://www.facebook.com/pycontw
⭐️ Instagram: https://www.instagram.com/pycontw
⭐️ Twitter: https://twitter.com/PyConTW
⭐️ LinkedIn: https://www.linkedin.com/company/pycontw
⭐️ Blogger: https://pycontw.blogspot.com
...
https://www.youtube.com/watch?v=X0HryRZ7BnQ

Author

Content Type

Unspecified

video/mp4

Language

English

Open in LBRY

More from the publisher

Controlling

VIDEO

#SHOR

shorts-pycast-s3ep2-官網大翻新！ui

lbry://@pycontw/shorts-pycast-s3ep2-官網大翻新！ui

【S3EP2 | 官網大翻新！UI/UX 設計師聊多了都是淚（？】 2021 年，一群熱血的志工啟動了 PyCon TW 官網大改版，將前後端分離，並且導入介面設計的觀念，才誕生了現在精美又現代化的網站。可以發現網站從 2020 到 2023 年，網站主視覺與易用性上有很多不一樣的突破！這次我們邀請到網站改版的幕後推手，一起揭開改版過程中，介面設計師與工程師之間相愛相殺（？的心路歷程。 #PyConTW #2023 #python #website #uiux - ?點擊收聽：https://reurl.cc/6NMWb5 - Follow “PyCon Taiwan” ⭐️ Official Website: https://tw.pycon.org ⭐️ Facebook: https://www.facebook.com/pycontw ⭐️ Instagram: https://www.instagram.com/pycontw ⭐️ Twitter: https://twitter.com/PyConTW ⭐️ LinkedIn: https://www.linkedin.com/company/pycontw ⭐️ Blogger: https://pycontw.blogspot.com ... https://www.youtube.com/watch?v=IhFqVFFzVKM

Transaction

Created

2 weeks ago

Content Type

Language

video/mp4

Controlling

VIDEO

LIGHT

lightning-talks-and-closing-–-pycon

lbry://@pycontw/lightning-talks-and-closing-–-pycon

00:33 myhomingfood.com – Claire 03:34 Introducing TrackML – Y. Chao 07:15 104 Hackathon – ifeng 10:19 2018 Pixnet Hackathon – Ryan 13:17 Bias in Datasets – MTH Junaidi 16:29 LED 眼鏡 – Tom 19:58 厲害了，我的蟲——我在 iii 的爬蟲人生 – Bingroom 23:20 PyCon PL – David 26:27 掌握各社群活動的工具—TGMeetup – Samina 29:39 我們用 Python 土炮了人臉打卡系統 – Mon 33:02 高中生做點事 – Rafael 36:28 用 Google Maps 查查（你用不到的）建築物防震係數 – Clifford Yen 39:50 uTensor: Neural Network on Embedded System – Dboy Liao, Neil Tan 43:28 PyCon JP – Hisa_X 46:31 小時不讀書，長大寫 Bot – Aweimeow ... https://www.youtube.com/watch?v=nP13eOjcFWw

Transaction

Created

2 weeks ago

Content Type

Language

video/mp4

Controlling

VIDEO

BIG M

big-models,-small-pitfalls-my-random

lbry://@pycontw/big-models,-small-pitfalls-my-random

PyCon APAC 2022｜一般演講 Talks｜國泰金控 Cathay Financial Holdings / 美光科技 Micron 冠名贊助 ✏️ 共筆 Note：https://hackmd.io/@pycontw/rkh-wTXyi ?? Slido：https://app.sli.do/event/jfzDeYQ9auhfLV1U9uRRBN ? 投影片 Slides：https://github.com/ShreyaKhurana/PyconAPAC22 ? 語言 Language：英文 English ? 層級 Level：中階 Intermediate ? 分類 Category：自然語言處理 Natural Language Processing ? 摘要 Abstract ? From Googling for the reviews of that new deli place to watching Parasite with subtitles, sequence-to-sequence learning is behind a variety of applications - machine translation, speech recognition, chatbots etc. In this talk, I share my experiences with modeling real-world messy data with attention-based transformer and CNN models - the good, the bad, and the ugly. I'll discuss some of the lessons that we learned through experimentation on sequence-to-sequence models - about architecture size, vocabulary, the relation between validation and training error, etc. Working in ML makes you realize one thing–it’s not always black and white and it doesn’t have to be a black box! The process of building good NLP models is not a straight line. Not-so-little decisions can often require tons of experiments, but in this talk, I share my modeling experience and lessons learned so your walk isn't as random as mine was! ? 說明 Description ? I’ll start by discussing briefly how sequence (input) to sequence (output) deep learning architectures have evolved from RNNs to recent advancements in Transformer, BERT and GPT3. I’ll talk about the different applications of sequence to sequence learning, which will help the audience understand if it’s a good fit for their data. I'll discuss some results of hyperaparameter and data tuning that have been shown in the literature and are often used as starting points during modeling. Data collection, data cleaning and pre-processing play a significant role in effectively solving any data science problems. Particularly, subwords and byte-pair processing have been immensely helpful in generalizing the task for rare words. FastBPE[1] and subword NMT[2] are two tools that are widely used for this purpose. I’ll show how small errors during subword preprocessing can lead to very different outputs and harm the prediction quality. I’ll also address the common questions we face as ML practitioners about the architecture, hyperparameters and training metrics. While research literature points to clear patterns of performance with data size and number of model parameters, real data is often messy and has its own set of nuances. I'll be sharing some lessons and tricks to help make practical modeling choices. Sometimes, having enough labelled data is hard, and annotation is difficult. In these cases, models need to be trained with small datasets, which are often less representative. We discuss what approaches are viable in these cases. This talk will help get NLP practitioners an understanding of some common modeling pitfalls and hopefully help develop a sense of what works better in practice. [1] https://github.com/glample/fastBPE [2] https://github.com/rsennrich/subword-nmt ? 講者介紹 About Speaker - Shreya Khurana ? Hi, I'm a data scientist in the Domain Search team at GoDaddy which researches, develops, and deploys deep learning models for Godaddy’s domain business. I enjoy working with data in general and more specifically building NLP models. I'm a Python enthusiast and enjoy sharing my learnings with the community - I've previously presented at the Grace Hopper Conference, PyCon US, EuroPython and GeoPython. #pycontw #pyconapac2022 #python #sequencetosequence #nlp #machinelearning Follow “PyCon Taiwan” ⭐️ Official Website: https://tw.pycon.org ⭐️ Facebook: https://www.facebook.com/pycontw ⭐️ Instagram: https://www.instagram.com/pycontw ⭐️ Twitter: https://twitter.com/PyConTW ⭐️ LinkedIn: https://www.linkedin.com/company/pycontw ⭐️ Blogger: https://pycontw.blogspot.com ... https://www.youtube.com/watch?v=vI74uLKk4ag

Transaction

Created

2 weeks ago

Content Type

Language

video/mp4

English

Controlling

VIDEO

防�

防疫大作戰！使用-raspberry-pi-+

lbry://@pycontw/防疫大作戰！使用-raspberry-pi-+

Day 2, 15:50–16:40 後新冠肺炎時代，出入各種場所都需要進行體溫量測後才能決定是否允許進入。但使用額溫槍速度過慢並且需要大量人工操作，如果使用市售熱像儀所費不貲且功能受限。 FusionPi 是一個融合熱影像和光影像的智慧物體辨識的專案。過去光影像能進行物體辨識是基於大量的資料集與人工標示後的成果，但光影像會受到天候或環境的影響讓辨識率降低。我們提出一種使用 Raspberry Pi + FLIR Lepton 相機 + Raspberry Pi Camera 自製熱像儀的方法，可進行非接觸式的溫度量測並辨識物體。在軟體面，我們使用 Python + NumPy + OpenCV 實做相機校正與影像處理，並利用機器學習做物件辨識訓練。我們將在本場次說明 * 光影像的限制與挑戰 * 熱影像技術介紹 * FusionPi 專案使用的軟體和硬體 * 如何應用機器學習讓 FusionPi 更聰明 * 未來專案方向與學習資源 Slides: http://raspberrypi-tw.s3.amazonaws.com/slides/20200906_make-your-thermal-camera.pdf Speaker: sosorry Hi, I'm sosorry. ... https://www.youtube.com/watch?v=vjLLtKf6rNM

Transaction

Created

2 weeks ago

Content Type

Language

video/mp4

Controlling

VIDEO

KEYNO

keynote-artificial-intelligence-in

lbry://@pycontw/keynote-artificial-intelligence-in

Day 3, 09:25–10:25 The programmatic availability of basically any kind of (financial) data has reshaped finance from a theory-driven to a data-driven discipline. Recent advances in AI in combination with the programmatic availability of (financial) data with further change finance to an AI-first discipline. The talk discusses several important aspects in this regard and provides concrete examples in Python. Slides not uploaded by the speaker. Speaker: Yves J Hilpisch Dr. Yves J. Hilpisch is founder and managing partner of The Python Quants (http://tpq.io), a group focusing on the use of open source technologies for financial data science, artificial intelligence, algorithmic trading, and computational finance. He is also founder and CEO of The AI Machine (http://aimachine.io), a company focused on AI-powered algorithmic trading based a proprietary strategy execution platform. Yves has a Diploma in Business Administration, a Ph.D. in mathematical finance and is Adjunct Professor for Computational Finance at Miami Business School (University of Miami). He is the author of four books (http://books.tpq.io): * Artificial Intelligence in Finance (O’Reilly, current project) * Python for Finance (2018, 2nd ed., O’Reilly) * Listed Volatility and Variance Derivatives (2017, Wiley Finance) * Derivatives Analytics with Python (2015, Wiley Finance) Yves is the director of the first online training program leading to a University Certificates in Python for Algorithmic Trading (http://certificate.tpq.io) and Computational Finance (http://compfinance.tpq.io). He also lectures on computational finance, machine learning and algorithmic trading at the CQF Program (http://cqf.com). Yves is the main author of the financial analytics library DX Analytics (http://dx-analytics.com) and organizes meetups, conferences, and bootcamps about Python, artificial intelligence and algorithmic trading in London (http://pqf.tpq.io), New York (http://aifat.tpq.io), Frankfurt, Berlin and Paris. He has given keynote speeches at technology conferences in the United States, Europe, and Asia. ... https://www.youtube.com/watch?v=HkBIV1ZJKrM

Transaction

Created

2 weeks ago

Content Type

Language

video/mp4

English

Controlling

VIDEO

驢�

驢車-(donkey-car)，一個基於

lbry://@pycontw/驢車-(donkey-car)，一個基於

Day 1, R1 10:50‑11:20 Donkey Car 是 Adam Conway 在 2017 年發起的專案，一開始是想做一個縮小版的自走車，因此使用 1/10 比例的 RC 車作為車體，並用 OpenCV 裡的車道跟隨技術，但他的夥伴 Will Roscoe 希望能像 Google 和 Tesla 一樣可以使用深度學習技術打造自動駕駛系統，因此很快的 Donkey Car 就加入了神經網路模型。 Donkey Car 是一個開源機器學習的自走車專案，車上唯一的感測器就是相機。藉由操作 Donkey Car，我們可以將機器學習的過程跑一遍，包括訓練資料蒐集與處理、模型選擇與調整、實機訓練、測試驗證等。從無到有瞭解機器學習的原理，並操作整個過程，最後透過車輛的自走看到實際的成果。而最有趣的地方不只是機器學習的部份，我們可以自由的改裝車上的硬體或是修改車道的設計比較各項變化，這都能讓我們離真實的自駕車更靠近。 Donkey Car 使用的硬體包括 Raspberry Pi、RC Car(不限)、馬達控制板和最重要的魚眼相機等。使用的軟體包括 Raspbian 系統加上 Keras/Tensorflow 神經網路，搭配 Tornado 做的 web control。我們將分享專案的技術細節、實做上會遇到的問題與相關學習資源，希望有更多人能一起來當車友。 Slides: https://www.slideshare.net/raspberrypi-tw/donkey-car-raspberry-pi Speaker: sosorry Hi, I'm sosorry. ... https://www.youtube.com/watch?v=D121LQqLdi4

Transaction

Created

2 weeks ago

Content Type

Language

video/mp4

Controlling

VIDEO

DOUJI

doujin-activity-in-japan-with-ren’py

lbry://@pycontw/doujin-activity-in-japan-with-ren’py

Day 3, R2 13:00–13:30 I have been using Ren'Py for eight years in Doujin activities. Ren'Py is Python based visual novel game engine. In the past eight years, I felt many benefits using Ren'Py. This talk demonstrates the benefits of using it. The author also describes how Ren'Py evolved from the user's point of view. Slides not uploaded by the speaker. Speaker: Daisuke Saito Daisuke Saito is a assistant professor of the School of Fundamental Science and Engineering, Waseda University in Japan. He acquired a Doctor of Engineering degree from Waseda University in Japan. His research interests include programming education and digital game-based learning. ... https://www.youtube.com/watch?v=bzkhPxLvk58

Transaction

Created

2 weeks ago

Content Type

Language

video/mp4

English

Controlling

VIDEO

[PYCO

pycontw-2013-behind-pinkoi-window

lbry://@pycontw/pycontw-2013-behind-pinkoi-window

... https://www.youtube.com/watch?v=oUAzPSAQC1w

Transaction

Created

2 weeks ago

Content Type

Language

video/mp4

Controlling

VIDEO

SOSOR

sosorry-learn-lora-with-python

lbry://@pycontw/sosorry-learn-lora-with-python

摘要 Abstract: 在物聯網(IoT)的環境下會根據不同需求使用不同的無線通訊方式，而 LoRa 的特色在於長距離與低功耗。但目前 LoRa 並不像大多數成熟的通訊協定可以隨插即用，它的優點在於所有東西都可以自己做，缺點也在於所有東西都要自己做。我們將分享 LoRa 原理與實做經驗，如何用 Raspberry Pi + LoRa 模組 + Python 打造一個微型閘道器(nano gateway)，包括在難以用軟體除錯與優化的情形下，如何使用 Python 自製一個頻譜分析儀來分析訊號。讓有興趣弄髒手的人能 happy hacking。 Slide Link: https://www.slideshare.net/raspberrypi-tw/building-a-raspberry-pi-lora-nano-gateway PyCon Taiwan 2017 official: https://tw.pycon.org/2017/ PyCon Taiwan 2017 Facebook Fan Page: https://www.facebook.com/pycontw/ ... https://www.youtube.com/watch?v=d4L3z2Rvisk

Transaction

Created

2 weeks ago

Content Type

Language

video/mp4