LBRY Block Explorer • Claim • write-your-own-micro-data-processing

LBRY Claims • write-your-own-micro-data-processing

f2a1d38ecb2b44b4910be1ef55c9f81c9544a616

Published By

@pycontw

Created On

23 Sep 2022 08:48:47 UTC

Transaction ID

efd7523c0e45bca07ec69869190daa1ae7c0c2b85f2bdfc7d1d1f5cc15b75d36

Cost

Safe for Work

Free

Yes

Write Your Own Micro Data Processing Framework in Python｜David Chen｜PyCon TW 2016

PyCon Taiwan 2016｜一般演講 Talks

? 摘要 Abstract ?
Data processing framework is the core element of Big Data. It provided good abstraction for computing resource and logic. In this talk, I will use google mapreduce (written in python) to introduce some key components such as message queue, pipeline, object collection, fault tolerance and task flow which used in common data processing frameworks. Then use a micro framework written in django to demo how data processing works.

? 關於講者 About Speaker - David Chen ?
GliaCloud founder and coorganizer of GCPUG. Gliacloud is a startup focus on AI and data analysis. I like coding, ramen. and skiing. Most of time, I involve in python coding in Google Cloud Platform for big data processing and cloud architect designing. It is my pleasure to join pycon with lots of passion community members.

#python #pycontw #pycontw2016

Follow “PyCon Taiwan”
⭐️ Official Website: https://tw.pycon.org
⭐️ Facebook: https://www.facebook.com/pycontw
⭐️ Instagram: https://www.instagram.com/pycontw
⭐️ Twitter: https://twitter.com/PyConTW
⭐️ LinkedIn: https://www.linkedin.com/company/pycontw
⭐️ Blogger: https://pycontw.blogspot.com
...
https://www.youtube.com/watch?v=0AQvWl0l3Aw

Author

Content Type

Unspecified

video/mp4

Language

English

Open in LBRY

More from the publisher

Controlling

VIDEO

HACKI

hacking-models-with-python-(pycon-apac

lbry://@pycontw/hacking-models-with-python-(pycon-apac

Speaker: Chia-Chi Chang https://docs.google.com/document/d/1LwQG8pLLO2PEviExoU3xiqyclnaT9kfhwsCGVKVRB2I/edit?usp=sharing Although there are several data mining tools in python, you can use them to deal with almost every kind of data (numeric, text, image, audio, ...) you met. Besides, there are also lots of modeling tools in python, you can use them to build the FIRST LIGHTING MODEL to solve the problems. However, if you want to solve problems deeply, most of time, you need to write down the customerized models and solve them by yourself. Instead of using fast modeling tools, you need to know more about the essential things in modeling: What is a model ? - How models solve your problems ? - What is the connection between models and data ? - What is the important data ? important model ? I deeply believe ... The more the connection between models and data you know, the deeper the problem you can solve. Outline: What is Modeling ? Data, Model, Evaluators Direct Problem: Data + Evaluators → Model Inverse Problem: Data + Models → Evaluator Hacking Models with Metric Learning Data as a Model & Model as a Data Duality between Dimension Reduction and Clustering ... https://www.youtube.com/watch?v=PznPp-BbwyU

Transaction

Created

11 months ago

Content Type

Language

video/mp4

English

Controlling

VIDEO

#SHOR

shorts-pycast-s3ep4-b-pycon-us-2023

lbry://@pycontw/shorts-pycast-s3ep4-b-pycon-us-2023

?美國的會眾比台灣的會眾更主動搭話參加 Congerence 的時候，你是那個躲在角落的小孤獨嗎？會場人來人往，放眼望去都是陌生人，內向者到底要怎麼開始交流呢？議程跑跳經驗豐富的 TP 鼓勵大家主動的開啟話題，一起營造出開放友善的對話環境雖然乍看之下我們的日常都很相似，只要再近一步了解也許就會找到有趣連結！今年九月，我們相約在 PyCon TW 見囉～！ - ? 本集節目感謝 PyCon TW 2023 金級贊助商 Netskope 贊助 - ?點擊收聽全集：https://reurl.cc/WG7XKx Follow “PyCon Taiwan” ⭐️ Official Website: https://tw.pycon.org ⭐️ Facebook: https://www.facebook.com/pycontw ⭐️ Instagram: https://www.instagram.com/pycontw ⭐️ Twitter: https://twitter.com/PyConTW ⭐️ LinkedIn: https://www.linkedin.com/company/pycontw ⭐️ Blogger: https://pycontw.blogspot.com ... https://www.youtube.com/watch?v=HeQ89qatviM

Transaction

Created

11 months ago

Content Type

Language

video/mp4

Controlling

VIDEO

嫌 D

嫌-django-的-validator-不好用-!

lbry://@pycontw/嫌-django-的-validator-不好用-!

Day 2, 13:55-14:25 Abstract Django Validator 可以是資料在進入 DB 前最終的守門員，但難道我們必須走到最後一步(準備寫入 DB)才能發現外部傳來的資料沒有符合規範嗎？若之前業務邏輯已觸碰髒資料而導致問題, 我們是否要在程式碼裡面加上許多錯誤處理? 這樣的問題在引入 DRF(Django Rest Framework)的 ModelSerializer 後會有解決辦法嗎？我希望透過介紹 Django Validator/Serializer 的設計目的與使用時機, 並引出在公司專案內使用上的不便之處. 如果上述各種方法都有痛點, 那不如就自做一套 HTTP request validation 的機制吧 ! 想要達成哪些目標 ? 1. 夠簡單用, 且依賴低 2. 不只在 Django Framework 內可用, 單純 Python 函式參數驗證也能使用 3. 便於日後製作 API spec document 4. 容易擴展新的驗證條件 5. 盡可能 functional programming (純粹爽) 向各位介紹 HARDCORE - "SPEC" Description Django framework 一開始誕生的主要的目標是為了簡化以資料庫為驅動的網站開發, 實際使用上則是將架構分成 Template/View/Model 三大區塊並做直接的串接. 這也讓其設計的 Serializer 系統著重在不同資料類型(xml/ymal/json/python/queryset)的序列化轉換而忽略了驗證資料的合規性. 而其 Validator 的資料驗證系統只著重在 a) 資料進入資料庫欄位時的驗證 b) 將 Template 表單欄位輸入值轉換成 view 內變數時的驗證. 這產生了兩個問題 1) 無法針對 GET 使用的 query paramerter 進行驗證 2) Validator 必須搭配表單 (Form) 使用, 等同於各欄位設計必須基於 Field, 才能將 validator 植入 Field 的 default_validators list 內. 上述的使用情境與限制, 無法充分地滿足公司內部專案的需求, 現實的狀況是當專案爆長, 許多頁面開始利用前端框架進行渲染, template 上不一定會使用 form; 而後端更引入 DRF, 使得 API 的設計規範出現更多分歧(function v.s view class v.s. APIView-like), ModelSerializer 的出現導致工程師容易在 APIView 內濫用 serializer.is_valid 作為 API request 與資料進入 model 的唯一檢查手段. 久之, 當 view 裡面的商業邏輯越來越多, 但卻都希望勁量靠近圍繞著 serializer 來確保使用資料合乎規範, 程式碼耦合性變高, 越難職責切割, 開始容易疊床架屋... 幾經思考, 決定明確將 API 設計方向切成數個部分, API 進入點部分商業邏輯部分 API 返回值部分第一部分: API 進入點. 這裡會有 HTTP request 透過 view 的 get/post function 被傳入, 我們可以在此處進行所有參數的驗證. 一來避免錯誤或惡意的參數在接下來的程式碼中被使用導致不預期的錯誤, 二來儘� ... https://www.youtube.com/watch?v=urOLJqX0lZ0

Transaction

Created

11 months ago

Content Type

Language

video/mp4

Controlling

VIDEO

TRACK

track-machine-learning-applications-by

lbry://@pycontw/track-machine-learning-applications-by

Day 1, R1 15:55–16:10 Productization of machine learning (ML) solutions can be challenging. Therefore, the concept of operationalization on machine learning (MLOps) has emerged in the past few years for effective model lifecycle management. One of the core aspects of MLOps is "monitoring". ML models are built by experimenting with a wide range of datasets. However, since the real data continues to change, it is necessary to monitor and to manage model usage, consumption, and results of models. MLflow is an open-source framework designed to manage the end-to-end ML lifecycle with different components. In the talk, the basic concepts of MLflow will be introduced. Then, MLflow Tracking will be the main focus. You will know how to track experiments for recording and comparing parameters and results by MLflow Tracking. Slides: https://speakerdeck.com/sucitw/track-machine-learning-applications-by-mlflow-tracking Speaker: Shuhsi Lin A data engineer and python programmer. Currently working on various data applications in a manufacturing company. Research interests: IoT applications, data streaming processing, data analysis and data visualization. ... https://www.youtube.com/watch?v=76QWG9di1Hs

Transaction

Created

11 months ago

Content Type

Language

video/mp4

English

Controlling

VIDEO

FINDI

finding-the-perfect-shift-schedule-shung

lbry://@pycontw/finding-the-perfect-shift-schedule-shung

PyCon APAC 2022｜一般演講 Talks｜國泰金控 Cathay Financial Holdings / 美光科技 Micron 冠名贊助 ✏️ 共筆 Note：https://hackmd.io/@pycontw/rJ5aL67yo ?? Slido：https://app.sli.do/event/ek2NTzkytzYkU1XVkpvdzk ? 語言 Language：英文 English ? 層級 Level：中階 Intermediate ? 分類 Category：應用 Application ? 摘要 Abstract ? Finding a working shift schedule can be a tedious task, there’s usually various requirements to juggle all at once, from time-offs, shift types, conflicting schedules, to personal preferences; and one may easily miss a requirement or two by mistake. Instead, we can use so called “solvers” and leave the hard lifting up to computer to find not just a valid schedule, but a good one; and as it turns out, the same technique also works for a broad spectrum of problems such as conference scheduling, vehicle routing, bin packing, and more. ? 說明 Description ? This talk will likely interest those who likes to explore using Python to solve logical problems, because in some way, finding a shift schedule is a lot like playing Sudoku. There’s a lot of holes that you need to fill, but only certain element can fit in the holes. So if we can write programs that solves Sudoku, then surely we can write programs that produces a good shift schedule, right? Indeed we can, but there’s an even better approach. Instead of writing our own custom program that specifically schedules shifts, we can use off-the-shelve solvers that are very good at solving a more general kind problem, and transform our original problem into a form that it understands. Behind the scene these off-the shelve solvers use similar technique used by the Sudoku solver mentioned above (e.g. backtracking and constrain propagation), albeit much more advanced, since tons of optimization and clever tuning have went into make them work well. By then end of the talk, you’ll know the basic of Z3’s Python API, how to formulate the shift scheduling problem in propositional logic, and a slight idea of how it can be applied to solve similar problems such as conference scheduling, vehicle routing, bin packing, and more. Audience are suggested to have a good understand of list comprehension and itertools since both will be heavily used throughout the talk. Ideally, some experience with predicate logic is preferred, though it is not strictly necessary. However, no domain knowledge is required. - Side note: Z3 is an immensely powerful tool; this talk can only covers a tip of iceberg in terms of what it can do. Other interesting use of Z3 includes: software verification, program synthesis, exploit generation, code simplification, etc. ? 講者介紹 About Speaker - Shung-Hsi Yu ? Kernel Engineer at SUSE working on BPF #pycontw #pyconapac2022 #python #shiftschedule #algorithm #z3 Follow “PyCon Taiwan” ⭐️ Official Website: https://tw.pycon.org ⭐️ Facebook: https://www.facebook.com/pycontw ⭐️ Instagram: https://www.instagram.com/pycontw ⭐️ Twitter: https://twitter.com/PyConTW ⭐️ LinkedIn: https://www.linkedin.com/company/pycontw ⭐️ Blogger: https://pycontw.blogspot.com ... https://www.youtube.com/watch?v=T6Q2fPnPgUU

Transaction

Created

11 months ago

Content Type

Language

video/mp4

English

Controlling

VIDEO

GETTI

getting-started-with-sparse-modeling

lbry://@pycontw/getting-started-with-sparse-modeling

Day 2, R2 11:45–12:15 Blackbox problem has been becoming a popular concern when applying machine learning in specific applications, like medical system, where a user is supposed to understand the behavior of the system. Collecting tons of data for training machine learning model is another headache especially when you newly create a system from scratch. In this talk, I introduce the data analysis approach called "Sparse Modeling" that can produce good results, even if the amount of data is small. Event Horizon Telescope project, capturing blackhole image, is one good example of this nature. It's also referred to as explainable since it can tell you which input features have a strong impact to result generated by a machine learning model. With the overview of the method, I'll show concrete code examples for common use cases like image analysis, using a Python library named spm-image. Slides: https://speakerdeck.com/hacarus/getting-started-with-sparse-modeling-with-spm-image Speaker: Takashi Someda After getting his master’s degree in informatics at Graduate School of Kyoto University, he started his job at Sun Microsystems as an engineer. For about 20 years in the software industry, he has experienced several roles like software developer, technical evangelist, and data scientist. Now, as CTO of Hacarus, he is responsible for technical direction with strong passion toward building a creative, self-organized team like Pixer. ... https://www.youtube.com/watch?v=KjPyDhbqzKE

Transaction

Created

11 months ago

Content Type

Language

video/mp4

English

Controlling

VIDEO

中�

中文長文本語意理解-victor-pycon

lbry://@pycontw/中文長文本語意理解-victor-pycon

PyCon Taiwan 2023｜Talk 演講｜Day 1, R0 15:25–15:55 ? 說明 Description ? 近幾個月來，我們對於長文本資訊擷取和文本分類模型有許多研究心得，這幾項技術讓我們能夠迅速從中文文本中萃取出關鍵訊息，並將相似文本進行分類，大大提升子公司在資訊檢索與分析的效率。在自然語言處理領域中，處理長文本對語言模型仍然是一大挑戰，我們也為此提出了兩階段模型的方法，以解決中文長文本的難題。希望透過這些努力，能夠為國泰資料科學的發展貢獻一份微薄之力，同時也期待能夠透過這個簡介，讓更多人了解國泰資料科學的發展與自然語言處理技術的價值。 ? 講者介紹 About Speaker - Victor ? 我是 Victor，專注於自然語言處理和機器學習等相關研究領域。我參與過多項自然語言處理專案，包括資訊擷取、文本分類和命名實體辨識模型的開發，這些經驗讓我對自然語言處理和相關模型訓練有更深入的理解。此外，我也積極參與其他機器學習領域的專案，最終在國泰子公司成功落地。 Follow “PyCon Taiwan” ⭐️ Official Website: https://tw.pycon.org ⭐️ Facebook: https://www.facebook.com/pycontw ⭐️ Instagram: https://www.instagram.com/pycontw ⭐️ Twitter: https://twitter.com/PyConTW ⭐️ LinkedIn: https://www.linkedin.com/company/pycontw ⭐️ Blogger: https://conf.python.tw/ ... https://www.youtube.com/watch?v=QB4q_gcWIvo

Transaction

Created

11 months ago

Content Type

Language

video/mp4

Controlling

VIDEO

R1 DA

r1-day1-05-network-security-and-analysis

lbry://@pycontw/r1-day1-05-network-security-and-analysis

Speaker: Lee Yang Peng I developed and evaluated Analytics, a tool that analyses packet data to learn information about network protocol formats. Analytics attempts to discover constants and enumeration fields among packet data, while providing visualization to aid analysts. My experiments on fixed length protocol headers show that the heuristics implemented for Analytics in detecting constants and enumeration fields are mostly accurate. It has an average accuracy in detecting constants of 76.8% and an average accuracy in detecting enumeration fields of 88.6%. As Analytics consists of heuristics to detect the targeted fields in network traces, it can also be applied onto proprietary or unknown protocols. From my talk, audience can learn about network security and its significance. Poor network security can result in vulnerabilities in an organization, which may result in commercial espionage, the leakage of company secrets, or the control of computers connected to the network to perform illegal activities. Audience can also benefit from my talk by learning about Deep Packet Inspection, a common process used in large organizations to maintain network security and prevent the transfer or malicious data through a network. Experts in the field can appreciate the tool, 'Analytics', that demonstrates the use of Python in garnering information about unknown network protocol formats. About the speaker I'm a 16 year old student from Dunman High School 組織/公司 Dunman High School 頭銜 Student ... https://www.youtube.com/watch?v=7qsixKitI18

Transaction

Created

11 months ago

Content Type

Language

video/mp4

English

Controlling

VIDEO

許�

許邱翔-python-module-in-rust

lbry://@pycontw/許邱翔-python-module-in-rust

摘要: Python 有好幾種跟其他程式語言介接方式，本講題將會示範如何撰寫 CPython Extension，並簡單介紹 Rust 這個新興的程式語言的特色，最後講解如何使用 Rust 來撰寫並編譯出 CPython Extension。 Slide Link: https://docs.google.com/presentation/d/1mTw-4buKDTqPNzJS03s2I0apBMal-SaeKk1dHDSE6fk/pub PyCon Taiwan 2017 official: https://tw.pycon.org/2017/ PyCon Taiwan 2017 Facebook Fan Page: https://www.facebook.com/pycontw/ ... https://www.youtube.com/watch?v=4s62XopdPJs

Transaction

Created

11 months ago

Content Type

Language

video/mp4