Ami is a data scientist employed for the past five years at Final, a financial algorithms company in Israel. Before that, as part of Ph.D studies, he lectured at Tel-Aviv University. Between 2000 and 2005, he worked at IBM's Haifa Research Labs as a researcher in the field of large distributed storage systems.
In 2010 he received a Ph.D in Electrical Engineering from Tel Aviv University, in the field of financial information theory. His bachelor's and master's are from Tel Aviv University too.
Ami uses Python and C++ for data analysis. He contributed to various open source projects, and is the author of a libstd C++ extension shipped with g++ (pb_ds: policy-based data structures).
...
https://www.youtube.com/watch?v=neGrScpSy6w
Speaker: Jimmy Lai
Knowledge graph is the new search engine technology. All the leading search engine exploit knowledge graph to provide more accurate result to user, e.g. Bing, Google, Yahoo.
In this talk, the speaker will demonstrate how to build a searchable knowledge graph from scratch. Lots of python tools will be applied during the process. The process includes data wrangling, graph entity indexing, full text search and web visualization. The data sources are from dbpedia.org. Enormous amount of entities are collected and stored to graph database for relationship querying and full text search engine for searching. In the web visualization, a searchable interface and visualized result demonstrate the knowledgable information to customer.
About the speaker
Jimmy Lai is a Python fan, and his interested topics are natural language processing and machine learning. He specializes in combining machine learning algorithm and cloud computing technology to do big data analysis, building application services.
...
https://www.youtube.com/watch?v=3HB8vTbPJcI
PyCon APAC 2022|一般演講 Talks|國泰金控 Cathay Financial Holdings / 美光科技 Micron 冠名贊助
✏️ 共筆 Note:https://hackmd.io/@pycontw/B17AUTXks
?? Slido:https://app.sli.do/event/aH6RxX7bYGg8WR3GvM3Q3X
? 投影片 Slides:https://docs.google.com/presentation/d/1OEAJF6cP_m62QMiprRzWEg6jOclR-kHdLl8g9hk9iZw/edit?usp=sharing
? 語言 Language:中文演講/英文投影片 Chinese talk w. English slides
? 層級 Level:進階 Experienced
? 分類 Category:Python 核心 Python Core
? 摘要 Abstract ?
Despite their powerful features and flexibility, the low-level APIs for library and framework developers provided in Asyncio are underutilized. One reason might be the high entry barrier – prerequisites like understanding protocols knowledge and asynchronous design patterns. I am sharing my humble experience with Asyncio here. I show you how to implement a network protocol, specifically Websocket, with Protocol and Transport, two of the staple low-level APIs in the package. You will learn how to establish a consensual connection and communicate between the server and the client. While the main focus is on low-level APIs (e.g., we will use Transport to send messages to another endpoint) , we will be using some high-level APIs (e.g., StreamReader, Queue) for auxiliary purposes like message handling. It is hoped that the techniques and familiarity gained with asyncio serve as the basis for your customized network protocol and libraries, even for commercial purposes, in the future.
? 說明 Description ?
Asynchronous web services and frameworks have become ubiquitous in recent years due to their efficiency. In Python development, these very commonly translate to using packages like aiohttp, for sending async network requests, or Uvicorn for setting up web servers for other backend services like FastApi, just to name a few. Though these packages cover most common scenarios and are production ready, their one-size-fits-all approach may fail Python web developers with more flexible and specialized use in mind. As such, for more customizability and specialized feature development, we delve into lower-level APIs like Protocol, Transport provided in Asyncio, which serves as the basis on which the aforementioned packages were built. Doing so, for example, allows us to develop a self-defined inter-process communication (IPC) library or create network protocol.
In this sharing, I will implement application-layer protocol, specifically, WebSocket, as the motivating example, using the low-level API Transport and Protocol from the package asyncio to handle the connection and communication between the server and the client, which is the main focus of the proposed presentation. However, due to the constraint of time, I will still use high-level APIs to expedite the building of part of data reception from the client.
This presentation will familiarize you with
- Asyncio low-level API like Protocol and Transport: you will be able to build your async framework, Inter-process communication, Network Protocols etc.
- The patterns of async development
? 講者介紹 About Speaker - 姜韋辰 (daniel) ?
我是一名後端工程師,熟悉常用的 Python web 框架,喜歡挑戰各種業務場景的服務。專注於系統開發以及服務優化,因應業務需求,偶而也會下海寫寫前端,最近比較著迷 infra 相關建置及維護。
#pycontw #pyconapac2022 #pycontw #asyncio
Follow “PyCon Taiwan”
⭐️ Official Website: https://tw.pycon.org
⭐️ Facebook: https://www.facebook.com/pycontw
⭐️ Instagram: https://www.instagram.com/pycontw
⭐️ Twitter: https://twitter.com/PyConTW
⭐️ LinkedIn: https://www.linkedin.com/company/pycontw
⭐️ Blogger: https://pycontw.blogspot.com
...
https://www.youtube.com/watch?v=fssvxxq7mLk
PyCon APAC 2022|專業課程 Tutorials|國泰金控 Cathay Financial Holdings / 美光科技 Micron 冠名贊助
✏️ 共筆 Note:https://hackmd.io/@pycontw/ryg0L67ys
?? Slido:https://app.sli.do/event/6DJ5vhaAUaAP3as8m7vcnw
? 語言 Language:英文 English
? 層級 Level:中階 Intermediate
? 分類 Category:專案建置工具 Project Tooling
? 摘要 Abstract ?
Acquiring massive amounts of public data from anywhere on the web is crucial in today's data age. Such undertaking could be achieved through the use of Spiders which has two components: (1) Crawling —— the means to find the content of interest and (2) Extraction —— the means of turning data into a structured format. However, the web changes so fast that scaling and maintaining these spiders become an issue. In this talk, we will create an end-to-end web crawling project that walks through each crucial step, the challenges for each stage, and the available tools and techniques to overcome such obstacles. We will be using Scrapy, one of the most popular web crawling Python frameworks, together with its ecosystem of tools.
? 說明 Description ?
Core libraries used:
- scrapy One of the most popular libraries for web crawling in Python
- spidermon Scrapy extension for monitoring spider execution
- web-poet Writing Page Object patterns for web data extraction
- scrapy-poet Scrapy integration for web-poet
- jsonschema allows you to annotate and validate JSON documents
- scrapy-jsonschema Scrapy integration for jsonschema
- playwright handle browsers like Chromium and Firefox.
- scrapy-playwright Scrapy integration for playwright
Helper libraries:
- SpiderKeeper
- number-parser
- dateparser
- maya
- priceparser
- extruct
- html-text
- scrapy-deltafetch
- scrapyrt
- autopager
Enterprise Tools:
- AutoExtract
- Zyte API
- Scrapy Cloud
? 講者介紹 About Speaker - Kevin Lloyd Bernal ?
Kevin is currently a Software Engineer in Zyte. He builds on solutions to crawl the web at scale. He's part of the team that develops and maintains open source packages that enable developers to effectively manage their parsing and crawling solutions. He is also currently studying MS in Computer Science at GA Tech specializing in Machine Learning.
#pycontw #pyconapac2022 #python #webcrawler #scrapy
Follow “PyCon Taiwan”
⭐️ Official Website: https://tw.pycon.org
⭐️ Facebook: https://www.facebook.com/pycontw
⭐️ Instagram: https://www.instagram.com/pycontw
⭐️ Twitter: https://twitter.com/PyConTW
⭐️ LinkedIn: https://www.linkedin.com/company/pycontw
⭐️ Blogger: https://pycontw.blogspot.com
...
https://www.youtube.com/watch?v=pLucY2PoSts
Day 1, 14:05-14:35
Abstract
In this talk, we will share the complete process of how Quark-Engine replaced its core library to enhance resilience and performance. Also, we will share the situations we came across and the strategies of keeping growing in the open-source community. Quark-Engine is a well-known open-source Android malware analysis engine written in python. Many essential features inside are based on Androguard, an open-source Python package for analyzing Android files. However, Androguard is no longer maintained by its author. To ensure the health of Quark-Engine, we had decided to replace Androguard with Rizin, one of the most popular open-source reverse engineering frameworks. There are many challenges behind this work, and we will share how we overcome each of them.
Description
Introduction of Quark-Engine
In this talk, we will briefly introduce Quark-Engine, which covers the key features of Quark, the design of the scoring system, and the usage of Quark. Also, we will take an Android malware sample to show how Quark can analyze malware in a simple but practical way, and how Quark enhances the efficiency of malware analysis.
Why does Quark-Engine need to change the core library?
Androguard is an open-source Python package for analyzing Android files. With the help of Androguard, Quark can implement its essential features. However, the project is no longer maintained recently. The health of Quark-Engine is getting dangerous. Therefore, we decided to replace Androguard with Rizin, one of the most popular open-source reverse engineering frameworks and supported by a strong community.
What is Rizin?
Rizin supports executable file formats on most platforms. It can analyze files, reassemble, and debug, etc. Also, Rizin has a robust community to support the entire project. Besides, Rizin has almost all the features that Androguard has. It is a perfect solution for replacing Androguard. After the replacement, we found that not only Quark's health is getting better, but the performance also gets significantly improved.
What’s the challenge of core library replacement
Furthermore, the usage of the two libraries is different. Many functions in Quark are needed to redesign. During replacing the core library, we must ensure that everything goes on smoothly, which brings us a lot of challenges, but it also brings us a lot of fun. We will share all these interesting findings in this talk.
The comparison of the two Quark-Engine
Finally, we will compare the differences between Rizin and Androguard. We will deep dive into the detail, including performance and accuracy. Then, we will talk about how to evaluate the performance by common tools and the strategy we used to optimize the Rizin-based Quark.
Slides not uploaded by the speaker.
HackMD
...
https://www.youtube.com/watch?v=yaAEoMSepqQ