PyCon APAC is a worldwide gathering of the Python community. In addition to diverse speech topics, the annual meeting also serves as a platform for Python professionals along with enthusiasts to exchange ideas, experiences, and the latest developments.
PyCon APAC 2022|一般演講 Talks|國泰金控 Cathay Financial Holdings / 美光科技 Micron 冠名贊助
✏️ 共筆 Note:https://hackmd.io/@pycontw/HknC8amkj
?? Slido:https://app.sli.do/event/9SZY8fXP6EyYUYKY5kCaTQ
? 語言 Language:日文演講/英文投影片 Japanese talk w. English slides
? 層級 Level:中階 Intermediate
? 分類 Category:其他 Other
? 摘要 Abstract ?
How can we create a program that can speak (not write) with a human? I love anime and fell in love with a movie "Sing a Bit of Harmony"(讓我聽見愛的歌聲). The character, AI (robot) Shion, is very attractive from an engineer's point of view, and I wanted to implement even some of its functions. I implemented shion.py, which allows humans to enter text by voice and the script responds by voice. In short, it is like a smart speaker that parrots. In other word, the program reads aloud the spoken texts. I started with an easy implementation (with Web API and OS command) to check the idea and then reworked it with pre-trained machine learning models to get closer to Shion. I will share those implementations with you. I would be happy to provide a little inspiration for your Maker project. Keywords like hashtag: #TTS, #ASR, #subprocess, #SpeechRecognition, #ttslearn #ESPnet, #soundfile, #HuggingFace
? 說明 Description ?
Background
Sing a Bit of Harmony
The movie: Sing a Bit of Harmony
- https://www.comingsoon.net/anime/news/1193910-sing-a-bit-of-harmony-trailer-teases-2022-release
- https://youtu.be/1UeIEUoHZ6E
In my opinion, this is an awesome film.
It has the distinction of winning several film festivals: https://en.wikipedia.org/wiki/Sing_a_Bit_of_Harmony#Reception
In October 2021, Sing a Bit of Harmony won the Audience Award at the Scotland Loves Animation film festival.
Motivation
In Japan, fans support their favorite animated films by drawing illustrations.
I cannot draw illustrations, but I wanted to support this movie somehow.
In this film, an AI (robot, android) named Shion plays an key role.
Since Shion is an AI, some parts of it can be reproduced by writing a program.
So, instead of illustrations, I decided to support this film by implementing some of Shion's features.
Technical Details
Define: Shion v0.0.1
I started with implementation of Shion small.
I implement Shion as software. (As for the hardware, it is a future work)
I defined Shion v0.0.1 as a program that enables the following:
1. We inputs voice into Shion (the program)
2. Shion transcribes speech into text
3. Shion processes the text
4. Shion reads the text out loud as response
Text processing is also worth devising, but this time the focus is on handling speech.
Techniques to implement Shion
- Text-To-Speech enables a program to read text out loud
- Automatic Speech Recognition enables us to input voice into a program
Text-To-Speech (a.k.a TTS)
- call OS command [1] (easy to implement, but depends on the environment)
- use a pre-trained machine learning model
Automatic Speech Recognition (a.k.a ASR)
- call API like Cloud Speech-to-Text API [2]
- use a pre-trained machine learning model (to provide this feature without Internet access. Shion is standalone)
[1]: call say command (macOS) like https://docs.python.org/3/howto/logging-cookbook.html#speaking-logging-messages
[2] https://cloud.google.com/speech-to-text
Caveats
⚠️ Mainly deals with ASR and TTS in Japanese. Best effort for ASR and TTS in English.
⚠️ I am a beginner of ASR and TTS, so the focus will be on what implementations are possible. (I will not deal with the theory)
? 講者介紹 About Speaker - nikkie ?
Nikkie began his career as a software engineer in 2016. He started Python as a hobby in 2017 and fell in love with it. He is engaged in Natural Language Processing as a data scientist at Uzabase, inc. Tokyo, Japan from 2019. He is working on the Python community in Japan as a staff of the following event: - [PyCon Japan](https://www.pycon.jp/organizer/index.html): the largest PyCon in Japan - staff on 2019 and 2020 (Program committee, lead on 2020) - [chair](https://pyconjp.blogspot.com/2020/10/pyconjp-2021-chair.html) on 2021 He gave a talk (and lightning talks) at many PyCons in Japan and abroad. - EuroPython 2020, [PyCon APAC 2020](https://youtu.be/JiXnEA7pM7U) (English) He loves anime (Japanese animetation) as much as Python, and implements ideas related to some anime with Python. In 2022, he write code related to "Sing a Bit of Harmony" (e.g. Twitter bot, prototyping AI character, e.t.c.).
#pycontw #pyconapac2022 #python #tts #asr #subprocess #speechrecognition #ttslearn #espnet #soundfile #huggingface
Follow “PyCon Taiwan”
⭐️ Official Website: https://tw.pycon.org
⭐️ Facebook: https://www.facebook.com/pycontw
⭐️ Instagram: https://www.instagram.com/pycontw
⭐️ Twitter: https://twitter.com/PyConTW
⭐️ LinkedIn: https://www.linkedin.com/company/pycontw
⭐️ Blogger: https://pycontw.blogspot.com
...
https://www.youtube.com/watch?v=qPEGGlnTmA8
Day 1, R2 15:45–16:15
Before the EOL of python2, let's reflect one of the most well-known nightmare: str. This talk will analyze the nightmare, and provide the treatment for it. This is the right talk for you if you have encountered UnicodeEncodeError but not knew about the root cause yet. After the talk, you will know the difference between str and bytes in both python2 and python3 and learn the cleaner way to write string.
* This talk will organize many previous PyCon talks with real world supporting python3 experiences.
Slides: https://www.slideshare.net/ssuser2cbb78/the-strbytes-nightmare-before-python2-eol
Speaker: Kir Chou
A code monkey builds search services in Amazon jungle. This will be the 3rd year of his presence in PyCon TW.
在亞馬遜做搜索服務的碼猴,今年將會是它出現在PyCon台灣的第三年。
...
https://www.youtube.com/watch?v=M5CGocevX9Q
Day 2, R0 16:15–16:45
Neural networks work in "mysterious ways", but we can now peer into some of them to see how they work. This talk focuses on a tool called Lucid, from the TensorFlow team, and aims to show some interesting examples of different ways of visualizing neural networks with the goal of improving explainability. You will come away with a bit more understanding of how neural networks "learn", and what aspects they are "looking" at.
Slides not uploaded by the speaker.
Speaker: Yufeng Guo
Yufeng is a Developer Advocate focusing on Cloud AI, where he is working to make machine learning more understandable and usable for all.
He is the creator of the YouTube series AI Adventures, at yt.be/AIAdventures, exploring the art, science, and tools of machine learning.
He is enjoys hearing about new and interesting applications of machine learning, share your use case with him on Twitter @YufengG
...
https://www.youtube.com/watch?v=n6XRoQVlGpY
Deep Learning is so popular and in demand these days, and is becoming the state-of-the-art solution of many topics, such as image recognition, game playing, auto driving, etc.
Besides, we are so lucky that many Deep Learning frameworks and pre-trained models are just free and easy to use, that we can stand on the shoulders of giants and get great results.
In this talk, we are going to introduce how we use TensorFlow and Python to solve image classification problems with more than 180,000 photos a day on PhotoGrid, and avoid inappropriate content online.
We are going to have a brief introduction of Deep Learning and Transfer Learning, then share some experiences about the model training and performance tuning, and how we benefited from Deep Learning in practical use.
Slide Link:
https://goo.gl/BEzf10
PyCon Taiwan 2017 official: https://tw.pycon.org/2017/
PyCon Taiwan 2017 Facebook Fan Page: https://www.facebook.com/pycontw/
...
https://www.youtube.com/watch?v=AosR-pElbOQ
Speaker: Adrian Liaw
PyCuber is a Python package for dealing with Rubik's Cubes, which also contains a built-in solver. Behind these, the implementation of Rubik's Cube's algorithms (formulae) are really important, it provides several useful functions like reverse, mirror, optimise, this talk will cover all these stuffs.
About the speaker
A secondary school Python coder from Taiwan
組織/公司 木刻思股份有限公司
頭銜 Learning Reinforcer 拍打鞭策長
https://tw.pycon.org/2015apac/zh/program/99
...
https://www.youtube.com/watch?v=1_M6ZKJmGyc