LBRY Block Explorer • Claim • reinforcement-learning-with-augmented

LBRY Claims • reinforcement-learning-with-augmented

2d2ac211f3bb432d0de33a5ca0497ed777df3daa

Published By

@yannickilcher

Created On

1 Mar 2021 12:11:33 UTC

Transaction ID

64a05be20f3e389e6d03c6c19be8211c4936adc5a9cc21522a2bc36f19dd042d

Cost

Safe for Work

Free

Yes

Reinforcement Learning with Augmented Data (Paper Explained)

This ONE SIMPLE TRICK can take a vanilla RL algorithm to achieve state-of-the-art. What is it? Simply augment your training data before feeding it to the learner! This can be dropped into any RL pipeline and promises big improvements across the board.

Paper: https://arxiv.org/abs/2004.14990
Code: https://www.github.com/MishaLaskin/rad

Abstract:
Learning from visual observations is a fundamental yet challenging problem in reinforcement learning (RL). Although algorithmic advancements combined with convolutional neural networks have proved to be a recipe for success, current methods are still lacking on two fronts: (a) sample efficiency of learning and (b) generalization to new environments. To this end, we present RAD: Reinforcement Learning with Augmented Data, a simple plug-and-play module that can enhance any RL algorithm. We show that data augmentations such as random crop, color jitter, patch cutout, and random convolutions can enable simple RL algorithms to match and even outperform complex state-of-the-art methods across common benchmarks in terms of data-efficiency, generalization, and wall-clock speed. We find that data diversity alone can make agents focus on meaningful information from high-dimensional observations without any changes to the reinforcement learning method. On the DeepMind Control Suite, we show that RAD is state-of-the-art in terms of data-efficiency and performance across 15 environments. We further demonstrate that RAD can significantly improve the test-time generalization on several OpenAI ProcGen benchmarks. Finally, our customized data augmentation modules enable faster wall-clock speed compared to competing RL techniques. Our RAD module and training code are available at this https URL.

Authors: Michael Laskin, Kimin Lee, Adam Stooke, Lerrel Pinto, Pieter Abbeel, Aravind Srinivas

Links:
YouTube: https://www.youtube.com/c/yannickilcher
Twitter: https://twitter.com/ykilcher
BitChute: https://www.bitchute.com/channel/yannic-kilcher
Minds: https://www.minds.com/ykilcher
...
https://www.youtube.com/watch?v=to7vCdkLi4s

Author

Content Type

Unspecified

video/mp4

Language

English

Open in LBRY

More from the publisher

Controlling

VIDEO

CAN W

can-we-contain-covid-19-without-locking

lbry://@yannickilcher/can-we-contain-covid-19-without-locking

My thoughts on the let-the-young-get-infected argument. https://medium.com/amnon-shashua/can-we-contain-covid-19-without-locking-down-the-economy-2a134a71873f Abstract: In this article, we present an analysis of a risk-based selective quarantine model where the population is divided into low and high-risk groups. The high-risk group is quarantined until the low-risk group achieves herd-immunity. We tackle the question of whether this model is safe, in the sense that the health system can contain the number of low-risk people that require severe ICU care (such as life support systems). Authors: Shai Shalev-Shwartz, Amnon Shashua Links: YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher BitChute: https://www.bitchute.com/channel/yannic-kilcher Minds: https://www.minds.com/ykilcher ... https://www.youtube.com/watch?v=XdpF9ZixIbI

Transaction

Created

2 weeks ago

Content Type

Language

video/mp4

English

Controlling

VIDEO

DYNAM

dynamic-inference-with-neural

lbry://@yannickilcher/dynamic-inference-with-neural

#deeplearning #neuralinterpreter #ai OUTLINE: 0:00 - Intro & Overview 3:00 - Model Overview 7:00 - Interpreter weights and function code 9:40 - Routing data to functions via neural type inference 14:55 - ModLin layers 18:25 - Experiments 21:35 - Interview Start 24:50 - General Model Structure 30:10 - Function code and signature 40:30 - Explaining Modulated Layers 49:50 - A closer look at weight sharing 58:30 - Experimental Results Paper: https://arxiv.org/abs/2110.06399 Guests: Nasim Rahaman: https://twitter.com/nasim_rahaman Francesco Locatello: https://twitter.com/FrancescoLocat8 Waleed Gondal: https://twitter.com/Wallii_gondal Abstract: Modern neural network architectures can leverage large amounts of data to generalize well within the training distribution. However, they are less capable of systematic generalization to data drawn from unseen but related distributions, a feat that is hypothesized to require compositional reasoning and reuse of knowledge. In this work, we present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules, which we call \emph{functions}. Inputs to the model are routed through a sequence of functions in a way that is end-to-end learned. The proposed architecture can flexibly compose computation along width and depth, and lends itself well to capacity extension after training. To demonstrate the versatility of Neural Interpreters, we evaluate it in two distinct settings: image classification and visual abstract reasoning on Raven Progressive Matrices. In the former, we show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner. In the latter, we find that Neural Interpreters are competitive with respect to the state-of-the-art in terms of systematic generalization Authors: Nasim Rahaman, Muhammad Waleed Gondal, Shruti Joshi, Peter Gehler, Yoshua Bengio, Francesco Locatello, Bernhard Schölkopf Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yannic-kilcher LinkedIn: https://www.linkedin.com/in/ykilcher BiliBili: https://space.bilibili.com/2017636191 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannickilcher Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litec ... https://www.youtube.com/watch?v=w3knicSHx5s

Transaction

Created

2 weeks ago

Content Type

Language

video/mp4

English

Controlling

VIDEO

AUTHO

author-interview-vos-learning-what-you

lbry://@yannickilcher/author-interview-vos-learning-what-you

#deeplearning #objectdetection #outliers An interview with the authors of "Virtual Outlier Synthesis". Watch the paper review video here: https://youtu.be/i-J4T3uLC9M Outliers are data points that are highly unlikely to be seen in the training distribution, and therefore deep neural networks have troubles when dealing with them. Many approaches to detecting outliers at inference time have been proposed, but most of them show limited success. This paper presents Virtual Outlier Synthesis, which is a method that pairs synthetic outliers, forged in the latent space, with an energy-based regularization of the network at training time. The result is a deep network that can reliably detect outlier datapoints during inference with minimal overhead. OUTLINE: 0:00 - Intro 2:20 - What was the motivation behind this paper? 5:30 - Why object detection? 11:05 - What's the connection to energy-based models? 12:15 - Is a Gaussian mixture model appropriate for high-dimensional data? 16:15 - What are the most important components of the method? 18:30 - What are the downstream effects of the regularizer? 22:00 - Are there severe trade-offs to outlier detection? 23:55 - Main experimental takeaways? 26:10 - Why do outlier detection in the last layer? 30:20 - What does it take to finish a research projects successfully? Paper: https://arxiv.org/abs/2202.01197 Code: https://github.com/deeplearning-wisc/vos Abstract: Out-of-distribution (OOD) detection has received much attention lately due to its importance in the safe deployment of neural networks. One of the key challenges is that models lack supervision signals from unknown data, and as a result, can produce overconfident predictions on OOD data. Previous approaches rely on real outlier datasets for model regularization, which can be costly and sometimes infeasible to obtain in practice. In this paper, we present VOS, a novel framework for OOD detection by adaptively synthesizing virtual outliers that can meaningfully regularize the model's decision boundary during training. Specifically, VOS samples virtual outliers from the low-likelihood region of the class-conditional distribution estimated in the feature space. Alongside, we introduce a novel unknown-aware training objective, which contrastively shapes the uncertainty space between the ID data and synthesized outlier data. VOS achieves state-of-the-art performance on both object detection and image classification models, reducing the FPR95 by up to 7.87% compared to the previous best method. Code is available at this https URL. Authors: Xuefeng Du, Zhaoning Wang, Mu Cai, Yixuan Li Links: Merch: http://store.ykilcher.com TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter ... https://www.youtube.com/watch?v=MgJ3JsE3Tqo

Transaction

Created

2 weeks ago

Content Type

Language

video/mp4

English

Controlling

VIDEO

POPUL

population-based-search-and-open-ended

lbry://@yannickilcher/population-based-search-and-open-ended

Comments on the ICML2019 tutorial on population-based search and open-ended learning. Talk: https://www.facebook.com/icml.imls/videos/481758745967365/ Slides: http://www.cs.uwyo.edu/~jeffclune/share/2019_06_10_ICML_Tutorial.pdf Book: https://www.amazon.com/dp/B00X57B4JG/ Event: https://icml.cc/Conferences/2019/ScheduleMultitrack?event=4336 ... https://www.youtube.com/watch?v=TFiZYA_JfJs

Transaction

Created

2 weeks ago

Content Type

Language

video/mp4

English

Controlling

VIDEO

IMPLI

implicit-mle-backpropagating-through

lbry://@yannickilcher/implicit-mle-backpropagating-through

#imle #backpropagation #discrete Backpropagation is the workhorse of deep learning, but unfortunately, it only works for continuous functions that are amenable to the chain rule of differentiation. Since discrete algorithms have no continuous derivative, deep networks with such algorithms as part of them cannot be effectively trained using backpropagation. This paper presents a method to incorporate a large class of algorithms, formulated as discrete exponential family distributions, into deep networks and derives gradient estimates that can easily be used in end-to-end backpropagation. This enables things like combinatorial optimizers to be part of a network's forward propagation natively. OUTLINE: 0:00 - Intro & Overview 4:25 - Sponsor: Weights & Biases 6:15 - Problem Setup & Contributions 8:50 - Recap: Straight-Through Estimator 13:25 - Encoding the discrete problem as an inner product 19:45 - From algorithm to distribution 23:15 - Substituting the gradient 26:50 - Defining a target distribution 38:30 - Approximating marginals via perturb-and-MAP 45:10 - Entire algorithm recap 56:45 - Github Page & Example Paper: https://arxiv.org/abs/2106.01798 Code (TF): https://github.com/nec-research/tf-imle Code (Torch): https://github.com/uclnlp/torch-imle Our Discord: https://discord.gg/4H8xxDF Sponsor: Weights & Biases https://wandb.com Abstract: Combining discrete probability distributions and combinatorial optimization problems with neural network components has numerous applications but poses several challenges. We propose Implicit Maximum Likelihood Estimation (I-MLE), a framework for end-to-end learning of models combining discrete exponential family distributions and differentiable neural components. I-MLE is widely applicable as it only requires the ability to compute the most probable states and does not rely on smooth relaxations. The framework encompasses several approaches such as perturbation-based implicit differentiation and recent methods to differentiate through black-box combinatorial solvers. We introduce a novel class of noise distributions for approximating marginals via perturb-and-MAP. Moreover, we show that I-MLE simplifies to maximum likelihood estimation when used in some recently studied learning settings that involve combinatorial solvers. Experiments on several datasets suggest that I-MLE is competitive with and often outperforms existing approaches which rely on problem-specific relaxations. Authors: Mathias Niepert, Pasquale Minervini, Luca Franceschi Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yannic-kilcher LinkedIn: https:// ... https://www.youtube.com/watch?v=W2UT8NjUqrk

Transaction

Created

2 weeks ago

Content Type

Language

video/mp4

English

Controlling

VIDEO

[NEWS

news-facebook-s-real-time-tts-system

lbry://@yannickilcher/news-facebook-s-real-time-tts-system

Facebook AI's new Text-To-Speech system is able to create 1 second of speech in as little as 500ms, making it real-time. What's even more impressive is the fact that this does not require a rack of GPUs, but runs on merely 4 CPUs. OUTLINE: 0:00 - Intro 1:00 - Problem Formulation 3:20 - System Explanation 15:00 - Speeding up the computation https://ai.facebook.com/blog/a-highly-efficient-real-time-text-to-speech-system-deployed-on-cpus/ Links: YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher BitChute: https://www.bitchute.com/channel/yannic-kilcher Minds: https://www.minds.com/ykilcher ... https://www.youtube.com/watch?v=XvDzZwoQFcU

Transaction

Created

2 weeks ago

Content Type

Language

video/mp4

English

Controlling

VIDEO

WORLD

world-models

lbry://@yannickilcher/world-models

Authors: David Ha, Jürgen Schmidhuber Abstract: We explore building generative neural network models of popular reinforcement learning environments. Our world model can be trained quickly in an unsupervised manner to learn a compressed spatial and temporal representation of the environment. By using features extracted from the world model as inputs to an agent, we can train a very compact and simple policy that can solve the required task. We can even train our agent entirely inside of its own hallucinated dream generated by its world model, and transfer this policy back into the actual environment. https://arxiv.org/abs/1803.10122 ... https://www.youtube.com/watch?v=dPsXxLyqpfs

Transaction

Created

2 weeks ago

Content Type

Language

video/mp4

English

Controlling

VIDEO

THE W

the-weird-and-wonderful-world-of-ai-art

lbry://@yannickilcher/the-weird-and-wonderful-world-of-ai-art

#aiart #deeplearning #clip Since the release of CLIP, the world of AI art has seen an unprecedented level of acceleration in what's possible to do. Whereas image generation had previously been mostly in the domain of scientists, now a community of professional artists, researchers, and amateurs are sending around colab notebooks and sharing their creations via social media. How did this happen? What is going on? And where do we go from here? Jack Morris and I attempt to answer some of these questions, following his blog post "The Weird and Wonderful World of AI Art" (linked below). OUTLINE: 0:00 - Intro 2:30 - How does one get into AI art? 5:00 - Deep Dream & Style Transfer: the early days of art in deep learning 10:50 - The advent of GANs, ArtBreeder and TikTok 19:50 - Lacking control: Pre-CLIP art 22:40 - CLIP & DALL-E 30:20 - The shift to shared colabs 34:20 - Guided diffusion models 37:20 - Prompt engineering for art models 43:30 - GLIDE 47:00 - Video production & Disco Diffusion 48:40 - Economics, money, and NFTs 54:15 - What does the future hold for AI art? Blog post: https://jxmo.notion.site/The-Weird-and-Wonderful-World-of-AI-Art-b9615a2e7278435b98380ff81ae1cf09 Jack's Blog: https://jxmo.io/ Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yannic-kilcher LinkedIn: https://www.linkedin.com/in/ykilcher BiliBili: https://space.bilibili.com/2017636191 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannickilcher Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n ... https://www.youtube.com/watch?v=DdkenV-ZdJU

Transaction

Created

2 weeks ago

Content Type

Language

video/mp4

English

Controlling

VIDEO

REINF

reinforcement-learning-fast-and-slow

lbry://@yannickilcher/reinforcement-learning-fast-and-slow

Abstract: Deep reinforcement learning (RL) methods have driven impressive advances in artificial intelligence in recent years, exceeding human performance in domains ranging from Atari to Go to no-limit poker. This progress has drawn the attention of cognitive scientists interested in understanding human learning. However, the concern has been raised that deep RL may be too sample-inefficient – that is, it may simply be too slow – to provide a plausible model of how humans learn. In the present review, we counter this critique by describing recently developed techniques that allow deep RL to operate more nimbly, solving problems much more quickly than previous methods. Although these techniques were developed in an AI context, we propose that they may have rich implications for psychology and neuroscience. A key insight, arising from these AI methods, concerns the fundamental connection between fast RL and slower, more incremental forms of learning. Authors: Matthew Botvinick, Sam Ritter, Jane X. Wang, Zeb Kurth-Nelson, Charles Blundell, Demis Hassabis https://www.cell.com/trends/cognitive-sciences/fulltext/S1364-6613(19)30061-0 ... https://www.youtube.com/watch?v=_N_nFzMtWkA

Transaction

Created

2 weeks ago

Content Type

Language

video/mp4

English