LBRY Block Explorer • Claim • ml-news-nvidia-gtc'21-deepmind-buys

LBRY Claims • ml-news-nvidia-gtc'21-deepmind-buys

40cd5cdb0fbd6260aad44cd8588c59ea6c090549

Published By

@yannickilcher

Created On

29 Oct 2021 16:25:15 UTC

Transaction ID

e27267aa504f67da9f18dce99ad9cc736e7efab23db658c23e096c888fba8894

Cost

Safe for Work

Free

Yes

[ML News] NVIDIA GTC'21 | DeepMind buys MuJoCo | Google predicts spreadsheet formulas

#gtc21 #mlnews #mujoco

Register to GTC'21 and Win a RTX 3090: https://nvda.ws/2Y2B5ni

OUTLINE:
0:00 - Intro
0:15 - Sponsor: NVIDIA GTC'21
5:35 - DeepMind buys & Open-Sources MuJoCo
7:25 - PyTorch 1.10 Released
9:10 - Google Predicts Spreadsheet Formulas
11:25 - handtracking.io
12:25 - Cell Instance Segmentation Challenge
13:00 - Helpful Libraries
17:50 - Waymo cars keep turning into same dead-end
19:35 - BlueRiver balances tractors

References:
DeepMind buys & open-sources MuJoCo
https://deepmind.com/blog/announcements/mujoco

PyTorch 1.10 released
https://pytorch.org/blog/pytorch-1.10-released/
https://developer.nvidia.com/blog/cuda-graphs/

GoogleAI predicts spreadsheet formulas
https://ai.googleblog.com/2021/10/predicting-spreadsheet-formulas-from.html

Handtracking in Browser
https://handtracking.io/
https://handtracking.io/draw_demo/

Sartorius Cell Instance Segmentation Competition
https://www.kaggle.com/c/sartorius-cell-instance-segmentation/

Helpful Libraries
https://github.com/IntelLabs/control-flag
https://github.com/facebookresearch/salina
https://github.com/facebookresearch/salina/tree/main/salina_examples/rl/a2c/mono_cpu
https://github.com/ydataai/ydata-synthetic
https://syntheticdata.community/
https://github.com/ydataai/ydata-synthetic/blob/master/examples/regular/gan_example.ipynb
https://medium.com/aimstack/aim-3-0-0-the-foundations-for-open-source-open-metadata-ml-platform-f3969755d55
https://github.com/aimhubio/aim
https://robustbench.github.io/

Waymo cars keep coming to same dead-end over and over
https://sanfrancisco.cbslocal.com/2021/10/14/dead-end-sf-street-plagued-with-confused-waymo-cars-trying-to-turn-around-every-5-minutes/

BlueRiver balances tractors
https://www.linkedin.com/posts/lredden_blue-river-is-building-the-boston-dynamics-activity-6850873662959169536-8sue/
https://bluerivertechnology.com/ourmethods/

Links:
TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick
YouTube: https://www.youtube.com/c/yannickilcher
Twitter: https://twitter.com/ykilcher
Discord: https://discord.gg/4H8xxDF
BitChute: https://www.bitchute.com/channel/yannic-kilcher
Minds: https://www.minds.com/ykilcher
Parler: https://parler.com/profile/YannicKilcher
LinkedIn: https://www.linkedin.com/in/ykilcher
BiliBili: https://space.bilibili.com/1824646584

If you want to support me, the best thing to do is to share out the content :)

If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
SubscribeStar: https://www.subscribestar.com/yannickilcher
Patreon: https://www.patreon.com/yannickilcher
Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq
Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2
Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m
Monero (XMR): 4ACL
...
https://www.youtube.com/watch?v=U8Rmfb8aZXE

Author

Content Type

Unspecified

video/mp4

Language

English

Open in LBRY

More from the publisher

Controlling

VIDEO

NEURI

neurips-2020-changes-to-paper-submission

lbry://@yannickilcher/neurips-2020-changes-to-paper-submission

My thoughts on the changes to the paper submission process for NeurIPS 2020. The main new changes are: 1. ACs can desk reject papers 2. All authors have to be able to review if asked 3. Resubmissions from other conferences must be marked and a summary of changes since the last submission must be provided 4. Borader societal / ethical impact must be discussed 5. Upon acceptance, all papers must link to an explanatory video and the PDFs for slides and poster https://neurips.cc/Conferences/2020/CallForPapers https://youtu.be/361h6lHZGDg Links: YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher BitChute: https://www.bitchute.com/channel/yannic-kilcher Minds: https://www.minds.com/ykilcher ... https://www.youtube.com/watch?v=JPX_jSZtszY

Transaction

Created

2 weeks ago

Content Type

Language

video/mp4

English

Controlling

VIDEO

EXTRA

extracting-training-data-from-large-2

lbry://@yannickilcher/extracting-training-data-from-large-2

#ai #privacy #tech This paper demonstrates a method to extract verbatim pieces of the training data from a trained language model. Moreover, some of the extracted pieces only appear a handful of times in the dataset. This points to serious security and privacy implications for models like GPT-3. The authors discuss the risks and propose mitigation strategies. OUTLINE: 0:00 - Intro & Overview 9:15 - Personal Data Example 12:30 - Eidetic Memorization & Language Models 19:50 - Adversary's Objective & Outlier Data 24:45 - Ethical Hedging 26:55 - Two-Step Method Overview 28:20 - Perplexity Baseline 30:30 - Improvement via Perplexity Ratios 37:25 - Weights for Patterns & Weights for Memorization 43:40 - Analysis of Main Results 1:00:30 - Mitigation Strategies 1:01:40 - Conclusion & Comments Paper: https://arxiv.org/abs/2012.07805 Abstract: It has become common to publish large (billion parameter) language models that have been trained on private datasets. This paper demonstrates that in such settings, an adversary can perform a training data extraction attack to recover individual training examples by querying the language model. We demonstrate our attack on GPT-2, a language model trained on scrapes of the public Internet, and are able to extract hundreds of verbatim text sequences from the model's training data. These extracted examples include (public) personally identifiable information (names, phone numbers, and email addresses), IRC conversations, code, and 128-bit UUIDs. Our attack is possible even though each of the above sequences are included in just one document in the training data. We comprehensively evaluate our extraction attack to understand the factors that contribute to its success. For example, we find that larger models are more vulnerable than smaller models. We conclude by drawing lessons and discussing possible safeguards for training large language models. Authors: Nicholas Carlini, Florian Tramer, Eric Wallace, Matthew Jagielski, Ariel Herbert-Voss, Katherine Lee, Adam Roberts, Tom Brown, Dawn Song, Ulfar Erlingsson, Alina Oprea, Colin Raffel Links: YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yannic-kilcher Minds: https://www.minds.com/ykilcher Parler: https://parler.com/profile/YannicKilcher LinkedIn: https://www.linkedin.com/in/yannic-kilcher-488534136/ If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannickilcher Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy ... https://www.youtube.com/watch?v=plK2WVdLTOY

Transaction

Created

2 weeks ago

Content Type

Language

video/mp4

English

Controlling

VIDEO

[ML N

ml-news-eu-regulates-ai,-china-trains

lbry://@yannickilcher/ml-news-eu-regulates-ai,-china-trains

#mlnews #wudao #academicfraud OUTLINE: 0:00 - Intro 0:25 - EU seeks to regulate AI 2:45 - AI COVID detection systems are all flawed 5:05 - Chinese lab trains model 10x GPT-3 size 6:55 - Google error identifies "ugliest" language 9:45 - McDonald's learns about AI buzzwords 11:25 - AI predicts cryptocurrency prices 12:00 - Unreal Engine hack for CLIP 12:35 - Please commit more academic fraud References: https://www.lawfareblog.com/artificial-intelligence-act-what-european-approach-ai https://blogs.sciencemag.org/pipeline/archives/2021/06/02/machine-learning-deserves-better-than-this https://www.nature.com/articles/s42256-021-00307-0 https://en.pingwest.com/a/8693 https://arxiv.org/pdf/2104.12369.pdf https://www.bbc.com/news/world-asia-india-57355011 https://www.zdnet.com/article/mcdonalds-wants-to-democratise-machine-learning-for-all-users-across-its-operations/ https://www.analyticsinsight.net/ai-is-helping-you-make-profits-by-predicting-cryptocurrency-prices/ https://twitter.com/arankomatsuzaki/status/1399471244760649729 https://jacobbuckman.com/2021-05-29-please-commit-more-blatant-academic-fraud/ Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yannic-kilcher Minds: https://www.minds.com/ykilcher Parler: https://parler.com/profile/YannicKilcher LinkedIn: https://www.linkedin.com/in/yannic-kilcher-488534136/ BiliBili: https://space.bilibili.com/1824646584 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannickilcher Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n ... https://www.youtube.com/watch?v=bw1kiLMQFKU

Transaction

Created

2 weeks ago

Content Type

Language

video/mp4

English

Controlling

VIDEO

NEURA

neuralhash-is-broken-how-to-evade

lbry://@yannickilcher/neuralhash-is-broken-how-to-evade

#apple #icloud #neuralhash Send your Apple fanboy friends to prison with this one simple trick ;) We break Apple's NeuralHash algorithm used to detect CSAM for iCloud photos. I show how it's possible to craft arbitrary hash collisions from any source / target image pair using an adversarial example attack. This can be used for many purposes, such as evading detection, or forging false positives, triggering manual reviews. OUTLINE: 0:00 - Intro 1:30 - Forced Hash Collisions via Adversarial Attacks 2:30 - My Successful Attack 5:40 - Results 7:15 - Discussion DISCLAIMER: This is for demonstration and educational purposes only. This is not an endorsement of illegal activity or circumvention of law. Code: https://github.com/yk/neural_hash_collision Extract Model: https://github.com/AsuharietYgvar/AppleNeuralHash2ONNX My Video on NeuralHash: https://youtu.be/z15JLtAuwVI Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yannic-kilcher Minds: https://www.minds.com/ykilcher Parler: https://parler.com/profile/YannicKilcher LinkedIn: https://www.linkedin.com/in/yannic-kilcher-488534136/ BiliBili: https://space.bilibili.com/1824646584 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannickilcher Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n ... https://www.youtube.com/watch?v=6MUpWGeGMxs

Transaction

Created

2 weeks ago

Content Type

Language

video/mp4

English

Controlling

VIDEO

CM3:

cm3-a-causal-masked-multimodal-model-of

lbry://@yannickilcher/cm3-a-causal-masked-multimodal-model-of

#cm3 #languagemodel #transformer This video contains a paper explanation and an incredibly informative interview with first author Armen Aghajanyan. Autoregressive Transformers have come to dominate many fields in Machine Learning, from text generation to image creation and many more. However, there are two problems. First, the collected data is usually scraped from the web and uni- or bi-modal and throws away a lot of structure of the original websites, and second, language modelling losses are uni-directional. CM3 addresses both problems: It directly operates on HTML and includes text, hyperlinks, and even images (via VQGAN tokenization) and can therefore be used in plenty of ways: Text generation, captioning, image creation, entity linking, and much more. It also introduces a new training strategy called Causally Masked Language Modelling, which brings a level of bi-directionality into autoregressive language modelling. In the interview after the paper explanation, Armen and I go deep into the how and why of these giant models, we go over the stunning results and we make sense of what they mean for the future of universal models. OUTLINE: 0:00 - Intro & Overview 6:30 - Directly learning the structure of HTML 12:30 - Causally Masked Language Modelling 18:50 - A short look at how to use this model 23:20 - Start of interview 25:30 - Feeding language models with HTML 29:45 - How to get bi-directionality into decoder-only Transformers? 37:00 - Images are just tokens 41:15 - How does one train such giant models? 45:40 - CM3 results are amazing 58:20 - Large-scale dataset collection and content filtering 1:04:40 - More experimental results 1:12:15 - Why don't we use raw HTML? 1:18:20 - Does this paper contain too many things? Paper: https://arxiv.org/abs/2201.07520 Abstract: We introduce CM3, a family of causally masked generative models trained over a large corpus of structured multi-modal documents that can contain both text and image tokens. Our new causally masked approach generates tokens left to right while also masking out a small number of long token spans that are generated at the end of the string, instead of their original positions. The casual masking object provides a type of hybrid of the more common causal and masked language models, by enabling full generative modeling while also providing bidirectional context when generating the masked spans. We train causally masked language-image models on large-scale web and Wikipedia articles, where each document contains all of the text, hypertext markup, hyperlinks, and image tokens (from a VQVAE-GAN), provided in the order they appear in the original HTML source (before masking). The resulting CM3 models can generate rich structured, multi-modal outputs while conditioning on arbitrary masked ... https://www.youtube.com/watch?v=qNfCVGbvnJc

Transaction

Created

2 weeks ago

Content Type

Language

video/mp4

English

Controlling

VIDEO

NEURI

neurips-2023-poster-session-4-(thursday

lbry://@yannickilcher/neurips-2023-poster-session-4-(thursday

OUTLINE: 0:30 - Activity Grammars for Temporal Action Segmentation 8:50 - Diffusion-TTA: Test-time Adaptation of Discriminative Models via Generative Feedback 17:05 - On the Role of Noise in the Sample Complexity of Learning Recurrent Neural Networks: Exponential Gaps for Long Sequences 21:20 - Sketching Algorithms for Sparse Dictionary Learning: PTAS and Turnstile Streaming 27:10 - Equivariant Adaptation of Large Pretrained Models 33:10 - Multi-Head Adapter Routing for Cross-Task Generalization 39:25 - Geometry-Aware Adaptation for Pretrained Models 46:10 - Adversarial Learning for Feature Shift Detection and Correction Papers: Title: Activity Grammars for Temporal Action Segmentation Link: https://arxiv.org/abs/2312.04266 Author: Dayoung Gong, Joonseok Lee, Deunsol Jung, Suha Kwak, Minsu Cho -------- Title: Diffusion-TTA: Test-time Adaptation of Discriminative Models via Generative Feedback Link: https://arxiv.org/abs/2311.16102 Author: Mihir Prabhudesai, Tsung-Wei Ke, Alexander C. Li, Deepak Pathak, Katerina Fragkiadaki -------- Title: On the Role of Noise in the Sample Complexity of Learning Recurrent Neural Networks: Exponential Gaps for Long Sequences Link: https://arxiv.org/abs/2305.18423 Author: Alireza Fathollah Pour, Hassan Ashtiani -------- Title: Sketching Algorithms for Sparse Dictionary Learning: PTAS and Turnstile Streaming Link: https://arxiv.org/abs/2310.19068 Author: Gregory Dexter, Petros Drineas, David P. Woodruff, Taisuke Yasuda -------- Title: Equivariant Adaptation of Large Pretrained Models Link: https://arxiv.org/pdf/2310.01647.pdf Author: Arnab Kumar Mondal, Siba Smarak Panigrahi, Sékou-Oumar Kaba, Sai Rajeswar, Siamak Ravanbakhsh -------- Title: Multi-Head Adapter Routing for Cross-Task Generalization Link: https://arxiv.org/abs/2211.03831 Author: Lucas Caccia, Edoardo Ponti, Zhan Su, Matheus Pereira, Nicolas Le Roux, Alessandro Sordoni -------- Title: Geometry-Aware Adaptation for Pretrained Models Link: https://arxiv.org/abs/2307.12226 Author: Nicholas Roberts, Xintong Li, Dyah Adila, Sonia Cromp, Tzu-Heng Huang, Jitian Zhao, Frederic Sala -------- Title: Adversarial Learning for Feature Shift Detection and Correction Link: https://arxiv.org/abs/2312.04546 Author: Miriam Barrabes, Daniel Mas Montserrat, Margarita Geleta, Xavier Giro-i-Nieto, Alexander G. Ioannidis Links: Homepage: https://ykilcher.com Merch: https://ykilcher.com/merch YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ykilcher.com/discord LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannickilcher Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n ... https://www.youtube.com/watch?v=cx3bbMf9LRA

Transaction

Created

2 weeks ago

Content Type

Language

video/mp4

Controlling

VIDEO

DDPM

ddpm-diffusion-models-beat-gans-on-image

lbry://@yannickilcher/ddpm-diffusion-models-beat-gans-on-image

#ddpm #diffusionmodels #openai GANs have dominated the image generation space for the majority of the last decade. This paper shows for the first time, how a non-GAN model, a DDPM, can be improved to overtake GANs at standard evaluation metrics for image generation. The produced samples look amazing and other than GANs, the new model has a formal probabilistic foundation. Is there a future for GANs or are Diffusion Models going to overtake them for good? OUTLINE: 0:00 - Intro & Overview 4:10 - Denoising Diffusion Probabilistic Models 11:30 - Formal derivation of the training loss 23:00 - Training in practice 27:55 - Learning the covariance 31:25 - Improving the noise schedule 33:35 - Reducing the loss gradient noise 40:35 - Classifier guidance 52:50 - Experimental Results Paper (this): https://arxiv.org/abs/2105.05233 Paper (previous): https://arxiv.org/abs/2102.09672 Code: https://github.com/openai/guided-diffusion Abstract: We show that diffusion models can achieve image sample quality superior to the current state-of-the-art generative models. We achieve this on unconditional image synthesis by finding a better architecture through a series of ablations. For conditional image synthesis, we further improve sample quality with classifier guidance: a simple, compute-efficient method for trading off diversity for sample quality using gradients from a classifier. We achieve an FID of 2.97 on ImageNet 128×128, 4.59 on ImageNet 256×256, and 7.72 on ImageNet 512×512, and we match BigGAN-deep even with as few as 25 forward passes per sample, all while maintaining better coverage of the distribution. Finally, we find that classifier guidance combines well with upsampling diffusion models, further improving FID to 3.85 on ImageNet 512×512. We release our code at this https URL Authors: Alex Nichol, Prafulla Dhariwal Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yannic-kilcher Minds: https://www.minds.com/ykilcher Parler: https://parler.com/profile/YannicKilcher LinkedIn: https://www.linkedin.com/in/yannic-kilcher-488534136/ BiliBili: https://space.bilibili.com/1824646584 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannickilcher Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5h ... https://www.youtube.com/watch?v=W-O7AZNzbzQ

Transaction

Created

2 weeks ago

Content Type

Language

video/mp4

English

Controlling

VIDEO

CONTE

context-r-cnn-long-term-temporal-context

lbry://@yannickilcher/context-r-cnn-long-term-temporal-context

Object detection often does not occur in a vacuum. Static cameras, such as wildlife traps, collect lots of irregularly sampled data over a large time frame and often capture repeating or similar events. This model learns to dynamically incorporate other frames taken by the same camera into its object detection pipeline. OUTLINE: 0:00 - Intro & Overview 1:10 - Problem Formulation 2:10 - Static Camera Data 6:45 - Architecture Overview 10:00 - Short-Term Memory 15:40 - Long-Term Memory 20:10 - Quantitative Results 22:30 - Qualitative Results 30:10 - False Positives 32:50 - Appendix & Conclusion Paper: https://arxiv.org/abs/1912.03538 My Video On Attention Is All You Need: https://youtu.be/iDulhoQ2pro Abstract: In static monitoring cameras, useful contextual information can stretch far beyond the few seconds typical video understanding models might see: subjects may exhibit similar behavior over multiple days, and background objects remain static. Due to power and storage constraints, sampling frequencies are low, often no faster than one frame per second, and sometimes are irregular due to the use of a motion trigger. In order to perform well in this setting, models must be robust to irregular sampling rates. In this paper we propose a method that leverages temporal context from the unlabeled frames of a novel camera to improve performance at that camera. Specifically, we propose an attention-based approach that allows our model, Context R-CNN, to index into a long term memory bank constructed on a per-camera basis and aggregate contextual features from other frames to boost object detection performance on the current frame. We apply Context R-CNN to two settings: (1) species detection using camera traps, and (2) vehicle detection in traffic cameras, showing in both settings that Context R-CNN leads to performance gains over strong baselines. Moreover, we show that increasing the contextual time horizon leads to improved results. When applied to camera trap data from the Snapshot Serengeti dataset, Context R-CNN with context from up to a month of images outperforms a single-frame baseline by 17.9% mAP, and outperforms S3D (a 3d convolution based baseline) by 11.2% mAP. Authors: Sara Beery, Guanhang Wu, Vivek Rathod, Ronny Votel, Jonathan Huang Links: YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yannic-kilcher Minds: https://www.minds.com/ykilcher ... https://www.youtube.com/watch?v=eI8xTdcZ6VY

Transaction

Created

2 weeks ago

Content Type

Language

video/mp4

English

Controlling

VIDEO

OPENA

openai-tackles-math-formal-mathematics

lbry://@yannickilcher/openai-tackles-math-formal-mathematics

#openai #math #imo Formal mathematics is a challenging area for both humans and machines. For humans, formal proofs require very tedious and meticulous specifications of every last detail and results in very long, overly cumbersome and verbose outputs. For machines, the discreteness and sparse reward nature of the problem presents a significant problem, which is classically tackled by brute force search, guided by a couple of heuristics. Previously, language models have been employed to better guide these proof searches and delivered significant improvements, but automated systems are still far from usable. This paper introduces another concept: An expert iteration procedure is employed to iteratively produce more and more challenging, but solvable problems for the machine to train on, which results in an automated curriculum, and a final algorithm that performs well above the previous models. OpenAI used this method to even solve two problems of the international math olympiad, which was previously infeasible for AI systems. OUTLINE: 0:00 - Intro 2:35 - Paper Overview 5:50 - How do formal proofs work? 9:35 - How expert iteration creates a curriculum 16:50 - Model, data, and training procedure 25:30 - Predicting proof lengths for guiding search 29:10 - Bootstrapping expert iteration 34:10 - Experimental evaluation & scaling properties 40:10 - Results on synthetic data 44:15 - Solving real math problems 47:15 - Discussion & comments Paper: https://arxiv.org/abs/2202.01344 miniF2F benchmark: https://github.com/openai/miniF2F Abstract: We explore the use of expert iteration in the context of language modeling applied to formal mathematics. We show that at same compute budget, expert iteration, by which we mean proof search interleaved with learning, dramatically outperforms proof search only. We also observe that when applied to a collection of formal statements of sufficiently varied difficulty, expert iteration is capable of finding and solving a curriculum of increasingly difficult problems, without the need for associated ground-truth proofs. Finally, by applying this expert iteration to a manually curated set of problem statements, we achieve state-of-the-art on the miniF2F benchmark, automatically solving multiple challenging problems drawn from high school olympiads. Authors: Stanislas Polu, Jesse Michael Han, Kunhao Zheng, Mantas Baksys, Igor Babuschkin, Ilya Sutskever Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yannic-kilcher LinkedIn: https://www.linkedin.com/in/ykilcher BiliBili: https://space.bilibili.com/2017636191 If you want to support me, ... https://www.youtube.com/watch?v=lvYVuOmUVs8

Transaction

Created

2 weeks ago

Content Type

Language

video/mp4

English