Sparse Transformers

Generative Modeling with Sparse Transformers

We’ve developed the Sparse Transformer, a deep neural network which sets new records at predicting what comes next in a sequence—whether text, images, or sound. It uses an algorithmic improvement of the attention mechanism to extract patterns from sequences 30x longer than possible previously.

One existing challenge in AI research is modeling long-range, subtle interdependencies in complex data like images, videos, or sounds. The Sparse Transformer incorporates an $\sqrt{N})$ reformulation of the $O(N^2)$ Transformer self-attention mechanism, along with several other improvements, to apply it directly to these rich data types. Previously, models used on these data were specifically crafted for one domain or difficult to scale to sequences more than a few thousand elements long. In contrast, our model can model sequences with tens of thousands of elements using hundreds of layers, achieving state-of-the-art performance across multiple domains. At OpenAI, we’re using it to help us build AI systems that possess a greater ability to understand the world.

LEARN MORE

DIRECTORY

Related Posts

OWASP Dependency-Check Tool

Cisco Meraki – SaaS Tool

BeachheadSecure – SaaS Tool

Jamf Now – SaaS Tool

More Articles

What’s your address..

WhoTracks.Me: Shedding light on the opaque world of online tracking

5G Explained: Security and Deployment

How to excel at penetration testing

Poland accuses Russia of hacking tax system

Google blocks 173k bad developer accounts

Johnson Controls security advisory

CISA and FBI Update Advisory on Malware Targeting Ukraine

Cybersecurity Domains

Emerging Technologies

Frameworks

Repository

Threats

Subscribe to our newsletter

Welcome Back!

Create New Account!

Retrieve your password

Add New Playlist