The Engineer’s Guide To Deep Learning

We are in the third golden age of AI.

In the previous two golden ages (1950s-1960s and the 1980s), our expectations outpaced the capabilities of the technology at the time, leading to disappointment. In contrast, the AI technology of the current golden age, which began in the mid-2010s, has consistently exceeded our expectations.

Among AI technologies, the Transformer, introduced in 2017, stands as a groundbreaking breakthrough. Initially developed as a machine translation model, its impact has extended to permeate nearly every field. Today, the Transformer model is considered essential knowledge for modern engineers.

The first goal of this document is to provide the shortest path for engineers to understand the Transformer.

What is this document
  • A concise guidebook:
    This document provides just enough information to learn the Transformer.
What this document provides
  • Working Python code examples for hands-on learning:
    To enhance comprehension, this document provides working Python code examples that readers can run themselves.

  • References for further exploration:
    This document introduces readers to a variety of documentation options, recognizing that different individuals find different resources more accessible.

Contents
DateDescription
23.Jul.2024 Added PyTorch version in Parts 1 and 2.
21.May.2024 The first version released.
Next goal

Many Transformer-based technologies are currently being developed. There will definitely be another major breakthrough in the near future. I might write about them if I have time.

© Copyright ALL Right Reserved, Hironobu SUZUKI.

For any inquiries regarding the use of this document or any of its figures, please contact me after reading the following FAQ:

Since publishing my content, I’ve been fortunate to receive a lot of positive feedback, which is truly gratifying. However, I’ve also encountered a few instances where people tried to misuse my content for self-promotion in the past.

These experiences have shaped the approach I’ve outlined below:

  1. Who can use this document freely?
    If you are a teacher or a student belonging to an educational organization, you can freely use this document and figures in your study. Anyone can use this document and figures with noncommercial meetings and lectures, if you state the link to this site and the copyright; otherwise, contact me.
  2. Is it available for commercial contents?
    This content can be used under two options:
    There was a comment on Hacker News that took this seriously, but of course, it’s a joke. To be clear, I have no intention of having any commercial ties to this.
    • Revenue Share: You can leverage this content after a revenue share agreement is signed. Under this agreement, you’ll share 20% of the sales generated from using this content including the github repository.
    • Full Buyout: In very rare cases, I consider requests for full commercial use of all content on this site (and the github repository). For a complete buyout of all content rights, the cost is €10,000,000.
  3. Why doesn’t the author waive the copyright of this document or use the creative commons license?
    I’d like to ask you what problems you have by that I keep on having the copyright of my document.

When you send me an email, please provide at least two SNS addresses (e.g. LinkedIn, Twitter) for verification purposes. Due to the XZ backdoor incident, I no longer accept contact from anonymous individuals.

Exception Educational institutions can use this document freely.

Author

Hironobu SUZUKI

I am a software programmer/engineer, the author of:

I graduated from graduate school in information engineering (M.S. in Information Engineering), had worked for several companies as a software developer and technical manager/director, and published seven books (4 PostgreSQL books and 3 MySQL books) in Japanese and a Chinese book.

As a director of the Japan PostgreSQL Users Group (2010-2016), I organized the largest (non-commercial) technical seminar/lecture on PostgreSQL in Japan for more than six years, and also served as the program committee chair of the Japan PostgreSQL Conference in 2013 and as a member in 2008 and 2009. In June 2022, my interview article was published.

Cuando era joven, vivió en Sudamérica por unos años. Recientemente, a veces vuelve a allí.

I am looking for a new job, applying ML and AI technologies to DBMS.

I’m interested in History, Animal Rights, Cosmology, Social Issues, Environment Issues. I play the piano and guitar. Vegetarian. I love animals, music, science.