GPT-4 architecture: what we can deduce from research literature

This text is my personal opinion, developed by researching publicly available sources such as research publications and rumors. I did not and do not work in any of the companies whose current or future products this text speculates about. Intended audience: people with engineering experience or some basic ML knowledge who are interested in language modeling techniques that may have been selected for implementation by “GPT-4” authors from OpenAI. We need such speculation, because the authors have elected to keep the technical detail private, citing safety concerns and competitive landscape....

March 14, 2023 · 8 min · 1655 words · Kirill Gadjello