Following is for discovery for now: 1.Matrix multiplications (Q·Kᵀ, attention scores, FFN layers) 2. Softmax 3. LayerNorm 4. Embedding + Positional Encoding 5. Batch operations
Following is for discovery for now:
1.Matrix multiplications (Q·Kᵀ, attention scores, FFN layers)
2. Softmax
3. LayerNorm
4. Embedding + Positional Encoding
5. Batch operations