User will be implement Transformer: 1. Positional Encoding 2. Scaled Dot-Product Attention 3. Multi-Head Attention 4. Feed-Forward Network (FFN) 5. Layer Normalization 6. Residual Connections Lets discuss for any discovery discussion needed
User will be implement Transformer:
Lets discuss for any discovery discussion needed