Нейронные сети: Attention: Vision Transformer (ViT) | CLIP, Swin, CoAtNet, MLP Mixer | Transformer