Package io.github.kirstenali.deepj.layers.transformer
package io.github.kirstenali.deepj.layers.transformer
-
ClassesClassDescriptionLayerNorm over feature dimension (cols) with trainable gamma/beta exposed as
Parameters.Multi-head causal self-attention for a single sequence (no batch dimension).Pre-LN Transformer block: x = x + Attn(LN(x)) x = x + MLP(LN(x))