Class MultiHeadSelfAttention

java.lang.Object
io.github.kirstenali.deepj.layers.transformer.MultiHeadSelfAttention
All Implemented Interfaces:
Layer, Trainable

public final class MultiHeadSelfAttention extends Object implements Layer
Multi-head causal self-attention for a single sequence (no batch dimension). Input/Output shape: [seqLen x dModel]