ML4CCN VL10 auditory cortex and semantics

Where does attention in transformers come from?

in semantics it’s the best to push objects as far apart from each other as possible

25 verbs:
“see,” “hear,” “listen,” “taste,” “smell,” “eat,” “touch,” “rub,” “lift,” “manipulate,” “run,” “push,” “fill,” “move,” “ride,” “say,” “fear,” “open,” “approach,” “near,” “enter,” “drive,” “wear,” “break,” and “clean.”

Mitchell 2008
Huth 2016

celery, airplane and apple

  1. record brain data while people read/hear
  2. train linguistic models on large scale text corpora
  3. use transcripts of speech from fMRI to derive predictions from the language model
  4. evaluate prediction using multivariate variance explained

“the animal didn’t cross the street because it was too tired.”

GPT: predict next word (unidirectional attention)
BERT: predict missing word from surrounding context (bidirectional)

you can do this with images as well.

at every time step attention will put a context vector into the first calculation.


Xu et al (2015), “Show, Attend and Tell”, ICML


Schrimpf et al. 2021

only a model predicting the next word was good at predicting the brain

see also

Type:
Tags:
Status:
Location:
Created: 27-01-25 15:16

Source