Questions ML4CCN VL5

What is Nomansland?

Pros and Cons of BrainScore. Are there other scores?

What is a 2-stage RSA?

What is a hIT-fitted RDM?

Are adversarial attacks a bug or a feature?

  • they show the limitations of deepnets because adversarial attacks change an input image in a for a human unperceptible way. So a human still classifies the image as it was, but the deepnet then missclassifies it. so a panda turns into a gibbon while for a human it remains a panda.
  • they can strengthen neural responses because by changing the input noise adversarial attacks can specifically target those neurons active for this particular region. this could lead to a super stimulus.

so if normal cnns output a Gabor-like structure for the first layer, why did dapello (2020) introduce a v1-like layer and got gabor filters as well? So what is the difference and why did they introduce this v1-like layer. what is so special about it?

In the early layers of convolutional neural networks (CNNs), Gabor-like filters indeed emerge when they are trained with natural images. These filters respond to edges and orientations, similar to the neurons in the primary visual cortex (V1). However, the work by Dapello et al. (2020) introduced an explicit V1-like layer into CNNs, which is a significant difference from traditional CNN architectures. The motivation and specifics of this V1-like layer are explained below:

  • Motivation for Introducing the V1-like Layer:

    • Enhancing Robustness to Image Disturbances: The main goal of Dapello et al. was to improve the robustness of CNNs against adversarial attacks and other image disturbances. They observed that models with a higher match to V1 data were also more robust to such disturbances.
    • Emulating Early Visual Processing: The researchers wanted to more accurately model the early visual processing of the brain, particularly how V1 neurons react to images. They argued that the way V1 processes visual stimuli contributes to the robustness of the human visual system.
    • Adapting to Biological Constraints: The V1-like layer was intended to offer a way to adapt CNNs to biological constraints in order to develop more robust and reliable models.
  • Specifics of the V1-like Layer:

    • Local Connectivity and Spatial Structure: Unlike traditional CNN layers, where convolutions are applied across the entire input image, V1-like layers use local connectivity. This means that neurons in these layers only respond to a small region of the input image, similar to the receptive fields in V1.
    • Gabor Filters: The V1-like layer employs Gabor-like filters. In a standard CNN, these filters are learned, but the explicit V1-like layer directly uses Gabor filters at the start of the CNN architecture. These filters are designed to respond to different orientations and spatial frequencies.
    • Nonlinearity and Normalization: Besides Gabor filters, the V1-like layer also uses nonlinear activation functions and normalization operations that are akin to mechanisms in V1.
    • Additional Constraints: This V1-like layer also introduces constraints, such as ensuring that receptive fields are not excessively large and do not have too many units. Additionally, there was a constraint on the sparsity of the connections, meaning each unit was only connected to a subset of spatial locations in the previous layer.
  • The Difference from Regular CNNs:

    • Explicit vs. Implicit Gabor Filters: While regular CNNs learn Gabor-like filters in their initial layers through training, the V1-like layer explicitly implements Gabor filters. This means these filters are not just a result of training but are integrated into the architecture from the outset.
    • Focus on Biological Plausibility: Dapello et al.’s approach places a strong emphasis on biological plausibility. They aimed to create an architecture that not only excels at image recognition but also mirrors the functioning of the visual cortex.
    • Improved Robustness: By implementing this V1-like layer, they were able to develop CNNs that are more robust against image disturbances.
  • Why is this V1-like Layer Special?

    • Enhancement of Robustness: The V1-like layer significantly improves the robustness of CNNs, enhancing performance in image recognition under challenging conditions.
    • Biological Relevance: It offers a better alignment with the actual processes of early visual processing in the brain, making the models more scientifically relevant.
    • Network Adjustment: It demonstrates that architectural constraints on networks can lead to more robust models that better mimic brain function.

In summary, the V1-like layer introduced by Dapello et al. (2020) is more than just another convolutional layer. It is an explicit and biologically-inspired method aimed at enhancing the robustness of CNNs by emulating early visual processing in the brain, including the use of Gabor filters. The difference is not just in the use of Gabor filters, but in explicitly integrating them into the architecture from the beginning rather than having them learned implicitly through training, and in implementing additional constraints inspired by the V1 region.

How does style transfer work?

  • shape and style features are separated and then generate a picture with stable shape but changing style.
  • then you compute the loss of both and improve the picture with gradient descent with minimus loss in style and shape.
  • iterations over again

What kind of noise attacks are there?

What are the pros and cons of FNNs?

Why V1-like layer enhance noise robustness, because otherwise you get gabor like filters anyway. so why the need for v1?

Antworten

Here’s a summary of the information from the sources related to your questions:

Nomansland

  • “Nomansland” refers to a study indicating that face recognition networks might not be as specialized as previously thought.
  • It suggests that brain areas responsible for face recognition, known as “face patches,” may also process other object types, contrary to the traditional belief that these areas are exclusively for faces.

Brain-Score

  • Brain-Score is a benchmark used to evaluate different Deep Neural Networks (DNNs) on their ability to predict neural data from the brain.
  • The goal of Brain-Score is to identify models that best explain brain activity, thus serving as better models for understanding the brain.
  • Some pros and cons of Brain-Score include:
    • Pros:
      • Provides a quantitative way to compare model performance.
      • Allows researchers to develop models that better match brain data.
      • Encourages competition among labs to develop the best brain models.
    • Cons:
      • Unclear if the top models based on Brain-Score lead to more biologically plausible models.
      • No significant correlation between task performance and Brain-Score for newer models.
  • Other benchmarks include Algonauts (for human data) and Sensorium (for mouse data).

2-Stage RSA

  • The sources do not specifically mention a 2-stage RSA. However, RSA or Representational Similarity Analysis is described as:
    • A method for comparing representations at the population level.
    • Focuses on geometric relationships between stimulus representations rather than comparing individual units.
    • Compares similarity patterns across different representations, avoiding the problem of aligning ANN units to brain measurement channels.
    • Uses dissimilarity matrices to represent similarity between different stimuli.

hIT-Adjusted RDM

  • An hIT-adjusted RDM is a Representational Dissimilarity Matrix (RDM) tailored to neuronal responses in the inferior temporal cortex (IT) of the human brain.
  • It represents the dissimilarity between patterns of neuronal activity for different stimuli.
  • An RDM can be visualized as a matrix with color values, where blue indicates low dissimilarity and red indicates high dissimilarity.
  • The similarity of the RDM, measured with Kendall’s τA, is used to compare the alignment between layers of an HCNN model and the human IT.

Adversarial Attacks: Error or Feature?

  • Adversarial attacks are slight, often human-imperceptible changes to an image that cause a neural network to misclassify it.
  • Views differ on whether they are errors or features:
    • Some argue they highlight errors in current models and the lack of robustness in DNNs.
    • Others consider them a natural consequence of learned features, showing network specialization.
  • Adversarial examples reveal differences in visual strategies between biological and artificial neural networks.
  • Adversarial training can improve a network’s robustness.
  • Artificial neurons trained adversarially are shown to be more robust than biological neurons.
  • Adversarial images can even alter human neural responses to make non-preferred images more stimulating than preferred images, creating a “super-stimulus”.

How Style Transfer Works

  • Style Transfer is a technique where the style of one image (e.g., texture) is applied to the content of another image.
  • Achieved by gradient descent to the image, keeping the network constant.
  • Aims to satisfy two objectives:
    • Style-Loss: The generated image should reflect the style of the style image.
    • Content-Loss: The generated image should retain the content of the content image.

Types of Noise Attacks

  • The sources mention various types of noise attacks, including:
    • Universal adversarial attacks: Apply a small, image-independent noise mask to images.
    • Adversarial physical attacks: Use stickers or physical modifications on objects to fool a neural network.
    • Additive Gaussian noise attacks: Add Gaussian noise to images.
    • Salt-and-pepper noise attacks: Add salt-and-pepper noise to images.

Pros and Cons of FNNs

  • FNNs (Feedforward Neural Networks) have historically been the dominant framework for modeling vision.
  • Some of their pros and cons are:
    • Pros:
      • Effective for image recognition and classification tasks.
      • Capable of learning complex non-linear relationships between input and output.
      • Can be trained via various learning rules.
    • Cons:
      • Susceptible to adversarial attacks and often generalize poorly to unknown data.
      • Tend to focus more on texture than form.
      • Struggle to adapt to temporal patterns due to lack of feedback loops.
      • Not ideal for modeling the dorsal areas of the visual system.

I hope this helps! Let me know if you have further questions!

Fragen

Nomansland?

GAN stimulus optimization?

how can we predict fixations?
Why are humans at predicting other humans the best?

what is the BH score (page 24):
Achievement of human-level image recognition by deep neural networks (DNNs) has spurred interest in whether and how DNNs are brain-like. Both DNNs and the visual cortex perform hierarchical processing, and correspondence has been shown between hierarchical visual areas and DNN layers in representing visual features. Here, we propose the brain hierarchy (BH) score as a metric to quantify the degree of hierarchical correspondence based on neural decoding and encoding analyses where DNN unit activations and human brain activity are predicted from each other. We find that BH scores for 29 pre-trained DNNs with various architectures are negatively correlated with image recognition performance, thus indicating that recently developed high-performance DNNs are not necessarily brain-like. Experimental manipulations of DNN models suggest that single-path sequential feedforward architecture with broad spatial integration is critical to brain-like hierarchy. Our method may provide new ways to design DNNs in light of their representational homology to the brain. (Nonaka, 2021)

How do adversarial attacks work?

are there adversarial attacks in V1?

see also

Type:
Tags:
Status:
Location:
Created: 31-01-25 09:42
Machine Learning for Cognitive Computational Neuroscience

Source