Idea
Append two types of attention modules on top of dilated FCN, which model the sematic interdependencies in spatial and channel dimensions respectively.
Architecture
Dual Attention Module
Positional Attention Module
Just like the attention mechanism in Transformer, here we may consider as key, as query and as value. The difference is that here we apply convolution layers instead of a linear layer to convert to , and . The final self-attention score is
Channel Attention Module
Apply channel-wise self-attention