deep learning - understanding deconv layer math -
i need help. trying understand how math of deconv layer works. let's talk layer:
layer { name: "decon" type: "deconvolution" bottom: "conv2" top: "decon" convolution_param { num_output: 1 kernel_size: 4 stride: 2 pad: 1 } }
so layer supposed "upscale" image factor of 2. if @ learned weights, see e.g. this:
-0,0629104823 -0,1560362280 -0,1512266700 -0,0636162385 -0,0635886043 +0,2607241870 +0,2634004350 -0,0603787377 -0,0718072355 +0,3858278100 +0,3168329000 -0,0817491412 -0,0811873227 -0,0312164668 -0,0321144797 -0,0388795212
so far, good. i'm trying understand how apply these weights achieve upscaling effect. need in own code because want use simple pixel shaders.
looking @ caffe code, "deconvolutionlayer::forward_cpu" internally calls "backward_cpu_gemm", "gemm", followed "col2im". understanding of how works this: gemm takes input image, , multiplies each pixel each of 16 weights listed above. gemm produces 16 output "images". col2im sums these 16 "images" produce final output image. due stride of 2, stretches 16 gemm images on output image in such way each output pixel comprised of 4 gemm pixels. sound correct far?
my understand each output pixel calculated nearest 4 low-res pixels, using 4 weights 4x4 deconv weight matrix. if @ following image:
https://i.stack.imgur.com/x6ixe.png
each output pixel uses either yellow, pink, grey or white weights, not other weights. understand correctly? if so, have huge understanding problem, because in order whole concept work correctly, e.g. yellow weights should add same sum pink weights etc. not! result pixel shader produces images 1 out of 4 pixels darker others, or every other line darker, or things (depending on trained model i'm using). obviously, when running model through caffe, no such artifacts occur. must have misunderstanding somewhere. can't find it... :-(
p.s: complete information: there's conv layer in front of deconv layer "num_output" of e.g. 64. deconv layer has e.g. 64 4x4 weights, plus 1 bias, of course.
after lot of debugging found understanding of deconv layer alright. fixed artifacts dividing bias floats 255.0. that's necessary because pixel shaders run in 0-1 range, while caffe bias constants seem targetted @ 0-255 pixel values.
everything working great now.
i still don't understand why 4 weight pairs don't sum same value , how can possibly work. know. work, after all. suppose things mystery me.
Comments
Post a Comment