计算感受野

发表于 2018-06-20 更新于 2026-01-02 分类于 caffe

feature map的计算公式为outsize = (insize - fsize + 2*pad) / stride + 1
stride 的计算公式为totstride = totstride * stride，这两个公式都是从前往后算
receptive field RF = ((RF -1)* stride) + fsize，这个要从后往前算

#!/usr/bin/env python

net_struct = {'alexnet': {'net':[[11,4,0],[3,2,0],[5,1,2],[3,2,0],[3,1,1],[3,1,1],[3,1,1],[3,2,0]],
                   'name':['conv1','pool1','conv2','pool2','conv3','conv4','conv5','pool5']},
       'vgg16': {'net':[[3,1,1],[3,1,1],[2,2,0],[3,1,1],[3,1,1],[2,2,0],[3,1,1],[3,1,1],[3,1,1],
                        [2,2,0],[3,1,1],[3,1,1],[3,1,1],[2,2,0],[3,1,1],[3,1,1],[3,1,1],[2,2,0]],
                 'name':['conv1_1','conv1_2','pool1','conv2_1','conv2_2','pool2','conv3_1','conv3_2',
                         'conv3_3', 'pool3','conv4_1','conv4_2','conv4_3','pool4','conv5_1','conv5_2','conv5_3','pool5']},
       'zf-5':{'net': [[7,2,3],[3,2,1],[5,2,2],[3,2,1],[3,1,1],[3,1,1],[3,1,1]],
               'name': ['conv1','pool1','conv2','pool2','conv3','conv4','conv5']}}

#[3,1,1] stands for filter_size, stride, padding

imsize = 224

def outFromIn(isz, net, layernum):
    totstride = 1
    insize = isz
    for layer in range(layernum):
        fsize, stride, pad = net[layer]
        outsize = (insize - fsize + 2*pad) / stride + 1
        insize = outsize
        totstride = totstride * stride
    return outsize, totstride

def inFromOut(net, layernum):
    RF = 1
    for layer in reversed(range(layernum)):
        fsize, stride, pad = net[layer]
        RF = ((RF -1)* stride) + fsize
    return RF

if __name__ == '__main__':
    print "layer output sizes given image = %dx%d" % (imsize, imsize)
    for net in net_struct.keys():
        print '************net structrue name is %s**************'% net
        for i in range(len(net_struct[net]['net'])):
            p = outFromIn(imsize,net_struct[net]['net'], i+1)
            rf = inFromOut(net_struct[net]['net'], i+1)
            print "Layer Name = %s, Output size = %3d, Stride = % 3d, RF size = %3d" % (net_struct[net]['name'][i], p[0], p[1], rf)

layer output sizes given image = 224x224
************net structrue name is vgg16**************
Layer Name = conv1_1, Output size = 224, Stride =   1, RF size =   3
Layer Name = conv1_2, Output size = 224, Stride =   1, RF size =   5
Layer Name = pool1, Output size = 112, Stride =   2, RF size =   6
Layer Name = conv2_1, Output size = 112, Stride =   2, RF size =  10
Layer Name = conv2_2, Output size = 112, Stride =   2, RF size =  14
Layer Name = pool2, Output size =  56, Stride =   4, RF size =  16
Layer Name = conv3_1, Output size =  56, Stride =   4, RF size =  24
Layer Name = conv3_2, Output size =  56, Stride =   4, RF size =  32
Layer Name = conv3_3, Output size =  56, Stride =   4, RF size =  40
Layer Name = pool3, Output size =  28, Stride =   8, RF size =  44
Layer Name = conv4_1, Output size =  28, Stride =   8, RF size =  60
Layer Name = conv4_2, Output size =  28, Stride =   8, RF size =  76
Layer Name = conv4_3, Output size =  28, Stride =   8, RF size =  92
Layer Name = pool4, Output size =  14, Stride =  16, RF size = 100
Layer Name = conv5_1, Output size =  14, Stride =  16, RF size = 132
Layer Name = conv5_2, Output size =  14, Stride =  16, RF size = 164
Layer Name = conv5_3, Output size =  14, Stride =  16, RF size = 196
Layer Name = pool5, Output size =   7, Stride =  32, RF size = 212
************net structrue name is zf-5**************
Layer Name = conv1, Output size = 112, Stride =   2, RF size =   7
Layer Name = pool1, Output size =  56, Stride =   4, RF size =  11
Layer Name = conv2, Output size =  28, Stride =   8, RF size =  27
Layer Name = pool2, Output size =  14, Stride =  16, RF size =  43
Layer Name = conv3, Output size =  14, Stride =  16, RF size =  75
Layer Name = conv4, Output size =  14, Stride =  16, RF size = 107
Layer Name = conv5, Output size =  14, Stride =  16, RF size = 139
************net structrue name is alexnet**************
Layer Name = conv1, Output size =  54, Stride =   4, RF size =  11
Layer Name = pool1, Output size =  26, Stride =   8, RF size =  19
Layer Name = conv2, Output size =  26, Stride =   8, RF size =  51
Layer Name = pool2, Output size =  12, Stride =  16, RF size =  67
Layer Name = conv3, Output size =  12, Stride =  16, RF size =  99
Layer Name = conv4, Output size =  12, Stride =  16, RF size = 131
Layer Name = conv5, Output size =  12, Stride =  16, RF size = 163
Layer Name = pool5, Output size =   5, Stride =  32, RF size = 195

v1

Atrous algorithm

deeplab和FCN一样，也是在VGG上finetune
首先要解决的问题是，如何增大最后输出的score map呢？deeplab把VGG最后的pool4和pool5的stride从2变成了1，整个VGG的stride从32变成8
但是修改了stide之后，后面的conv层感受野就不一样大了，不能finetune了，所以这里引入了非常优雅的atrous algorithm
- feature map的感受野的计算公式为$RF_{i}=(RF_{i+1}-1)*stride+kernel$
- 所以在stride减小的情况下想办法增大kernel，即在kernel里面增加hole，kernel变大

deeplab

Fully connected CRF

CRF简单来说，能做到的就是在决定一个位置的像素值时（在这个paper里是label），会考虑周围邻居的像素值（label），这样能抹除一些噪音。但是通过CNN得到的feature map在一定程度上已经足够平滑了，所以short range的CRF没什么意义。于是作者采用了fully connected CRF，这样考虑的就是全局的信息了。
随机变量$X_i$是像素$i$的标签，变量$X$由$X_1, X_2, …, X_N$组成随机向量，$N$就是图像中的像素个数。
在全连接CRF中，标签$x$的能量为

$$
E(x)=\sum _i\theta_i(x_i)+\sum {ij}\theta{ij}(x_i,x_j)
$$

$\theta_i(x_i)$是一元能量，表示像素$i$被分割成$x_i$的能量，二元能量$\theta_{ij}(x_i,x_j)$像素点$i$、$j$同时分割成$x_i$、$x_j$的能量。
一元能量使用FCN的输出

$$
\theta_i(x_i) = -logP(x_i)
$$

二元能量表达式为

$$
\theta_{ij}(x_i, x_j)=\mu(x_i, x_j)[\omega_1exp(-\frac{\left |p_i-p_j \right |^2} {2\sigma_\alpha^2}-\frac{\left |I_i-I_j \right |^2}{2\sigma_\beta^2})+\omega_2exp(-\frac{\left |p_i-p_j \right |^2} {2\sigma_\gamma^2})]
$$

主要参考这里

v2

v2在v1的基础上增加了多感受野

deeplabv2

参考这里

mt19937

发表于 2018-06-12 更新于 2026-01-02

背景

你需要把原来mtalab的代码翻译为c、python等等
你的代码里使用了随机数
你的老板又要求你实现的方法exactly the same
参考这里

mt19927

Mersenne Twister是目前比较常用的随机数生成器
周期非常长$2^{19937}-1$，速度非常快

python代码

import numpy as np
np.random.seed(1337)
A = np.random.random((5,3))
A.T
array([[ 0.26202468,  0.45931689,  0.26194293,  0.11527423,  0.12505793],
       [ 0.15868397,  0.32100054,  0.97608528,  0.38627507,  0.98354861],
       [ 0.27812652,  0.51839282,  0.73281455,  0.62850118,  0.44322487]])

MATLAB代码

rand('twister', 1337);
A = rand(3,5)
A = 
 Columns 1 through 2
   0.262024675015582   0.459316887214567
   0.158683972154466   0.321000540520167
   0.278126519494360   0.518392820597537
  Columns 3 through 4
   0.261942925565145   0.115274226683149
   0.976085284877434   0.386275068634359
   0.732814552690482   0.628501179539712
  Column 5
   0.125057926335599
   0.983548605143641
   0.443224868645128

c++代码

#include <iostream>
#include <random>

int main()
{
  unsigned seed1 = 1337;
  std::mt19937 g1(seed1);
  for(int i=0; i<100; i++)
    std::cout << 1.0*g1()/g1.max() << std::endl;
}

还有一些

最简单的生成算法，混合同余法，可以看这里

c++11的random库，可以参考这里

大数乘法

发表于 2018-06-12 更新于 2026-01-02 分类于 cpp

coursera北大的c++课上的一道题，本地ok，poj(pku的OJ)上ok，垃圾coursera上compile error。
先贴在这里，我感觉写的还是很优雅的。

#include <iostream>  
#include <algorithm>
#include <string>
using namespace std;

class BigInt {
public:
	BigInt(){
		values = "0";
		flag = true;
	}
	BigInt(const string stringvalue){
		values = stringvalue;
		flag = true;
	}
	BigInt(const int intvalue){
		if (intvalue >= 0){
			values = to_string(intvalue);
			flag = true;
		}
		else{
			values = to_string(-intvalue);
			flag = false;
		}
	}
	//~BigInt();
	friend ostream& operator << (ostream& out, const BigInt& b);
	friend istream& operator >> (istream& in, const BigInt& b);
	BigInt& operator + (BigInt b);
	BigInt& operator - (BigInt b);
	BigInt& operator * (BigInt b);
	BigInt& operator / (BigInt b);
	bool operator >= (BigInt b);

//private:
	string values;
	bool flag;
};

bool BigInt::operator >= (BigInt b){
	int agb = 0;
	if (values.length() > b.values.length())
	{
		agb = 1;
	}
	else if (values.length() < b.values.length())
	{
		agb = -1;
	}
	else
	{
		agb = values.compare(b.values);
	}
	return agb>=0;
}

 BigInt& BigInt :: operator + (BigInt b) {

	if (flag == b.flag){
		string res = "";
		reverse(values.begin(), values.end());
		reverse(b.values.begin(), b.values.end());
		int i = 0, carry = 0;
		for (; i < values.length() && i < b.values.length(); ++i){
			int tmp = values[i] - '0' + b.values[i] - '0' + carry;
			carry = tmp / 10;
			tmp = tmp % 10;
			res = (char)(tmp + '0') + res;
		}
		if (i < values.length()){
			for (; i < values.length(); ++i){
				int tmp = values[i] - '0' + carry;
				carry = tmp / 10;
				tmp = tmp % 10;
				res = (char)(tmp + '0') + res;
			}
		}
		else if (i < b.values.length()){
			for (; i < b.values.length(); ++i){
				int tmp = b.values[i] - '0' + carry;
				carry = tmp / 10;
				tmp = tmp % 10;
				res = (char)(tmp + '0') + res;
			}
		}
		if (carry == 1)
			res = '1' + res;
		values = res;
	}
	else{
		int agb = 0;
		if (values.length() > b.values.length())
		{
			agb = 1;
		}
		else if (values.length() < b.values.length())
		{
			agb = -1;
		}
		else
		{
			agb = values.compare(b.values);
		}
		if (0 == agb)
		{
			values = "0";
			return *this;
		}
		
		else if (agb < 0){
			flag = !flag;
			string tmp = values;
			values = b.values;
			b.values = tmp;
		}
		string res = "";
		reverse(values.begin(), values.end());
		reverse(b.values.begin(), b.values.end());
		int i = 0;
		for (; i < values.length() && i < b.values.length(); ++i)
			res.push_back(values.at(i) - b.values.at(i) + '0');

		if (i < values.length()) 
			for (; i < values.length(); ++i)
				res.push_back(values.at(i));

		int carry = 0;
		for (i = 0; i < values.length(); ++i)
		{
			int newValue = res.at(i) - carry - '0';
			if (newValue < 0) carry = 1;
			else carry = 0;
			res.at(i) = newValue + carry * 10 + '0';
		}
		while (res[res.length() - 1] == '0')
			res.pop_back();
		reverse(res.begin(), res.end());
		values = res;
	}

	return *this;
}

 BigInt &BigInt::operator - (BigInt b)
 {	
	 BigInt tmp(b);
	 tmp.flag = !tmp.flag;
	 return *this+tmp;
 }

 BigInt &BigInt::operator * (BigInt b)
 {
	 BigInt res;
	 if (values == "0" || b.values == "0"){
		 values = "0";
		 return *this;
	 }
	 if (flag == b.flag)
		 flag = true;
	 else
		 flag = false;

	 BigInt thisbk(*this);
	 for (int i = 0; i < b.values.length(); ++i){
		 for (int j = 0; j < b.values[i] - '0'; ++j){
			 res = res + thisbk;
		 }
		res.values.push_back('0');
	 }
	 res.values.pop_back();
	 values = res.values;
	 return *this;
 }

 BigInt &BigInt::operator / (BigInt b)
 {
	 if (b.values == "0")
		 throw "Division by zero condition!";
	 if (flag == b.flag)
		 flag = true;
	 else
		 flag = false;

	 BigInt one(1);
	 while (*this >= b){
		 b.values.push_back('0');
		 one.values.push_back('0');
	 }
	 b.values.pop_back();
	 one.values.pop_back();

	 BigInt thisbk(*this), res(0);
	 while (one.values.compare("0") > 0){
		 while (*this >= b){
			 *this - b;
			 res + one;
		 }
		 b.values.pop_back();
		 one.values.pop_back();
	 }
	 values = res.values;
	 return *this;
 }

ostream& operator << (ostream& ou, const BigInt& b)
{
	if (!b.flag)
		ou << '-';
	ou << b.values;
	return ou;
}

istream& operator >> (istream& in, BigInt& b)
{
	string str;
	in >> str;
	b.values = str;
	return in;
}

int main()
{
	BigInt b1, b2;
	string str;
	cin >> b1 >> str >> b2;

	if (str == "+")
	{
		cout << b1 + b2 << endl;
	}
	else if (str == "-")
	{
		cout << b1 - b2 << endl;
	}
	else if (str == "*")
	{
		cout << b1 * b2 << endl;
	}
	else
	{
		cout << b1 / b2 << endl;
	}
	return 0;
}

mobileNet和shuffleNet

发表于 2018-05-13 更新于 2026-01-02 分类于 caffe

先说mobileNet

使用depthwise convolution和point wise(1*1) convolution代替标准的convolution
(b)类似于group为M的卷积，m-th filter is applied to m-th channel
计算量是原来的$\frac{1}{N}+\frac{1}{D_k^2}$，kernel一般是3，所以可以减少到1/8到1/9
论文里还提出了两个控制计算量的超参数
width multiplier，$\alpha$，乘在channel前面，计算量减小到$\frac{1}{\alpha}$
resolution multiplier，$\beta$，乘在输入到尺寸前面，计算量减小到$\frac{1}{\beta}$
好像文章里公式写错了，卷积到计算量应该是乘输出的尺寸，而不是输入到尺寸吧。。。

shuffleNet

在resnet的基础上，用带group的1*1卷积代替原来的1*1卷积
group操作会带来边界效应，学出来的特征会局限，所以就有了channel shuffle层
随机层的caffe实现是先reshape再transpose再flatten，不是真随机，所以可以实现backward
3*3的depth wise的卷积就是moblieNet里用到的
然后用shuffleNet Unit组成shuffetNet网络
一个重要结论是group个数的线性增长并不会带来分类准确率的线性增长。但是发现ShuffleNet对于小的网络效果更明显，因为一般小的网络的channel个数都不多，在限定计算资源的前提下，ShuffleNet可以使用更多的feature map。

CTC

发表于 2018-03-20 更新于 2026-01-02 分类于 caffe

以OCR为例，原始图片经过CNN卷积，图片高度方向尺寸变为1
图片的宽度方向即为时间序列方向
在channel分享进行innerproduct，然后softmax，得到每个序列在每个字符的概率，类似于下面这张图

然后根据这个概率图，使用类似动态规划的思路，可以计算出ctc loss和导数
不想写了，看下面这篇吧
CTC讲解

c复习笔记

发表于 2018-03-11 更新于 2026-01-02 分类于 cpp

cin 带空格的字符串时，需要这样cin.getline(s, 80)，s是char数组
或者也可以这样getline(cin, str)，原型为istream& getline (istream& is, string& str); ，C++对每种流都定义了一个getline函数
在gcc编译器中，对标准库进行了扩展，加入了一个getline函数。会自动malloc, realloc，所以用的话，需要自己手动free，好像没啥人用，参考这里
cout 控制输出精度 cout << fixed << setprecision(2) << f，#include <iomanip>
cout 控制输出格式cout << setfill('0') << setw(4) << a[i][j]
更多

cin cout 重定向

1 2	freopen("foo.txt","w",stdout); freopen(“bar.txt”,”r”,stdin);

lambda表达式

使用lambda对vector进行排序

#include <iostream>
#include <string>
#include <vector>
#include <algorithm>
#include <cstdio>

using namespace std;

int main()
{
   int n;
   double th;
   cin >> n >> th;
   vector<pair<string, double>> res;  
   while(n--){
        string name;
        double score;
        cin >> name >> score;
        if(score > th){
            res.push_back(pair<string, double>(name, score));
        }
    }
    sort(res.begin(), res.end(), [](pair<string, double>& a, pair<string, double>& b) {return a.second > b.second;});
    for(auto i: res){
        printf("%s %.1f\n", i.first.c_str(), i.second);
    }
   return 0;
}

erase删除vector元素

for(it=iVec.begin();it!=iVec.end();){
　　if(*it==4 || *it==5)
　　　　it=iVec.erase(it);
　　else
　　　　it++;
}

Sample a(0), Sample a = 0, 都是调用构造函数
Sample a(9); a = 8 调用两次构造函数，
Sample b = a , Sample b(a) 拷贝构造函数
类型转换构造函数，编译系统会生成一个临时变量
C++编译器遵循以下优先顺序:

先找参数完全匹配的普通函数(非由模板实例化而得的函数)，再找参数完全匹配的模板函数，再找实参经过自动类型转换后能够匹配的普通函数，上面的都找不到, 则报错。

优雅的内存对齐方法

unsigned int calc_align(unsigned int n, unsigned align)
{
    return ((n + align - 1) & (~(align - 1)));
}

__declspec(dllexport)是导出声明，说明这个函数要从DLL中导出给别人用。
__declspec(dllimport)是说这个函数是从别处导入的，不适用也能正常编译代码。
ANSI C是美国国家标准局，为C语言制定的一套国际标准语法，避免各个厂家的C语言不一致
include的文件可以不写到gcc里面

1 2	#include "max.c" gcc main.c

1 2	gcc -c max.c -o max.o gcc max.o main.c

又是火车上的博客

发表于 2018-02-26 更新于 2026-01-02

还是圣经旧约

雅各回乡准备见以扫，但心存芥蒂，仆人和妻子在前，他在最后，且分成两队以备不测。
当天晚上，朦胧之中有人找雅各摔跤，摔到天亮未分胜负，此人即是上帝，上帝让雅各改名以色列，与上帝摔跤的人。
以扫和雅各相约同去以东（也不知道是哪里），雅各仍有戒心，让以扫先走，然后掉头去了迦南的示剑城。
雅各女儿底拿被示剑城太子强奸，太子来提亲，雅各和儿子们要求示剑城全城男子割礼，太子同意，然后雅各儿子们趁他们蛋疼的时候，血洗示剑城。
然后雅各带所有人回家见以撒，路上底拿难产而死，产下一子。
雅各的一个儿子犹大扒灰的故事，太狗血不写。
后面慢慢来吧

jQuery笔记

发表于 2018-02-25 更新于 2026-01-02 分类于 flask网站总结

按属性选取var email = $('[name=email]');

jQuery对象和DOM对象之间可以互相转化：

1
2
3

var div = $('#abc'); // jQuery对象
var divDom = div.get(0); // 假设存在div，获取第1个DOM元素
var another = $(divDom); // 重新把DOM包装为jQuery对象

过滤器

$('ul.lang li'); // 选出JavaScript、Python和Lua 3个节点

$('ul.lang li:first-child'); // 仅选出JavaScript
$('ul.lang li:last-child'); // 仅选出Lua
$('ul.lang li:nth-child(2)'); // 选出第N个元素，N从1开始
$('ul.lang li:nth-child(even)'); // 选出序号为偶数的元素
$('ul.lang li:nth-child(odd)'); // 选出序号为奇数的元素

css selector, element element是后代就可以，element>element必须是父子

ajax jsonp

$.ajax({
  type: 'get',
  url: "http://api.money.126.net/data/feed/0000001,1399001",
  dataType: 'jsonp',
  success: function(data) {
        var str = '当前价格：' +
            data['0000001'].name + ': ' +
            data['0000001'].price + '；' +
            data['1399001'].name + ': ' +
            data['1399001'].price;
        alert(str);
    },
  error: function() {
        alert('出错了');
    }
});

jQuery的jqXHR对象类似一个Promise对象，我们可以用链式写法来处理各种回调

$.ajax({
      type: 'get',
      url: "http://api.money.126.net/data/feed/0000001,1399001",
      dataType: 'jsonp'
}).done(function (data) {
    ajaxLog('成功, 收到的数据: ' + JSON.stringify(data));
}).fail(function (xhr, status) {
    ajaxLog('失败: ' + xhr.status + ', 原因: ' + status);
}).always(function () {
    ajaxLog('请求完成: 无论成功或失败都会调用');
});

scrapy 爬虫

发表于 2018-02-12 更新于 2026-01-02 分类于 scrapy

前言

这是在dc学院上299买的课，也是丧心病狂啊，怀着后悔的心情上完了全部课程。

主要截图

框架图

css选择器

chrome的调试技巧