深海游弋的鱼 – 第 2 页

Why GEMM is at the heart of deep learning

I spend most of my time worrying about how to make deep learning with neural networks faster and more power efficient. In practice that means focusing on a function called GEMM. It’s part of the BLAS (Basic Linear Algebra Subprograms) library that was first created in 1979, and until I started trying to optimize neural networks I’d never heard of it.
继续阅读Why GEMM is at the heart of deep learning

CNN 基础之卷积及其矩阵加速

卷积在 CNN 中是非常基础的一个操作, 但是, 一旦写出来, 要画不少的图, 所以, 一直拖了下来, 刚好最近看到一个比较好的图, 能够说明卷积转化为矩阵相乘就行操作的方法.
继续阅读

ROCm：AMD系开源HPC/超规模GPU计算/深度学习平台

ROCm的英文全称Radeon Open Compute platform，它是AMD在去年12月推出的一款开源GPU运算平台，目前已经发展到了1.3版本。MIOpen则是AMD为此开发的软件库，其作用是将程序设计语言和ROCm平台连接，以充分利用GCN架构。

本次发布的版本包括以下内容:

深度卷积解算器针对前向和后向传播进行了优化
包括Winograd和FFT转换在内的优化卷积
优化GEMM深入学习
Pooling、Softmax、Activations、梯度算法的批量标准化和LR规范化
MIOpen将数据描述为4-D张量 - Tensors 4D NCHW格式
支持OpenCL和HIP的框架API
MIOpen驱动程序可以测试MIOpen中任何特定图层的向前/向后网络
二进制包增加了对Ubuntu 16.04和Fedora 24的支持
源代码位于https://github.com/ROCmSoftwarePlatform/MIOpen
参考文档
- MIOpen
- MIOpenGemm

继续阅读ROCm：AMD系开源HPC/超规模GPU计算/深度学习平台

卷积神经网络CNN总结

卷积神经网络依旧是层级网络，只是层的功能和形式做了变化，可以说是传统神经网络的一个改进。卷积网络在本质上是一种输入到输出的映射，它能够学习大量的输入与输出之间的映射关系，而不需要任何输入和输出之间的精确的数学表达式，只要用已知的模式对卷积网络加以训练，网络就具有输入输出对之间的映射能力。

继续阅读卷积神经网络CNN总结

ubuntu 16.04 LTS使用开源面部识别库Openface

Openface是一个基于深度神经网络的开源人脸识别系统。该系统基于谷歌的文章FaceNet: A Unified Embedding for Face Recognition and Clustering。Openface是卡内基梅隆大学的Brandon Amos主导的。

1.准备系统环境

如果是服务器版本的ubuntu,可能默认Python都没有安装

#如果没有安装python的话，执行最小安装命令即可，目前我们需要的是Python2
$ sudo apt-get install python-minimal
$ sudo apt-get install python-pip
$ sudo pip install --upgrade pip

#如果没有安装git的话，此处需要手工安装
$ sudo apt-get install git

#编译dlib时候需要
$ sudo apt-get install cmake
$ sudo apt-get install libboost-dev
$ sudo apt-get install libboost-python-dev

#如果没有安装python的话，执行最小安装命令即可，目前我们需要的是Python2

$ sudo apt-get install python-minimal

$ sudo apt-get install python-pip

$ sudo pip install --upgrade pip

#如果没有安装git的话，此处需要手工安装

$ sudo apt-get install git

#编译dlib时候需要

$ sudo apt-get install cmake

$ sudo apt-get install libboost-dev

$ sudo apt-get install libboost-python-dev

2.下载代码

$ git clone https://github.com/cmusatyalab/openface.git

1	$ git clone https://github.com/cmusatyalab/openface.git

3.安装`opencv`

$ sudo apt-get install libopencv-dev
$ sudo apt-get install python-opencv

1 2	$ sudo apt-get install libopencv-dev $ sudo apt-get install python-opencv

4.安装依赖的`python`库

$ cd openface
$ pip install -r requirements.txt
$ sudo pip install dlib
$ sudo pip install matplotlib

$ cd openface

$ pip install -r requirements.txt

$ sudo pip install dlib

$ sudo pip install matplotlib

5.安装`Torch7`

参考链接 ubuntu 16.04 LTS上安装Torch7

编译完成后，路径变量被加入了.bashrc，我们需要刷新一下Shell的环境变量

# 使 torch7 设置的刚刚的环境变量生效
source ~/.bashrc

1 2	# 使 torch7 设置的刚刚的环境变量生效 source ~/.bashrc

6.安装依赖的`lua`库

$ luarocks install dpnn

#下面的为选装，有些函数或方法可能会用到
$ luarocks install image
$ luarocks install nn
$ luarocks install graphicsmagick
$ luarocks install torchx
$ luarocks install csvigo

$ luarocks install dpnn

#下面的为选装，有些函数或方法可能会用到

$ luarocks install image

$ luarocks install nn

$ luarocks install graphicsmagick

$ luarocks install torchx

$ luarocks install csvigo

7.编译代码

$ python setup.py build
$ sudo python setup.py install

1 2	$ python setup.py build $ sudo python setup.py install

8.下载预训练后的数据

$ sh models/get-models.sh
#参考 https://cmusatyalab.github.io/openface/models-and-accuracies/ 下载对应的模型
$ wget https://storage.cmusatyalab.org/openface-models/nn4.v1.t7 -O models/openface/nn4.v1.t7

$ sh models/get-models.sh

#参考 https://cmusatyalab.github.io/openface/models-and-accuracies/ 下载对应的模型

$ wget https://storage.cmusatyalab.org/openface-models/nn4.v1.t7 -O models/openface/nn4.v1.t7

9.执行测试Demo

执行的脚本face_detect.py代码如下：

#!/usr/bin/env python2

import argparse
import cv2
import os
import dlib

import numpy as np
np.set_printoptions(precision=2)
import openface

from matplotlib import cm

fileDir = os.path.dirname(os.path.realpath(__file__))
modelDir = os.path.join(fileDir, '..', 'models')
dlibModelDir = os.path.join(modelDir, 'dlib')

if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument(
        '--dlibFacePredictor',
        type=str,
        help="Path to dlib's face predictor.",
        default=os.path.join(
            dlibModelDir,
            "shape_predictor_68_face_landmarks.dat"))
    parser.add_argument(
        '--networkModel',
        type=str,
        help="Path to Torch network model.",
        default='models/openface/nn4.v1.t7')
    # Download model from:
    # https://storage.cmusatyalab.org/openface-models/nn4.v1.t7
    parser.add_argument('--imgDim', type=int,
                        help="Default image dimension.", default=96)
    # parser.add_argument('--width', type=int, default=640)
    # parser.add_argument('--height', type=int, default=480)
    parser.add_argument('--width', type=int, default=1280)
    parser.add_argument('--height', type=int, default=800)
    parser.add_argument('--scale', type=int, default=1.0)
    parser.add_argument('--cuda', action='store_true')
    parser.add_argument('--image', type=str,help='Path of image to recognition')

    args = parser.parse_args()
    if (None == args.image) or (not os.path.exists(args.image)):
	print '--image not set or image file not exists'
	exit()

    align = openface.AlignDlib(args.dlibFacePredictor)
    net = openface.TorchNeuralNet(
        args.networkModel,
        imgDim=args.imgDim,
        cuda=args.cuda)

    cv2.namedWindow('video', cv2.WINDOW_NORMAL)

    frame = cv2.imread(args.image)  
    bbs = align.getAllFaceBoundingBoxes(frame)
    for i, bb in enumerate(bbs):
	# landmarkIndices set  "https://cmusatyalab.github.io/openface/models-and-accuracies/"
        alignedFace = align.align(96, frame, bb,
                                      landmarkIndices=openface.AlignDlib.OUTER_EYES_AND_NOSE)
        rep = net.forward(alignedFace)

        center = bb.center()
        centerI = 0.7 * center.x * center.y / \
                (args.scale * args.scale * args.width * args.height)
        color_np = cm.Set1(centerI)
        color_cv = list(np.multiply(color_np[:3], 255))

        bl = (int(bb.left() / args.scale), int(bb.bottom() / args.scale))
        tr = (int(bb.right() / args.scale), int(bb.top() / args.scale))
        cv2.rectangle(frame, bl, tr, color=color_cv, thickness=3)

    cv2.imshow('video', frame)

    cv2.waitKey (0)  

    cv2.destroyAllWindows()

#!/usr/bin/env python2

import argparse

import cv2

import os

import dlib

import numpy as np

np.set_printoptions(precision=2)

import openface

from matplotlib import cm

fileDir = os.path.dirname(os.path.realpath(__file__))

modelDir = os.path.join(fileDir, '..', 'models')

dlibModelDir = os.path.join(modelDir, 'dlib')

if __name__ == '__main__':

parser = argparse.ArgumentParser()

parser.add_argument(

'--dlibFacePredictor',

type=str,

help="Path to dlib's face predictor.",

default=os.path.join(

dlibModelDir,

"shape_predictor_68_face_landmarks.dat"))

parser.add_argument(

'--networkModel',

type=str,

help="Path to Torch network model.",

default='models/openface/nn4.v1.t7')

# Download model from:

# https://storage.cmusatyalab.org/openface-models/nn4.v1.t7

parser.add_argument('--imgDim', type=int,

help="Default image dimension.", default=96)

# parser.add_argument('--width', type=int, default=640)

# parser.add_argument('--height', type=int, default=480)

parser.add_argument('--width', type=int, default=1280)

parser.add_argument('--height', type=int, default=800)

parser.add_argument('--scale', type=int, default=1.0)

parser.add_argument('--cuda', action='store_true')

parser.add_argument('--image', type=str,help='Path of image to recognition')

args = parser.parse_args()

if (None == args.image) or (not os.path.exists(args.image)):

print '--image not set or image file not exists'

exit()

align = openface.AlignDlib(args.dlibFacePredictor)

net = openface.TorchNeuralNet(

args.networkModel,

imgDim=args.imgDim,

cuda=args.cuda)

cv2.namedWindow('video', cv2.WINDOW_NORMAL)

frame = cv2.imread(args.image)

bbs = align.getAllFaceBoundingBoxes(frame)

for i, bb in enumerate(bbs):

# landmarkIndices set "https://cmusatyalab.github.io/openface/models-and-accuracies/"

alignedFace = align.align(96, frame, bb,

landmarkIndices=openface.AlignDlib.OUTER_EYES_AND_NOSE)

rep = net.forward(alignedFace)

center = bb.center()

centerI = 0.7 * center.x * center.y / \

(args.scale * args.scale * args.width * args.height)

color_np = cm.Set1(centerI)

color_cv = list(np.multiply(color_np[:3], 255))

bl = (int(bb.left() / args.scale), int(bb.bottom() / args.scale))

tr = (int(bb.right() / args.scale), int(bb.top() / args.scale))

cv2.rectangle(frame, bl, tr, color=color_cv, thickness=3)

cv2.imshow('video', frame)

cv2.waitKey (0)

cv2.destroyAllWindows()

在Shell中如下执行代码：

$ python demos/face_detect.py --image=xx.jpeg

1	$ python demos/face_detect.py --image=xx.jpeg

识别完成后会弹出识别到的面部图片。

译文 | GAN之父在NIPS 2016上做的报告：两个竞争网络的对抗

作者：Ian Goodfellow
翻译：七月在线DL翻译组
译者：范诗剑汪识瀚李亚楠
审校：管博士寒小阳加号
责编：翟惠良 July
声明：本译文仅供学习交流，有任何翻译不当之处，敬请留言指正。转载请注明出处。

2016年的NIPS上，Ian Goodfellow做了主题为《生成对抗网络（Generative Adversarial Networks）》的报告，报告包括以下主题：
- 为什么生成式模型是一个值得研究的课题
- 生成式模型的工作原理，以及与其他生成模型的对比
- 生成式对抗网络的原理细节
- GAN相关的研究前沿
- 目前结合GAN与其他方法的主流图像模型

原英文精辟演示文稿请点击——
PDF版：www.iangoodfellow.com/slides/2016-12-04-NIPS.pdf
KeyNote版：www.iangoodfellow.com/slides/2016-12-04-NIPS.key

本站PDF版本：Generative Adversarial Networks (GANs)

本站KeyNote版：Generative Adversarial Networks (GANs)

一句话描述GAN——
GAN之所以是对抗的，是因为GAN的内部是竞争关系，一方叫generator，它的主要工作是生成图片，并且尽量使得其看上去是来自于训练样本的。另一方是discriminator，其目标是判断输入图片是否属于真实训练样本。
更直白的讲，将generator想象成假币制造商，而discriminator是警察。generator目的是尽可能把假币造的跟真的一样，从而能够骗过discriminator，即生成样本并使它看上去好像来自于训练样本一样。

继续阅读译文 | GAN之父在NIPS 2016上做的报告：两个竞争网络的对抗

macOS Sierra (10.12.3)编译Caffe

截止2017-03-05的最新版本代码，后续代码编译可能有所不同。为了实验目的，本次不开启GPU的支持，仅仅使用CPU。

虽然Google开源了tensorflow,但是受限于天朝的网络问题，在没有梯子的情况下，是基本上没办法编译成功的，因此我们尝试使用老牌的Caffe来进行实验，看看效果。

默认大家已经成功安装了HomeBrew,没有安装的同学参考让Mac也能拥有apt-get类似的功能——Brew。默认大家已经安装好Xcode的最新版本，并且安装了命令行编译工具。

安装Git

$ brew install git

1	$ brew install git

下载源代码

$ git clone https://github.com/BVLC/caffe.git

1	$ git clone https://github.com/BVLC/caffe.git

安装依赖的编译库

$ brew install protobuf

$ brew install boost

$ brew install gflags

$ brew install glog

$ brew install homebrew/science/opencv

$ brew install hdf5

$ brew install leveldb

$ brew install lmdb

$ brew install protobuf

$ brew install boost

$ brew install gflags

$ brew install glog

$ brew install homebrew/science/opencv

$ brew install hdf5

$ brew install leveldb

$ brew install lmdb

编译Caffe

$ cd caffe

$ cp Makefile.config.example Makefile.config

$ cd caffe

$ cp Makefile.config.example Makefile.config

修改其中的编译选项：

$ vim Makefile.config

1	$ vim Makefile.config

然后修改里面的内容,找到如下内容：

# CPU-only switch (uncomment to build without GPU support).
# CPU_ONLY := 1

1 2	# CPU-only switch (uncomment to build without GPU support). # CPU_ONLY := 1

去掉注释，修改后如下：

# CPU-only switch (uncomment to build without GPU support).
 CPU_ONLY := 1

1 2	# CPU-only switch (uncomment to build without GPU support). CPU_ONLY := 1

完成设置后，开始编译

$ make all -j4

$ make test

$ make runtest

$ make all -j4

$ make test

$ make runtest

编译好的执行程序在build/tools/目录下。

Ubuntu 14.04/14.10/16.04编译CPU版本Caffe

最近在学习Deep Learning,参考一下经典的Caffe,记录一下编译历程。

安装build-essentials

安装开发所需要的一些基本包

$ sudo apt-get install build-essential

1	$ sudo apt-get install build-essential

安装OpenCV

图片处理都算依赖OpenCV,版本号要>=2.4版本，目前14.04跟14.10默认的版本都是2.4

$ sudo apt-get install libopencv-dev

1	$ sudo apt-get install libopencv-dev

安装数学计算库ATLAS

ATLAS提供离散数学，线性代数的计算支持

$ sudo apt-get install libatlas-base-dev

1	$ sudo apt-get install libatlas-base-dev

安装Boost库

Boost提供了一系列的C++算法支持，需要>=1.55版本,目前的14.04跟14.10默认的版本都是1.55

$ sudo apt-get install libboost-all-dev

1	$ sudo apt-get install libboost-all-dev

然后就是一些依赖项

protobuf,leveldb,snappy,hdf5,gflags-devel,glog-devel,lmdb-devel

$ sudo apt-get install libprotobuf-dev protobuf-compiler libleveldb-dev libsnappy-dev libhdf5-serial-dev libgoogle-glog-dev libgflags-dev liblmdb-dev

1	$ sudo apt-get install libprotobuf-dev protobuf-compiler libleveldb-dev libsnappy-dev libhdf5-serial-dev libgoogle-glog-dev libgflags-dev liblmdb-dev

安装GIT

$ sudo apt-get install git

1	$ sudo apt-get install git

下载代码

$ git clone https://github.com/BVLC/caffe.git

1	$ git clone https://github.com/BVLC/caffe.git

编译Caffe

$ cd caffe

$ cp Makefile.config.example Makefile.config

$ cd caffe

$ cp Makefile.config.example Makefile.config

然后修改里面的内容，主要需要修改的参数包括
CPU_ONLY是否只使用CPU模式，没有GPU没安装CUDA的同学可以打开这个选项
BLAS (使用intel mkl还是OpenBLAS)
完成设置后,开始编译

$ make all -j4

$ make test

$ make runtest

$ make all -j4

$ make test

$ make runtest

编译出错的处理

Ubuntu 16.04下编译时候提示:

CXX src/caffe/solvers/sgd_solver.cpp
In file included from src/caffe/solvers/sgd_solver.cpp:5:0:
./include/caffe/util/hdf5.hpp:6:18: fatal error: hdf5.h: No such file or directory
#include "hdf5.h"

CXX src/caffe/solvers/sgd_solver.cpp

In file included from src/caffe/solvers/sgd_solver.cpp:5:0:

./include/caffe/util/hdf5.hpp:6:18: fatal error: hdf5.h: No such file or directory

#include "hdf5.h"

解决方法:
1. 编辑Makefile.config,在文件最后，添加/usr/include/hdf5/serial到INCLUDE_DIRS

INCLUDE_DIRS += /usr/include/hdf5/serial

1	INCLUDE_DIRS += /usr/include/hdf5/serial

2.修改Makefile文件，把hdf5_hl和hdf5修改为hdf5_serial_hl和hdf5_serial，也就是把下面第一行代码改为第二行代码。

原始内容：

LIBRARIES += glog gflags protobuf boost_system boost_filesystem m hdf5_hl hdf5

1	LIBRARIES += glog gflags protobuf boost_system boost_filesystem m hdf5_hl hdf5

修改后的内容：

LIBRARIES += glog gflags protobuf boost_system boost_filesystem m hdf5_serial_hl hdf5_serial

1	LIBRARIES += glog gflags protobuf boost_system boost_filesystem m hdf5_serial_hl hdf5_serial

编译Python接口

$ sudo apt install python-pip

$ pip install --upgrade pip

$ sudo pip install -r python/requirements.txt

$ sudo apt-get install python-numpy

$ make pycaffe

$ make distribute

$ sudo apt install python-pip

$ pip install --upgrade pip

$ sudo pip install -r python/requirements.txt

$ sudo apt-get install python-numpy

$ make pycaffe

$ make distribute

一	二	三	四	五	六	日
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30