分类： Vulkan

Vulkan是一个跨平台的2D和3D绘图应用程序接口（API），最早由科纳斯组织在2015年游戏开发者大会（GDC）上发表。
科纳斯最先把VulkanAPI称为“次世代OpenGL行动”（nextgenerationOpenGLinitiative）或“glNext”，但在正式宣布Vulkan之后这些名字就没有再使用了。就像OpenGL，Vulkan针对实时3D程序（如电子游戏）设计，Vulkan并计划提供高性能和低CPU管理负担（overhead），这也是Direct3D12和AMD的Mantle的目标。Vulkan兼容Mantle的一个分支，并使用了Mantle的一些组件

ubuntu 18.04安装Vulkan依赖的工具链SPIR-V Toolchain(glslangValidator)

开发 Vulkan 的时候，需要使用 glslangValidator 编译 Vulkan 代码。

如果是 ubuntu 19.10 版本，可以直接执行:

$ sudo apt-get install vulkan-tools

# 目前ubuntu 20.04 最新需要使用如下命令 sudo apt-get install glslang-tools

$ sudo apt-get install vulkan-tools

# 目前ubuntu 20.04 最新需要使用如下命令 sudo apt-get install glslang-tools

如果低于这个版本，则只能从源代码编译安装了，如下：

# 编译安装 glslang
$ git clone https://github.com/KhronosGroup/glslang.git

# 也可本站下载一份拷贝 wget https://www.mobibrw.com/wp-content/uploads/2018/12/glslang.zip
 
$ cd glslang
 
# 目前（2018.12.17）的正式版，最稳定的版本，试过最新的版本，编译部分代码存在问题
$ git checkout 7.10.2984
 
# 下载 spirv-tools 部分的功能代码
$ python update_glslang_sources.py 
 
$ mkdir build
 
$ cd build
 
$ cmake ..
 
$ make
 
$ sudo make install

# 编译安装 glslang

$ git clone https://github.com/KhronosGroup/glslang.git

# 也可本站下载一份拷贝 wget https://www.mobibrw.com/wp-content/uploads/2018/12/glslang.zip

$ cd glslang

# 目前（2018.12.17）的正式版，最稳定的版本，试过最新的版本，编译部分代码存在问题

$ git checkout 7.10.2984

# 下载 spirv-tools 部分的功能代码

$ python update_glslang_sources.py

$ mkdir build

$ cd build

$ cmake ..

$ make

$ sudo make install

Compatibility Between SPIR-V Image Formats And Vulkan Formats

SPIR-V Image Format	Compatible Vulkan Format
Rgba32f	VK_FORMAT_R32G32B32A32_SFLOAT
Rgba16f	VK_FORMAT_R16G16B16A16_SFLOAT
R32f	VK_FORMAT_R32_SFLOAT
Rgba8	VK_FORMAT_R8G8B8A8_UNORM
Rgba8Snorm	VK_FORMAT_R8G8B8A8_SNORM
Rg32f	VK_FORMAT_R32G32_SFLOAT
Rg16f	VK_FORMAT_R16G16_SFLOAT
R11fG11fB10f	VK_FORMAT_B10G11R11_UFLOAT_PACK32
R16f	VK_FORMAT_R16_SFLOAT
Rgba16	VK_FORMAT_R16G16B16A16_UNORM
Rgb10A2	VK_FORMAT_A2B10G10R10_UNORM_PACK32
Rg16	VK_FORMAT_R16G16_UNORM
Rg8	VK_FORMAT_R8G8_UNORM
R16	VK_FORMAT_R16_UNORM
R8	VK_FORMAT_R8_UNORM
Rgba16Snorm	VK_FORMAT_R16G16B16A16_SNORM
Rg16Snorm	VK_FORMAT_R16G16_SNORM
Rg8Snorm	VK_FORMAT_R8G8_SNORM
R16Snorm	VK_FORMAT_R16_SNORM
R8Snorm	VK_FORMAT_R8_SNORM
Rgba32i	VK_FORMAT_R32G32B32A32_SINT
Rgba16i	VK_FORMAT_R16G16B16A16_SINT
Rgba8i	VK_FORMAT_R8G8B8A8_SINT
R32i	VK_FORMAT_R32_SINT
Rg32i	VK_FORMAT_R32G32_SINT
Rg16i	VK_FORMAT_R16G16_SINT
Rg8i	VK_FORMAT_R8G8_SINT
R16i	VK_FORMAT_R16_SINT
R8i	VK_FORMAT_R8_SINT
Rgba32ui	VK_FORMAT_R32G32B32A32_UINT
Rgba16ui	VK_FORMAT_R16G16B16A16_UINT
Rgba8ui	VK_FORMAT_R8G8B8A8_UINT
R32ui	VK_FORMAT_R32_UINT
Rgb10a2ui	VK_FORMAT_A2B10G10R10_UINT_PACK32
Rg32ui	VK_FORMAT_R32G32_UINT
Rg16ui	VK_FORMAT_R16G16_UINT
Rg8ui	VK_FORMAT_R8G8_UINT
R16ui	VK_FORMAT_R16_UINT
R8ui	VK_FORMAT_R8_UINT

参考链接

Compatibility Between SPIR-V Image Formats And Vulkan Formats

粗略判断Shader每条代码的成本

GPU IS a processor (graphics proccessing unit). Anywho, i remember seeing somewhere that in geforce 6 series cards its a signle cycle (maybe i was just dreaming :-p) but i have that memory

radeon x800 has it anyways
EDIT:

Quote:

ORIGINALLY AT: http://gear.ibuypower.com/GVE/Store/ProductDetails.aspx?sku=VC-POWERC-147
Smartshader HD•Support for Microsoft® DirectX® 9.0 programmable vertex and pixel shaders in hardware
• DirectX 9.0 Vertex Shaders
- Vertex programs up to 65,280 instructions with flow control
- Single cycle trigonometric operations (SIN & COS)
• Direct X 9.0 Extended Pixel Shaders
- Up to 1,536 instructions and 16 textures per rendering pass
- 32 temporary and constant registers
- Facing register for two-sided lighting
- 128-bit, 64-bit & 32-bit per pixel floating point color formats
- Multiple Render Target (MRT) support
• Complete feature set also supported in OpenGL® via extensions

继续阅读粗略判断Shader每条代码的成本

macOS Mojave(10.14.3)编译Vulkan例子项目

$ git clone --recursive https://github.com/SaschaWillems/Vulkan.git

$ cd Vulkan

$ export PATH="/user/local/bin:$PATH"

$ python3 download_assets.py 

$ cd xcode


# 编译 MoltenVK
$ git clone https://github.com/KhronosGroup/MoltenVK.git MoltenVKSrc

$ cd MoltenVKSrc

$ brew install cmake

$ brew install python

$ brew install ninja

$ bash fetchDependencies

$ make

$ make macos

# 拷贝出来，否则链接的时候找不到文件错误
$  cp ./Package/Release/MoltenVK/macOS/dynamic/* ./Package/Release/MoltenVK/macOS/ 

$ cd ..

$ rm -rf MoltenVK

$ ln -s -f ./MoltenVKSrc/Package/Release/MoltenVK MoltenVK


# 编译 assimp
$ cd assimp

$ git clone https://github.com/assimp/assimp.git -b v3.3.1 assimp-mac

$ cd assimp-mac

# 编译单元测试的时候，链接的时候会失败，我们关闭即可。
$ cmake CMakeLists.txt -G 'Unix Makefiles' -DASSIMP_BUILD_TESTS=OFF

$ make

$ cd ..

$ ln -s -f assimp-mac assimp-macos

$ cd ..

$ xcodebuild -project "examples.xcodeproj" -scheme "examples-macos" build

$ git clone --recursive https://github.com/SaschaWillems/Vulkan.git

$ cd Vulkan

$ export PATH="/user/local/bin:$PATH"

$ python3 download_assets.py

$ cd xcode

# 编译 MoltenVK

$ git clone https://github.com/KhronosGroup/MoltenVK.git MoltenVKSrc

$ cd MoltenVKSrc

$ brew install cmake

$ brew install python

$ brew install ninja

$ bash fetchDependencies

$ make

$ make macos

# 拷贝出来，否则链接的时候找不到文件错误

$ cp ./Package/Release/MoltenVK/macOS/dynamic/* ./Package/Release/MoltenVK/macOS/

$ cd ..

$ rm -rf MoltenVK

$ ln -s -f ./MoltenVKSrc/Package/Release/MoltenVK MoltenVK

# 编译 assimp

$ cd assimp

$ git clone https://github.com/assimp/assimp.git -b v3.3.1 assimp-mac

$ cd assimp-mac

# 编译单元测试的时候，链接的时候会失败，我们关闭即可。

$ cmake CMakeLists.txt -G 'Unix Makefiles' -DASSIMP_BUILD_TESTS=OFF

$ make

$ cd ..

$ ln -s -f assimp-mac assimp-macos

$ cd ..

$ xcodebuild -project "examples.xcodeproj" -scheme "examples-macos" build

编译 assimp的时候出现如下错误：

～/Vulkan/xcode/assimp/assimp-mac/code/D3MFImporter.cpp:230:29: error: invalid operands to
      binary expression ('float (*)(const char *, const char *)' and 'nullptr_t')
        vertex.z = ai_strtof>(xmlReader->getAttributeValue(D3MF::XmlTag::z.c_str()), nullptr);

～/Vulkan/xcode/assimp/assimp-mac/code/D3MFImporter.cpp:230:29: error: invalid operands to

binary expression ('float (*)(const char *, const char *)' and 'nullptr_t')

vertex.z = ai_strtof>(xmlReader->getAttributeValue(D3MF::XmlTag::z.c_str()), nullptr);

这个原因是由于代码的BUG导致的，修改～/Vulkan/xcode/assimp/assimp-mac/code/D3MFImporter.cpp的230代码即可。

其他编译错误无视即可，只要能编译出 libassimp.3.3.1.dylib 即可。

修复方式如下图：
继续阅读macOS Mojave(10.14.3)编译Vulkan例子项目

macOS Mojave(10.14.3)编译使用MoltenVK运行Vulkan应用

MoltenVK是一个软件库，允许Vulkan应用程序在Apple的macOS和iOS操作系统上运行在Metal之上。它是Vulkan Portability Initiative发布的第一个软件组件，该项目是在没有本地Vulkan驱动程序的平台上运行Vulkan子集的项目。

下载并编译 MoltenVK 的代码：

$ cd ~

$ brew install cmake

$ brew install python

$ brew install ninja

$ git clone https://github.com/KhronosGroup/MoltenVK.git

$ cd MoltenVK

$ bash fetchDependencies

$ make

$ make macos

$ export VK_ICD_FILENAMES=~/MoltenVK/Package/Release/MoltenVK/macOS/dynamic/MoltenVK_icd.json

$ cd ~

$ brew install cmake

$ brew install python

$ brew install ninja

$ git clone https://github.com/KhronosGroup/MoltenVK.git

$ cd MoltenVK

$ bash fetchDependencies

$ make

$ make macos

$ export VK_ICD_FILENAMES=~/MoltenVK/Package/Release/MoltenVK/macOS/dynamic/MoltenVK_icd.json

下载并编译 vuh 的代码：

$ git clone https://github.com/Glavnokoman/vuh.git

$ cd vuh

$ export DEPENDENCIES_INSTALL_DIR=.

$ export VUH_SOURCE_DIR=.

$ export PATH="/usr/local/bin:$PATH"

$ brew install python

$ brew install python2

$ brew install glslang

$ brew install spdlog

$ sudo python -m pip install --upgrade pip

$ python -m pip install cget

$ export BINPATH=`python -c 'import imp; import os; mod=imp.find_module("cget")[1]; root=os.path.abspath(os.path.dirname(os.path.dirname(os.path.dirname(os.path.dirname(mod))))); print os.path.join(root,"bin")'`

$ export PATH="$BINPATH:$PATH"

$ export CGET_PREFIX=${DEPENDENCIES_INSTALL_DIR}

$ bash ${VUH_SOURCE_DIR}/config/install_dependencies.sh

$ export VULKAN_SDK=$(cd "$(dirname ${DEPENDENCIES_INSTALL_DIR})";pwd)

$ export Catch2_DIR=$(cd "$(dirname ${DEPENDENCIES_INSTALL_DIR})";pwd)

$ cmake -DCMAKE_PREFIX_PATH=${DEPENDENCIES_INSTALL_DIR} ${VUH_SOURCE_DIR}

$ cmake --build . --target install

$ git clone https://github.com/Glavnokoman/vuh.git

$ cd vuh

$ export DEPENDENCIES_INSTALL_DIR=.

$ export VUH_SOURCE_DIR=.

$ export PATH="/usr/local/bin:$PATH"

$ brew install python

$ brew install python2

$ brew install glslang

$ brew install spdlog

$ sudo python -m pip install --upgrade pip

$ python -m pip install cget

$ export BINPATH=`python -c 'import imp; import os; mod=imp.find_module("cget")[1]; root=os.path.abspath(os.path.dirname(os.path.dirname(os.path.dirname(os.path.dirname(mod))))); print os.path.join(root,"bin")'`

$ export PATH="$BINPATH:$PATH"

$ export CGET_PREFIX=${DEPENDENCIES_INSTALL_DIR}

$ bash ${VUH_SOURCE_DIR}/config/install_dependencies.sh

$ export VULKAN_SDK=$(cd "$(dirname ${DEPENDENCIES_INSTALL_DIR})";pwd)

$ export Catch2_DIR=$(cd "$(dirname ${DEPENDENCIES_INSTALL_DIR})";pwd)

$ cmake -DCMAKE_PREFIX_PATH=${DEPENDENCIES_INSTALL_DIR} ${VUH_SOURCE_DIR}

$ cmake --build . --target install

执行测试：

$ ${VUH_SOURCE_DIR}/test/correctness/test_vuh

1	$ ${VUH_SOURCE_DIR}/test/correctness/test_vuh

参考链接

Vulkan Memory Management

Vulkan offers another key difference to OpenGL with respect to memory allocation. When it comes to managing memory allocations as well as assigning it to individual resources, the OpenGL driver does most of the work for the developer. This allows applications to be developed, tested and deployed very quickly. In Vulkan however, the programmer takes responsibility meaning that many operations that OpenGL orchestrates heuristically can be orchestrated based on an absolute knowledge of the resource lifecycle.

继续阅读Vulkan Memory Management

Vulkan直接使用CPU内存指针

Depending on the target platform, some recently published EXT extensions allow sharing memory between different physical devices.

VK_EXT_external_memory_host enables importing host allocations or host-mapped foreign device memory using a host pointer as the handle.

VK_EXT_external_memory_dma_buf enables importing dma_buf handles on Linux which can possibly come from another physical device.

The spec now also has a table where it's listed which external memory handle types require a matching physical device and which don't.

Additionally, I'd also like to draw your attention to additional features which enable execution control across multiple physical devices. At least on Linux (and possibly other POSIX based systems) semaphores and fences can be shared across physical devices if the FENCE_FD and SYNC_FD handle types are used. These are part of the KHR external semaphore/fence extensions.

扩展 VK_EXT_external_memory_host 在2018年4月04被合并到Android主分支，后续的版本可能可以使用这个插件了，这个使得显卡设备可以直接使用CPU创建的内存指针，减少内存的拷贝操作。

参考链接

Vulkan Device Memory

This post serves as a guide on how to best use the various Memory Heaps and Memory Types exposed in Vulkan on AMD drivers, starting with some high-level tips.

GPU Bulk Data
Place GPU-side allocations in DEVICE_LOCAL without HOST_VISIBLE. Make sure to allocate the highest priority resources first like Render Targets and resources which get accessed more often. Once DEVICE_LOCAL fills up and allocations fail, have the lower priority allocations fall back to CPU-side memory if required via HOST_VISIBLE with HOST_COHERENT but without HOST_CACHED. When doing in-game reallocations (say for display resolution changes), make sure to fully free all allocations involved before attempting to make any new allocations. This can minimize the possibility that an allocation can fail to fit in the GPU-side heap.
CPU-to-GPU Data Flow
For relatively small total allocation size (under 256 MB) the DEVICE_LOCAL with HOST_VISIBLE is the perfect Memory Type for CPU upload to GPU cases: the CPU can directly write into GPU memory which the GPU can then access without reading across the PCIe bus. This is great for upload of constant data, etc.
GPU-to-CPU Data Flow
Use HOST_VISIBLE with HOST_COHERENT and HOST_CACHED. This is the only Memory Type which supports cached reads by the CPU. Great for cases like recording screen-captures, feeding back Hierarchical Z-Buffer occlusion tests, etc.

Pooling Allocations

EDIT: Great reminder from Axel Gneiting (leading Vulkan implementation in DOOM® at id Software), make sure to pool a group of resources, like textures and buffers, into a single memory allocation. On Windows® 7 for example, Vulkan memory allocations map to WDDM Allocations (the same lists seen in GPUView), and there is a relatively high cost associated for a WDDM Allocation as command buffers flow through the WDDM based driver stack. Having 256 MB per DEVICE_LOCAL allocation can be a good target, takes only 16 allocations to fill 4 GB.

Hidden Paging

When an application starts over-subscribing GPU-side memory, DEVICE_LOCAL memory allocations will fail. It is also possible that later during application execution, another application in the system increases its usage of GPU-side memory, resulting in dynamic over-subscribing of GPU-side memory. This case can result in an OS (for instance Windows® 7) to silently migrate or page GPU-side allocations to/from CPU-side as it time-slices execution of each application on the GPU. This can result in visible “hitching”. There is currently no method to directly query if the OS is migrating allocations in Vulkan. One possible workaround is for the app to detect hitching by looking at time-stamps, and then actively attempting to reduce DEVICE_LOCAL memory consumption when hitching is detected. For example, the application could manually move around resources to fully empty DEVICE_LOCAL allocations which can then be freed.

EDIT: Targeting Low-Memory GPUs

When targeting a memory surplus, using DEVICE_LOCAL+HOST_VISIBLE for CPU-write cases can bypass the need to schedule an extra copy. However in memory constrained situations it is much better to use DEVICE_LOCAL+HOST_VISIBLE as an extension to the DEVICE_LOCAL heap and use it for GPU Resources like Textures and Buffers. CPU-write cases can switch to HOST_VISIBLE+COHERENT. The number one priority for performance is keeping the high bandwidth access resources in GPU-side memory.

Memory Heap and Memory Type – Technical Details

Driver Device Memory Heaps and Memory Types can be inspected using the Vulkan Hardware Database. For Windows AMD drivers, below is a breakdown of the characteristics and best usage models for all the Memory Types. Heap and Memory Type numbering is not guaranteed by the Vulkan Spec, so make sure to work from the Property Flags directly. Also note memory sizes reported in Vulkan represent the maximum amount which is shared across applications and driver.

Heap 0
- VK_MEMORY_HEAP_DEVICE_LOCAL_BIT
- Represents memory on the GPU device which can not be mapped into Host system memory
- Using 256 MB per vkAllocateMemory() allocation is a good starting point for collections of buffers and images
- Suggest using separate allocations for large allocations which might need to be resized (freed and reallocated) at run-time
- Memory Type 0
  - VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT
  - Full speed read/write/atomic by GPU
  - No ability to use vkMapMemory() to map into Host system address space
  - Use for standard GPU-side data
Heap 1
- VK_MEMORY_HEAP_DEVICE_LOCAL_BIT
- Represents memory on the GPU device which can be mapped into Host system memory
- Limited on Windows to 256 MB
  - Best to allocate at most 64 MB per vkAllocateMemory() allocation
  - Fall back to smaller allocations if necessary
- Memory Type 1
  - VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT
  - VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT
  - VK_MEMORY_PROPERTY_HOST_COHERENT_BIT
  - Full speed read/write/atomic by GPU
  - Ability to use vkMapMemory() to map into Host system address space
  - CPU writes are write-combined and write directly into GPU memory
    - Best to write full aligned cacheline sized chunks
  - CPU reads are uncached
    - Best to use Memory Type 3 instead for GPU write and CPU read cases
  - Use for dynamic buffer data to avoid an extra Host to Device copy
  - Use for a fall-back when Heap 0 runs out of space before resorting to Heap 2
Heap 2
- Represents memory on the Host system which can be accessed by the GPU
- Suggest using similar allocation size strategy as Heap 0
- Ability to use vkMapMemory()
- GPU reads for textures and buffers are cached in GPU L2
  - GPU L2 misses read across the PCIe bus to Host system memory
  - Higher latency and lower throughput on an L2 miss
- GPU reads for index buffers are cached in GPU L2 in Tonga and later GPUs like FuryX
- Memory Type 2
  - VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT
  - VK_MEMORY_PROPERTY_HOST_COHERENT_BIT
  - CPU writes are write-combined
  - CPU reads are uncached
  - Use for staging for upload to GPU device
  - Can use as a fall-back when GPU device runs out of memory in Heap 0 and Heap 1
- Memory Type 3
  - VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT
  - VK_MEMORY_PROPERTY_HOST_COHERENT_BIT
  - VK_MEMORY_PROPERTY_HOST_CACHED_BIT
  - CPU reads and writes go through CPU cache hierarchy
  - GPU reads snoop CPU cache
  - Use for staging for download from GPU device

Choosing the correct Memory Heap and Memory Type is a critical task in optimization. A GPU like Radeon™ Fury X for instance has 512 GB/s of DEVICE_LOCAL bandwidth (sum of any ratio of read and write) but the PCIe bus supports at most 16 GB/s read and at most 16 GB/s write for a sum of 32 GB/s in both directions.

Timothy Lottes is a member of the Developer Technology Group at AMD. Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied.

参考链接

Vulkan Device Memory

常见GPU的浮点性能

Game Consoles GPU

Consoles Name	GPU Name	Fab	Clock	GFlops
NDS	ARM946E-S (CPU)	180/130nm	67 MHz	0.6
N3DS	PICA 200	45nm	200 MHz	4.8
PSP	R4000 x 2	90nm	333 MHz	2.6
PS VITA	SGX543 MP4+	45nm	222 MHz	28.4
Dreamcast	PowerVR2 CLX2	250nm	100 MHz	2.1
XBOX	XGPU (NV2A)	150nm	233 MHz	20
XBOX 360	ATI R500 Xenos	90/65/45nm	500 MHz	240
XBOX ONE XBOX ONE S	AMD Radeon GCN (12CU 768 Cores)	28/16nm	853 MHz 914 MHz	1311.5 1405.2
XBOX ONE X	AMD Radeon GCN (40CU 2560 Cores)	16nm	1172 MHz	6000
PlayStation 2	GS	180/150/90nm	147 MHz	6.2 (EE+GS)
PlayStation 3	RSX (NVIDIA G70)	90/65/45nm	550 MHz	228.8
PlayStation 4 PlayStation 4 Slim	AMD Radeon GCN (18CU 1152 Cores)	28/16nm	800 MHz	1840
PlayStation 4 Pro	AMD Radeon GCN (36CU 2304 Cores)	16nm	911 MHz	4200
N64	SGI RCP	350nm	62.5 MHz	0.1~0.2
GameCube	Flipper	180nm	162 MHz	9.4
Wii	ATI HollyWood	90nm	243 MHz	12
Wii U	ATI RV770	40nm	550 MHz	176
Switch	Tegra X1 (Undocked)	20nm	307.2 MHz	157.2
Switch	Tegra X1 (Docked)	20nm	768 MHz	393.2
Ouya	Tegra 3 (Geforce ULP x 12)	40nm	520 Mhz	12.5
SHIELD portable	Tegra 4 (Geforce ULP x 72)	28nm	672 MHz	96.8
SHIELD TV	Tegra X1 (Maxwell Cores x 256 (2xSMM))	20nm	1000 MHz	512

Imagination PowerVR

GPU Name	Chip	Clock	GFlops
SGX530	OMAP 3530	110 MHz	0.88
	DM3730	200 MHz	1.6
	---	300 MHz	2.4
SGX531	MT6513 MT6573 MT6575M	281 MHz	2.25
SGX531	R-Car E1	400 MHz	3.2
SGX531 Ultra	MT6515 MT6575 MT6517 MT6517T MT6577 MT6577T MT8317 MT8317T MT8377	522 MHz	4.2
SGX535	S5PC100 Apple A4	200 MHz	1.6
	Apple A4 (iPad)	250 MHz	2.0
	---	300 MHz	2.4
SGX540	Jz4780	??? MHz	???
	Exynos 3110	200 MHz	3.2
	OMAP 4430	307 MHz	4.9
	OMAP 4460	384 MHz	6.1
	Atom Z2420 R-Car E2 R-Car M1A、M1S	400 MHz	6.4
	ATM7021 ATM7021A ATM7029B	500 MHz	8.0
	RK3168	600 MHz	9.6
SGX543	---	200 MHz	6.4
SGX543 MP2	Apple A5	200 MHz	12.8
	Apple A5 (iPad2)	250 MHz	16.0
	MT5327	400 MHz	25.6
	R-Car H1	520 MHz	33.28
SGX543 MP3	Apple A6	266 MHz	25.5
SGX543 MP4	Apple A5X	250 MHz	32.0
SGX544	MT6589M MT8117 MT8121	156 MHz	5
	MT6589 MT8389	286 MHz	9.2
	MT8125	300 MHz	9.6
	MT6589T MT8389T	357 MHz	11.4
	OMAP 4470	384 MHz	12.3
	Broadcom M320 Broadcom M340	???	???
	ATM7039	450 MHz	14.4
SGX544 MP2	Atom Z2520	300 MHZ	19.2
	Allwinner A31 Allwinner A31s	350 MHz	22.4
	Atom Z2560	400 MHz	25.6
	R-Car M2	520 MHz	33.28
	Atom Z2580	533 MHz	34.1
	Allwinner A83T Allwinner H8	700 MHz	44.8
SGX544 MP3	Exynos 5410	533 MHz	51.1
SGX545	---	300 MHz	4.8
SGX545	Atom Z2460 Atom Z2760	533 MHz	8.5
SGX554	---	300 MHz	19.2
SGX554 MP2	---	300 MHz	38.4
SGX554 MP4	Apple A6X	266 MHz	68.1
G6020 (0.25 Clusters)	---	300 MHz	4.8
G6050 G6060 (0.5 Clusters)	---	300 MHz	9.6
G6100 G6110 (1 Clusters)	RK3368	600 MHz	38.4
G6200 (2 Clusters)	MT6595M MT8135	450 MHz	57.6
	MT6795M	550 MHz	70.4
	MT6595 MT6595T	600 MHz	76.8
	MT6793 Helio X10 (MT6795、MT6795T)	700 MHz	89.6
G6230 (2 Clusters)	Allwinner A80 Allwinner A80T	533 MHz	68.0
G6230 (2 Clusters)	ATM9009	600 MHz	76.8
GX6240 (2 Clusters)	---	650 MHz	83.2
GX6250 (2 Clusters)	MT8173 MT8176	600 MHz	76.8
	MT8693	700 MHz	89.6
	---	750 MHz	96
G6400 (4 Clusters)	---	300 MHz	76.8
	Atom Z3460 Atom Z3480	533 MHz	136.4
	R-Car H2	600 MHz	153.6
G6430 (4 Clusters)	---	300 MHz	76.8
	Apple A7 Apple A7 (iPad Air)	450 MHz	115.2
	Atom Z3530	457 MHz	117
	Atom Z3560 Atom Z3580	533 MHz	136.4
	Atom Z3570 Atom Z3590	640 MHz	163.8
GX6450 (4 Clusters)	Apple A8	450 MHz	115.2
GX6450 (4 Clusters)	---	600 MHz	153.6
G6630 (6 Clusters)	---	450 MHz	172.8
G6630 (6 Clusters)	---	600 MHz	230.4
GX6650 (6 Clusters)	R-Car H3	600 MHz	230.4
GX6850 (8 Clusters)	Apple A8X	450 MHz	230.4
GX6850 (8 Clusters)	---	600 MHz	307.2
GE7400 (0.5 Clusters)	---	600 MHz	19.2
GE7800 (1 Clusters)	---	600 MHz	38.4
GT7200 (2 Clusters)	---	650 MHz	83.2
GT7200 (2 Clusters)	SC9861G-IA	??? MHz	???
GT7400 (4 Clusters)	---	650 MHz	166.4
GT7400 Plus (4 Clusters)	Helio X30	800 MHz	204.8
GT7600 (6 Clusters)	Apple A9	450 MHz	172.8
GT7600 Plus (6 Clusters)	Apple A10 Fusion	650 MHz?	249.6?
GT7800 (8 Clusters)	---	650 MHz	332.8
GT7800+ (12 Clusters)	Apple A9X	450 MHz	345.6
GT7800? (12 Clusters)	Apple A10X Fusion	650 MHz?	499.2?
GT7900 (16 Clusters)	---	650 MHz	665.6
GT7900 (16 Clusters)	---	800 MHz	819.2
GT8525 (2 Clusters)	---	1000 MHz	192

Qualcomm Adreno

GPU Name	Chip	Fab	Clock	GFlops
Adreno 130	MSM7x00 MSM7x00A MSM7x01 MSM7x01A	??nm	133 MHz	1.2
Adreno 200	Snapdragon S1 MSM7225 MSM7625 MSM7227 MSM7627 QSD8250 QSD8650	65nm	133 MHz	2.1
	Snapdragon S1 MSM7225A MSM7625A	45nm	200 MHz	3.2
	Snapdragon S1 MSM7227A MSM7627A	45nm	245 MHz	3.92
Adreno 203	Snapdragon S4 Play MSM8225 MSM8625	45nm	245 MHz	7.84
Adreno 203	Snapdragon 200 MSM8225Q MSM8625Q	45nm	294 MHz	9.4
Adreno 205	Snapdragon S2 MSM7230 MSM7630 MSM8255 MSM8655 APQ8055	45nm	266 MHz	8.5
Adreno 220	Snapdragon S3 MSM8260 MSM8660 APQ8060	45nm	266MHz	17
Adreno 225	Snapdragon S4 Plus APQ8060A MSM8260A	28nm	200 MHz	12.8
	Snapdragon S4 Plus (MSM8660A)	28nm	300 MHz	19.2
	Snapdragon S4 Plus (MSM8960)	28nm	400 MHz	25.6
Adreno 302	Snapdragon 200 MSM8210 MSM8610 MSM8212 MSM8612	28nm	400 MHz	19.2
Adreno 304	Snapdragon 208 Snapdragon 210 Snapdragon 212 Snapdragon Wear 2100	28nm	400 MHz	19.2
Adreno 305	Snapdragon S4 Plus MSM8227 MSM8627 Snapdragon 400 MSM8226 MSM8626 MSM8230 MSM8630 MSM8930 MSM8030AB MSM8230AB MSM8630AB MSM8930AB MSM8228 MSM8628 MSM8928 APQ8026 APQ8030	28nm	400~450 MHz	19.2~21.6
Adreno 306	Snapdragon 410 (MSM8916) Snapdragon 412 (MSM8916v2)	28nm	400 MHz	21.6
Adreno 308	Snapdragon 425 (MSM8917) Snapdragon 427	28nm	500 MHz	27
Adreno 320 (64 ALU)	Snapdragon S4 Pro MSM8960T APQ8064 APQ8064 1AA Snapdragon S4 Prime (MPQ8064)	28nm	400 MHz	57.6
Adreno 320 (96 ALU)	Snapdragon 600 (APQ8064T)	28nm	400 MHz	86.4
Adreno 320 (96 ALU)	Snapdragon 600 (APQ8064AB)	28nm	450 MHz	97.2
Adreno 330	Snapdragon 800 APQ8074 MSM8974AA	28nm	450 MHz	129.8
	Snapdragon 801 MSM8274AB MSM8974AB	28nm	550 MHz	158.4
	Snapdragon 801 (MSM8974AC)	28nm	578 MHz	166.5
Adreno 405	Snapdragon 415 (MSM8929) Snapdragon 615 (MSM8939) Snapdragon 616 (MSM8939v2) Snapdragon 617 (MSM8952)	28nm	550 MHz	59.4
Adreno 418	Snapdragon 808 (MSM8992)	20nm	600 MHz	172.8
Adreno 420	Snapdragon 805 (APQ8084)	28nm	500~600 MHz	144~172.8
Adreno 430	Snapdragon 810 APQ8094 MSM8994	20nm	500~650 MHz	324~420
Adreno 505	Snapdragon 430 (MSM8937) Snapdragon 435	28nm	450 MHz	48.6
Adreno 506	Snapdragon 450	14nm	600 MHz	120
Adreno 506	Snapdragon 625 Snapdragon 626	14nm	650 MHz	130
Adreno 508	Snapdragon 630	14nm	800 MHz?	160?
Adreno 510	Snapdragon 650 (MSM8956) Snapdragon 652 (MSM8976) Snapdragon 653 (MSM8976PRO)	28nm	600 MHz	180
Adreno 512	Snapdragon 660 (MSM8976 Plus)	14nm	800 MHz?	240?
Adreno 530	Snapdragon 820 (MSM8996)	14nm	510~624 MHz	407.4~498.5
Adreno 530	Snapdragon 821 (MSM8996PRO)	14nm	650 MHz	519.2
Adreno 540	Snapdragon 835 (MSM8998)	10nm	710 MHz	567
Adreno 608	---	10nm	??? MHz	???
Adreno 615	---	10nm	??? MHz	???
Adreno 630	Snapdragon 845	10nm	??? MHz	???

More Qualcomm Adreno Information in wiki

Nvidia Tegra

GPU Name	Chip	Fab	Clock	GFlops
Geforce ULP x 8	Tegra 2 (AP20H)	40nm	300 MHz	4.8
	Tegra 2 (T20)	40nm	333 MHz	5.6
	Tegra 2 (AP25、T25)	40nm	400 MHz	6.7
Geforce ULP x 12	Tegra 3 (T30L、AP33)	40nm	416 MHz	10
	Tegra 3	40nm	450 MHz	10.8
	Tegra 3 (T30、T33、AP37)	40nm	520 MHz	12.5
Geforce ULP x 60	Tegra 4i	28nm	660 MHz	79.2
Geforce ULP x 72	Tegra 4	28nm	672 MHz	96.8
Kepler Cores x 192 (1xSMX)	Tegra K1 Tegra K1 (Denver)	28nm	850 MHz	326.4
Maxwell Cores x 256 (2xSMM)	Tegra X1	20nm	850 MHz 1000 MHz	435.2 512
Pascal Cores x 256 (2xSMM)	Tegra Parker	16nm	1465 MHz	750
Volta Cores x 512	Tegra Xavier	12nm	???? MHz	????

Arm Mali

GPU Name	Chip	Clock	GFlops
Mali-400	---	200 MHz	1.8
	AML8726-M3	250 MHz	2.25
	ST-E U8500	275 MHz	2.48
	WM8850 WM8950 SC6815A SC7710 SC8810 SC9820 Allwinner A10 Allwinner A10s Allwinner A13	300 MHz	2.7
	RK292X	330 MHz	2.97
	SC7715 SC7727S ST-E U8520 Telechips TCC892x-i Rk2926 RK2928 MT6290 MT8638T MT6572M	400 MHz	3.6
	MT6570 MT6572 MT8312 MT8321 XMM6321 S5P4418	500 MHz	4.5
	---	533 MHz	4.8
Mali-400 MP2	LC1810 LC1811	300 MHz	5.4
	WM8880 WM8980 SC6825 SC8825 Allwinner A20 Allwinner A23 Allwinner A33	350 MHz	6.3
	SC5735A SC7730A SC7730S SC7731G SC8830 SC8830A SC8831G SC9830A SC9830I SC9836 MT6582M AML7366-M6C AML8726-MX AML8726-MXS AML8726-MXL NS115 LC1813 LC1913 RTD1195 Exynos 3250	400 MHz	7.2
	SC8831G	480 MHz	8.64
	MT6580 MT6582 MT8382 RK3026 RK3036	500 MHz	9.0
	RK3126 RK3128 RK3228 RK3229 Allwinner H3 Atom x3-C3130	600 MHz	10.8
Mali-400 MP4	RK3066 Exynos 4210	266 MHz	9.6
	Exynos 4212 SC7735S SC8735S SC8835S Hi3716 Hi3718 Hi3719 Rockchip PX2 AML7366-M6L	400 MHz	14.4
	Exynos 4412	440 MHz	15.84
	Exynos 3470	450 MHz	16.2
	Exynos 4412 v2 RK3188 S5P6818	533 MHz	19.2
Mali-450	WM8860	300 MHz	4.5
Mali-450 MP2	AML7366-M6D	400 MHz	12
Mali-450 MP2	Amlogic M803 Amlogic M805 Amlogic M805T Amlogic M806 Amlogic S805	500 MHz	15
Mali-450 MP3	Amlogic S905 Amlogic S905X	750 MHz	33.75
Mali-450 MP4	MT8685	416 MHz	24.8
	Kirin 620 Mstar 6A908 Mstar 6A918	500 MHz	29.8
	Kirin 910	533 MHz	32
	MT6588 MT6592M MT8127 MT6591 MT6591H Atom x3-C3230RK Hi3796M V100 Hi3798M V100	600 MHz	35.8
	MT6592 MT8392 Kirin 910T	700 MHz	41.8
Mali-450 MP6	Amlogic M801 Amlogic M802 Amlogic S801 Amlogic S802 Amlogic S802H Amlogic S812 Amlogic T866 Hi3796 Hi3798	600 MHz	53.8
Mali-450 MP8	---	600 MHz	71.7
Mali-T604	---	533 MHz	17
Mali-T604 MP2	---	533 MHz	34
Mali-T604 MP4	Exynos 5250	533 MHz	68.2
Mali-T622	---	533 MHz	8.5
Mali-T624	---	533 MHz	17
Mali-T624 MP4	Kirin 920(K3V3) Kirin 925 Kirin 928 Exynos 5260	600 MHz	76.8
Mali-T628	---	533 MHz	17
Mali-T628 MP2	LC1860 LC1860C LC1960	600 MHz	38.4
Mali-T628 MP3	---	533 MHz	51.2
Mali-T628 MP4	Kirin 930 Kirin 935	680 MHz	87
Mali-T628 MP6	Exynos 5420 Exynos 5422	533 MHz	102.4
Mali-T628 MP6	Exynos 5430	600 MHz	115.2
Mali-T720	---	450 MHz	7.65
Mali-T720	Exynos 7270 Exynos 7570	??? MHz	???
Mali-T720 MP2	MT6735P MT8735P	400 MHz	13.6
	MT6735M MT8735M	500 MHz	17
	MT8163V/B	520 MHz	17.68
	MT6737 MT8735D MT8735B	550 MHz	18.7
	Atom x3-C3440 Exynos 3475 MT6735 MT6737T MT8163V/A	600 MHz	20.4
	Exynos 7580	668 MHz	22.7
Mali-T720 MP3	MT6753 MT6753T MT8783	700 MHz	35.7
Mali-T720 MP6	LC1980	???	???
Mali-T720 MP8	---	600 MHz	81.6
Mali-T720 MP?	Hi3798C V200	???	103
Mali-T760	---	600 MHz	20.4
Mali-T760 MP2	MT6732 MT6732M MT8732	500 MHz	34
Mali-T760 MP2	MT6752 MT6752M MT8752	700 MHz	47.6
Mali-T760 MP4	Mstar 6A928	552 MHz	75
Mali-T760 MP4	RK3288 RK3288-C	600 MHz	81.6
Mali-T760 MP6	Exynos 5433 (Exynos 7410)	700 MHz	142.8
Mali-T760 MP8	Exynos 7420	772 MHz	210
Mali-T820	---	600 MHz	10.2
Mali-T820	SC9850	??? MHz	???
Mali-T820 MP3	Amlogic S912 Mstar 6A938	600 MHz	30.6
Mali-T830	---	600 MHz	20.4
Mali-T830 MP2	Amlogic S966 Amlogic T966 Amlogic T968	650 MHz	44.2
	Kirin 650 Kirin 655 Kirin 658	900 MHz	61.2
	Exynos 7870	700 MHz	47.6
Mali-T830 MP3	Exynos 7880	950 MHz	71.4
Mali-T860	---	700 MHz	23.8
Mali-T860 MP2	MT6738	350 MHz	23.8
	MT6750 MT6738T	520 MHz	35.3
	Helio P10 (MT6755M)	550 MHz	37.4
	MT6750T	650 MHz	44.2
	Helio P10 (MT6755) MT8785	700 MHz	47.6
	MT6739 Helio P15 (MT6755T)	800 MHz	54.4
Mali-T860 MP3	Exynos 7650	700 MHz	71.4
Mali-T860 MP4	RK3399	600 MHz	81.6
Mali-T860 MP4	Pinecone S1 (V670)	922 MHz	125.4
Mali-T880	---	850 MHz	28.9
Mali-T880 MP??	LG Nuclun 2	??? MHz	???
Mali-T880 MP2	Helio P20 (MT6757)	900 MHz	61.2
Mali-T880 MP2	Helio P25 (MT6757CD)	1000 MHz	68
Mali-T880 MP4	SC9860GV	??? MHz	???
	Helio X20 (MT6797) Helio X23 (MT6797D)	780 MHz	106
	Helio X25 (MT6797T)	850 MHz	115.6
	Helio X27 (MT6797X)	875 MHz	119
	Kirin 950 (Boost) Kirin 955 (Boost)	900 MHz	122.4
Mali-T880 MP10	Exynos 8890 (Lite)	650 MHz	221
Mali-T880 MP12	Exynos 8890	650 MHz	265.2
Mali-G51	---	??? MHz	???
Mali-G71	---	850 MHz	28.9
Mali-G71 MP2	MT6763 Helio P23 (MT6763T)	770 MHz	52.36
Mali-G71 MP2	Helio P30 (MT6758)	950 MHz	64.6
Mali-G71 MP8	Kirin 960	1037 MHz	282
Mali-G71 MP12	Pinecone S2? (V970)	900 MHz?	367.2?
Mali-G71 MP18	Exynos 8895 (Lite)	546 MHz	334
Mali-G71 MP20	Exynos 8895	546 MHz	371.2
Mali-G72	---	850 MHz	28.9
Mali-G72 MP3	Exynos 9610	??? MHz	???
Mali-G72 MP12	Kirin 970	850 MHz	346.8

Vivante Graphics & Broadcom VideoCore

GPU Name	Chip	Clock	GFlops
GC200	Jz4760	??? MHz	???
GC400	i.MX6 SoloX	??? MHz	???
GC500	PXA920	315 MHz	0.96
GC800	RK2918 ATM7013 ATM7019	575 MHz	4.6
GC860	Jz4770	??? MHz	???
GC880	i.MX6S i.MX6DL	??? MHz	???
GC1000	PXA986 PXA988 PXA1088	600 MHz	9.6
GC1000 Plus	ATM7029	630 MHz	10.1
GC2000	i.MX6D i.MX6Q	600 MHz	19.2
GC4000	K3V2	480 MHz	30.7
GC3000	S32V234	800 MHz	32
GC5000	PXA1928	800 MHz	64
GC6000 GC6400	---	800 MHz	128
GC7000UL	PXA1908	800 MHz	16
GC7000L	PXA1936	800 MHz	32
GC7000	---	800 MHz	64
GC7200	---	800 MHz	128
GC7400	---	800 MHz	256
GC7600	---	800 MHz	512
GC8000	---	---	---
VideoCore1	VC01	---	---
VideoCore2	BCM2702 BCM2705 BCM2722 BCM2724	---	---
VideoCore3	BCM2727 BCM11181	---	---
VideoCore4	BCM2763 BCM2820 BCM2835 BCM2836 BCM11182 BCM11311 BCM21533 BCM21654 BCM21663 BCM21664 BCM21664T BCM28145 BCM28150 BCM28155 BCM23550	250 MHz	24
VideoCore4	BCM2837	300 MHz	28.8

Intel Hd Graphics

Name	type	EUs	Chip	Fab	Clock(MHz)	GFlops
GMA 4500 Series	Gen 4	10	G41、G43、G45...	65nm	533~800	21~32
HD Graphics	Gen 5	12	Clarkdale Arrandale	45nm	533~900	25.6~43.2
HD Graphics HD Graphics 2000	Gen 6	6	SandyBridge GT1	32nm	950~1350	45.6~64.8
HD Graphics 3000	Gen 6	12	SandyBridge GT2	32nm	1000~1350	96~129.6
HD Graphics	Gen 7	4	Bay Trail-T Atom Z37xx Atom E38xx Bay Trail-M Pentium N35xx Celeron N2xxx Bay Trail-D Pentium J2xxx Celeron J1xxx	22nm	400~896	25.6~57.3
HD Graphics HD Graphics 2500	Gen 7	6	IvyBridge GT1	22nm	800~1150	76.8~110.4
HD Graphics 4000 HD Graphics P4000	Gen 7	16	IvyBridge GT2	22nm	850~1300	217.6~332.8
HD Graphics	Gen 7.5	10	Haswell GT1	22nm	850~1150	136~184
HD Graphics 4400	Gen 7.5	12	Haswell GT1.5	22nm	1150~1300	220.8~249.6
HD Graphics 4200 HD Graphics 4400 (Mobile) HD Graphics 4600 HD Graphics P4600 HD Graphics P4700	Gen 7.5	20	Haswell GT2	22nm	850~1350	272~432
HD Graphics 5000 Iris Graphics 5100	Gen 7.5	40	Haswell GT3	22nm	1000~1100	640~704
Iris Pro 5200 (with 128MB eDRAM)	Gen 7.5	40	Haswell GT3e	22nm	1200~1300	768~832
HD Graphics HD Graphics 400	Gen 8	12	Cherry Trail Atom x5-Z83xx Atom x5-Z85xx Braswell Celeron N30xx Celeron N31xx Celeron J30xx Celeron J31xx	14nm	500~700	96~134.4
HD Graphics HD Graphics 405	Gen 8	16	Cherry Trail Atom x7-Z87xx Braswell Pentium N37xx	14nm	600~700	153.6~179.2
HD Graphics 405	Gen 8	18	Braswell Pentium J3710	14nm	740	213.12
HD Graphics (Broadwell)	Gen 8	12	Broadwell-U GT1	14nm	800~850	153.6~163.2
HD Graphics 5300	Gen 8	24	Broadwell-Y GT2 Core M-5Yxx	14nm	800~850	307.2~326.4
HD Graphics 5500	Gen 8	23	Broadwell-U GT2	14nm	850~900	312.8~331.2
HD Graphics 5500	Gen 8	24	Broadwell-U GT2	14nm	900~950	345.6~364.8
HD Graphics 5600 HD Graphics P5700	Gen 8	24	Broadwell-U GT2	14nm	1000~1050	384~403.2
HD Graphics 6000	Gen 8	48	Broadwell-U GT3	14nm	950~1000	729.6~768
Iris Graphics 6100	Gen 8	48	Broadwell-U GT3	14nm	1050~1100	806.4~844.8
Iris Pro Graphics 6200 Iris Pro Graphics P6300 (with 128MB eDRAM)	Gen 8	48	Broadwell GT3e	14nm	1000~1150	768~883.2
HD Graphics 500	Gen 9	12	Apollo Lake Celeron N3350 Celeron N3450 Celeron J3355 Celeron J3455	14nm	650~750	124.8~144
HD Graphics 505	Gen 9	18	Apollo Lake Pentium N4200 Pentium J4205	14nm	750~800	216~230.4
HD Graphics 510	Gen 9	12	Skylake GT1	14nm	900~1000	172.8~192
HD Graphics 515	Gen 9	24	Skylake-Y GT2 Core M3 Core M5 Core M7	14nm	800~1000	307.2~384
HD Graphics 520	Gen 9	24	Skylake-U GT2	14nm	1000~1050	384~403.2
HD Graphics 530 HD Graphics P530	Gen 9	24	Skylake GT2	14nm	900~1150	345.6~441.6
Iris Graphics 540 Iris Graphics 550 (with 64MB eDRAM)	Gen 9	48	Skylake GT3e	14nm	950~1100	729.6~844.8
Iris Pro Graphics 580 Iris Pro Graphics P580 (with 128MB eDRAM)	Gen 9	72	Skylake GT4e	14nm	1000	1152
HD Graphics 610	Gen 9+	12	Kaby Lake GT1	14nm	900~1050	172.8~201.6
HD Graphics 615	Gen 9+	24	Kaby Lake-Y GT2 Pentium 4410Y Core M3-7Yxx Core i5-7Yxx Core i7-7Yxx	14nm	850~1050	326.4~403.2
HD Graphics 620	Gen 9+	24	Kaby Lake-U GT2	14nm	1000~1150	384~441.6
HD Graphics 630 HD Graphics P630	Gen 9+	24	Kaby Lake GT2	14nm	950~1150	364.8~441.6
Iris Plus Graphics 640 Iris Plus Graphics 650 (with 64MB eDRAM)	Gen 9+	48	Kaby Lake GT3e	14nm	950~1150	729.6~883.2

Nvidia Geforce Gtx 600 Series

GPU Name	Card	Core	Clock(MHz)	DDR	Bus(bit)	GFlops
GK110	GTX Titan	2688	837~876	GDDR5	384	4500
GK104	GTX 680	1536	1006~1110	GDDR5	256	3250
	GTX 670	1344	915~1084	GDDR5	256	2760
	GTX 660Ti	1344	915~1058	GDDR5	192	2460
GK106	GTX 660	960	980~1032	GDDR5	192	1881.6
	GTX 650Ti Boost	768	980~1032	GDDR5	192	1505.2
	GTX 650Ti	768	928	GDDR5	128	1425.4
GK107	GTX 650	384	1058	GDDR5	128	812.5
GK107	GT 640	384	900	DDR3	128	691.2

More nVIDIA Geforce Information in wiki

AMD Radeon Hd 7000 Series

GPU Name	Card	Core	Clock(MHz)	DDR	Bus(bit)	GFlops
Tahiti XT2	HD 7970 GHZ	2048	1000~1050	GDDR5	384	4096~4300
Tahiti XT	HD 7970	2048	925	GDDR5	384	3788.8
Tahiti Pro	HD 7950 Boost	1792	850~925	GDDR5	384	3046.4~3315.2
Tahiti Pro	HD 7950	1792	800	GDDR5	384	2867.2
Tahiti LE	HD 7870 XT	1536	925~975	GDDR5	256	2841.6~2995.2
Pitcairn XT	HD 7870 GHZ	1280	1000	GDDR5	256	2560
Pitcairn Pro	HD 7850	1024	860	GDDR5	256	1761.28
Bonaire XT	HD 7790	896	1000	GDDR5	128	1792
Cape Verde XT	HD 7770 GHZ ver.2	640	1100	GDDR5	128	1408
Cape Verde XT	HD 7770 GHZ	640	1000	GDDR5	128	1280
Cape Verde Pro	HD 7750 ver.2	512	900	GDDR5	128	921.6
Cape Verde Pro	HD 7750	512	800	GDDR5	128	819.2

More AMD Radeon Information in wiki

Snapdragon 820
三星14nm FinFET工艺
2.2Ghz四核Kryo构架（自主黑科技）
GPU为Adreno 530@510~624 MHzMHz，GPU浮点性能407.4~498.5 GFlops
内存带宽28.8GB/s（LPDDR4双通道）

Exynos 8890
三星14nm FinFET工艺
1.6Ghz四核 ARM A-53加上2.3~2.6Ghz四核 Exynos M1核心构架（自主黑科技）
GPU为Mali-T880MP12（12核心）@650 MHz，GPU浮点性能265.2 GFlops
内存带宽28.7GB/s（LPDDR4双通道）

Helio X20
台积电 20nm工艺
十核心2x Cortex-A72 @ 2.5GHz加上4x Cortex-A53 @ 2.0GHz加上4x Cortex-A53 @ 1.4GHz
GPU为Mali-T880 MP4@780 MHz，GPU浮点性能106 GFlops
内存带宽14.9GB/s（LPDDR3双通道）

Kirin 950
台积电16nm工艺
八核心 4X ARM Cortex-A72 @ 2.3GHz加上4X ARM Cortex-A53@1.8GHz
GPU为Mali-T880 MP4@900 MHz，GPU浮点性能122.4 GFlops
内存带宽25.6GB/s（LPDDR4双通道）
继续阅读常见GPU的浮点性能

macOS Mojave (10.14.3) Android Studio 3.3.1 NDK 19.1.5304403 导入并构建Vuh项目

以前在 Android Studio 3.2.1上vuh库使用的例子中实现了一个使用 vuh 库的例子。那个例子中的 vuh 库是我们编译好 libvuh.so 之后直接引用的，我们下面实现通过直接编译代码实现整合。

尝试过使用 ExternalProject_add 跟 include 的方式包含 vuh 库，但是都不是很成功。

其中 ExternalProject_add 导入的项目只能编译一次，即使指定 BUILD_ALWAYS 1 也没用，这个应该是 Ninja 导致的问题，导致当出现多个 ABI 或者 vuh 库代码变动之后，不能重新编译，出现各种编译错误。

使用 include 包含的项目会导致路径信息不正确，无法找到源代码文件。

最后使用 add_subdirectory实现。

修改之后的几个关键文件如下：

注意： VUH_ROOT_DIR 这个变量中指定 vuh 库代码的位置

# For more information about using CMake with Android Studio, read the
# documentation: https://d.android.com/studio/projects/add-native-code.html

# Sets the minimum version of CMake required to build the native library.

cmake_minimum_required(VERSION 3.8)

# for Vulkan
SET(Vulkan_INCLUDE_DIR ${ANDROID_NDK}/sources/third_party/vulkan/src/include/)
SET(Vulkan_LIBRARIES ${ANDROID_NDK}/platforms/${ANDROID_PLATFORM}/arch-${ANDROID_ARCH_NAME}/usr/lib)
if(X86_64)
    SET(Vulkan_LIBRARIES ${Vulkan_LIBRARIES}64)
endif()
SET(Vulkan_LIBRARIES ${Vulkan_LIBRARIES}/libvulkan.so)
add_library(vulkan SHARED IMPORTED)
set_target_properties(vulkan PROPERTIES IMPORTED_LOCATION ${Vulkan_LIBRARIES})

# for vuh
add_definitions(-DVK_USE_PLATFORM_ANDROID_KHR=1 -DVULKAN_HPP_TYPESAFE_CONVERSION=1)
SET(VUH_ROOT_DIR ${CMAKE_CURRENT_SOURCE_DIR}/../../vuh/)
SET(VUH_BUILD_TESTS OFF)
SET(VUH_BUILD_DOCS OFF)
SET(VUH_BUILD_EXAMPLES OFF)

add_subdirectory(${VUH_ROOT_DIR}src/ ${CMAKE_CURRENT_SOURCE_DIR}/build/)

# for example
SET(Vuh_INCLUDE_PATH ${VUH_ROOT_DIR}src/include)
include_directories(${Vulkan_INCLUDE_DIR})
include_directories(${Vuh_INCLUDE_PATH})

# Creates and names a library, sets it as either STATIC
# or SHARED, and provides the relative paths to its source code.
# You can define multiple libraries, and CMake builds them for you.
# Gradle automatically packages shared libraries with your APK.

add_library( # Sets the name of the library.
        native-lib

        # Sets the library as a shared library.
        SHARED

        # Provides a relative path to your source file(s).
        src/main/cpp/native-lib.cpp)


add_dependencies(native-lib vuh)

# Searches for a specified prebuilt library and stores the path as a
# variable. Because CMake includes system libraries in the search path by
# default, you only need to specify the name of the public NDK library
# you want to add. CMake verifies that the library exists before
# completing its build.

find_library( # Sets the name of the path variable.
        log-lib

        # Specifies the name of the NDK library that
        # you want CMake to locate.
        log

        android)

# Specifies libraries CMake should link to your target library. You
# can link multiple libraries, such as libraries you define in this
# build script, prebuilt third-party libraries, or system libraries.

target_link_libraries( # Specifies the target library.
        native-lib

        vulkan

        vuh

        ${log-lib})

# For more information about using CMake with Android Studio, read the

# documentation: https://d.android.com/studio/projects/add-native-code.html

# Sets the minimum version of CMake required to build the native library.

cmake_minimum_required(VERSION 3.8)

# for Vulkan

SET(Vulkan_INCLUDE_DIR ${ANDROID_NDK}/sources/third_party/vulkan/src/include/)

SET(Vulkan_LIBRARIES ${ANDROID_NDK}/platforms/${ANDROID_PLATFORM}/arch-${ANDROID_ARCH_NAME}/usr/lib)

if(X86_64)

SET(Vulkan_LIBRARIES ${Vulkan_LIBRARIES}64)

endif()

SET(Vulkan_LIBRARIES ${Vulkan_LIBRARIES}/libvulkan.so)

add_library(vulkan SHARED IMPORTED)

set_target_properties(vulkan PROPERTIES IMPORTED_LOCATION ${Vulkan_LIBRARIES})

# for vuh

add_definitions(-DVK_USE_PLATFORM_ANDROID_KHR=1 -DVULKAN_HPP_TYPESAFE_CONVERSION=1)

SET(VUH_ROOT_DIR ${CMAKE_CURRENT_SOURCE_DIR}/../../vuh/)

SET(VUH_BUILD_TESTS OFF)

SET(VUH_BUILD_DOCS OFF)

SET(VUH_BUILD_EXAMPLES OFF)

add_subdirectory(${VUH_ROOT_DIR}src/ ${CMAKE_CURRENT_SOURCE_DIR}/build/)

# for example

SET(Vuh_INCLUDE_PATH ${VUH_ROOT_DIR}src/include)

include_directories(${Vulkan_INCLUDE_DIR})

include_directories(${Vuh_INCLUDE_PATH})

# Creates and names a library, sets it as either STATIC

# or SHARED, and provides the relative paths to its source code.

# You can define multiple libraries, and CMake builds them for you.

# Gradle automatically packages shared libraries with your APK.

add_library( # Sets the name of the library.

native-lib

# Sets the library as a shared library.

SHARED

# Provides a relative path to your source file(s).

src/main/cpp/native-lib.cpp)

add_dependencies(native-lib vuh)

# Searches for a specified prebuilt library and stores the path as a

# variable. Because CMake includes system libraries in the search path by

# default, you only need to specify the name of the public NDK library

# you want to add. CMake verifies that the library exists before

# completing its build.

find_library( # Sets the name of the path variable.

log-lib

# Specifies the name of the NDK library that

# you want CMake to locate.

log

android)

# Specifies libraries CMake should link to your target library. You

# can link multiple libraries, such as libraries you define in this

# build script, prebuilt third-party libraries, or system libraries.

target_link_libraries( # Specifies the target library.

native-lib

vulkan

vuh

${log-lib})

注意：由于 vuh 库需要 CMake 3.8 。因此，我们需要手工指定CMake版本为3.10.2 。

如下：

apply plugin: 'com.android.application'

android {
    compileSdkVersion 28
    defaultConfig {
        applicationId "com.mobibrw.vuhandroid"
        minSdkVersion 24
        targetSdkVersion 28
        versionCode 1
        versionName "1.0"
        testInstrumentationRunner "android.support.test.runner.AndroidJUnitRunner"
        externalNativeBuild {
            cmake {
                version "3.10.2"
                cppFlags "-std=c++14 -v -g"
            }
        }
    }

    buildTypes {
        release {
            minifyEnabled false
            proguardFiles getDefaultProguardFile('proguard-android.txt'), 'proguard-rules.pro'
        }
    }
    externalNativeBuild {
        cmake {
            version "3.10.2"
            path "CMakeLists.txt"
        }
    }
}

dependencies {
    implementation fileTree(dir: 'libs', include: ['*.jar'])
    implementation 'com.android.support:appcompat-v7:28.0.0'
    implementation 'com.android.support.constraint:constraint-layout:1.1.3'
    testImplementation 'junit:junit:4.12'
    androidTestImplementation 'com.android.support.test:runner:1.0.2'
    androidTestImplementation 'com.android.support.test.espresso:espresso-core:3.0.2'
}

apply plugin: 'com.android.application'

android {

compileSdkVersion 28

defaultConfig {

applicationId "com.mobibrw.vuhandroid"

minSdkVersion 24

targetSdkVersion 28

versionCode 1

versionName "1.0"

testInstrumentationRunner "android.support.test.runner.AndroidJUnitRunner"

externalNativeBuild {

cmake {

version "3.10.2"

cppFlags "-std=c++14 -v -g"

}

buildTypes {

release {

minifyEnabled false

proguardFiles getDefaultProguardFile('proguard-android.txt'), 'proguard-rules.pro'

}

externalNativeBuild {

cmake {

version "3.10.2"

path "CMakeLists.txt"

}

dependencies {

implementation fileTree(dir: 'libs', include: ['*.jar'])

implementation 'com.android.support:appcompat-v7:28.0.0'

implementation 'com.android.support.constraint:constraint-layout:1.1.3'

testImplementation 'junit:junit:4.12'

androidTestImplementation 'com.android.support.test:runner:1.0.2'

androidTestImplementation 'com.android.support.test.espresso:espresso-core:3.0.2'

}

如果出现如下错误：

CMake Error: CMake was unable to find a build program corresponding to "Ninja".  CMAKE_MAKE_PROGRAM is not set.  You probably need to select a different build tool.
-- Configuring incomplete, errors occurred!

1 2	CMake Error: CMake was unable to find a build program corresponding to "Ninja". CMAKE_MAKE_PROGRAM is not set. You probably need to select a different build tool. -- Configuring incomplete, errors occurred!

则执行如下操作：

$ brew install ninja

1	$ brew install ninja

如果出现如下错误：

* What went wrong:
Execution failed for task ':app:transformNativeLibsWithMergeJniLibsForDebug'.
> More than one file was found with OS independent path 'lib/armeabi-v7a/libvuh.so'

* What went wrong:

Execution failed for task ':app:transformNativeLibsWithMergeJniLibsForDebug'.

> More than one file was found with OS independent path 'lib/armeabi-v7a/libvuh.so'

则删除代码中的 jniLibs/armeabi-v7a/libvuh.so 即可解决问题。

完整的例子点击此处下载 vuhAndroid

2025 年 4 月
一	二	三	四	五	六	日
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30