TX1入门教程软件篇-安装TensorFlow(1.0.1)
说明:
介绍如何在TX1上安装TensorFlow 1.0.1版本,1.0版本以上可以支持更多功能实现。
准备:
利用Jetpack安装如下:
L4T 24.2.1 an Ubuntu 16.04 64-bit variant (aarch64)
CUDA 8.0
cuDNN 5.1.5
TensorFlow安装需要用到CUDA和cuDNN
TensorFlow占用比较多空间,TX1通常空间不足,最好增加64G+的U盘作为root分区启动,增加交换分区大小为8G+
安装:
安装Java:
sudo add-apt-repository ppa:webupd8team/java sudo apt-get update sudo apt-get install oracle-java8-installer
安装依赖,(使用Python 2.7)
sudo apt-get install zip unzip autoconf automake libtool curl zlib1g-dev maven -y sudo apt-get install python-numpy swig python-dev python-pip python-wheel -y
安装Bazel(0.5.0版本)
wget https://github.com/bazelbuild/bazel/releases/download/0.4.5/bazel-0.4.5-dist.zip
解压,进入
cd bazel-0.4.5-dist
修改:
vim src/main/java/com/google/devtools/build/lib/util/CPU.java
其中28行:ARM("arm", ImmutableSet.of("arm","armv7l"))
修改为:ARM("arm", ImmutableSet.of("aarch64", "arm","armv7l"))
编译:
./compile.sh
复制到系统bin目录:
sudo cp output/bazel /usr/local/bin
创建swap文件
创建8G swap文件:
fallocate -l 8G swapfile
修改权限
chmod 600 swapfile
创建swap区
mkswap swapfile
激活
swapon swapfile
确认
swapon -s
安装TensorFlow
克隆
git clone https://github.com/tensorflow/tensorflow.git
checkout 新版本
cd tensorflow git checkout v1.0.1
修改:
tensorflow/stream_executor/cuda/cuda_gpu_executor.cc
找到:
static int TryToReadNumaNode(conststring &pci_bus_id,intdevice_ordinal)
添加:
#if defined(__APPLE__)
#ifdef __aarch64__ LOG(INFO) << "ARM64 does not support NUMA - returning NUMA node zero"; return 0; #elif defined(__APPLE__)
增加头文件
sudo cp /usr/include/cudnn.h /usr/lib/aarch64-linux-gnu/include/cudnn.h
编译:
./configure
配置:
ubuntu@tegra-ubuntu:~/tensorflow$ ./configure Please specify the location of python. [Default is /usr/bin/python]: /usr/bin/python2.7 Please specify optimization flags to use during compilation [Default is -march=native]: Do you wish to use jemalloc as the malloc implementation? (Linux only) [Y/n] y jemalloc enabled on Linux Do you wish to build TensorFlow with Google Cloud Platform support? [y/N] n No Google Cloud Platform support will be enabled for TensorFlow Do you wish to build TensorFlow with Hadoop File System support? [y/N] n No Hadoop File System support will be enabled for TensorFlow Do you wish to build TensorFlow with the XLA just-in-time compiler (experimental)? [y/N] y XLA JIT support will be enabled for TensorFlow Found possible Python library paths: /usr/local/lib/python2.7/dist-packages /usr/lib/python2.7/dist-packages Please input the desired Python library path to use. Default is [/usr/local/lib/python2.7/dist-packages] Using python library path: /usr/local/lib/python2.7/dist-packages Do you wish to build TensorFlow with OpenCL support? [y/N] n No OpenCL support will be enabled for TensorFlow Do you wish to build TensorFlow with CUDA support? [y/N] y CUDA support will be enabled for TensorFlow Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]: Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to use system default]: Please specify the location where CUDA toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: Please specify the Cudnn version you want to use. [Leave empty to use system default]: Please specify the location where cuDNN library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: Please specify a list of comma-separated Cuda compute capabilities you want to build with. You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus. Please note that each additional compute capability significantly increases your build time and binary size. Extracting Bazel installation... ....................... INFO: Starting clean (this may take a while). Consider using --expunge_async if the clean takes more than several minutes. ....................... INFO: All external dependencies fetched successfully. Configuration finished
bazel 编译:
bazel build -c opt --local_resources 3072,4.0,1.0 --verbose_failures --config=cuda //tensorflow/tools/pip_package:build_pip_package
bazel生成whl文件
bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
保存whl文件
mv /tmp/tensorflow_pkg/tensorflow-1.0.1-cp27-cp27mu-linux_aarch64.whl $HOME/
安装
sudo pip install $HOME/tensorflow-1.0.1-cp27-cp27mu-linux_aarch64.whl
重启
sudo reboot
测试:
ubuntu@tegra-ubuntu:~$ python Python 2.7.12 (default, Nov 19 2016, 06:48:10) [GCC 5.4.0 20160609] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import tensorflow as tf I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.8.0 locally I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.8.0 locally I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.8.0 locally >>> x = tf.constant(1.0) >>> y = tf.constant(2.0) >>> z = x + y >>> with tf.Session() as sess: ... print z.eval() ... I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:874] ARM has no NUMA node, hardcoding to return zero I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties: name: GP10B major: 6 minor: 2 memoryClockRate (GHz) 1.3005 pciBusID 0000:00:00.0 Total memory: 7.67GiB Free memory: 6.79GiB I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0 I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0: Y I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GP10B, pci bus id: 0000:00:00.0) E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 6.45G (6929413888 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 5.81G (6236472320 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 5.23G (5612825088 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 4.70G (5051542528 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 4.23G (4546387968 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 3.81G (4091749120 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY I tensorflow/compiler/xla/service/platform_util.cc:58] platform CUDA present with 1 visible devices I tensorflow/compiler/xla/service/platform_util.cc:58] platform Host present with 4 visible devices I tensorflow/compiler/xla/service/service.cc:180] XLA service executing computations on platform Host. Devices: I tensorflow/compiler/xla/service/service.cc:187] StreamExecutor device (0): <undefined>, <undefined> I tensorflow/compiler/xla/service/platform_util.cc:58] platform CUDA present with 1 visible devices I tensorflow/compiler/xla/service/platform_util.cc:58] platform Host present with 4 visible devices I tensorflow/compiler/xla/service/service.cc:180] XLA service executing computations on platform CUDA. Devices: I tensorflow/compiler/xla/service/service.cc:187] StreamExecutor device (0): GP10B, Compute Capability 6.2 3.0
问题:
错误:
tensorflow/stream_executor/BUILD:39:1: C++ compilation of rule '//tensorflow/stream_executor:cuda_platform' failed: crosstool_wrapper_driver_is_not_gcc failed: error executing command
解决:
https://github.com/tensorflow/tensorflow/issues/2559 https://github.com/tensorflow/tensorflow/issues/2556
修改:tensorflow/stream_executor/cuda/cuda_blas.cc, 在
#if CUDA_VERSION >= 7050 #define EIGEN_HAS_CUDA_FP16 #endif
增加定义:
#if CUDA_VERSION >= 8000 #define CUBLAS_DATA_HALF CUDA_R_16F #endif
- 本文固定链接: http://www.rosrobot.cn/?id=99
- 转载请注明: znjrobot 于 北京智能佳科技有限公司 发表
《本文》有 0 条评论