Deep Learning is a task that was definitely not intended to run on a mobile device like your personal Notebook. Usually local GPU clusters or cloud based Server Systems are used to train models in an acceptable time period. However, everyone that is in to AI in general likes to try out stuff on its own. Depending on your field of interest it gets quite obvious that without any GPU at hand you should not even consider to start. If you have a Nvidia GPU (AMD with plaidml is possible but not well established by now) in your machine you might be confident to give it try. Unfortunately you realize that this will not be sustainable. It is nice to be able to train a VGG-16 with a batch size of 2 with your mid/ low level RTX 20 series card, but it just takes too long and your notebook will sound and feel like a plane before take off. These machines can easily handle daily work, but were for sure not designed to train deep learning models for 24 hours straight. Even if you have a good gaming notebook with a descent GPU, you may not want to run machine learning projects on your local GPU for the following reasons:
- The Notebook itself is not build to run on maximum speed without a break for more than 6 hours. The heat generated inside the chassis will damage your system components over time without a doubt.
- The GPU inside your notebook is power constraint in comparison to the desktop variant. Thus, it is much slower. For example, my notebook has a RTX Quadro 5000 Max Q build in, which sounds awesome at first, but looking at the stat sheet I noticed that it only runs on 80W of power compared to 230W on the desktop side, thus is a lot weaker by design.
When I realized that, I started to research what options I have if I would like to keep my current gear but become more “competitive” regarding machine learning. I ended up with the following two options:
- Use a cloud based service like Google Colab and GPU acceleration to run my projects without being held back by my local machine capabilities.
- Purchase an external GPU (eGPU) that I can connect via Thunderbolt 3 to my Notebook to outsource the heavy workload and save my system from future damage.
Option 1 was tempting at first, but I have read a lot of articles on medium covering the problem with randomly selected GPUs, which vary a lot in speed. I still think that it is the best option for beginners to start with, because a lot of tutorials are targeting Google Colab and it is fro free. But, I personally like to have my projects running on my local machine without having to rely on internet connections or anything else. Hence, I decided to buy a used eGPU to test what benefit it would give to my system.
Testing eGPU with Tensorflow 2.x
I got my hands on a used RTX 2080Ti eGPU all in one solution from Gigabyte. If you start digging deeper into eGPU related topics you will quickly realize that the Thunderbolt 3 connection is limiting the performance gain of the desktop GPU because of the maximum 4 PCI lanes it can use. I was afraid that this could result in a very poor performance in machine learning, but as you will see, I was wrong (luckily :D). As a side note, the performance dip for gaming is indeed very big and I would not recommend to buy an eGPU for gaming if you already have a dedicated GPU inside your Notebook.
My system has an i7 9750H 6 core processor with 32gb of RAM running on POP OS 20.04. The integrated Nvidia RTX Quadro 5000 Max Q is one of the most capable mobile GPU chips out there (at least before the new RTX 30 Series). So if you run on a mid-/ low level Nvidia RTX 20 Series card, you will gain a lot more performance than even shown in this comparison. I used the AI-Benchmark tool. This way I wanted to assure that you can compare your machine to the results of my system using the eGPU.
The eGPU set up scored 27104 points in AI-Benchmark, which is 17.6% less than the best official result for any RTX 2080 TI system (full Desktop Tower). The dedicated Notebook GPU scored 17046 points, which is actually higher than the listed score in the ranking. That is a performance loss of 37.2% compared to the eGPU set up.
The performance gain for training using the AI-benchmark was 34.8% over all tests. The test palette ranges from simple Image Classification tasks over Image Super Resolution Networks, Natural Language Tasks and TimeSeries problems. As you can see the eGPU shows the biggest performance gains for Image related tasks whereas it shows not a great benefit for TimeSeries Tasks. Overall I am really pleased with this performance boost.
I also did a real world test using one of my own projects. I trained the EfficientDet D0 512 x512 for a two class object detection task using the facial mask set from Roboflow. I trained the model on both systems for 10k steps with a batch size of 12. The eGPU set up took 93 minutes to complete this task. In comparison, the dedicated Laptop GPU needed 124 minutes to finish. This is 25% less time spend on training.
The performance gain during Inference is 29.8% on average ranging from a performance loss of 9.6% all the way up to a top performance boost of 50.2%. Interestingly the eGPU struggles with the Super Resolution task the most. Again, the TimeSeries models show no big improvement compared to other Image tasks.
If you want to stick with your portable Notebook, getting a hand on an external GPU can definitely speed up your AI game. Especially Computer Vision Tasks will benefit a lot. It will not only save you precious time but also can retain a good health of your machine for a much longer time. Keep in mind, that I tested the eGPU setup against one of the fastest mobile chips out there, hence you might benefit a lot more!