The Multi-GPU-SET Idea

Hello, I am a Ph.D. student in Poland at the Silesian University of Technology. And I wanted to start a general discussion on the classifier on the video stream improvement idea. I called it the “Multi-GPU & Multi-SET” or, in short, the “Multi-GPU-SET” idea. People use Multi-GPU and the “Syncing” for Convolutional Neural Networks. But nobody, in my opinion, tried multi-GPU to classify. So what is it about? Well, it is about Multi-GPU detection and “Combining.” You can imagine You Only Look Once (YOLO2 or YOLO3) models, right? For example, one is trained only to detect cars and people… another on the COCO set, another on the VOC set… and so on. When you separate detecting “specialization” on these AI/ML models, why not combine and run on Multi-GPU? Many researchers work on making CNN work in as many classes as possible, but has someone tried this Multi-GPU approach? I think the answer is not. So, can you discuss this with me here?

Some visual diagram of this idea is here.
multi-gpu-set-idea

I really wanted to test it on the Mac Pro – Apple, but I do not have enough money/budget… so I did a build described at Ph.D. Hanna (Hackintosh) is Ready – iblog.isowa.io, and it runs Multi-GPU on my GitHub – sowson/darknet: Convolutional Neural Networks on OpenCL on Intel & NVidia & AMD & Mali GPUs for macOS & GNU/Linux fork like shown here.

Hanna-PC-GPU-Monitor

I have a post with nested video (sorry, it is very long, 55 minutes) here: Ph.D. Progress from May 27th, 2020 Update Keynote – iblog.isowa.io Btw, I improved the clBLAS library to calculate GEMMs in Multi-GPU mode here: GitHub – sowson/clBLAS: a software library containing BLAS functions written in OpenCL for https://github.com/sowson/darknet with the Pull Request that is waiting to approval at Multi-GPU for GEMM for macOS and GNU/Linux by sowson · Pull Request #350 · clMathLibraries/clBLAS · GitHub

Thanks to that, I was able to run on 2 x Radeon VII and macOS Catalina – Apple for the research proposes only. There is one more thing… GPUs should not be connected with AMD Infinity Fabric Link because it combines the GPUs’ memory… and the model: OpenCL: Context of Multi-GPUs => Multi-Queues => Multi-Kernels simply does not work.

So let’s discuss ;-). Btw, it is coming soon. Keep in mind I have only 2 GPUs, not 4 ;-).

Thanks!

Leave a Reply

Your email address will not be published. Required fields are marked *

*

This site uses Akismet to reduce spam. Learn how your comment data is processed.