본문 바로가기
Openstack/vGPU & GPU

Kolla-ansible Openstack vGPU Virtual Machine 설정 방법

by Miners1205 2023. 12. 20.
반응형

 


Kolla-ansible Openstack vGPU(Virtual GPU) Virtual Machine 설정 방법


vGPU 사용을 위한 Openstack VM 환경 설정을 한다.

  • Openstack vGPU VM 환경 세팅

 

1.Virtual Machine 환경

### OS 기본 환경 세팅 ###

o Rocky 8.6 OS Kernel 4.18.0-372.13.1.el8_6.x86_64


### 커널 버전 확인 ###
[root@vgputestvm01]# rpm -qa |grep kernel
kernel-core-4.18.0-372.13.1.el8_6.x86_64
kernel-headers-4.18.0-372.13.1.el8_6.x86_64
kernel-modules-4.18.0-372.13.1.el8_6.x86_64
kernel-tools-4.18.0-372.13.1.el8_6.x86_64
kernel-tools-libs-4.18.0-372.13.1.el8_6.x86_64
kernel-devel-4.18.0-372.13.1.el8_6.x86_64
kernel-4.18.0-372.13.1.el8_6.x86_64

-> 버전이 맞지 않는 경우 일치 시켜주어야함.

### Grub 모드 세팅###

[root@vgputestvm01]# cat /etc/default/grub
GRUB_TIMEOUT=1
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
GRUB_DEFAULT=saved
GRUB_DISABLE_SUBMENU=true
GRUB_TERMINAL_OUTPUT="console"
GRUB_CMDLINE_LINUX="crashkernel=auto resume=/dev/mapper/rl-swap rd.lvm.lv=rl/root rd.lvm.lv=rl/swap  rd.driver.blacklist=nouveau nouveau.modeset=0"
GRUB_DISABLE_RECOVERY="true"
GRUB_ENABLE_BLSCFG=true


### Grub 설정 변경 ###

[root@vgputestvm01]# grub2-mkconfig -o /bot/efi/EFI/rocky/grub2.cfg
[root@vgputestvm01]# grub2-mkconfig -o /boot/grub2/grub
[root@vgputestvm01]# dracut -v /boot/initramfs-$(uname -r).img $(uname -r)

2. NVIDIA Driver 설치

### 패키지 종속성을 위한 필수 패키지 사전 설치 ###

[root@vgputestvm01 ]# dnf install gcc* make pciutils -y

[root@vgputestvm01 ]# lspci -nn |grep NVIDIA
00:05.0 VGA compatible controller [0300]: NVIDIA Corporation TU104GL [Tesla T4] [10de:1eb8] (rev a1)


### rpm 패캐지 버전들은 커널 버전과 동일하게 맞춤.
[root@vgputestvm01 ]# rpm -ivh kernel-devel-4.18.0-372.13.1.el8_6.x86_64.rpm

[root@vgputestvm01 ]# rpm -ivh kernel-headers-4.18.0-372.13.1.el8_6.x86_64.rpm (rpm -Uvh kernel-headers-4.18.0-372.13.1.el8_6.x86_64.rpm)


[root@vgputestvm01 ]# rpm -ivh dkms-3.0.12-1.el8.noarch.rpm
→ dkms 설치시 디펜던시 필요시 다음 설치
→ rpm -ivh elfutils-libelf-devel-0.189-3.el8.x86_64.rpm
→ rpm -ivh elfutils-libelf-0.189-3.el8.x86_64.rpm

[root@vgputestvm01]# chmod +x NVIDIA-Linux-x86_64-470.161.03-grid.run

[root@vgputestvm01]# ./NVIDIA-Linux-x86_64-470.161.03-grid.run
→ 설치시 필요한 패키지 있으면 추가 설치 필요.


### 설치 완료 확인

[root@vgputestvm01 ]# lsmod |grep nvidia
nvidia_drm 65536 0
nvidia_modeset 1200128 1 nvidia_drm
nvidia 35454976 1 nvidia_modeset
drm_kms_helper 266240 4 cirrus,nvidia_drm
drm 585728 5 drm_kms_helper,nvidia,cirrus,nvidia_drm

[root@vgputestvm01 ]# nvidia-smi

3. NVIDIA Driver 인식

[root@vgputestvm01]# ll /etc/nvidia/
ClientConfigToken
gridd.conf
license
nvidia-topologyd.conf.template


### 최근 라이센스 인식이 설정값이 아닌 토큰값으로 변경됨
→ gridd.conf.template 으로 있으면 명칭을 gridd.conf로 변경해도 되고 안해도 된다.

### NVIDIA 라이센스 다운받아 하단 폴더에 넣는다.
[root@vgputestvm01 nvidia]# cd ClientConfigToken/
[root@vgputestvm01 nvidia]# ls
client_configuration_token_12-04-2023-18-04-25.to


### 라이센스 인식을 위한 vGPU 서비스 재기동 및 인식 확인
[root@vgputestvm01 ]# systemctl restart nvidia-gridd.service

[root@vgputestvm01 nvidia]# nvidia-smi -q
==============NVSMI LOG==============
Timestamp : Tue Dec 12 02:16:37 2023
Driver Version : 470.161.03
CUDA Version : 11.4
Attached GPUs : 1
GPU 00000000:00:05.0
Product Name : GRID T4-1B
Product Brand : NVIDIA Virtual PC
Display Mode : Enabled
Display Active : Disabled
Persistence Mode : Enabled
..................................................................................................................
GPU UUID : GPU-UUID
..................................................................................................................
GPU Virtualization Mode
Virtualization Mode : VGPU
Host VGPU Mode : N/A
vGPU Software Licensed Product
Product Name : NVIDIA Virtual PC
License Status : Licensed (Expiry: 2023-12-13 0:28:35 GMT)    -> 해당 부분이 정상 인식으로 변경된것 확인.
..................................................................................................................
라이센스 상태 정상 확인 완료.

 

o vGPU 참고 사이트
1)NVIDIA : https://docs.nvidia.com/grid/13.0/grid-vgpu-user-guide/index.html
2) vGPU : https://docs.openstack.org/nova/queens/admin/virtual-gpu.html
3) https://www.nvidia.com/en-us/data-center/graphics-cards-for-virtualization/
4) 참고용 사이트 : https://cloud-atlas.readthedocs.io/zh-cn/latest/machine_learning/hardware/nvidia_gpu/nvidia-smi.html

반응형