티스토리 뷰

목차



    반응형

     


    Kolla-ansible Openstack vGPU(Virtual GPU) Virtual Machine 설정 방법


    vGPU 사용을 위한 Openstack VM 환경 설정을 한다.

    • Openstack vGPU VM 환경 세팅

     

    1.Virtual Machine 환경

    ### OS 기본 환경 세팅 ###

    o Rocky 8.6 OS Kernel 4.18.0-372.13.1.el8_6.x86_64


    ### 커널 버전 확인 ###
    [root@vgputestvm01]# rpm -qa |grep kernel
    kernel-core-4.18.0-372.13.1.el8_6.x86_64
    kernel-headers-4.18.0-372.13.1.el8_6.x86_64
    kernel-modules-4.18.0-372.13.1.el8_6.x86_64
    kernel-tools-4.18.0-372.13.1.el8_6.x86_64
    kernel-tools-libs-4.18.0-372.13.1.el8_6.x86_64
    kernel-devel-4.18.0-372.13.1.el8_6.x86_64
    kernel-4.18.0-372.13.1.el8_6.x86_64

    -> 버전이 맞지 않는 경우 일치 시켜주어야함.

    ### Grub 모드 세팅###

    [root@vgputestvm01]# cat /etc/default/grub
    GRUB_TIMEOUT=1
    GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
    GRUB_DEFAULT=saved
    GRUB_DISABLE_SUBMENU=true
    GRUB_TERMINAL_OUTPUT="console"
    GRUB_CMDLINE_LINUX="crashkernel=auto resume=/dev/mapper/rl-swap rd.lvm.lv=rl/root rd.lvm.lv=rl/swap  rd.driver.blacklist=nouveau nouveau.modeset=0"
    GRUB_DISABLE_RECOVERY="true"
    GRUB_ENABLE_BLSCFG=true


    ### Grub 설정 변경 ###

    [root@vgputestvm01]# grub2-mkconfig -o /bot/efi/EFI/rocky/grub2.cfg
    [root@vgputestvm01]# grub2-mkconfig -o /boot/grub2/grub
    [root@vgputestvm01]# dracut -v /boot/initramfs-$(uname -r).img $(uname -r)

    2. NVIDIA Driver 설치

    ### 패키지 종속성을 위한 필수 패키지 사전 설치 ###

    [root@vgputestvm01 ]# dnf install gcc* make pciutils -y

    [root@vgputestvm01 ]# lspci -nn |grep NVIDIA
    00:05.0 VGA compatible controller [0300]: NVIDIA Corporation TU104GL [Tesla T4] [10de:1eb8] (rev a1)


    ### rpm 패캐지 버전들은 커널 버전과 동일하게 맞춤.
    [root@vgputestvm01 ]# rpm -ivh kernel-devel-4.18.0-372.13.1.el8_6.x86_64.rpm

    [root@vgputestvm01 ]# rpm -ivh kernel-headers-4.18.0-372.13.1.el8_6.x86_64.rpm (rpm -Uvh kernel-headers-4.18.0-372.13.1.el8_6.x86_64.rpm)


    [root@vgputestvm01 ]# rpm -ivh dkms-3.0.12-1.el8.noarch.rpm
    → dkms 설치시 디펜던시 필요시 다음 설치
    → rpm -ivh elfutils-libelf-devel-0.189-3.el8.x86_64.rpm
    → rpm -ivh elfutils-libelf-0.189-3.el8.x86_64.rpm

    [root@vgputestvm01]# chmod +x NVIDIA-Linux-x86_64-470.161.03-grid.run

    [root@vgputestvm01]# ./NVIDIA-Linux-x86_64-470.161.03-grid.run
    → 설치시 필요한 패키지 있으면 추가 설치 필요.


    ### 설치 완료 확인

    [root@vgputestvm01 ]# lsmod |grep nvidia
    nvidia_drm 65536 0
    nvidia_modeset 1200128 1 nvidia_drm
    nvidia 35454976 1 nvidia_modeset
    drm_kms_helper 266240 4 cirrus,nvidia_drm
    drm 585728 5 drm_kms_helper,nvidia,cirrus,nvidia_drm

    [root@vgputestvm01 ]# nvidia-smi

    3. NVIDIA Driver 인식

    [root@vgputestvm01]# ll /etc/nvidia/
    ClientConfigToken
    gridd.conf
    license
    nvidia-topologyd.conf.template


    ### 최근 라이센스 인식이 설정값이 아닌 토큰값으로 변경됨
    → gridd.conf.template 으로 있으면 명칭을 gridd.conf로 변경해도 되고 안해도 된다.

    ### NVIDIA 라이센스 다운받아 하단 폴더에 넣는다.
    [root@vgputestvm01 nvidia]# cd ClientConfigToken/
    [root@vgputestvm01 nvidia]# ls
    client_configuration_token_12-04-2023-18-04-25.to


    ### 라이센스 인식을 위한 vGPU 서비스 재기동 및 인식 확인
    [root@vgputestvm01 ]# systemctl restart nvidia-gridd.service

    [root@vgputestvm01 nvidia]# nvidia-smi -q
    ==============NVSMI LOG==============
    Timestamp : Tue Dec 12 02:16:37 2023
    Driver Version : 470.161.03
    CUDA Version : 11.4
    Attached GPUs : 1
    GPU 00000000:00:05.0
    Product Name : GRID T4-1B
    Product Brand : NVIDIA Virtual PC
    Display Mode : Enabled
    Display Active : Disabled
    Persistence Mode : Enabled
    ..................................................................................................................
    GPU UUID : GPU-UUID
    ..................................................................................................................
    GPU Virtualization Mode
    Virtualization Mode : VGPU
    Host VGPU Mode : N/A
    vGPU Software Licensed Product
    Product Name : NVIDIA Virtual PC
    License Status : Licensed (Expiry: 2023-12-13 0:28:35 GMT)    -> 해당 부분이 정상 인식으로 변경된것 확인.
    ..................................................................................................................
    라이센스 상태 정상 확인 완료.

     

    o vGPU 참고 사이트
    1)NVIDIA : https://docs.nvidia.com/grid/13.0/grid-vgpu-user-guide/index.html
    2) vGPU : https://docs.openstack.org/nova/queens/admin/virtual-gpu.html
    3) https://www.nvidia.com/en-us/data-center/graphics-cards-for-virtualization/
    4) 참고용 사이트 : https://cloud-atlas.readthedocs.io/zh-cn/latest/machine_learning/hardware/nvidia_gpu/nvidia-smi.html

    반응형