怎么和别人和谐共处的使用服务器上的GPU

师兄协调了一台服务器，给了我个Ubuntu账户，我应该怎么用而不打扰别人？

怎么只有一个＄符号？

那是因为你正在用.sh模式，只要在终端输入bash，你所热爱的正是你的生活。

环境变量

请只修改你目录下的东西，因为你没有sudo权限不太可能会动到/etc/..底下的东西，但是千万记住，不要直接修改 /etc/profile、/etc/bashrc 或其他全局的环境变量配置文件，因为这会影响所有用户。你只能碰/home/username/..里面的profile、bashrc。请注意，安装Anaconda没事，CUDA小心一点，有点麻烦。

CUDA

这里同样你只能用师兄们给你装好的圣遗物请创建文件touch switch-cuda.sh 通过source switch-cuda.sh看可用的cuda，通过source switch-cuda.sh [VERSION]切换，例如：

(base) user@root:~$ source switch-cuda.sh 
The following CUDA installations have been found (in '/usr/local'):
* cuda-11.0
* cuda-12.4

(base) user@root:~$ source switch-cuda.sh 12.4
Switched to CUDA 12.4

文件内容（用Vim编辑）：

#!/usr/bin/env bash

set -e
# ensure that the script has been sourced rather than just executed
if [[ "${BASH_SOURCE[0]}" = "${0}" ]]; then
    echo "Please use 'source' to execute switch-cuda.sh!"
    exit 1
fi

INSTALL_FOLDER="/usr/local"  # the location to look for CUDA installations at
TARGET_VERSION=${1}          # the target CUDA version to switch to (if provided)

# if no version to switch to has been provided, then just print all available CUDA installations
if [[ -z ${TARGET_VERSION} ]]; then
    echo "The following CUDA installations have been found (in '${INSTALL_FOLDER}'):"
    ls -l "${INSTALL_FOLDER}" | egrep -o "cuda-[0-9]+\\.[0-9]+$" | while read -r line; do
        echo "* ${line}"
    done
    set +e
    return
# otherwise, check whether there is an installation of the requested CUDA version
elif [[ ! -d "${INSTALL_FOLDER}/cuda-${TARGET_VERSION}" ]]; then
    echo "No installation of CUDA ${TARGET_VERSION} has been found!"
    set +e
    return
fi

# the path of the installation to use
cuda_path="${INSTALL_FOLDER}/cuda-${TARGET_VERSION}"

# filter out those CUDA entries from the PATH that are not needed anymore
path_elements=(${PATH//:/ })
new_path="${cuda_path}/bin"
for p in "${path_elements[@]}"; do
    if [[ ! ${p} =~ ^${INSTALL_FOLDER}/cuda ]]; then
        new_path="${new_path}:${p}"
    fi
done

# filter out those CUDA entries from the LD_LIBRARY_PATH that are not needed anymore
ld_path_elements=(${LD_LIBRARY_PATH//:/ })
new_ld_path="${cuda_path}/lib64:${cuda_path}/extras/CUPTI/lib64"
for p in "${ld_path_elements[@]}"; do
    if [[ ! ${p} =~ ^${INSTALL_FOLDER}/cuda ]]; then
        new_ld_path="${new_ld_path}:${p}"
    fi
done

# update environment variables
export CUDA_HOME="${cuda_path}"
export CUDA_ROOT="${cuda_path}"
export LD_LIBRARY_PATH="${new_ld_path}"
export PATH="${new_path}"

echo "Switched to CUDA ${TARGET_VERSION}."

set +e
return

Conda

Anaconda 和 Miniconda 正常安装即可，包括换源也是没问题的。目前Miniconda挺爽的，小小的也很可爱！ Miniconda

# 创建文件夹
mkdir miniconda3
cd miniconda3/
# 下载miniconda
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
# 运行安装
bash Miniconda3-latest-Linux-x86_64.sh
# 让你的bash前面出现当前环境，例如：(base)
eval "$(/home/lvzhiwei/miniconda3/bin/conda shell.bash hook)"
# 让conda在你的环境变量里面打上烙印，不用担心，不会修改全局变量
conda init

APT？

这个时候问题来了，APT没换源啊？求你的师兄去给你换源，没有sudo只能临时指定。 holyshit，原来大多数情况下你没有sudo你根本没法apt install，因为要写你没权限的文件

喂？谁在用显卡

这里介绍一些监看命令，帮你搞清楚谁在用显卡，并检验你能不能得罪得起kill掉他们的进程 nvidia-smi太基础了，这里介绍一种持续刷新的办法watch ~~视监你的师兄们~~

nvidia-smi

# 每隔2秒刷新一次，每次只在固定位置刷新
watch -n 2 -d nvidia-smi

根据上面`nvidia-smi`查到的PID，查是谁的进程（PS命令）

# 查询是谁的进程以及开始时间
ps -p [PID] -o user=,lstart=

例如：

(base) you@your_lab:~$ ps -p 2662726 -o user=,lstart=
DaShiXiong+ Sun Jan  5 15:25:48 2025

我写了个程序，可以一键查询哪些人在用哪张显卡： touch check.sh、./check.sh

#!/bin/bash

# 获取 nvidia-smi 输出中的 PID 列（排除表头）
pids=$(nvidia-smi --query-compute-apps=pid --format=csv,noheader)

# 检查是否有进程在 GPU 上运行
if [ -z "$pids" ]; then
    echo "No processes are running on the GPU."
    exit 1
fi

# 输出表头
printf "%-10s %-20s %-25s %-5s\n" "PID" "USER" "CREATED" "GPU"

# 遍历每个 PID，查找进程的用户、GPU 和显存使用情况
while read pid; do
    # 获取进程所属的用户
    user=$(ps -p $pid -o user=)
    
    # 获取进程创建时间
    created=$(ps -p $pid -o lstart=)
    
    # 获取该进程正在使用的 GPU
    gpu=$(nvidia-smi pmon -c 1 | grep $pid | awk '{print $1}')
    
    # 使用 printf 格式化输出，确保对齐
    printf "%-10s %-20s %-25s %-5s\n" "$pid" "$user" "$created" "$gpu"
done <<< "$pids"

Vim

修改文件必备

vim ~/.vnc/xstartup

i编辑、 esc退出、 :wq保存