# GPU服务器使用说明

## 获取服务器信息

{% embed url="<https://nvidia-smi.github.io/>" %}

![服务器信息示例](/files/-LimiPijcRMZ1HjfF-cb)

**使用虚拟容器的服务器：**

* **Dione**
* **Mimas**
* **Tethys**

**其他机器共享宿主操作系统，仅需要查看新建用户这一步，无需查看剩余的说明文档。**

## 新建用户

使用账户addu登陆host，按提示键入用户名及密码。

```bash
ssh addu@172.26.xxx.xxx
# 密码
addu@172.26.xxx.xxx's password:
=====Welcome!
We need to get sudo permission first. Enter the password for `addu`.
# 输入addu的密码，获取sudo权限
[sudo] password for addu:
=====Let's setup a new account and create a container now.
# 输入用户名，接下来自动创建用户并新建虚拟机
Enter your username: test
Creating user...
Allocating container for test...
Creating test
Allocating ssh port... 10020
Device sshproxy added to test
# 设置用户密码
set password for test now (host only).
Enter new UNIX password:
Retype new UNIX password:
passwd: password updated successfully
Login this host via `ssh <username>@<host-ip>` to manage your container.
Done!
```

{% hint style="info" %}

* 新建的用户名请使用自己的**姓名全拼**，如果需要多个账户，请使用<全拼>+<数字>的格式，如`zhangsan2；`
* **妥善保存：新建的用户名，密码，所在服务器。**
  {% endhint %}

## 管理容器

使用新建的账户登陆host，按照提示管理自己的container。

```bash
# 使用新建的用户登陆并管理虚拟机
ssh test@172.26.xxx.xxx
test@172.26.xxx.xxx's password:
Welcome to Ubuntu 18.04.2 LTS (GNU/Linux 4.15.0-54-generic x86_64)
……
 Hi, test
 You're using the GPU Server in Vision Group.

==========About your container:
Your container is not running.
Transfer data to your container using scp or sftp;
File sharing is encouraged, access datasets at shared/datasets, access download files at shared/downloads, etc

See GPU load: nvidia-smi.
    memory usage: free -h.
    disk usage: df -h.

===== main menu  =====
[1] start your container  # 开机
[2] enter your container  # 切换至虚拟机
[3] stop your container   # 关机（也可以直接在虚拟机中执行shutdown now）
[4] change your password  # 更改密码（如果需要改虚拟机密码，进入虚拟机后执行passwd）
[5] allocate ports        # 进行端口映射
[6] release ports         # 释放申请的端口
[0] show info             # 显示虚拟机运行状态
[x] exit                  # 退出管理
# 启动虚拟机
Enter your choice: 1
========== Starting your container...

Press any key to continue...
```

## 使用容器

使用上一步获取的用户名和密码，登入到自己的container。

```bash
# 检查显卡驱动和运行状况
(base) root@test:~# nvidia-smi
Mon Jul  1 14:07:26 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.48                 Driver Version: 410.48                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce RTX 208...  Off  | 00000000:19:00.0 Off |                  N/A |
| 30%   41C    P0    67W / 250W |      0MiB / 10989MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce RTX 208...  Off  | 00000000:1A:00.0 Off |                  N/A |
| 30%   51C    P0    61W / 250W |      0MiB / 10989MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  GeForce RTX 208...  Off  | 00000000:67:00.0 Off |                  N/A |
| 31%   51C    P0    64W / 250W |      0MiB / 10989MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   3  GeForce RTX 208...  Off  | 00000000:68:00.0 Off |                  N/A |
| 30%   51C    P0     1W / 250W |      0MiB / 10986MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
```

## 小结

#### **关于用户权限：**

1. 用户拥有整台机器全部计算资源使用权限，包括全部的CPU、GPU、内存；
2. 用户拥有完整的虚拟机访问权限，默认使用root账户。

#### **关于文件共享：**

1. 为鼓励文件共享，**只有共享目录shared下的文件存放在SSD上**；
2. 共享文件请存放至合适的位置，如数据集存放到datasets；
3. 不要删除别人共享的文件。

#### **关于环境配置：**

1. **请勿在虚拟机内安装显卡驱动，如需重新安装CUDA，请在安装过程中禁止显卡驱动安装；**
2. 已经配置conda及常用深度学习环境；
3. **如果需要安装CUDA，优先使用conda安装；**
4. **如需容器迁移联系管理员。**

共同维护我们的丹炉，祝炼丹愉快！


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.aiskyeye.com/manual.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
