Homepage > Man Pages > Category > General Commands
Homepage > Man Pages > Name > N

nvidia-smi

man page of nvidia-smi

nvidia-smi: NVIDIA System Management Interface program

NAME

nvidia-smi - NVIDIA System Management Interface program
SYNOPSIS
nvidia-smi [OPTION1 [ARG1]] [OPTION2 [ARG2]] ... -h, --help Print usage information and exit SUMMARY OPTIONS -L, --list-gpus Display a list of available GPUs QUERY OPTIONS -q, --query Display GPU or Unit info [plus any of] -u, --unit Show unit, rather than GPU, attributes -i, --id Target a specific GPU or Unit -f, --filename Log to a specified file -x, --xml-format Produce XML output -d, --display Display only selected information: MEMORY, UTILIZATION, ECC, TEMPERATURE, POWER, CLOCK, COMPUTE. Flags can be combined with comma e.g. "MEMORY,ECC". Doesn't work with -u/--unit or -x/--xml-format flags. -l, --loop Probe until Ctrl+C at specified interval, in seconds DEVICE MODIFICATION OPTIONS [any one of] -pm, --persistence-mode Set persistence mode: 0|DISABLED, 1|ENABLED -e, --ecc-config Toggle ECC support: 0|DISABLED, 1|ENABLED -p, --reset-ecc-errors Reset ECC error counts: 0|VOLATILE, 1|AGGREGATE -c, --compute-mode Set MODE for compute applications: 0|DEFAULT, 1|EXCLUSIVE_THREAD, 2|PROHIBITED, 3|EXCLUSIVE_PROCESS [plus optional] -i, --id Target a specific GPU UNIT MODIFICATION OPTIONS -t, --toggle-led Set Unit LED state: 0|GREEN, 1|AMBER [plus optional] -i, --id Target a specific Unit
DESCRIPTION
NVSMI provides diagnostic information for each of NVIDIA's Tesla devices and each of its Fermi-based Quadro devices. It provides very limited information for other types of NVIDIA devices. The data is presented in either plain text or XML format, via stdout or a file. NVSMI also provides several management operations for changing device state.

OPTIONS

-h, --help Print usage information and exit. -L, --list-gpus List each of the NVIDIA GPUs in the system, along with their serial numbers or UUIDs. Tesla and Quadro GPUs from the Fermi family report serial numbers, which match the ids physically printed on each board. Non-Fermi Tesla products only support UUIDs, which are also unique but do not correspond to any identifier on the board. All other products report N/A. -q, --query Display GPU or Unit info. Displayed info includes all data listed in the (GPU ATTRIBUTES) or (UNIT ATTRIBUTES) sections of this document. Some devices and/or environments don't support all possible information. Any unsupported data is indicated by a "N/A" in the output. By default information for all available GPUs or Units is displayed. Use the -i option to restrict the output to a single GPU or Unit. -u, --unit Modify the -q option. Display Unit data instead of GPU data. Unit data is only available for NVIDIA S-class Tesla enclosures. -i, --id=ID Modify the -q option. Display data for a single specified GPU or Unit. The specified id may be the GPU/Unit's 0-based index in the natural enumeration returned by the driver, the GPU's serial number, or the GPU's PCI bus ID (as domain:bus:device in hex). It is recommended that users desiring consistency use either of the latter two options, since device enumeration ordering is not guarenteed to be consistent between reboots. -f FILE, --filename=FILE Modify the -q option. Redirect query output to the specified file in place of the default stdout. The specified file will be overwritten. -x, --xml-format Modify the -q option. Produce XML output in place of the default human-readable format. Both GPU and Unit query outputs conform to corresponding DTDs, which are available in the online documentation. -d, --display Display only selected information: MEMORY, UTILIZATION, ECC, TEMPERATURE, POWER, CLOCK, COMPUTE. Flags can be combined with comma e.g. "MEMORY,ECC". Doesn't work with -u/--unit or -x/--xml-format flags. -l SEC, --loop=SEC Modify the -q option. Continuously report query data at the specified interval, rather than the default of just once. The application will sleep in-between queries. Pressing Ctrl+C at any time will abort the loop, which will otherwise run indefinitely. If no argument is specified for the -l form a default interval of 5 seconds is used. -pm, --persistence-mode=MODE Set the persistence mode for the target GPUs. See the (GPU ATTRIBUTES) section for a description of persistence mode. Requires root. Will impact all GPUs unless a single GPU is specified using the -i argument. The effect of this operation is immediate. However, it does not persist across reboots. After each reboot persistence mode will default to "Disabled". -e, --ecc-config=CONFIG Set the ECC mode for the target GPUs. See the (GPU ATTRIBUTES) section for a description of ECC mode. Requires root. Will impact all GPUs unless a single GPU is specified using the -i argument. This setting takes effect after the next reboot and is persistent. -p, --reset-ecc-errors=TYPE Reset the ECC error counters for the target GPUs. See the (GPU ATTRIBUTES) section for a description of ECC error counter types. Available arguments are 0|VOLATILE or 1|AGGREGATE. Requires root. Will impact all GPUs unless a single GPU is specified using the -i argument. The effect of this operation is immediate. -c, --compute-mode=MODE Set the compute mode for the target GPUs. See the (GPU ATTRIBUTES) section for a description of compute mode. Requires root. Will impact all GPUs unless a single GPU is specified using the -i argument. The effect of this operation is immediate. However, it does not persist across reboots. After each reboot compute mode will reset to "DEFAULT". -i, --id=ID Modify the -pm, -e, -p or -c options. Modify a single specified GPU. The specified id may be the GPU's 0-based index in the natural enumeration returned by the driver, the GPU's serial number, or the GPU's PCI bus ID (as domain:bus:device). It is recommended that users desiring consistency use either of the latter two options, since device enumeration ordering is not guarenteed to be consistent between reboots. -t, --toggle-led=STATE Set the LED indicator state on the front and back of the unit to the specified color. See the (UNIT ATTRIBUTES) section for a description of the LED states. Allowed colors are 0|GREEN and 1|AMBER. Requires root. -i, --id=ID Modify the -t option. Modify a single specified Unit. The specified id is the Unit's 0-based index in the natural enumeration returned by the driver.

GPU ATTRIBUTES

The following list describes all possible data returned by the -q device query option. Unless otherwise noted all numerical results are base 10 and unitless. Timestamp The current system timestamp at the time nvidia-smi was invoked. Format is "Day-of-week Month Day HH:MM:SS Year". Driver Version The version of the installed NVIDIA display driver. This is an alphanumeric string. Attached GPUs The number of accessible NVIDIA GPUs. Under Linux all NVIDIA GPUs are expected to be accessible. Product Name The official product name of the GPU. This is an alphanumeric string. For all products. Display Mode A flag that indicates whether a display is attached to the GPU. "Enabled" indicates an attached display. "Disabled" indicates otherwise. For Tesla products, and Quadro products from the Fermi family. Persistence Mode A flag that indicates whether persistence mode is enabled for the GPU. Value is either "Enabled" or "Disabled". When persistence mode is enabled the NVIDIA driver remains loaded even when no active clients, such as X11 or nvidia-smi, exist. This minimizes the driver load latency associated with running dependent apps, such as CUDA programs. For all CUDA-capable products. Linux only. Driver Model This feature is not supported under Linux and will always have the value of "N/A". For Tesla products, and Quadro products from the Fermi family. Windows only. Current Always "N/A". Pending Always "N/A". Serial Number This number matches the serial number physically printed on each board. It is a globally unique immutable alphanumeric value. For Tesla and Quadro products from the Fermi family. GPU UUID This value is another globally unique immutable alphanumeric identifier. It does not correspond to any physical label on the board. This id is provided for non-Fermi Tesla products that don't support serial numbers, and should only be used in cases where serial numbers are not available. For Tesla products, and Quadro products from the Fermi family. Inforom Version Version numbers for each object in the GPU board's inforom storage. The inforom is a small, persistent store of configuration and state data for the GPU. All inforom version fields are numerical. It can be useful to know these version numbers because some GPU features are only available with inforoms of a certain version or higher. For Tesla and Quadro products from the Fermi family. OEM Object Version for the OEM configuration data. ECC Object Version for the ECC recording data. Power Object Version for the power management data. PCI Basic PCI info for the device. Some of this information may change whenever cards are added/removed/moved in a system. For all products. Bus PCI bus number, in hex Device PCI device number, in hex Domain PCI domain number, in hex Device Id PCI vendor device id, in hex Bus Id PCI bus id as "domain:device:bus", in hex Fan Speed The fan speed value is the percent of maximum speed that the device's fan is currently running at. It ranges from 0 to 100%. Many parts do not report fan speeds because they rely on cooling via fans in the surrounding enclosure. For all discrete products with dedicated fans. Memory Usage On-board memory information. Reported total memory is affected by ECC state. If ECC is enabled the total available memory is decreased by several percent, due to the requisite parity bits. The driver may also reserve a small amount of memory for internal use, even without active work on the GPU. For all products. Total Total installed GPU memory. Used Total memory allocated by active contexts. Free Total free memory. Compute Mode The compute mode flag indicates whether individual or multiple compute applications may run on the GPU. "DEFAULT" means multiple contexts are allowed per device. "EXCLUSIVE_THREAD" means only one context is allowed per device, usable from one thread at a time. "EXCLUSIVE_PROCESS" means only one context is allowed per device, usable from multiple threads at a time. "PROHIBITED" means no contexts are allowed per device (no compute apps). "EXCLUSIVE_PROCESS" was added in CUDA 4.0. Prior CUDA releases supported only one exclusive mode, which is equivalent to "EXCLUSIVE_THREAD" in CUDA 4.0 and beyond. For all CUDA-capable products. Utilization Utilization rates report how busy each GPU is over time, and can be used to determine how much an application is using the GPUs in the system. For Tesla products, and Quadro products from the Fermi family. GPU Percent of time over the past second during which one or more kernels was executing on the GPU. Memory Percent of time over the past second during which global (device) memory was being read or written. Ecc Mode A flag that indicates whether ECC support is enabled. May be either "Enabled" or "Disabled". Changes to ECC mode require a reboot. For Tesla and Quadro products from the Fermi family. Requires Inforom ECC object version 1.0 or higher. Current The ECC mode that the GPU is currently operating under. Pending The ECC mode that the GPU will operate under after the next reboot. ECC Errors NVIDIA GPUs can provide error counts for two types of ECC errors (single bit and double bit) across two timescales (volatile and aggregate). Single bit ECC errors are automaically corrected by the HW and do not result in data corruption. Double bit errors are detected but not corrected. Please see the ECC documents on the web for information on compute application behavior when double bit errors occur. Volatile error counters track the number of errors detected since the last reboot. They are reset during each power cycle. Aggregate errors persist beyond reboots and thus act as a lifetime counter. Tesla and Quadro products from the Fermi family can display total ECC error counts, as well as a breakdown of errors based on location on the chip. The locations are described below. Location-based data for aggregate error counts requires Inforom ECC object version 2.0. All other ECC counts require ECC object version 1.0. Device Memory Errors detected in global device memory. Register File Errors detected in register file memory. L1 Cache Errors detected in the L1 cache. L2 Cache Errors detected in the L2 cache. Total Total errors detected across entire chip. Sum of Device Memory, Register File, L1 Cache and L2 Cache. Temperature Readings from temperature sensors on the board. All readings are in degrees C. Not all products support all reading types. In particular, products in module form factors that rely on case fans or passive cooling do not usually provide temperature readings. See below for restrictions. GPU Core GPU temperature. For all discrete and S-class products. Power Readings Power readings help to shed light on the current power usage of the GPU, and the factors that affect that usage. When power management is enabled the GPU limits power draw under load to fit within a predefined power envelope by manipulating the current power state. See below for limits of availability. Power State The current power state for the GPU. States range from P0 (maximum perf) to P12 (minimum power). For Tesla products, and Quadro products from the Fermi family. Power Management A flag that indicates whether power management is enabled. Either "Supported" or "N/A". For "GF11x" Tesla and Quadro products from the Fermi family. Requires Inforom PWR object version 3.0 or higher. Power Draw The last measured power draw for the entire board, in watts. Only available if power management is supported. This reading is accurate to within +/- 5 watts. For "GF11x" Tesla and Quadro products from the Fermi family. Requires Inforom PWR object version 3.0 or higher. Power Limit The power management algorithm's power ceiling, in watts. Total board power draw is manipulated by the power management algorithm such that it stays under this value. Only available if power management is supported. For "GF11x" Tesla and Quadro products from the Fermi family. Requires Inforom PWR object version 3.0 or higher.

UNIT ATTRIBUTES

The following list describes all possible data returned by the -q -u unit query option. Unless otherwise noted all numerical results are base 10 and unitless. Timestamp The current system timestamp at the time nvidia-smi was invoked. Format is "Day-of-week Month Day HH:MM:SS Year". Driver Version The version of the installed NVIDIA display driver. Format is "Major-Number.Minor-Number". Attached Units The number of attached Units in the system. Product Name The offical product name of the unit. This is an alphanumeric value. For all S-class products. Product Id The product identifier for the unit. This is an alphanumeric value of the form "part1-part2-part3". For all S-class products. Product Serial The immutable globally unique identifier for the unit. This is an alphanumeric value. For all S-class products. Firmware Version The version of the firmware running on the unit. Format is "Major-Number.Minor-Number". For all S-class products. LED State The LED indicator is used to flag systems with potential problems. An LED color of AMBER indicates an issue. For all S-class products. Color The color of the LED indicator. Either "GREEN" or "AMBER". Cause The reason for the current LED color. The cause may be listed as any combination of "Unknown", "Set to AMBER by host system", "Thermal sensor failure", "Fan failure" and "Temperature exceeds critical limit". Temperature Temperature readings for important compoents of the Unit. All readings are in degrees C. Not all readings may be available. For all S-class products. Intake Air temperature at the unit intake. Exhaust Air temperature at the unit exhaust point. Board Air temperature across the unit board. PSU Readings for the unit power supply. For all S-class products. State Operating state of the PSU. The power supply state can be any of the following: "Normal", "Abnormal", "High voltage", "Fan failure", "Heatsink temperature", "Current limit", "Voltage below UV alarm threshold", "Low-voltage", "I2C remote off command", "MOD_DISABLE input" or "Short pin transition". Voltage PSU voltage setting, in volts. Current PSU current draw, in amps. Fan Info Fan readings for the unit. A reading is provided for each fan, of which there can be many. For all S-class products. State The state of the fan, either "NORMAL" or "FAILED". Speed For a healthy fan, the fan's speed in RPM. Attached GPUs A list of PCI bus ids that correspond to each of the GPUs attached to the unit. The bus ids have the form "domain:device:bus", in hex. For all S-class products.

NOTES

NVIDIA device files may be modified by nvidia-smi if run as root. Please see the relevant section of the driver README file. The -a, -s and -g arguments are now deprecated in favor of -q and -i, respectively. However, the old arguments still work for this release.
EXAMPLES
nvidia-smi -q Query attributes for all GPUs once, and display in plain text to stdout. nvidia-smi -q -d ECC,POWER -i 0 -x -l 10 -f out.log Query ECC errors and power consumtion for GPU 0 at a frequency of 10 seconds, indefintely, and record in xml format to the file out.log. nvidia-smi -c 1 -i 1010302775687 Set the compute mode to "EXCLUSIVE_THREAD" for GPU with serial "1010302775687". nvidia-smi -q -u -x Query attributes for all Units once, and display in xml format to stdout.
SEE ALSO
/usr/share/doc/NVIDIA_GLX-1.0/README.txt

AUTHOR

NVIDIA Corporation

COPYRIGHT

Copyright 2011 NVIDIA Corporation. NVIDIA-SMI(1)
 
 
 

Copyright © 2011–2018 by topics-of-interest.com . All rights reserved. Hosted by all-inkl.
Contact · Imprint · Privacy

Page generated in 16.47ms.

plr.li | und-verkauft.de | information-information.de