1 ------------------------------------------------------------------------------
2 T H E /proc F I L E S Y S T E M
3 ------------------------------------------------------------------------------
4 /proc/sys Terrehon Bowden <terrehon@pacbell.net> October 7 1999
5 Bodo Bauer <bb@ricochet.net>
7 2.4.x update Jorge Nerin <comandante@zaralinux.com> November 14 2000
8 move /proc/sys Shen Feng <shen@cn.fujitsu.com> April 1 2009
9 ------------------------------------------------------------------------------
10 Version 1.3 Kernel version 2.2.12
11 Kernel version 2.4.0-test11-pre4
12 ------------------------------------------------------------------------------
18 0.1 Introduction/Credits
21 1 Collecting System Information
22 1.1 Process-Specific Subdirectories
24 1.3 IDE devices in /proc/ide
25 1.4 Networking info in /proc/net
27 1.6 Parallel port info in /proc/parport
28 1.7 TTY info in /proc/tty
29 1.8 Miscellaneous kernel statistics in /proc/stat
30 1.9 Ext4 file system parameters
32 2 Modifying System Parameters
34 3 Per-Process Parameters
35 3.1 /proc/<pid>/oom_adj - Adjust the oom-killer score
36 3.2 /proc/<pid>/oom_score - Display current oom-killer score
37 3.3 /proc/<pid>/io - Display the IO accounting fields
38 3.4 /proc/<pid>/coredump_filter - Core dump filtering settings
39 3.5 /proc/<pid>/mountinfo - Information about mounts
42 ------------------------------------------------------------------------------
44 ------------------------------------------------------------------------------
46 0.1 Introduction/Credits
47 ------------------------
49 This documentation is part of a soon (or so we hope) to be released book on
50 the SuSE Linux distribution. As there is no complete documentation for the
51 /proc file system and we've used many freely available sources to write these
52 chapters, it seems only fair to give the work back to the Linux community.
53 This work is based on the 2.2.* kernel version and the upcoming 2.4.*. I'm
54 afraid it's still far from complete, but we hope it will be useful. As far as
55 we know, it is the first 'all-in-one' document about the /proc file system. It
56 is focused on the Intel x86 hardware, so if you are looking for PPC, ARM,
57 SPARC, AXP, etc., features, you probably won't find what you are looking for.
58 It also only covers IPv4 networking, not IPv6 nor other protocols - sorry. But
59 additions and patches are welcome and will be added to this document if you
62 We'd like to thank Alan Cox, Rik van Riel, and Alexey Kuznetsov and a lot of
63 other people for help compiling this documentation. We'd also like to extend a
64 special thank you to Andi Kleen for documentation, which we relied on heavily
65 to create this document, as well as the additional information he provided.
66 Thanks to everybody else who contributed source or docs to the Linux kernel
67 and helped create a great piece of software... :)
69 If you have any comments, corrections or additions, please don't hesitate to
70 contact Bodo Bauer at bb@ricochet.net. We'll be happy to add them to this
73 The latest version of this document is available online at
74 http://skaro.nightcrawler.com/~bb/Docs/Proc as HTML version.
76 If the above direction does not works for you, ypu could try the kernel
77 mailing list at linux-kernel@vger.kernel.org and/or try to reach me at
78 comandante@zaralinux.com.
83 We don't guarantee the correctness of this document, and if you come to us
84 complaining about how you screwed up your system because of incorrect
85 documentation, we won't feel responsible...
87 ------------------------------------------------------------------------------
88 CHAPTER 1: COLLECTING SYSTEM INFORMATION
89 ------------------------------------------------------------------------------
91 ------------------------------------------------------------------------------
93 ------------------------------------------------------------------------------
94 * Investigating the properties of the pseudo file system /proc and its
95 ability to provide information on the running Linux system
96 * Examining /proc's structure
97 * Uncovering various information about the kernel and the processes running
99 ------------------------------------------------------------------------------
102 The proc file system acts as an interface to internal data structures in the
103 kernel. It can be used to obtain information about the system and to change
104 certain kernel parameters at runtime (sysctl).
106 First, we'll take a look at the read-only parts of /proc. In Chapter 2, we
107 show you how you can use /proc/sys to change settings.
109 1.1 Process-Specific Subdirectories
110 -----------------------------------
112 The directory /proc contains (among other things) one subdirectory for each
113 process running on the system, which is named after the process ID (PID).
115 The link self points to the process reading the file system. Each process
116 subdirectory has the entries listed in Table 1-1.
119 Table 1-1: Process specific entries in /proc
120 ..............................................................................
122 clear_refs Clears page referenced bits shown in smaps output
123 cmdline Command line arguments
124 cpu Current and last cpu in which it was executed (2.4)(smp)
125 cwd Link to the current working directory
126 environ Values of environment variables
127 exe Link to the executable of this process
128 fd Directory, which contains all file descriptors
129 maps Memory maps to executables and library files (2.4)
130 mem Memory held by this process
131 root Link to the root directory of this process
133 statm Process memory status information
134 status Process status in human readable form
135 wchan If CONFIG_KALLSYMS is set, a pre-decoded wchan
136 stack Report full stack trace, enable via CONFIG_STACKTRACE
137 smaps Extension based on maps, the rss size for each mapped file
138 ..............................................................................
140 For example, to get the status information of a process, all you have to do is
141 read the file /proc/PID/status:
143 >cat /proc/self/status
159 SigPnd: 0000000000000000
160 SigBlk: 0000000000000000
161 SigIgn: 0000000000000000
162 SigCgt: 0000000000000000
163 CapInh: 00000000fffffeff
164 CapPrm: 0000000000000000
165 CapEff: 0000000000000000
168 This shows you nearly the same information you would get if you viewed it with
169 the ps command. In fact, ps uses the proc file system to obtain its
170 information. The statm file contains more detailed information about the
171 process memory usage. Its seven fields are explained in Table 1-2. The stat
172 file contains details information about the process itself. Its fields are
173 explained in Table 1-3.
176 Table 1-2: Contents of the statm files (as of 2.6.8-rc3)
177 ..............................................................................
179 size total program size (pages) (same as VmSize in status)
180 resident size of memory portions (pages) (same as VmRSS in status)
181 shared number of pages that are shared (i.e. backed by a file)
182 trs number of pages that are 'code' (not including libs; broken,
183 includes data segment)
184 lrs number of pages of library (always 0 on 2.6)
185 drs number of pages of data/stack (including libs; broken,
186 includes library text)
187 dt number of dirty pages (always 0 on 2.6)
188 ..............................................................................
191 Table 1-3: Contents of the stat files (as of 2.6.22-rc3)
192 ..............................................................................
195 tcomm filename of the executable
196 state state (R is running, S is sleeping, D is sleeping in an
197 uninterruptible wait, Z is zombie, T is traced or stopped)
198 ppid process id of the parent process
199 pgrp pgrp of the process
201 tty_nr tty the process uses
202 tty_pgrp pgrp of the tty
204 min_flt number of minor faults
205 cmin_flt number of minor faults with child's
206 maj_flt number of major faults
207 cmaj_flt number of major faults with child's
208 utime user mode jiffies
209 stime kernel mode jiffies
210 cutime user mode jiffies with child's
211 cstime kernel mode jiffies with child's
212 priority priority level
214 num_threads number of threads
215 it_real_value (obsolete, always 0)
216 start_time time the process started after system boot
217 vsize virtual memory size
218 rss resident set memory size
219 rsslim current limit in bytes on the rss
220 start_code address above which program text can run
221 end_code address below which program text can run
222 start_stack address of the start of the stack
223 esp current value of ESP
224 eip current value of EIP
225 pending bitmap of pending signals (obsolete)
226 blocked bitmap of blocked signals (obsolete)
227 sigign bitmap of ignored signals (obsolete)
228 sigcatch bitmap of catched signals (obsolete)
229 wchan address where process went to sleep
232 exit_signal signal to send to parent thread on exit
233 task_cpu which CPU the task is scheduled on
234 rt_priority realtime priority
235 policy scheduling policy (man sched_setscheduler)
236 blkio_ticks time spent waiting for block IO
237 ..............................................................................
243 Similar to the process entries, the kernel data files give information about
244 the running kernel. The files used to obtain this information are contained in
245 /proc and are listed in Table 1-4. Not all of these will be present in your
246 system. It depends on the kernel configuration and the loaded modules, which
247 files are there, and which are missing.
249 Table 1-4: Kernel info in /proc
250 ..............................................................................
252 apm Advanced power management info
253 buddyinfo Kernel memory allocator information (see text) (2.5)
254 bus Directory containing bus specific information
255 cmdline Kernel command line
256 cpuinfo Info about the CPU
257 devices Available devices (block and character)
258 dma Used DMS channels
259 filesystems Supported filesystems
260 driver Various drivers grouped here, currently rtc (2.4)
261 execdomains Execdomains, related to security (2.4)
262 fb Frame Buffer devices (2.4)
263 fs File system parameters, currently nfs/exports (2.4)
264 ide Directory containing info about the IDE subsystem
265 interrupts Interrupt usage
266 iomem Memory map (2.4)
267 ioports I/O port usage
268 irq Masks for irq to cpu affinity (2.4)(smp?)
269 isapnp ISA PnP (Plug&Play) Info (2.4)
270 kcore Kernel core image (can be ELF or A.OUT(deprecated in 2.4))
272 ksyms Kernel symbol table
273 loadavg Load average of last 1, 5 & 15 minutes
277 modules List of loaded modules
278 mounts Mounted filesystems
279 net Networking info (see text)
280 partitions Table of partitions known to the system
281 pci Deprecated info of PCI bus (new way -> /proc/bus/pci/,
282 decoupled by lspci (2.4)
284 scsi SCSI info (see text)
285 slabinfo Slab pool info
286 stat Overall statistics
287 swaps Swap space utilization
289 sysvipc Info of SysVIPC Resources (msg, sem, shm) (2.4)
290 tty Info of tty drivers
292 version Kernel version
293 video bttv info of video resources (2.4)
294 vmallocinfo Show vmalloced areas
295 ..............................................................................
297 You can, for example, check which interrupts are currently in use and what
298 they are used for by looking in the file /proc/interrupts:
300 > cat /proc/interrupts
302 0: 8728810 XT-PIC timer
303 1: 895 XT-PIC keyboard
305 3: 531695 XT-PIC aha152x
306 4: 2014133 XT-PIC serial
307 5: 44401 XT-PIC pcnet_cs
310 12: 182918 XT-PIC PS/2 Mouse
312 14: 1232265 XT-PIC ide0
316 In 2.4.* a couple of lines where added to this file LOC & ERR (this time is the
317 output of a SMP machine):
319 > cat /proc/interrupts
322 0: 1243498 1214548 IO-APIC-edge timer
323 1: 8949 8958 IO-APIC-edge keyboard
324 2: 0 0 XT-PIC cascade
325 5: 11286 10161 IO-APIC-edge soundblaster
326 8: 1 0 IO-APIC-edge rtc
327 9: 27422 27407 IO-APIC-edge 3c503
328 12: 113645 113873 IO-APIC-edge PS/2 Mouse
330 14: 22491 24012 IO-APIC-edge ide0
331 15: 2183 2415 IO-APIC-edge ide1
332 17: 30564 30414 IO-APIC-level eth0
333 18: 177 164 IO-APIC-level bttv
338 NMI is incremented in this case because every timer interrupt generates a NMI
339 (Non Maskable Interrupt) which is used by the NMI Watchdog to detect lockups.
341 LOC is the local interrupt counter of the internal APIC of every CPU.
343 ERR is incremented in the case of errors in the IO-APIC bus (the bus that
344 connects the CPUs in a SMP system. This means that an error has been detected,
345 the IO-APIC automatically retry the transmission, so it should not be a big
346 problem, but you should read the SMP-FAQ.
348 In 2.6.2* /proc/interrupts was expanded again. This time the goal was for
349 /proc/interrupts to display every IRQ vector in use by the system, not
350 just those considered 'most important'. The new vectors are:
352 THR -- interrupt raised when a machine check threshold counter
353 (typically counting ECC corrected errors of memory or cache) exceeds
354 a configurable threshold. Only available on some systems.
356 TRM -- a thermal event interrupt occurs when a temperature threshold
357 has been exceeded for the CPU. This interrupt may also be generated
358 when the temperature drops back to normal.
360 SPU -- a spurious interrupt is some interrupt that was raised then lowered
361 by some IO device before it could be fully processed by the APIC. Hence
362 the APIC sees the interrupt but does not know what device it came from.
363 For this case the APIC will generate the interrupt with a IRQ vector
364 of 0xff. This might also be generated by chipset bugs.
366 RES, CAL, TLB -- rescheduling, call and TLB flush interrupts are
367 sent from one CPU to another per the needs of the OS. Typically,
368 their statistics are used by kernel developers and interested users to
369 determine the occurrence of interrupts of the given type.
371 The above IRQ vectors are displayed only when relevent. For example,
372 the threshold vector does not exist on x86_64 platforms. Others are
373 suppressed when the system is a uniprocessor. As of this writing, only
374 i386 and x86_64 platforms support the new IRQ vector displays.
376 Of some interest is the introduction of the /proc/irq directory to 2.4.
377 It could be used to set IRQ to CPU affinity, this means that you can "hook" an
378 IRQ to only one CPU, or to exclude a CPU of handling IRQs. The contents of the
379 irq subdir is one subdir for each IRQ, and two files; default_smp_affinity and
384 0 10 12 14 16 18 2 4 6 8 prof_cpu_mask
385 1 11 13 15 17 19 3 5 7 9 default_smp_affinity
389 smp_affinity is a bitmask, in which you can specify which CPUs can handle the
390 IRQ, you can set it by doing:
392 > echo 1 > /proc/irq/10/smp_affinity
394 This means that only the first CPU will handle the IRQ, but you can also echo
395 5 which means that only the first and fourth CPU can handle the IRQ.
397 The contents of each smp_affinity file is the same by default:
399 > cat /proc/irq/0/smp_affinity
402 The default_smp_affinity mask applies to all non-active IRQs, which are the
403 IRQs which have not yet been allocated/activated, and hence which lack a
404 /proc/irq/[0-9]* directory.
406 prof_cpu_mask specifies which CPUs are to be profiled by the system wide
407 profiler. Default value is ffffffff (all cpus).
409 The way IRQs are routed is handled by the IO-APIC, and it's Round Robin
410 between all the CPUs which are allowed to handle it. As usual the kernel has
411 more info than you and does a better job than you, so the defaults are the
412 best choice for almost everyone.
414 There are three more important subdirectories in /proc: net, scsi, and sys.
415 The general rule is that the contents, or even the existence of these
416 directories, depend on your kernel configuration. If SCSI is not enabled, the
417 directory scsi may not exist. The same is true with the net, which is there
418 only when networking support is present in the running kernel.
420 The slabinfo file gives information about memory usage at the slab level.
421 Linux uses slab pools for memory management above page level in version 2.2.
422 Commonly used objects have their own slab pool (such as network buffers,
423 directory cache, and so on).
425 ..............................................................................
427 > cat /proc/buddyinfo
429 Node 0, zone DMA 0 4 5 4 4 3 ...
430 Node 0, zone Normal 1 0 0 1 101 8 ...
431 Node 0, zone HighMem 2 0 0 1 1 0 ...
433 Memory fragmentation is a problem under some workloads, and buddyinfo is a
434 useful tool for helping diagnose these problems. Buddyinfo will give you a
435 clue as to how big an area you can safely allocate, or why a previous
438 Each column represents the number of pages of a certain order which are
439 available. In this case, there are 0 chunks of 2^0*PAGE_SIZE available in
440 ZONE_DMA, 4 chunks of 2^1*PAGE_SIZE in ZONE_DMA, 101 chunks of 2^4*PAGE_SIZE
441 available in ZONE_NORMAL, etc...
443 ..............................................................................
447 Provides information about distribution and utilization of memory. This
448 varies by architecture and compile options. The following is from a
449 16GB PIII, which has highmem enabled. You may not have all of these fields.
454 MemTotal: 16344972 kB
461 HighTotal: 15597528 kB
462 HighFree: 13629632 kB
472 SReclaimable: 159856 kB
473 SUnreclaim: 124508 kB
478 CommitLimit: 7669796 kB
479 Committed_AS: 100056 kB
480 VmallocTotal: 112216 kB
482 VmallocChunk: 111088 kB
484 MemTotal: Total usable ram (i.e. physical ram minus a few reserved
485 bits and the kernel binary code)
486 MemFree: The sum of LowFree+HighFree
487 Buffers: Relatively temporary storage for raw disk blocks
488 shouldn't get tremendously large (20MB or so)
489 Cached: in-memory cache for files read from the disk (the
490 pagecache). Doesn't include SwapCached
491 SwapCached: Memory that once was swapped out, is swapped back in but
492 still also is in the swapfile (if memory is needed it
493 doesn't need to be swapped out AGAIN because it is already
494 in the swapfile. This saves I/O)
495 Active: Memory that has been used more recently and usually not
496 reclaimed unless absolutely necessary.
497 Inactive: Memory which has been less recently used. It is more
498 eligible to be reclaimed for other purposes
500 HighFree: Highmem is all memory above ~860MB of physical memory
501 Highmem areas are for use by userspace programs, or
502 for the pagecache. The kernel must use tricks to access
503 this memory, making it slower to access than lowmem.
505 LowFree: Lowmem is memory which can be used for everything that
506 highmem can be used for, but it is also available for the
507 kernel's use for its own data structures. Among many
508 other things, it is where everything from the Slab is
509 allocated. Bad things happen when you're out of lowmem.
510 SwapTotal: total amount of swap space available
511 SwapFree: Memory which has been evicted from RAM, and is temporarily
513 Dirty: Memory which is waiting to get written back to the disk
514 Writeback: Memory which is actively being written back to the disk
515 AnonPages: Non-file backed pages mapped into userspace page tables
516 Mapped: files which have been mmaped, such as libraries
517 Slab: in-kernel data structures cache
518 SReclaimable: Part of Slab, that might be reclaimed, such as caches
519 SUnreclaim: Part of Slab, that cannot be reclaimed on memory pressure
520 PageTables: amount of memory dedicated to the lowest level of page
522 NFS_Unstable: NFS pages sent to the server, but not yet committed to stable
524 Bounce: Memory used for block device "bounce buffers"
525 WritebackTmp: Memory used by FUSE for temporary writeback buffers
526 CommitLimit: Based on the overcommit ratio ('vm.overcommit_ratio'),
527 this is the total amount of memory currently available to
528 be allocated on the system. This limit is only adhered to
529 if strict overcommit accounting is enabled (mode 2 in
530 'vm.overcommit_memory').
531 The CommitLimit is calculated with the following formula:
532 CommitLimit = ('vm.overcommit_ratio' * Physical RAM) + Swap
533 For example, on a system with 1G of physical RAM and 7G
534 of swap with a `vm.overcommit_ratio` of 30 it would
535 yield a CommitLimit of 7.3G.
536 For more details, see the memory overcommit documentation
537 in vm/overcommit-accounting.
538 Committed_AS: The amount of memory presently allocated on the system.
539 The committed memory is a sum of all of the memory which
540 has been allocated by processes, even if it has not been
541 "used" by them as of yet. A process which malloc()'s 1G
542 of memory, but only touches 300M of it will only show up
543 as using 300M of memory even if it has the address space
544 allocated for the entire 1G. This 1G is memory which has
545 been "committed" to by the VM and can be used at any time
546 by the allocating application. With strict overcommit
547 enabled on the system (mode 2 in 'vm.overcommit_memory'),
548 allocations which would exceed the CommitLimit (detailed
549 above) will not be permitted. This is useful if one needs
550 to guarantee that processes will not fail due to lack of
551 memory once that memory has been successfully allocated.
552 VmallocTotal: total size of vmalloc memory area
553 VmallocUsed: amount of vmalloc area which is used
554 VmallocChunk: largest contiguous block of vmalloc area which is free
556 ..............................................................................
560 Provides information about vmalloced/vmaped areas. One line per area,
561 containing the virtual address range of the area, size in bytes,
562 caller information of the creator, and optional information depending
563 on the kind of area :
565 pages=nr number of pages
566 phys=addr if a physical address was specified
567 ioremap I/O mapping (ioremap() and friends)
568 vmalloc vmalloc() area
571 vpages buffer for pages pointers was vmalloced (huge area)
572 N<node>=nr (Only on NUMA kernels)
573 Number of pages allocated on memory node <node>
575 > cat /proc/vmallocinfo
576 0xffffc20000000000-0xffffc20000201000 2101248 alloc_large_system_hash+0x204 ...
577 /0x2c0 pages=512 vmalloc N0=128 N1=128 N2=128 N3=128
578 0xffffc20000201000-0xffffc20000302000 1052672 alloc_large_system_hash+0x204 ...
579 /0x2c0 pages=256 vmalloc N0=64 N1=64 N2=64 N3=64
580 0xffffc20000302000-0xffffc20000304000 8192 acpi_tb_verify_table+0x21/0x4f...
581 phys=7fee8000 ioremap
582 0xffffc20000304000-0xffffc20000307000 12288 acpi_tb_verify_table+0x21/0x4f...
583 phys=7fee7000 ioremap
584 0xffffc2000031d000-0xffffc2000031f000 8192 init_vdso_vars+0x112/0x210
585 0xffffc2000031f000-0xffffc2000032b000 49152 cramfs_uncompress_init+0x2e ...
586 /0x80 pages=11 vmalloc N0=3 N1=3 N2=2 N3=3
587 0xffffc2000033a000-0xffffc2000033d000 12288 sys_swapon+0x640/0xac0 ...
589 0xffffc20000347000-0xffffc2000034c000 20480 xt_alloc_table_info+0xfe ...
590 /0x130 [x_tables] pages=4 vmalloc N0=4
591 0xffffffffa0000000-0xffffffffa000f000 61440 sys_init_module+0xc27/0x1d00 ...
592 pages=14 vmalloc N2=14
593 0xffffffffa000f000-0xffffffffa0014000 20480 sys_init_module+0xc27/0x1d00 ...
595 0xffffffffa0014000-0xffffffffa0017000 12288 sys_init_module+0xc27/0x1d00 ...
597 0xffffffffa0017000-0xffffffffa0022000 45056 sys_init_module+0xc27/0x1d00 ...
598 pages=10 vmalloc N0=10
600 1.3 IDE devices in /proc/ide
601 ----------------------------
603 The subdirectory /proc/ide contains information about all IDE devices of which
604 the kernel is aware. There is one subdirectory for each IDE controller, the
605 file drivers and a link for each IDE device, pointing to the device directory
606 in the controller specific subtree.
608 The file drivers contains general information about the drivers used for the
611 > cat /proc/ide/drivers
612 ide-cdrom version 4.53
613 ide-disk version 1.08
615 More detailed information can be found in the controller specific
616 subdirectories. These are named ide0, ide1 and so on. Each of these
617 directories contains the files shown in table 1-5.
620 Table 1-5: IDE controller info in /proc/ide/ide?
621 ..............................................................................
623 channel IDE channel (0 or 1)
624 config Configuration (only for PCI/IDE bridge)
626 model Type/Chipset of IDE controller
627 ..............................................................................
629 Each device connected to a controller has a separate subdirectory in the
630 controllers directory. The files listed in table 1-6 are contained in these
634 Table 1-6: IDE device information
635 ..............................................................................
638 capacity Capacity of the medium (in 512Byte blocks)
639 driver driver and version
640 geometry physical and logical geometry
641 identify device identify block
643 model device identifier
644 settings device setup
645 smart_thresholds IDE disk management thresholds
646 smart_values IDE disk management values
647 ..............................................................................
649 The most interesting file is settings. This file contains a nice overview of
650 the drive parameters:
652 # cat /proc/ide/ide0/hda/settings
653 name value min max mode
654 ---- ----- --- --- ----
655 bios_cyl 526 0 65535 rw
656 bios_head 255 0 255 rw
658 breada_readahead 4 0 127 rw
660 file_readahead 72 0 2097151 rw
662 keepsettings 0 0 1 rw
663 max_kb_per_request 122 1 127 rw
667 pio_mode write-only 0 255 w
673 1.4 Networking info in /proc/net
674 --------------------------------
676 The subdirectory /proc/net follows the usual pattern. Table 1-6 shows the
677 additional values you get for IP version 6 if you configure the kernel to
678 support this. Table 1-7 lists the files and their meaning.
681 Table 1-6: IPv6 info in /proc/net
682 ..............................................................................
684 udp6 UDP sockets (IPv6)
685 tcp6 TCP sockets (IPv6)
686 raw6 Raw device statistics (IPv6)
687 igmp6 IP multicast addresses, which this host joined (IPv6)
688 if_inet6 List of IPv6 interface addresses
689 ipv6_route Kernel routing table for IPv6
690 rt6_stats Global IPv6 routing tables statistics
691 sockstat6 Socket statistics (IPv6)
692 snmp6 Snmp data (IPv6)
693 ..............................................................................
696 Table 1-7: Network info in /proc/net
697 ..............................................................................
700 dev network devices with statistics
701 dev_mcast the Layer2 multicast groups a device is listening too
702 (interface index, label, number of references, number of bound
704 dev_stat network device status
705 ip_fwchains Firewall chain linkage
706 ip_fwnames Firewall chain names
707 ip_masq Directory containing the masquerading tables
708 ip_masquerade Major masquerading table
709 netstat Network statistics
710 raw raw device statistics
711 route Kernel routing table
712 rpc Directory containing rpc info
713 rt_cache Routing cache
715 sockstat Socket statistics
717 tr_rif Token ring RIF routing table
719 unix UNIX domain sockets
720 wireless Wireless interface data (Wavelan etc)
721 igmp IP multicast addresses, which this host joined
722 psched Global packet scheduler parameters.
723 netlink List of PF_NETLINK sockets
724 ip_mr_vifs List of multicast virtual interfaces
725 ip_mr_cache List of multicast routing cache
726 ..............................................................................
728 You can use this information to see which network devices are available in
729 your system and how much traffic was routed over those devices:
733 face |bytes packets errs drop fifo frame compressed multicast|[...
734 lo: 908188 5596 0 0 0 0 0 0 [...
735 ppp0:15475140 20721 410 0 0 410 0 0 [...
736 eth0: 614530 7085 0 0 0 0 0 1 [...
739 ...] bytes packets errs drop fifo colls carrier compressed
740 ...] 908188 5596 0 0 0 0 0 0
741 ...] 1375103 17405 0 0 0 0 0 0
742 ...] 1703981 5535 0 0 0 3 0 0
744 In addition, each Channel Bond interface has it's own directory. For
745 example, the bond0 device will have a directory called /proc/net/bond0/.
746 It will contain information that is specific to that bond, such as the
747 current slaves of the bond, the link status of the slaves, and how
748 many times the slaves link has failed.
753 If you have a SCSI host adapter in your system, you'll find a subdirectory
754 named after the driver for this adapter in /proc/scsi. You'll also see a list
755 of all recognized SCSI devices in /proc/scsi:
759 Host: scsi0 Channel: 00 Id: 00 Lun: 00
760 Vendor: IBM Model: DGHS09U Rev: 03E0
761 Type: Direct-Access ANSI SCSI revision: 03
762 Host: scsi0 Channel: 00 Id: 06 Lun: 00
763 Vendor: PIONEER Model: CD-ROM DR-U06S Rev: 1.04
764 Type: CD-ROM ANSI SCSI revision: 02
767 The directory named after the driver has one file for each adapter found in
768 the system. These files contain information about the controller, including
769 the used IRQ and the IO address range. The amount of information shown is
770 dependent on the adapter you use. The example shows the output for an Adaptec
771 AHA-2940 SCSI adapter:
773 > cat /proc/scsi/aic7xxx/0
775 Adaptec AIC7xxx driver version: 5.1.19/3.2.4
777 TCQ Enabled By Default : Disabled
778 AIC7XXX_PROC_STATS : Disabled
779 AIC7XXX_RESET_DELAY : 5
780 Adapter Configuration:
781 SCSI Adapter: Adaptec AHA-294X Ultra SCSI host adapter
782 Ultra Wide Controller
783 PCI MMAPed I/O Base: 0xeb001000
784 Adapter SEEPROM Config: SEEPROM found and used.
785 Adaptec SCSI BIOS: Enabled
787 SCBs: Active 0, Max Active 2,
788 Allocated 15, HW 16, Page 255
790 BIOS Control Word: 0x18b6
791 Adapter Control Word: 0x005b
792 Extended Translation: Enabled
793 Disconnect Enable Flags: 0xffff
794 Ultra Enable Flags: 0x0001
795 Tag Queue Enable Flags: 0x0000
796 Ordered Queue Tag Flags: 0x0000
797 Default Tag Queue Depth: 8
798 Tagged Queue By Device array for aic7xxx host instance 0:
799 {255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255}
800 Actual queue depth per device for aic7xxx host instance 0:
801 {1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}
804 Device using Wide/Sync transfers at 40.0 MByte/sec, offset 8
805 Transinfo settings: current(12/8/1/0), goal(12/8/1/0), user(12/15/1/0)
806 Total transfers 160151 (74577 reads and 85574 writes)
808 Device using Narrow/Sync transfers at 5.0 MByte/sec, offset 15
809 Transinfo settings: current(50/15/0/0), goal(50/15/0/0), user(50/15/0/0)
810 Total transfers 0 (0 reads and 0 writes)
813 1.6 Parallel port info in /proc/parport
814 ---------------------------------------
816 The directory /proc/parport contains information about the parallel ports of
817 your system. It has one subdirectory for each port, named after the port
820 These directories contain the four files shown in Table 1-8.
823 Table 1-8: Files in /proc/parport
824 ..............................................................................
826 autoprobe Any IEEE-1284 device ID information that has been acquired.
827 devices list of the device drivers using that port. A + will appear by the
828 name of the device currently using the port (it might not appear
830 hardware Parallel port's base address, IRQ line and DMA channel.
831 irq IRQ that parport is using for that port. This is in a separate
832 file to allow you to alter it by writing a new value in (IRQ
834 ..............................................................................
836 1.7 TTY info in /proc/tty
837 -------------------------
839 Information about the available and actually used tty's can be found in the
840 directory /proc/tty.You'll find entries for drivers and line disciplines in
841 this directory, as shown in Table 1-9.
844 Table 1-9: Files in /proc/tty
845 ..............................................................................
847 drivers list of drivers and their usage
848 ldiscs registered line disciplines
849 driver/serial usage statistic and status of single tty lines
850 ..............................................................................
852 To see which tty's are currently in use, you can simply look into the file
855 > cat /proc/tty/drivers
856 pty_slave /dev/pts 136 0-255 pty:slave
857 pty_master /dev/ptm 128 0-255 pty:master
858 pty_slave /dev/ttyp 3 0-255 pty:slave
859 pty_master /dev/pty 2 0-255 pty:master
860 serial /dev/cua 5 64-67 serial:callout
861 serial /dev/ttyS 4 64-67 serial
862 /dev/tty0 /dev/tty0 4 0 system:vtmaster
863 /dev/ptmx /dev/ptmx 5 2 system
864 /dev/console /dev/console 5 1 system:console
865 /dev/tty /dev/tty 5 0 system:/dev/tty
866 unknown /dev/tty 4 1-63 console
869 1.8 Miscellaneous kernel statistics in /proc/stat
870 -------------------------------------------------
872 Various pieces of information about kernel activity are available in the
873 /proc/stat file. All of the numbers reported in this file are aggregates
874 since the system first booted. For a quick look, simply cat the file:
877 cpu 2255 34 2290 22625563 6290 127 456 0
878 cpu0 1132 34 1441 11311718 3675 127 438 0
879 cpu1 1123 0 849 11313845 2614 0 18 0
880 intr 114930548 113199788 3 0 5 263 0 4 [... lots more numbers ...]
887 The very first "cpu" line aggregates the numbers in all of the other "cpuN"
888 lines. These numbers identify the amount of time the CPU has spent performing
889 different kinds of work. Time units are in USER_HZ (typically hundredths of a
890 second). The meanings of the columns are as follows, from left to right:
892 - user: normal processes executing in user mode
893 - nice: niced processes executing in user mode
894 - system: processes executing in kernel mode
895 - idle: twiddling thumbs
896 - iowait: waiting for I/O to complete
897 - irq: servicing interrupts
898 - softirq: servicing softirqs
899 - steal: involuntary wait
901 The "intr" line gives counts of interrupts serviced since boot time, for each
902 of the possible system interrupts. The first column is the total of all
903 interrupts serviced; each subsequent column is the total for that particular
906 The "ctxt" line gives the total number of context switches across all CPUs.
908 The "btime" line gives the time at which the system booted, in seconds since
911 The "processes" line gives the number of processes and threads created, which
912 includes (but is not limited to) those created by calls to the fork() and
913 clone() system calls.
915 The "procs_running" line gives the number of processes currently running on
918 The "procs_blocked" line gives the number of processes currently blocked,
919 waiting for I/O to complete.
922 1.9 Ext4 file system parameters
923 ------------------------------
925 Information about mounted ext4 file systems can be found in
926 /proc/fs/ext4. Each mounted filesystem will have a directory in
927 /proc/fs/ext4 based on its device name (i.e., /proc/fs/ext4/hdc or
928 /proc/fs/ext4/dm-0). The files in each per-device directory are shown
929 in Table 1-10, below.
931 Table 1-10: Files in /proc/fs/ext4/<devname>
932 ..............................................................................
934 mb_groups details of multiblock allocator buddy cache of free blocks
935 mb_history multiblock allocation history
936 ..............................................................................
939 ------------------------------------------------------------------------------
941 ------------------------------------------------------------------------------
942 The /proc file system serves information about the running system. It not only
943 allows access to process data but also allows you to request the kernel status
944 by reading files in the hierarchy.
946 The directory structure of /proc reflects the types of information and makes
947 it easy, if not obvious, where to look for specific data.
948 ------------------------------------------------------------------------------
950 ------------------------------------------------------------------------------
951 CHAPTER 2: MODIFYING SYSTEM PARAMETERS
952 ------------------------------------------------------------------------------
954 ------------------------------------------------------------------------------
956 ------------------------------------------------------------------------------
957 * Modifying kernel parameters by writing into files found in /proc/sys
958 * Exploring the files which modify certain parameters
959 * Review of the /proc/sys file tree
960 ------------------------------------------------------------------------------
963 A very interesting part of /proc is the directory /proc/sys. This is not only
964 a source of information, it also allows you to change parameters within the
965 kernel. Be very careful when attempting this. You can optimize your system,
966 but you can also cause it to crash. Never alter kernel parameters on a
967 production system. Set up a development machine and test to make sure that
968 everything works the way you want it to. You may have no alternative but to
969 reboot the machine once an error has been made.
971 To change a value, simply echo the new value into the file. An example is
972 given below in the section on the file system data. You need to be root to do
973 this. You can create your own boot script to perform this every time your
976 The files in /proc/sys can be used to fine tune and monitor miscellaneous and
977 general things in the operation of the Linux kernel. Since some of the files
978 can inadvertently disrupt your system, it is advisable to read both
979 documentation and source before actually making adjustments. In any case, be
980 very careful when writing to any of these files. The entries in /proc may
981 change slightly between the 2.1.* and the 2.2 kernel, so if there is any doubt
982 review the kernel documentation in the directory /usr/src/linux/Documentation.
983 This chapter is heavily based on the documentation included in the pre 2.2
984 kernels, and became part of it in version 2.2.1 of the Linux kernel.
986 Please see: Documentation/sysctls/ directory for descriptions of these
989 ------------------------------------------------------------------------------
991 ------------------------------------------------------------------------------
992 Certain aspects of kernel behavior can be modified at runtime, without the
993 need to recompile the kernel, or even to reboot the system. The files in the
994 /proc/sys tree can not only be read, but also modified. You can use the echo
995 command to write value into these files, thereby changing the default settings
997 ------------------------------------------------------------------------------
999 ------------------------------------------------------------------------------
1000 CHAPTER 3: PER-PROCESS PARAMETERS
1001 ------------------------------------------------------------------------------
1003 3.1 /proc/<pid>/oom_adj - Adjust the oom-killer score
1004 ------------------------------------------------------
1006 This file can be used to adjust the score used to select which processes should
1007 be killed in an out-of-memory situation. The oom_adj value is a characteristic
1008 of the task's mm, so all threads that share an mm with pid will have the same
1009 oom_adj value. A high value will increase the likelihood of this process being
1010 killed by the oom-killer. Valid values are in the range -16 to +15 as
1011 explained below and a special value of -17, which disables oom-killing
1012 altogether for threads sharing pid's mm.
1014 The process to be killed in an out-of-memory situation is selected among all others
1015 based on its badness score. This value equals the original memory size of the process
1016 and is then updated according to its CPU time (utime + stime) and the
1017 run time (uptime - start time). The longer it runs the smaller is the score.
1018 Badness score is divided by the square root of the CPU time and then by
1019 the double square root of the run time.
1021 Swapped out tasks are killed first. Half of each child's memory size is added to
1022 the parent's score if they do not share the same memory. Thus forking servers
1023 are the prime candidates to be killed. Having only one 'hungry' child will make
1024 parent less preferable than the child.
1026 /proc/<pid>/oom_adj cannot be changed for kthreads since they are immune from
1027 oom-killing already.
1029 /proc/<pid>/oom_score shows process' current badness score.
1031 The following heuristics are then applied:
1032 * if the task was reniced, its score doubles
1033 * superuser or direct hardware access tasks (CAP_SYS_ADMIN, CAP_SYS_RESOURCE
1034 or CAP_SYS_RAWIO) have their score divided by 4
1035 * if oom condition happened in one cpuset and checked task does not belong
1036 to it, its score is divided by 8
1037 * the resulting score is multiplied by two to the power of oom_adj, i.e.
1038 points <<= oom_adj when it is positive and
1039 points >>= -(oom_adj) otherwise
1041 The task with the highest badness score is then selected and its children
1042 are killed, process itself will be killed in an OOM situation when it does
1043 not have children or some of them disabled oom like described above.
1045 3.2 /proc/<pid>/oom_score - Display current oom-killer score
1046 -------------------------------------------------------------
1048 This file can be used to check the current score used by the oom-killer is for
1049 any given <pid>. Use it together with /proc/<pid>/oom_adj to tune which
1050 process should be killed in an out-of-memory situation.
1053 3.3 /proc/<pid>/io - Display the IO accounting fields
1054 -------------------------------------------------------
1056 This file contains IO statistics for each running process
1061 test:/tmp # dd if=/dev/zero of=/tmp/test.dat &
1064 test:/tmp # cat /proc/3828/io
1070 write_bytes: 323932160
1071 cancelled_write_bytes: 0
1080 I/O counter: chars read
1081 The number of bytes which this task has caused to be read from storage. This
1082 is simply the sum of bytes which this process passed to read() and pread().
1083 It includes things like tty IO and it is unaffected by whether or not actual
1084 physical disk IO was required (the read might have been satisfied from
1091 I/O counter: chars written
1092 The number of bytes which this task has caused, or shall cause to be written
1093 to disk. Similar caveats apply here as with rchar.
1099 I/O counter: read syscalls
1100 Attempt to count the number of read I/O operations, i.e. syscalls like read()
1107 I/O counter: write syscalls
1108 Attempt to count the number of write I/O operations, i.e. syscalls like
1109 write() and pwrite().
1115 I/O counter: bytes read
1116 Attempt to count the number of bytes which this process really did cause to
1117 be fetched from the storage layer. Done at the submit_bio() level, so it is
1118 accurate for block-backed filesystems. <please add status regarding NFS and
1119 CIFS at a later time>
1125 I/O counter: bytes written
1126 Attempt to count the number of bytes which this process caused to be sent to
1127 the storage layer. This is done at page-dirtying time.
1130 cancelled_write_bytes
1131 ---------------------
1133 The big inaccuracy here is truncate. If a process writes 1MB to a file and
1134 then deletes the file, it will in fact perform no writeout. But it will have
1135 been accounted as having caused 1MB of write.
1136 In other words: The number of bytes which this process caused to not happen,
1137 by truncating pagecache. A task can cause "negative" IO too. If this task
1138 truncates some dirty pagecache, some IO which another task has been accounted
1139 for (in it's write_bytes) will not be happening. We _could_ just subtract that
1140 from the truncating task's write_bytes, but there is information loss in doing
1147 At its current implementation state, this is a bit racy on 32-bit machines: if
1148 process A reads process B's /proc/pid/io while process B is updating one of
1149 those 64-bit counters, process A could see an intermediate result.
1152 More information about this can be found within the taskstats documentation in
1153 Documentation/accounting.
1155 3.4 /proc/<pid>/coredump_filter - Core dump filtering settings
1156 ---------------------------------------------------------------
1157 When a process is dumped, all anonymous memory is written to a core file as
1158 long as the size of the core file isn't limited. But sometimes we don't want
1159 to dump some memory segments, for example, huge shared memory. Conversely,
1160 sometimes we want to save file-backed memory segments into a core file, not
1161 only the individual files.
1163 /proc/<pid>/coredump_filter allows you to customize which memory segments
1164 will be dumped when the <pid> process is dumped. coredump_filter is a bitmask
1165 of memory types. If a bit of the bitmask is set, memory segments of the
1166 corresponding memory type are dumped, otherwise they are not dumped.
1168 The following 7 memory types are supported:
1169 - (bit 0) anonymous private memory
1170 - (bit 1) anonymous shared memory
1171 - (bit 2) file-backed private memory
1172 - (bit 3) file-backed shared memory
1173 - (bit 4) ELF header pages in file-backed private memory areas (it is
1174 effective only if the bit 2 is cleared)
1175 - (bit 5) hugetlb private memory
1176 - (bit 6) hugetlb shared memory
1178 Note that MMIO pages such as frame buffer are never dumped and vDSO pages
1179 are always dumped regardless of the bitmask status.
1181 Note bit 0-4 doesn't effect any hugetlb memory. hugetlb memory are only
1182 effected by bit 5-6.
1184 Default value of coredump_filter is 0x23; this means all anonymous memory
1185 segments and hugetlb private memory are dumped.
1187 If you don't want to dump all shared memory segments attached to pid 1234,
1188 write 0x21 to the process's proc file.
1190 $ echo 0x21 > /proc/1234/coredump_filter
1192 When a new process is created, the process inherits the bitmask status from its
1193 parent. It is useful to set up coredump_filter before the program runs.
1196 $ echo 0x7 > /proc/self/coredump_filter
1199 3.5 /proc/<pid>/mountinfo - Information about mounts
1200 --------------------------------------------------------
1202 This file contains lines of the form:
1204 36 35 98:0 /mnt1 /mnt2 rw,noatime master:1 - ext3 /dev/root rw,errors=continue
1205 (1)(2)(3) (4) (5) (6) (7) (8) (9) (10) (11)
1207 (1) mount ID: unique identifier of the mount (may be reused after umount)
1208 (2) parent ID: ID of parent (or of self for the top of the mount tree)
1209 (3) major:minor: value of st_dev for files on filesystem
1210 (4) root: root of the mount within the filesystem
1211 (5) mount point: mount point relative to the process's root
1212 (6) mount options: per mount options
1213 (7) optional fields: zero or more fields of the form "tag[:value]"
1214 (8) separator: marks the end of the optional fields
1215 (9) filesystem type: name of filesystem of the form "type[.subtype]"
1216 (10) mount source: filesystem specific information or "none"
1217 (11) super options: per super block options
1219 Parsers should ignore all unrecognised optional fields. Currently the
1220 possible optional fields are:
1222 shared:X mount is shared in peer group X
1223 master:X mount is slave to peer group X
1224 propagate_from:X mount is slave and receives propagation from peer group X (*)
1225 unbindable mount is unbindable
1227 (*) X is the closest dominant peer group under the process's root. If
1228 X is the immediate master of the mount, or if there's no dominant peer
1229 group under the same root, then only the "master:X" field is present
1230 and not the "propagate_from:X" field.
1232 For more information on mount propagation see:
1234 Documentation/filesystems/sharedsubtree.txt