Игорь Олемской — практические заметки по системному администрированию Linux CentOS

Архив тега ‘openvz’

devel@ mailing list mess is no more (перепечатка)

Комментариев нет

OK, I must admit is was a very bad idea of me to subscribe our devel at openvz dot org mailing list to containers at linux-foundation dot com mailing list.

This is to announce that from now on devel@ is a separate list, not mirroring containers@ or anything. From now on, if the topic is openvz-specific, like a patch to OpenVZ, please use devel@. If the topic is about containers (as appearing in mainline), use containers@.

Let me explain. Initially, when we started moving OpenVZ project forward, we wanted to discuss all the things about containers on a mailing list, and therefore I created devel@. Later, then other parties joined, it was decided to create containers at osdl.org mailing list (remember OSDL later became the Linux Foundation). At that time I was worried that the discussions will split, and decided to just subscribe our devel@ to containers@, so devel@ becomes a super-set of containers@ (i.e. every message posted to containers@ will appear on devel@, but not vice versa).

Of course it ended up being a big mess. Better late than never, mess is no more!

devel@ mailing list mess is no more (перепечатка)

Комментариев нет

OK, I must admit is was a very bad idea of me to subscribe our devel at openvz dot org mailing list to containers at linux-foundation dot com mailing list.

This is to announce that from now on devel@ is a separate list, not mirroring containers@ or anything. From now on, if the topic is openvz-specific, like a patch to OpenVZ, please use devel@. If the topic is about containers (as appearing in mainline), use containers@.

Let me explain. Initially, when we started moving OpenVZ project forward, we wanted to discuss all the things about containers on a mailing list, and therefore I created devel@. Later, then other parties joined, it was decided to create containers at osdl.org mailing list (remember OSDL later became the Linux Foundation). At that time I was worried that the discussions will split, and decided to just subscribe our devel@ to containers@, so devel@ becomes a super-set of containers@ (i.e. every message posted to containers@ will appear on devel@, but not vice versa).

Of course it ended up being a big mess. Better late than never, mess is no more!

back to 2006, or openvz bug #60 (перепечатка)

Комментариев нет

Some software bugs, while being simple and stupid, have an interesting and long lasting life. Here is the story of such a very simple bug with a lifespan of about 5 years (or more? I don't know when it was introduced). The bug doesn't worth looking at otherwise, so I'll try to be short, and more info is available from the links.

OK,

back in 2006 I whined about a bug in sysvinit we found. Until today I thought is was never fixed upstream.

This night I found out that it's actually fixed in sysvinit (2.87dsf), released in Jul 2009, according to its changelog:

 * Adjust init to terminate argv0 with one 0 rather than two so that
    process name can be one character longer.  Patch by Kir Kolyshkin.

Unfortunately it wrongly contributes me as a patch author. The actual author is Dmitry Mishin, as seen in OpenVZ bug #60, I just submitted it.

back to 2006, or openvz bug #60 (перепечатка)

Комментариев нет

Some software bugs, while being simple and stupid, have an interesting and long lasting life. Here is the story of such a very simple bug with a lifespan of about 5 years (or more? I don't know when it was introduced). The bug doesn't worth looking at otherwise, so I'll try to be short, and more info is available from the links.

OK,

back in 2006 I whined about a bug in sysvinit we found. Until today I thought is was never fixed upstream.

This night I found out that it's actually fixed in sysvinit (2.87dsf), released in Jul 2009, according to its changelog:

 * Adjust init to terminate argv0 with one 0 rather than two so that
    process name can be one character longer.  Patch by Kir Kolyshkin.

Unfortunately it wrongly contributes me as a patch author. The actual author is Dmitry Mishin, as seen in OpenVZ bug #60, I just submitted it.

20.02.2011

on static function declaration (перепечатка)

Комментариев нет

If you are a seasoned C programmer, skip this post entirely (or try to find bugs in it). If you know C but don't consider yourself an expert, please read on — it might be helpful.

I was working a bit on vzctl today (my target was bug #1757, which is still a work in progress) and ... I am not sure how, but I ended up declaring most functions in src/vzlist.c as static. I thought it doesn't have any practical value — I was wrong!

In C, if you declare the function as static, it means its visibility is limited to the translation unit (i.e. a file) in which it is defined. In other words, you can only call/use a static function from another function in the same file.

Now, in vzctl sources vzlist.c is only linked to one binary — vzlist, and therefore I thought it doesn't make much sense to declare functions as static. Nevertheless I did it (see git commit).

Next thing I got is a set of compiler warnings! OK, all right, let's take a look...

First set of warnings is self-explanatory. See:
vzlist.c:825:14: warning: ‘parse_var’ defined but not used
vzlist.c:1075:14: warning: ‘remove_sp’ defined but not used
vzlist.c:1357:12: warning: ‘get_stop_quota_stats’ defined but not used

Easy! In some ancient time, these functions were used, now the code has changed and no one needs these three, but they were not removed for some reason (probably just forgotten). Solution: remove the dead code (see git commit).

Second set of warnings looks similar:
vzlist.c:400:1: warning: ‘dcachesize_m_sort_fn’ defined but not used
vzlist.c:400:1: warning: ‘dcachesize_l_sort_fn’ defined but not used
vzlist.c:400:1: warning: ‘dcachesize_b_sort_fn’ defined but not used
vzlist.c:400:1: warning: ‘dcachesize_f_sort_fn’ defined but not used
vzlist.c:411:1: warning: ‘diskinodes_s_sort_fn’ defined but not used
vzlist.c:411:1: warning: ‘diskinodes_h_sort_fn’ defined but not used

Hmm... all these *_sort_fn are sort functions generated by means of a few #define statements, and they are used when vzlist needs to sort its output by some column or parameter (vzlist -s). It is very strange that these are not used, because they should be. Let's take a closer look... zOMG! it's a bug!

Apparently, someone was using copy-paste technique* and forgot to change the names of the functions. The bug is, when you ask vzlist to sort its output to, say, dcachesize failcounter values, it sorts it by dcachesize held values instead, because of the wrong sort function used. Such bugs are hard to notice manually, and there are no autotests for vzlist.

* Yes some parts of vzlist is a copy-pasted mess, I am slowly working on untangling it. For example, see my previous cleanup patches (committed back in June 2010):
src/vzlist.c: streamline a few macros
vzlist: put similar print_ functions in a macro
vzlist.c: simplify last_field logic

Morale: sometimes declaring functions as static actually helps!

PS if you see mistakes in this blog post, patches to it are welcome. It's 1am here and I am a bit sleepy.

08.02.2011

on static function declaration (перепечатка)

Комментариев нет

If you are a seasoned C programmer, skip this post entirely (or try to find bugs in it). If you know C but don't consider yourself an expert, please read on — it might be helpful.

I was working a bit on vzctl today (my target was bug #1757, which is still a work in progress) and ... I am not sure how, but I ended up declaring most functions in src/vzlist.c as static. I thought it doesn't have any practical value — I was wrong!

In C, if you declare the function as static, it means its visibility is limited to the translation unit (i.e. a file) in which it is defined. In other words, you can only call/use a static function from another function in the same file.

Now, in vzctl sources vzlist.c is only linked to one binary — vzlist, and therefore I thought it doesn't make much sense to declare functions as static. Nevertheless I did it (see git commit).

Next thing I got is a set of compiler warnings! OK, all right, let's take a look...

First set of warnings is self-explanatory. See:
vzlist.c:825:14: warning: ‘parse_var’ defined but not used
vzlist.c:1075:14: warning: ‘remove_sp’ defined but not used
vzlist.c:1357:12: warning: ‘get_stop_quota_stats’ defined but not used

Easy! In some ancient time, these functions were used, now the code has changed and no one needs these three, but they were not removed for some reason (probably just forgotten). Solution: remove the dead code (see git commit).

Second set of warnings looks similar:
vzlist.c:400:1: warning: ‘dcachesize_m_sort_fn’ defined but not used
vzlist.c:400:1: warning: ‘dcachesize_l_sort_fn’ defined but not used
vzlist.c:400:1: warning: ‘dcachesize_b_sort_fn’ defined but not used
vzlist.c:400:1: warning: ‘dcachesize_f_sort_fn’ defined but not used
vzlist.c:411:1: warning: ‘diskinodes_s_sort_fn’ defined but not used
vzlist.c:411:1: warning: ‘diskinodes_h_sort_fn’ defined but not used

Hmm... all these *_sort_fn are sort functions generated by means of a few #define statements, and they are used when vzlist needs to sort its output by some column or parameter (vzlist -s). It is very strange that these are not used, because they should be. Let's take a closer look... zOMG! it's a bug!

Apparently, someone was using copy-paste technique* and forgot to change the names of the functions. The bug is, when you ask vzlist to sort its output to, say, dcachesize failcounter values, it sorts it by dcachesize held values instead, because of the wrong sort function used. Such bugs are hard to notice manually, and there are no autotests for vzlist.

* Yes some parts of vzlist is a copy-pasted mess, I am slowly working on untangling it. For example, see my previous cleanup patches (committed back in June 2010):
src/vzlist.c: streamline a few macros
vzlist: put similar print_ functions in a macro
vzlist.c: simplify last_field logic

Morale: sometimes declaring functions as static actually helps!

PS if you see mistakes in this blog post, patches to it are welcome. It's 1am here and I am a bit sleepy.

08.02.2011

Kernel 2.6.27 repin aka "Unexpected return" (перепечатка)

Комментариев нет

You probably thought we have abandoned 2.6.27 kernel branch. Well, we ourselves thought we did (although it was not yet officially announced). Then, out of a sudden, kernel 2.6.27-repin.1 is released, rebasing to latest upstream kernel (2.6.27.57), and fixing OpenVZ bug #1593.

The thing is, this kernel is called after Ilya Repin, a leading Russian painter and sculptor of the Peredvizhniki artistic school. One of his best paintings is called «Unexpected Return», and I happen to enjoy the original in Tretyakov Gallery here in Moscow a couple of weeks ago. So here it is: the unexpected return of 2.6.27 kernel. It took Ilya 4 years to finish the painting, it took Pavel 6 months to release the fix. Better late than never, that is.

Please enjoy: Ilya Repin. Unexpected return. 1884—1888.

Kernel 2.6.27 repin aka "Unexpected return" (перепечатка)

Комментариев нет

You probably thought we have abandoned 2.6.27 kernel branch. Well, we ourselves thought we did (although it was not yet officially announced). Then, out of a sudden, kernel 2.6.27-repin.1 is released, rebasing to latest upstream kernel (2.6.27.57), and fixing OpenVZ bug #1593.

The thing is, this kernel is called after Ilya Repin, a leading Russian painter and sculptor of the Peredvizhniki artistic school. One of his best paintings is called «Unexpected Return», and I happen to enjoy the original in Tretyakov Gallery here in Moscow a couple of weeks ago. So here it is: the unexpected return of 2.6.27 kernel. It took Ilya 4 years to finish the painting, it took Pavel 6 months to release the fix. Better late than never, that is.

Please enjoy: Ilya Repin. Unexpected return. 1884—1888.

26.01.2011

news from the VSwap front (перепечатка)

Комментариев нет

I have added vswap confguration samples to vzctl git. Basically, you set physpages and swappages and leave every other beancounter at unlimited. For example, this is how ve-vswap-256m-conf.sample looks like:

# UBC parameters (in form of barrier:limit)
PHYSPAGES="0:256M"
SWAPPAGES="0:512M"
KMEMSIZE="unlimited"
LOCKEDPAGES="unlimited"
PRIVVMPAGES="unlimited"
SHMPAGES="unlimited"
NUMPROC="unlimited"
VMGUARPAGES="unlimited"
OOMGUARPAGES="unlimited"
NUMTCPSOCK="unlimited"
NUMFLOCK="unlimited"
NUMPTY="unlimited"
NUMSIGINFO="unlimited"
TCPSNDBUF="unlimited"
TCPRCVBUF="unlimited"
OTHERSOCKBUF="unlimited"
DGRAMRCVBUF="unlimited"
NUMOTHERSOCK="unlimited"
DCACHESIZE="unlimited"
NUMFILE="unlimited"
NUMIPTENT="unlimited"

# Disk quota parameters (in form of softlimit:hardlimit)
DISKSPACE="1G"
DISKINODES="200000"
QUOTATIME="0"
# CPU fair scheduler parameter
CPUUNITS="1000"

As you can see, physpages (ie RAM size) is set to 256 megabytes, while swappages (ie swap size) is set to 512 megabytes, all the other beancounters are unlimited. Wow, it's never been easier to configure your containers!

Now, we can utilize this stuff using RHEL6 based kernel. This is what we see from inside the container:

[root@localhost ~]# vzctl enter 103
entered into CT 103
[root@localhost /]# free
             total       used       free     shared    buffers     cached
Mem:        262144      23936     238208          0          0      10968
-/+ buffers/cache:      12968     249176
Swap:       524288          0     524288
[root@localhost /]# cat /proc/user_beancounters
Version: 2.5
       uid  resource                     held              maxheld              barrier                limit              failcnt
      103:  kmemsize                  4722976              4853726  9223372036854775807  9223372036854775807                    0
            lockedpages                     0                    0  9223372036854775807  9223372036854775807                    0
            privvmpages                  4296                13875  9223372036854775807  9223372036854775807                    0
            shmpages                       31                   31  9223372036854775807  9223372036854775807                    0
            dummy                           0                    0                    0                    0                    0
            numproc                        33                   33  9223372036854775807  9223372036854775807                    0
            physpages                    5984                 5985                    0                65536                    0
            vmguarpages                     0                    0  9223372036854775807  9223372036854775807                    0
            oomguarpages                 2696                 2696  9223372036854775807  9223372036854775807                    0
            numtcpsock                      4                    4  9223372036854775807  9223372036854775807                    0
            numflock                        5                    6  9223372036854775807  9223372036854775807                    0
            numpty                          1                    1  9223372036854775807  9223372036854775807                    0
            numsiginfo                     12                   18  9223372036854775807  9223372036854775807                    0
            tcpsndbuf                   69760                    0  9223372036854775807  9223372036854775807                    0
            tcprcvbuf                   65536                    0  9223372036854775807  9223372036854775807                    0
            othersockbuf                 2312                10768  9223372036854775807  9223372036854775807                    0
            dgramrcvbuf                     0                    0  9223372036854775807  9223372036854775807                    0
            numothersock                   51                   53  9223372036854775807  9223372036854775807                    0
            dcachesize                1172451              1172451  9223372036854775807  9223372036854775807                    0
            numfile                       370                  390  9223372036854775807  9223372036854775807                    0
            dummy                           0                    0                    0                    0                    0
            dummy                           0                    0                    0                    0                    0
            dummy                           0                    0                    0                    0                    0
            numiptent                      14                   14  9223372036854775807  9223372036854775807                    0

[root@localhost /]# cat /proc/meminfo
MemTotal:         262144 kB
MemFree:          238208 kB
Cached:            10968 kB
Active:            16956 kB
Inactive:           1384 kB
Active(anon):       6352 kB
Inactive(anon):     1020 kB
Active(file):      10604 kB
Inactive(file):      364 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:        524288 kB
SwapFree:         524288 kB
Dirty:                 0 kB
AnonPages:          7364 kB
Mapped:             3416 kB
Shmem:               124 kB
Slab:               4012 kB
SReclaimable:       1088 kB
SUnreclaim:         2924 kB

news from the VSwap front (перепечатка)

Комментариев нет

I have added vswap confguration samples to vzctl git. Basically, you set physpages and swappages and leave every other beancounter at unlimited. For example, this is how ve-vswap-256m-conf.sample looks like:

# UBC parameters (in form of barrier:limit)
PHYSPAGES="0:256M"
SWAPPAGES="0:512M"
KMEMSIZE="unlimited"
LOCKEDPAGES="unlimited"
PRIVVMPAGES="unlimited"
SHMPAGES="unlimited"
NUMPROC="unlimited"
VMGUARPAGES="unlimited"
OOMGUARPAGES="unlimited"
NUMTCPSOCK="unlimited"
NUMFLOCK="unlimited"
NUMPTY="unlimited"
NUMSIGINFO="unlimited"
TCPSNDBUF="unlimited"
TCPRCVBUF="unlimited"
OTHERSOCKBUF="unlimited"
DGRAMRCVBUF="unlimited"
NUMOTHERSOCK="unlimited"
DCACHESIZE="unlimited"
NUMFILE="unlimited"
NUMIPTENT="unlimited"

# Disk quota parameters (in form of softlimit:hardlimit)
DISKSPACE="1G"
DISKINODES="200000"
QUOTATIME="0"
# CPU fair scheduler parameter
CPUUNITS="1000"

As you can see, physpages (ie RAM size) is set to 256 megabytes, while swappages (ie swap size) is set to 512 megabytes, all the other beancounters are unlimited. Wow, it's never been easier to configure your containers!

Now, we can utilize this stuff using RHEL6 based kernel. This is what we see from inside the container:

[root@localhost ~]# vzctl enter 103
entered into CT 103
[root@localhost /]# free
             total       used       free     shared    buffers     cached
Mem:        262144      23936     238208          0          0      10968
-/+ buffers/cache:      12968     249176
Swap:       524288          0     524288
[root@localhost /]# cat /proc/user_beancounters
Version: 2.5
       uid  resource                     held              maxheld              barrier                limit              failcnt
      103:  kmemsize                  4722976              4853726  9223372036854775807  9223372036854775807                    0
            lockedpages                     0                    0  9223372036854775807  9223372036854775807                    0
            privvmpages                  4296                13875  9223372036854775807  9223372036854775807                    0
            shmpages                       31                   31  9223372036854775807  9223372036854775807                    0
            dummy                           0                    0                    0                    0                    0
            numproc                        33                   33  9223372036854775807  9223372036854775807                    0
            physpages                    5984                 5985                    0                65536                    0
            vmguarpages                     0                    0  9223372036854775807  9223372036854775807                    0
            oomguarpages                 2696                 2696  9223372036854775807  9223372036854775807                    0
            numtcpsock                      4                    4  9223372036854775807  9223372036854775807                    0
            numflock                        5                    6  9223372036854775807  9223372036854775807                    0
            numpty                          1                    1  9223372036854775807  9223372036854775807                    0
            numsiginfo                     12                   18  9223372036854775807  9223372036854775807                    0
            tcpsndbuf                   69760                    0  9223372036854775807  9223372036854775807                    0
            tcprcvbuf                   65536                    0  9223372036854775807  9223372036854775807                    0
            othersockbuf                 2312                10768  9223372036854775807  9223372036854775807                    0
            dgramrcvbuf                     0                    0  9223372036854775807  9223372036854775807                    0
            numothersock                   51                   53  9223372036854775807  9223372036854775807                    0
            dcachesize                1172451              1172451  9223372036854775807  9223372036854775807                    0
            numfile                       370                  390  9223372036854775807  9223372036854775807                    0
            dummy                           0                    0                    0                    0                    0
            dummy                           0                    0                    0                    0                    0
            dummy                           0                    0                    0                    0                    0
            numiptent                      14                   14  9223372036854775807  9223372036854775807                    0

[root@localhost /]# cat /proc/meminfo
MemTotal:         262144 kB
MemFree:          238208 kB
Cached:            10968 kB
Active:            16956 kB
Inactive:           1384 kB
Active(anon):       6352 kB
Inactive(anon):     1020 kB
Active(file):      10604 kB
Inactive(file):      364 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:        524288 kB
SwapFree:         524288 kB
Dirty:                 0 kB
AnonPages:          7364 kB
Mapped:             3416 kB
Shmem:               124 kB
Slab:               4012 kB
SReclaimable:       1088 kB
SUnreclaim:         2924 kB

25.01.2011