-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy path12_advanced-administration.po
2181 lines (1675 loc) · 156 KB
/
12_advanced-administration.po
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
msgid ""
msgstr ""
"Project-Id-Version: 0\n"
"POT-Creation-Date: 2015-10-06 16:10+0200\n"
"PO-Revision-Date: 2015-10-06 16:10+0200\n"
"Last-Translator: Automatically generated\n"
"Language-Team: None\n"
"Language: en-US \n"
"MIME-Version: 1.0\n"
"Content-Type: application/x-publican; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"
"X-Generator: Publican v4.3.2\n"
msgid "RAID"
msgstr ""
msgid "LVM"
msgstr ""
msgid "FAI"
msgstr ""
msgid "Preseeding"
msgstr ""
msgid "Monitoring"
msgstr ""
msgid "Virtualization"
msgstr ""
msgid "Xen"
msgstr ""
msgid "LXC"
msgstr ""
msgid "Advanced Administration"
msgstr ""
msgid "This chapter revisits some aspects we already described, with a different perspective: instead of installing one single computer, we will study mass-deployment systems; instead of creating RAID or LVM volumes at install time, we'll learn to do it by hand so we can later revise our initial choices. Finally, we will discuss monitoring tools and virtualization techniques. As a consequence, this chapter is more particularly targeting professional administrators, and focuses somewhat less on individuals responsible for their home network."
msgstr ""
msgid "RAID and LVM"
msgstr ""
msgid "<xref linkend=\"installation\" /> presented these technologies from the point of view of the installer, and how it integrated them to make their deployment easy from the start. After the initial installation, an administrator must be able to handle evolving storage space needs without having to resort to an expensive reinstallation. They must therefore understand the required tools for manipulating RAID and LVM volumes."
msgstr ""
msgid "RAID and LVM are both techniques to abstract the mounted volumes from their physical counterparts (actual hard-disk drives or partitions thereof); the former secures the data against hardware failure by introducing redundancy, the latter makes volume management more flexible and independent of the actual size of the underlying disks. In both cases, the system ends up with new block devices, which can be used to create filesystems or swap space, without necessarily having them mapped to one physical disk. RAID and LVM come from quite different backgrounds, but their functionality can overlap somewhat, which is why they are often mentioned together."
msgstr ""
msgid "<emphasis>PERSPECTIVE</emphasis> Btrfs combines LVM and RAID"
msgstr ""
msgid "While LVM and RAID are two distinct kernel subsystems that come between the disk block devices and their filesystems, <emphasis>btrfs</emphasis> is a new filesystem, initially developed at Oracle, that purports to combine the featuresets of LVM and RAID and much more. It is mostly functional, and although it is still tagged “experimental” because its development is incomplete (some features aren't implemented yet), it has already seen some use in production environments. <ulink type=\"block\" url=\"http://btrfs.wiki.kernel.org/\" />"
msgstr ""
msgid "Among the noteworthy features are the ability to take a snapshot of a filesystem tree at any point in time. This snapshot copy doesn't initially use any disk space, the data only being duplicated when one of the copies is modified. The filesystem also handles transparent compression of files, and checksums ensure the integrity of all stored data."
msgstr ""
msgid "In both the RAID and LVM cases, the kernel provides a block device file, similar to the ones corresponding to a hard disk drive or a partition. When an application, or another part of the kernel, requires access to a block of such a device, the appropriate subsystem routes the block to the relevant physical layer. Depending on the configuration, this block can be stored on one or several physical disks, and its physical location may not be directly correlated to the location of the block in the logical device."
msgstr ""
msgid "Software RAID"
msgstr ""
msgid "<primary>RAID</primary>"
msgstr ""
msgid "RAID means <emphasis>Redundant Array of Independent Disks</emphasis>. The goal of this system is to prevent data loss in case of hard disk failure. The general principle is quite simple: data are stored on several physical disks instead of only one, with a configurable level of redundancy. Depending on this amount of redundancy, and even in the event of an unexpected disk failure, data can be losslessly reconstructed from the remaining disks."
msgstr ""
msgid "<emphasis>CULTURE</emphasis> <foreignphrase>Independent</foreignphrase> or <foreignphrase>inexpensive</foreignphrase>?"
msgstr ""
msgid "The I in RAID initially stood for <emphasis>inexpensive</emphasis>, because RAID allowed a drastic increase in data safety without requiring investing in expensive high-end disks. Probably due to image concerns, however, it is now more customarily considered to stand for <emphasis>independent</emphasis>, which doesn't have the unsavory flavour of cheapness."
msgstr ""
msgid "RAID can be implemented either by dedicated hardware (RAID modules integrated into SCSI or SATA controller cards) or by software abstraction (the kernel). Whether hardware or software, a RAID system with enough redundancy can transparently stay operational when a disk fails; the upper layers of the stack (applications) can even keep accessing the data in spite of the failure. Of course, this “degraded mode” can have an impact on performance, and redundancy is reduced, so a further disk failure can lead to data loss. In practice, therefore, one will strive to only stay in this degraded mode for as long as it takes to replace the failed disk. Once the new disk is in place, the RAID system can reconstruct the required data so as to return to a safe mode. The applications won't notice anything, apart from potentially reduced access speed, while the array is in degraded mode or during the reconstruction phase."
msgstr ""
msgid "When RAID is implemented by hardware, its configuration generally happens within the BIOS setup tool, and the kernel will consider a RAID array as a single disk, which will work as a standard physical disk, although the device name may be different (depending on the driver)."
msgstr ""
msgid "We only focus on software RAID in this book."
msgstr ""
msgid "Different RAID Levels"
msgstr ""
msgid "RAID is actually not a single system, but a range of systems identified by their levels; the levels differ by their layout and the amount of redundancy they provide. The more redundant, the more failure-proof, since the system will be able to keep working with more failed disks. The counterpart is that the usable space shrinks for a given set of disks; seen the other way, more disks will be needed to store a given amount of data."
msgstr ""
msgid "Linear RAID"
msgstr ""
msgid "Even though the kernel's RAID subsystem allows creating “linear RAID”, this is not proper RAID, since this setup doesn't involve any redundancy. The kernel merely aggregates several disks end-to-end and provides the resulting aggregated volume as one virtual disk (one block device). That's about its only function. This setup is rarely used by itself (see later for the exceptions), especially since the lack of redundancy means that one disk failing makes the whole aggregate, and therefore all the data, unavailable."
msgstr ""
msgid "RAID-0"
msgstr ""
msgid "This level doesn't provide any redundancy either, but disks aren't simply stuck on end one after another: they are divided in <emphasis>stripes</emphasis>, and the blocks on the virtual device are stored on stripes on alternating physical disks. In a two-disk RAID-0 setup, for instance, even-numbered blocks of the virtual device will be stored on the first physical disk, while odd-numbered blocks will end up on the second physical disk."
msgstr ""
msgid "This system doesn't aim at increasing reliability, since (as in the linear case) the availability of all the data is jeopardized as soon as one disk fails, but at increasing performance: during sequential access to large amounts of contiguous data, the kernel will be able to read from both disks (or write to them) in parallel, which increases the data transfer rate. However, RAID-0 use is shrinking, its niche being filled by LVM (see later)."
msgstr ""
msgid "RAID-1"
msgstr ""
msgid "This level, also known as “RAID mirroring”, is both the simplest and the most widely used setup. In its standard form, it uses two physical disks of the same size, and provides a logical volume of the same size again. Data are stored identically on both disks, hence the “mirror” nickname. When one disk fails, the data is still available on the other. For really critical data, RAID-1 can of course be set up on more than two disks, with a direct impact on the ratio of hardware cost versus available payload space."
msgstr ""
msgid "<emphasis>NOTE</emphasis> Disks and cluster sizes"
msgstr ""
msgid "If two disks of different sizes are set up in a mirror, the bigger one will not be fully used, since it will contain the same data as the smallest one and nothing more. The useful available space provided by a RAID-1 volume therefore matches the size of the smallest disk in the array. This still holds for RAID volumes with a higher RAID level, even though redundancy is stored differently."
msgstr ""
msgid "It is therefore important, when setting up RAID arrays (except for RAID-0 and “linear RAID”), to only assemble disks of identical, or very close, sizes, to avoid wasting resources."
msgstr ""
msgid "<emphasis>NOTE</emphasis> Spare disks"
msgstr ""
msgid "RAID levels that include redundancy allow assigning more disks than required to an array. The extra disks are used as spares when one of the main disks fails. For instance, in a mirror of two disks plus one spare, if one of the first two disks fails, the kernel will automatically (and immediately) reconstruct the mirror using the spare disk, so that redundancy stays assured after the reconstruction time. This can be used as another kind of safeguard for critical data."
msgstr ""
msgid "One would be forgiven for wondering how this is better than simply mirroring on three disks to start with. The advantage of the “spare disk” configuration is that the spare disk can be shared across several RAID volumes. For instance, one can have three mirrored volumes, with redundancy assured even in the event of one disk failure, with only seven disks (three pairs, plus one shared spare), instead of the nine disks that would be required by three triplets."
msgstr ""
msgid "This RAID level, although expensive (since only half of the physical storage space, at best, is useful), is widely used in practice. It is simple to understand, and it allows very simple backups: since both disks have identical contents, one of them can be temporarily extracted with no impact on the working system. Read performance is often increased since the kernel can read half of the data on each disk in parallel, while write performance isn't too severely degraded. In case of a RAID-1 array of N disks, the data stays available even with N-1 disk failures."
msgstr ""
msgid "RAID-4"
msgstr ""
msgid "This RAID level, not widely used, uses N disks to store useful data, and an extra disk to store redundancy information. If that disk fails, the system can reconstruct its contents from the other N. If one of the N data disks fails, the remaining N-1 combined with the “parity” disk contain enough information to reconstruct the required data."
msgstr ""
msgid "RAID-4 isn't too expensive since it only involves a one-in-N increase in costs and has no noticeable impact on read performance, but writes are slowed down. Furthermore, since a write to any of the N disks also involves a write to the parity disk, the latter sees many more writes than the former, and its lifespan can shorten dramatically as a consequence. Data on a RAID-4 array is safe only up to one failed disk (of the N+1)."
msgstr ""
msgid "RAID-5"
msgstr ""
msgid "RAID-5 addresses the asymmetry issue of RAID-4: parity blocks are spread over all of the N+1 disks, with no single disk having a particular role."
msgstr ""
msgid "Read and write performance are identical to RAID-4. Here again, the system stays functional with up to one failed disk (of the N+1), but no more."
msgstr ""
msgid "RAID-6"
msgstr ""
msgid "RAID-6 can be considered an extension of RAID-5, where each series of N blocks involves two redundancy blocks, and each such series of N+2 blocks is spread over N+2 disks."
msgstr ""
msgid "This RAID level is slightly more expensive than the previous two, but it brings some extra safety since up to two drives (of the N+2) can fail without compromising data availability. The counterpart is that write operations now involve writing one data block and two redundancy blocks, which makes them even slower."
msgstr ""
msgid "RAID-1+0"
msgstr ""
msgid "This isn't strictly speaking, a RAID level, but a stacking of two RAID groupings. Starting from 2×N disks, one first sets them up by pairs into N RAID-1 volumes; these N volumes are then aggregated into one, either by “linear RAID” or (increasingly) by LVM. This last case goes farther than pure RAID, but there's no problem with that."
msgstr ""
msgid "RAID-1+0 can survive multiple disk failures: up to N in the 2×N array described above, provided that at least one disk keeps working in each of the RAID-1 pairs."
msgstr ""
msgid "<emphasis>GOING FURTHER</emphasis> RAID-10"
msgstr ""
msgid "RAID-10 is generally considered a synonym of RAID-1+0, but a Linux specificity makes it actually a generalization. This setup allows a system where each block is stored on two different disks, even with an odd number of disks, the copies being spread out along a configurable model."
msgstr ""
msgid "Performances will vary depending on the chosen repartition model and redundancy level, and of the workload of the logical volume."
msgstr ""
msgid "Obviously, the RAID level will be chosen according to the constraints and requirements of each application. Note that a single computer can have several distinct RAID arrays with different configurations."
msgstr ""
msgid "Setting up RAID"
msgstr ""
msgid "<primary><emphasis role=\"pkg\">mdadm</emphasis></primary>"
msgstr ""
msgid "Setting up RAID volumes requires the <emphasis role=\"pkg\">mdadm</emphasis> package; it provides the <command>mdadm</command> command, which allows creating and manipulating RAID arrays, as well as scripts and tools integrating it to the rest of the system, including the monitoring system."
msgstr ""
msgid "Our example will be a server with a number of disks, some of which are already used, the rest being available to setup RAID. We initially have the following disks and partitions:"
msgstr ""
msgid "the <filename>sdb</filename> disk, 4 GB, is entirely available;"
msgstr ""
msgid "the <filename>sdc</filename> disk, 4 GB, is also entirely available;"
msgstr ""
msgid "on the <filename>sdd</filename> disk, only partition <filename>sdd2</filename> (about 4 GB) is available;"
msgstr ""
msgid "finally, a <filename>sde</filename> disk, still 4 GB, entirely available."
msgstr ""
msgid "<emphasis>NOTE</emphasis> Identifying existing RAID volumes"
msgstr ""
msgid "The <filename>/proc/mdstat</filename> file lists existing volumes and their states. When creating a new RAID volume, care should be taken not to name it the same as an existing volume."
msgstr ""
msgid "We're going to use these physical elements to build two volumes, one RAID-0 and one mirror (RAID-1). Let's start with the RAID-0 volume:"
msgstr ""
msgid ""
"<computeroutput># </computeroutput><userinput>mdadm --create /dev/md0 --level=0 --raid-devices=2 /dev/sdb /dev/sdc</userinput>\n"
"<computeroutput>mdadm: Defaulting to version 1.2 metadata\n"
"mdadm: array /dev/md0 started.\n"
"# </computeroutput><userinput>mdadm --query /dev/md0</userinput>\n"
"<computeroutput>/dev/md0: 8.00GiB raid0 2 devices, 0 spares. Use mdadm --detail for more detail.\n"
"# </computeroutput><userinput>mdadm --detail /dev/md0</userinput>\n"
"<computeroutput>/dev/md0:\n"
" Version : 1.2\n"
" Creation Time : Wed May 6 09:24:34 2015\n"
" Raid Level : raid0\n"
" Array Size : 8387584 (8.00 GiB 8.59 GB)\n"
" Raid Devices : 2\n"
" Total Devices : 2\n"
" Persistence : Superblock is persistent\n"
"\n"
" Update Time : Wed May 6 09:24:34 2015\n"
" State : clean \n"
" Active Devices : 2\n"
"Working Devices : 2\n"
" Failed Devices : 0\n"
" Spare Devices : 0\n"
"\n"
" Chunk Size : 512K\n"
"\n"
" Name : mirwiz:0 (local to host mirwiz)\n"
" UUID : bb085b35:28e821bd:20d697c9:650152bb\n"
" Events : 0\n"
"\n"
" Number Major Minor RaidDevice State\n"
" 0 8 16 0 active sync /dev/sdb\n"
" 1 8 32 1 active sync /dev/sdc\n"
"# </computeroutput><userinput>mkfs.ext4 /dev/md0</userinput>\n"
"<computeroutput>mke2fs 1.42.12 (29-Aug-2014)\n"
"Creating filesystem with 2095104 4k blocks and 524288 inodes\n"
"Filesystem UUID: fff08295-bede-41a9-9c6a-8c7580e520a6\n"
"Superblock backups stored on blocks: \n"
" 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632\n"
"\n"
"Allocating group tables: done \n"
"Writing inode tables: done \n"
"Creating journal (32768 blocks): done\n"
"Writing superblocks and filesystem accounting information: done \n"
"# </computeroutput><userinput>mkdir /srv/raid-0</userinput>\n"
"<computeroutput># </computeroutput><userinput>mount /dev/md0 /srv/raid-0</userinput>\n"
"<computeroutput># </computeroutput><userinput>df -h /srv/raid-0</userinput>\n"
"<computeroutput>Filesystem Size Used Avail Use% Mounted on\n"
"/dev/md0 7.9G 18M 7.4G 1% /srv/raid-0\n"
"</computeroutput>"
msgstr ""
msgid "The <command>mdadm --create</command> command requires several parameters: the name of the volume to create (<filename>/dev/md*</filename>, with MD standing for <foreignphrase>Multiple Device</foreignphrase>), the RAID level, the number of disks (which is compulsory despite being mostly meaningful only with RAID-1 and above), and the physical drives to use. Once the device is created, we can use it like we'd use a normal partition, create a filesystem on it, mount that filesystem, and so on. Note that our creation of a RAID-0 volume on <filename>md0</filename> is nothing but coincidence, and the numbering of the array doesn't need to be correlated to the chosen amount of redundancy. It's also possible to create named RAID arrays, by giving <command>mdadm</command> parameters such as <filename>/dev/md/linear</filename> instead of <filename>/dev/md0</filename>."
msgstr ""
msgid "Creation of a RAID-1 follows a similar fashion, the differences only being noticeable after the creation:"
msgstr ""
msgid ""
"<computeroutput># </computeroutput><userinput>mdadm --create /dev/md1 --level=1 --raid-devices=2 /dev/sdd2 /dev/sde</userinput>\n"
"<computeroutput>mdadm: Note: this array has metadata at the start and\n"
" may not be suitable as a boot device. If you plan to\n"
" store '/boot' on this device please ensure that\n"
" your boot-loader understands md/v1.x metadata, or use\n"
" --metadata=0.90\n"
"mdadm: largest drive (/dev/sdd2) exceeds size (4192192K) by more than 1%\n"
"Continue creating array? </computeroutput><userinput>y</userinput>\n"
"<computeroutput>mdadm: Defaulting to version 1.2 metadata\n"
"mdadm: array /dev/md1 started.\n"
"# </computeroutput><userinput>mdadm --query /dev/md1</userinput>\n"
"<computeroutput>/dev/md1: 4.00GiB raid1 2 devices, 0 spares. Use mdadm --detail for more detail.\n"
"# </computeroutput><userinput>mdadm --detail /dev/md1</userinput>\n"
"<computeroutput>/dev/md1:\n"
" Version : 1.2\n"
" Creation Time : Wed May 6 09:30:19 2015\n"
" Raid Level : raid1\n"
" Array Size : 4192192 (4.00 GiB 4.29 GB)\n"
" Used Dev Size : 4192192 (4.00 GiB 4.29 GB)\n"
" Raid Devices : 2\n"
" Total Devices : 2\n"
" Persistence : Superblock is persistent\n"
"\n"
" Update Time : Wed May 6 09:30:40 2015\n"
" State : clean, resyncing (PENDING) \n"
" Active Devices : 2\n"
"Working Devices : 2\n"
" Failed Devices : 0\n"
" Spare Devices : 0\n"
"\n"
" Name : mirwiz:1 (local to host mirwiz)\n"
" UUID : 6ec558ca:0c2c04a0:19bca283:95f67464\n"
" Events : 0\n"
"\n"
" Number Major Minor RaidDevice State\n"
" 0 8 50 0 active sync /dev/sdd2\n"
" 1 8 64 1 active sync /dev/sde\n"
"# </computeroutput><userinput>mdadm --detail /dev/md1</userinput>\n"
"<computeroutput>/dev/md1:\n"
"[...]\n"
" State : clean\n"
"[...]\n"
"</computeroutput>"
msgstr ""
msgid "<emphasis>TIP</emphasis> RAID, disks and partitions"
msgstr ""
msgid "As illustrated by our example, RAID devices can be constructed out of disk partitions, and do not require full disks."
msgstr ""
msgid "A few remarks are in order. First, <command>mdadm</command> notices that the physical elements have different sizes; since this implies that some space will be lost on the bigger element, a confirmation is required."
msgstr ""
msgid "More importantly, note the state of the mirror. The normal state of a RAID mirror is that both disks have exactly the same contents. However, nothing guarantees this is the case when the volume is first created. The RAID subsystem will therefore provide that guarantee itself, and there will be a synchronization phase as soon as the RAID device is created. After some time (the exact amount will depend on the actual size of the disks…), the RAID array switches to the “active” or “clean” state. Note that during this reconstruction phase, the mirror is in a degraded mode, and redundancy isn't assured. A disk failing during that risk window could lead to losing all the data. Large amounts of critical data, however, are rarely stored on a freshly created RAID array before its initial synchronization. Note that even in degraded mode, the <filename>/dev/md1</filename> is usable, and a filesystem can be created on it, as well as some data copied on it."
msgstr ""
msgid "<emphasis>TIP</emphasis> Starting a mirror in degraded mode"
msgstr ""
msgid "Sometimes two disks are not immediately available when one wants to start a RAID-1 mirror, for instance because one of the disks one plans to include is already used to store the data one wants to move to the array. In such circumstances, it is possible to deliberately create a degraded RAID-1 array by passing <filename>missing</filename> instead of a device file as one of the arguments to <command>mdadm</command>. Once the data have been copied to the “mirror”, the old disk can be added to the array. A synchronization will then take place, giving us the redundancy that was wanted in the first place."
msgstr ""
msgid "<emphasis>TIP</emphasis> Setting up a mirror without synchronization"
msgstr ""
msgid "RAID-1 volumes are often created to be used as a new disk, often considered blank. The actual initial contents of the disk is therefore not very relevant, since one only needs to know that the data written after the creation of the volume, in particular the filesystem, can be accessed later."
msgstr ""
msgid "One might therefore wonder about the point of synchronizing both disks at creation time. Why care whether the contents are identical on zones of the volume that we know will only be read after we have written to them?"
msgstr ""
msgid "Fortunately, this synchronization phase can be avoided by passing the <literal>--assume-clean</literal> option to <command>mdadm</command>. However, this option can lead to surprises in cases where the initial data will be read (for instance if a filesystem is already present on the physical disks), which is why it isn't enabled by default."
msgstr ""
msgid "Now let's see what happens when one of the elements of the RAID-1 array fails. <command>mdadm</command>, in particular its <literal>--fail</literal> option, allows simulating such a disk failure:"
msgstr ""
msgid ""
"<computeroutput># </computeroutput><userinput>mdadm /dev/md1 --fail /dev/sde</userinput>\n"
"<computeroutput>mdadm: set /dev/sde faulty in /dev/md1\n"
"# </computeroutput><userinput>mdadm --detail /dev/md1</userinput>\n"
"<computeroutput>/dev/md1:\n"
"[...]\n"
" Update Time : Wed May 6 09:39:39 2015\n"
" State : clean, degraded \n"
" Active Devices : 1\n"
"Working Devices : 1\n"
" Failed Devices : 1\n"
" Spare Devices : 0\n"
"\n"
" Name : mirwiz:1 (local to host mirwiz)\n"
" UUID : 6ec558ca:0c2c04a0:19bca283:95f67464\n"
" Events : 19\n"
"\n"
" Number Major Minor RaidDevice State\n"
" 0 8 50 0 active sync /dev/sdd2\n"
" 2 0 0 2 removed\n"
"\n"
" 1 8 64 - faulty /dev/sde</computeroutput>"
msgstr ""
msgid "The contents of the volume are still accessible (and, if it is mounted, the applications don't notice a thing), but the data safety isn't assured anymore: should the <filename>sdd</filename> disk fail in turn, the data would be lost. We want to avoid that risk, so we'll replace the failed disk with a new one, <filename>sdf</filename>:"
msgstr ""
msgid ""
"<computeroutput># </computeroutput><userinput>mdadm /dev/md1 --add /dev/sdf</userinput>\n"
"<computeroutput>mdadm: added /dev/sdf\n"
"# </computeroutput><userinput>mdadm --detail /dev/md1</userinput>\n"
"<computeroutput>/dev/md1:\n"
"[...]\n"
" Raid Devices : 2\n"
" Total Devices : 3\n"
" Persistence : Superblock is persistent\n"
"\n"
" Update Time : Wed May 6 09:48:49 2015\n"
" State : clean, degraded, recovering \n"
" Active Devices : 1\n"
"Working Devices : 2\n"
" Failed Devices : 1\n"
" Spare Devices : 1\n"
"\n"
" Rebuild Status : 28% complete\n"
"\n"
" Name : mirwiz:1 (local to host mirwiz)\n"
" UUID : 6ec558ca:0c2c04a0:19bca283:95f67464\n"
" Events : 26\n"
"\n"
" Number Major Minor RaidDevice State\n"
" 0 8 50 0 active sync /dev/sdd2\n"
" 2 8 80 1 spare rebuilding /dev/sdf\n"
"\n"
" 1 8 64 - faulty /dev/sde\n"
"# </computeroutput><userinput>[...]</userinput>\n"
"<computeroutput>[...]\n"
"# </computeroutput><userinput>mdadm --detail /dev/md1</userinput>\n"
"<computeroutput>/dev/md1:\n"
"[...]\n"
" Update Time : Wed May 6 09:49:08 2015\n"
" State : clean \n"
" Active Devices : 2\n"
"Working Devices : 2\n"
" Failed Devices : 1\n"
" Spare Devices : 0\n"
"\n"
" Name : mirwiz:1 (local to host mirwiz)\n"
" UUID : 6ec558ca:0c2c04a0:19bca283:95f67464\n"
" Events : 41\n"
"\n"
" Number Major Minor RaidDevice State\n"
" 0 8 50 0 active sync /dev/sdd2\n"
" 2 8 80 1 active sync /dev/sdf\n"
"\n"
" 1 8 64 - faulty /dev/sde</computeroutput>"
msgstr ""
msgid "Here again, the kernel automatically triggers a reconstruction phase during which the volume, although still accessible, is in a degraded mode. Once the reconstruction is over, the RAID array is back to a normal state. One can then tell the system that the <filename>sde</filename> disk is about to be removed from the array, so as to end up with a classical RAID mirror on two disks:"
msgstr ""
msgid ""
"<computeroutput># </computeroutput><userinput>mdadm /dev/md1 --remove /dev/sde</userinput>\n"
"<computeroutput>mdadm: hot removed /dev/sde from /dev/md1\n"
"# </computeroutput><userinput>mdadm --detail /dev/md1</userinput>\n"
"<computeroutput>/dev/md1:\n"
"[...]\n"
" Number Major Minor RaidDevice State\n"
" 0 8 50 0 active sync /dev/sdd2\n"
" 2 8 80 1 active sync /dev/sdf</computeroutput>"
msgstr ""
msgid "From then on, the drive can be physically removed when the server is next switched off, or even hot-removed when the hardware configuration allows hot-swap. Such configurations include some SCSI controllers, most SATA disks, and external drives operating on USB or Firewire."
msgstr ""
msgid "Backing up the Configuration"
msgstr ""
msgid "Most of the meta-data concerning RAID volumes are saved directly on the disks that make up these arrays, so that the kernel can detect the arrays and their components and assemble them automatically when the system starts up. However, backing up this configuration is encouraged, because this detection isn't fail-proof, and it is only expected that it will fail precisely in sensitive circumstances. In our example, if the <filename>sde</filename> disk failure had been real (instead of simulated) and the system had been restarted without removing this <filename>sde</filename> disk, this disk could start working again due to having been probed during the reboot. The kernel would then have three physical elements, each claiming to contain half of the same RAID volume. Another source of confusion can come when RAID volumes from two servers are consolidated onto one server only. If these arrays were running normally before the disks were moved, the kernel would be able to detect and reassemble the pairs properly; but if the moved disks had been aggregated into an <filename>md1</filename> on the old server, and the new server already has an <filename>md1</filename>, one of the mirrors would be renamed."
msgstr ""
msgid "Backing up the configuration is therefore important, if only for reference. The standard way to do it is by editing the <filename>/etc/mdadm/mdadm.conf</filename> file, an example of which is listed here:"
msgstr ""
msgid "<command>mdadm</command> configuration file"
msgstr ""
msgid ""
"# mdadm.conf\n"
"#\n"
"# Please refer to mdadm.conf(5) for information about this file.\n"
"#\n"
"\n"
"# by default (built-in), scan all partitions (/proc/partitions) and all\n"
"# containers for MD superblocks. alternatively, specify devices to scan, using\n"
"# wildcards if desired.\n"
"DEVICE /dev/sd*\n"
"\n"
"# auto-create devices with Debian standard permissions\n"
"CREATE owner=root group=disk mode=0660 auto=yes\n"
"\n"
"# automatically tag new arrays as belonging to the local system\n"
"HOMEHOST <system>\n"
"\n"
"# instruct the monitoring daemon where to send mail alerts\n"
"MAILADDR root\n"
"\n"
"# definitions of existing MD arrays\n"
"ARRAY /dev/md0 metadata=1.2 name=mirwiz:0 UUID=bb085b35:28e821bd:20d697c9:650152bb\n"
"ARRAY /dev/md1 metadata=1.2 name=mirwiz:1 UUID=6ec558ca:0c2c04a0:19bca283:95f67464\n"
"\n"
"# This configuration was auto-generated on Thu, 17 Jan 2013 16:21:01 +0100\n"
"# by mkconf 3.2.5-3"
msgstr ""
msgid "One of the most useful details is the <literal>DEVICE</literal> option, which lists the devices where the system will automatically look for components of RAID volumes at start-up time. In our example, we replaced the default value, <literal>partitions containers</literal>, with an explicit list of device files, since we chose to use entire disks and not only partitions, for some volumes."
msgstr ""
msgid "The last two lines in our example are those allowing the kernel to safely pick which volume number to assign to which array. The metadata stored on the disks themselves are enough to re-assemble the volumes, but not to determine the volume number (and the matching <filename>/dev/md*</filename> device name)."
msgstr ""
msgid "Fortunately, these lines can be generated automatically:"
msgstr ""
msgid ""
"<computeroutput># </computeroutput><userinput>mdadm --misc --detail --brief /dev/md?</userinput>\n"
"<computeroutput>ARRAY /dev/md0 metadata=1.2 name=mirwiz:0 UUID=bb085b35:28e821bd:20d697c9:650152bb\n"
"ARRAY /dev/md1 metadata=1.2 name=mirwiz:1 UUID=6ec558ca:0c2c04a0:19bca283:95f67464</computeroutput>"
msgstr ""
msgid "The contents of these last two lines doesn't depend on the list of disks included in the volume. It is therefore not necessary to regenerate these lines when replacing a failed disk with a new one. On the other hand, care must be taken to update the file when creating or deleting a RAID array."
msgstr ""
msgid "<primary>LVM</primary>"
msgstr ""
msgid "<primary>Logical Volume Manager</primary>"
msgstr ""
msgid "LVM, the <emphasis>Logical Volume Manager</emphasis>, is another approach to abstracting logical volumes from their physical supports, which focuses on increasing flexibility rather than increasing reliability. LVM allows changing a logical volume transparently as far as the applications are concerned; for instance, it is possible to add new disks, migrate the data to them, and remove the old disks, without unmounting the volume."
msgstr ""
msgid "LVM Concepts"
msgstr ""
msgid "This flexibility is attained by a level of abstraction involving three concepts."
msgstr ""
msgid "First, the PV (<emphasis>Physical Volume</emphasis>) is the entity closest to the hardware: it can be partitions on a disk, or a full disk, or even any other block device (including, for instance, a RAID array). Note that when a physical element is set up to be a PV for LVM, it should only be accessed via LVM, otherwise the system will get confused."
msgstr ""
msgid "A number of PVs can be clustered in a VG (<emphasis>Volume Group</emphasis>), which can be compared to disks both virtual and extensible. VGs are abstract, and don't appear in a device file in the <filename>/dev</filename> hierarchy, so there's no risk of using them directly."
msgstr ""
msgid "The third kind of object is the LV (<emphasis>Logical Volume</emphasis>), which is a chunk of a VG; if we keep the VG-as-disk analogy, the LV compares to a partition. The LV appears as a block device with an entry in <filename>/dev</filename>, and it can be used as any other physical partition can be (most commonly, to host a filesystem or swap space)."
msgstr ""
msgid "The important thing is that the splitting of a VG into LVs is entirely independent of its physical components (the PVs). A VG with only a single physical component (a disk for instance) can be split into a dozen logical volumes; similarly, a VG can use several physical disks and appear as a single large logical volume. The only constraint, obviously, is that the total size allocated to LVs can't be bigger than the total capacity of the PVs in the volume group."
msgstr ""
msgid "It often makes sense, however, to have some kind of homogeneity among the physical components of a VG, and to split the VG into logical volumes that will have similar usage patterns. For instance, if the available hardware includes fast disks and slower disks, the fast ones could be clustered into one VG and the slower ones into another; chunks of the first one can then be assigned to applications requiring fast data access, while the second one will be kept for less demanding tasks."
msgstr ""
msgid "In any case, keep in mind that an LV isn't particularly attached to any one PV. It is possible to influence where the data from an LV are physically stored, but this possibility isn't required for day-to-day use. On the contrary: when the set of physical components of a VG evolves, the physical storage locations corresponding to a particular LV can be migrated across disks (while staying within the PVs assigned to the VG, of course)."
msgstr ""
msgid "Setting up LVM"
msgstr ""
msgid "Let us now follow, step by step, the process of setting up LVM for a typical use case: we want to simplify a complex storage situation. Such a situation usually happens after some long and convoluted history of accumulated temporary measures. For the purposes of illustration, we'll consider a server where the storage needs have changed over time, ending up in a maze of available partitions split over several partially used disks. In more concrete terms, the following partitions are available:"
msgstr ""
msgid "on the <filename>sdb</filename> disk, a <filename>sdb2</filename> partition, 4 GB;"
msgstr ""
msgid "on the <filename>sdc</filename> disk, a <filename>sdc3</filename> partition, 3 GB;"
msgstr ""
msgid "the <filename>sdd</filename> disk, 4 GB, is fully available;"
msgstr ""
msgid "on the <filename>sdf</filename> disk, a <filename>sdf1</filename> partition, 4 GB; and a <filename>sdf2</filename> partition, 5 GB."
msgstr ""
msgid "In addition, let's assume that disks <filename>sdb</filename> and <filename>sdf</filename> are faster than the other two."
msgstr ""
msgid "Our goal is to set up three logical volumes for three different applications: a file server requiring 5 GB of storage space, a database (1 GB) and some space for back-ups (12 GB). The first two need good performance, but back-ups are less critical in terms of access speed. All these constraints prevent the use of partitions on their own; using LVM can abstract the physical size of the devices, so the only limit is the total available space."
msgstr ""
msgid "The required tools are in the <emphasis role=\"pkg\">lvm2</emphasis> package and its dependencies. When they're installed, setting up LVM takes three steps, matching the three levels of concepts."
msgstr ""
msgid "First, we prepare the physical volumes using <command>pvcreate</command>:"
msgstr ""
msgid ""
"<computeroutput># </computeroutput><userinput>pvdisplay</userinput>\n"
"<computeroutput># </computeroutput><userinput>pvcreate /dev/sdb2</userinput>\n"
"<computeroutput> Physical volume \"/dev/sdb2\" successfully created\n"
"# </computeroutput><userinput>pvdisplay</userinput>\n"
"<computeroutput> \"/dev/sdb2\" is a new physical volume of \"4.00 GiB\"\n"
" --- NEW Physical volume ---\n"
" PV Name /dev/sdb2\n"
" VG Name \n"
" PV Size 4.00 GiB\n"
" Allocatable NO\n"
" PE Size 0 \n"
" Total PE 0\n"
" Free PE 0\n"
" Allocated PE 0\n"
" PV UUID 0zuiQQ-j1Oe-P593-4tsN-9FGy-TY0d-Quz31I\n"
"\n"
"# </computeroutput><userinput>for i in sdc3 sdd sdf1 sdf2 ; do pvcreate /dev/$i ; done</userinput>\n"
"<computeroutput> Physical volume \"/dev/sdc3\" successfully created\n"
" Physical volume \"/dev/sdd\" successfully created\n"
" Physical volume \"/dev/sdf1\" successfully created\n"
" Physical volume \"/dev/sdf2\" successfully created\n"
"# </computeroutput><userinput>pvdisplay -C</userinput>\n"
"<computeroutput> PV VG Fmt Attr PSize PFree\n"
" /dev/sdb2 lvm2 --- 4.00g 4.00g\n"
" /dev/sdc3 lvm2 --- 3.09g 3.09g\n"
" /dev/sdd lvm2 --- 4.00g 4.00g\n"
" /dev/sdf1 lvm2 --- 4.10g 4.10g\n"
" /dev/sdf2 lvm2 --- 5.22g 5.22g\n"
"</computeroutput>"
msgstr ""
msgid "So far, so good; note that a PV can be set up on a full disk as well as on individual partitions of it. As shown above, the <command>pvdisplay</command> command lists the existing PVs, with two possible output formats."
msgstr ""
msgid "Now let's assemble these physical elements into VGs using <command>vgcreate</command>. We'll gather only PVs from the fast disks into a <filename>vg_critical</filename> VG; the other VG, <filename>vg_normal</filename>, will also include slower elements."
msgstr ""
msgid ""
"<computeroutput># </computeroutput><userinput>vgdisplay</userinput>\n"
"<computeroutput> No volume groups found\n"
"# </computeroutput><userinput>vgcreate vg_critical /dev/sdb2 /dev/sdf1</userinput>\n"
"<computeroutput> Volume group \"vg_critical\" successfully created\n"
"# </computeroutput><userinput>vgdisplay</userinput>\n"
"<computeroutput> --- Volume group ---\n"
" VG Name vg_critical\n"
" System ID \n"
" Format lvm2\n"
" Metadata Areas 2\n"
" Metadata Sequence No 1\n"
" VG Access read/write\n"
" VG Status resizable\n"
" MAX LV 0\n"
" Cur LV 0\n"
" Open LV 0\n"
" Max PV 0\n"
" Cur PV 2\n"
" Act PV 2\n"
" VG Size 8.09 GiB\n"
" PE Size 4.00 MiB\n"
" Total PE 2071\n"
" Alloc PE / Size 0 / 0 \n"
" Free PE / Size 2071 / 8.09 GiB\n"
" VG UUID bpq7zO-PzPD-R7HW-V8eN-c10c-S32h-f6rKqp\n"
"\n"
"# </computeroutput><userinput>vgcreate vg_normal /dev/sdc3 /dev/sdd /dev/sdf2</userinput>\n"
"<computeroutput> Volume group \"vg_normal\" successfully created\n"
"# </computeroutput><userinput>vgdisplay -C</userinput>\n"
"<computeroutput> VG #PV #LV #SN Attr VSize VFree \n"
" vg_critical 2 0 0 wz--n- 8.09g 8.09g\n"
" vg_normal 3 0 0 wz--n- 12.30g 12.30g\n"
"</computeroutput>"
msgstr ""
msgid "Here again, commands are rather straightforward (and <command>vgdisplay</command> proposes two output formats). Note that it is quite possible to use two partitions of the same physical disk into two different VGs. Note also that we used a <filename>vg_</filename> prefix to name our VGs, but it is nothing more than a convention."
msgstr ""
msgid "We now have two “virtual disks”, sized about 8 GB and 12 GB, respectively. Let's now carve them up into “virtual partitions” (LVs). This involves the <command>lvcreate</command> command, and a slightly more complex syntax:"
msgstr ""
msgid ""
"<computeroutput># </computeroutput><userinput>lvdisplay</userinput>\n"
"<computeroutput># </computeroutput><userinput>lvcreate -n lv_files -L 5G vg_critical</userinput>\n"
"<computeroutput> Logical volume \"lv_files\" created\n"
"# </computeroutput><userinput>lvdisplay</userinput>\n"
"<computeroutput> --- Logical volume ---\n"
" LV Path /dev/vg_critical/lv_files\n"
" LV Name lv_files\n"
" VG Name vg_critical\n"
" LV UUID J3V0oE-cBYO-KyDe-5e0m-3f70-nv0S-kCWbpT\n"
" LV Write Access read/write\n"
" LV Creation host, time mirwiz, 2015-06-10 06:10:50 -0400\n"
" LV Status available\n"
" # open 0\n"
" LV Size 5.00 GiB\n"
" Current LE 1280\n"
" Segments 2\n"
" Allocation inherit\n"
" Read ahead sectors auto\n"
" - currently set to 256\n"
" Block device 253:0\n"
"\n"
"# </computeroutput><userinput>lvcreate -n lv_base -L 1G vg_critical</userinput>\n"
"<computeroutput> Logical volume \"lv_base\" created\n"
"# </computeroutput><userinput>lvcreate -n lv_backups -L 12G vg_normal</userinput>\n"
"<computeroutput> Logical volume \"lv_backups\" created\n"
"# </computeroutput><userinput>lvdisplay -C</userinput>\n"
"<computeroutput> LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert\n"
" lv_base vg_critical -wi-a--- 1.00g \n"
" lv_files vg_critical -wi-a--- 5.00g \n"
" lv_backups vg_normal -wi-a--- 12.00g</computeroutput>"
msgstr ""
msgid "Two parameters are required when creating logical volumes; they must be passed to the <command>lvcreate</command> as options. The name of the LV to be created is specified with the <literal>-n</literal> option, and its size is generally given using the <literal>-L</literal> option. We also need to tell the command what VG to operate on, of course, hence the last parameter on the command line."
msgstr ""
msgid "<emphasis>GOING FURTHER</emphasis> <command>lvcreate</command> options"
msgstr ""
msgid "The <command>lvcreate</command> command has several options to allow tweaking how the LV is created."
msgstr ""
msgid "Let's first describe the <literal>-l</literal> option, with which the LV's size can be given as a number of blocks (as opposed to the “human” units we used above). These blocks (called PEs, <emphasis>physical extents</emphasis>, in LVM terms) are contiguous units of storage space in PVs, and they can't be split across LVs. When one wants to define storage space for an LV with some precision, for instance to use the full available space, the <literal>-l</literal> option will probably be preferred over <literal>-L</literal>."
msgstr ""
msgid "It's also possible to hint at the physical location of an LV, so that its extents are stored on a particular PV (while staying within the ones assigned to the VG, of course). Since we know that <filename>sdb</filename> is faster than <filename>sdf</filename>, we may want to store the <filename>lv_base</filename> there if we want to give an advantage to the database server compared to the file server. The command line becomes: <command>lvcreate -n lv_base -L 1G vg_critical /dev/sdb2</command>. Note that this command can fail if the PV doesn't have enough free extents. In our example, we would probably have to create <filename>lv_base</filename> before <filename>lv_files</filename> to avoid this situation – or free up some space on <filename>sdb2</filename> with the <command>pvmove</command> command."
msgstr ""
msgid "Logical volumes, once created, end up as block device files in <filename>/dev/mapper/</filename>:"
msgstr ""
msgid ""
"<computeroutput># </computeroutput><userinput>ls -l /dev/mapper</userinput>\n"
"<computeroutput>total 0\n"
"crw------- 1 root root 10, 236 Jun 10 16:52 control\n"
"lrwxrwxrwx 1 root root 7 Jun 10 17:05 vg_critical-lv_base -> ../dm-1\n"
"lrwxrwxrwx 1 root root 7 Jun 10 17:05 vg_critical-lv_files -> ../dm-0\n"
"lrwxrwxrwx 1 root root 7 Jun 10 17:05 vg_normal-lv_backups -> ../dm-2\n"
"# </computeroutput><userinput>ls -l /dev/dm-*</userinput>\n"
"<computeroutput>brw-rw---T 1 root disk 253, 0 Jun 10 17:05 /dev/dm-0\n"
"brw-rw---- 1 root disk 253, 1 Jun 10 17:05 /dev/dm-1\n"
"brw-rw---- 1 root disk 253, 2 Jun 10 17:05 /dev/dm-2\n"
"</computeroutput>"
msgstr ""
msgid "<emphasis>NOTE</emphasis> Autodetecting LVM volumes"
msgstr ""
msgid "When the computer boots, the <filename>lvm2-activation</filename> systemd service unit executes <command>vgchange -aay</command> to “activate” the volume groups: it scans the available devices; those that have been initialized as physical volumes for LVM are registered into the LVM subsystem, those that belong to volume groups are assembled, and the relevant logical volumes are started and made available. There is therefore no need to edit configuration files when creating or modifying LVM volumes."
msgstr ""
msgid "Note, however, that the layout of the LVM elements (physical and logical volumes, and volume groups) is backed up in <filename>/etc/lvm/backup</filename>, which can be useful in case of a problem (or just to sneak a peek under the hood)."
msgstr ""
msgid "To make things easier, convenience symbolic links are also created in directories matching the VGs:"
msgstr ""
msgid ""
"<computeroutput># </computeroutput><userinput>ls -l /dev/vg_critical</userinput>\n"
"<computeroutput>total 0\n"
"lrwxrwxrwx 1 root root 7 Jun 10 17:05 lv_base -> ../dm-1\n"
"lrwxrwxrwx 1 root root 7 Jun 10 17:05 lv_files -> ../dm-0\n"
"# </computeroutput><userinput>ls -l /dev/vg_normal</userinput>\n"
"<computeroutput>total 0\n"
"lrwxrwxrwx 1 root root 7 Jun 10 17:05 lv_backups -> ../dm-2</computeroutput>"
msgstr ""
msgid "The LVs can then be used exactly like standard partitions:"
msgstr ""
msgid ""
"<computeroutput># </computeroutput><userinput>mkfs.ext4 /dev/vg_normal/lv_backups</userinput>\n"
"<computeroutput>mke2fs 1.42.12 (29-Aug-2014)\n"
"Creating filesystem with 3145728 4k blocks and 786432 inodes\n"
"Filesystem UUID: b5236976-e0e2-462e-81f5-0ae835ddab1d\n"
"[...]\n"
"Creating journal (32768 blocks): done\n"
"Writing superblocks and filesystem accounting information: done \n"
"# </computeroutput><userinput>mkdir /srv/backups</userinput>\n"
"<computeroutput># </computeroutput><userinput>mount /dev/vg_normal/lv_backups /srv/backups</userinput>\n"
"<computeroutput># </computeroutput><userinput>df -h /srv/backups</userinput>\n"
"<computeroutput>Filesystem Size Used Avail Use% Mounted on\n"
"/dev/mapper/vg_normal-lv_backups 12G 30M 12G 1% /srv/backups\n"
"# </computeroutput><userinput>[...]</userinput>\n"
"<computeroutput>[...]\n"
"# </computeroutput><userinput>cat /etc/fstab</userinput>\n"
"<computeroutput>[...]\n"
"/dev/vg_critical/lv_base /srv/base ext4 defaults 0 2\n"
"/dev/vg_critical/lv_files /srv/files ext4 defaults 0 2\n"
"/dev/vg_normal/lv_backups /srv/backups ext4 defaults 0 2</computeroutput>"
msgstr ""
msgid "From the applications' point of view, the myriad small partitions have now been abstracted into one large 12 GB volume, with a friendlier name."
msgstr ""
msgid "LVM Over Time"
msgstr ""
msgid "Even though the ability to aggregate partitions or physical disks is convenient, this is not the main advantage brought by LVM. The flexibility it brings is especially noticed as time passes, when needs evolve. In our example, let's assume that new large files must be stored, and that the LV dedicated to the file server is too small to contain them. Since we haven't used the whole space available in <filename>vg_critical</filename>, we can grow <filename>lv_files</filename>. For that purpose, we'll use the <command>lvresize</command> command, then <command>resize2fs</command> to adapt the filesystem accordingly:"
msgstr ""
msgid ""
"<computeroutput># </computeroutput><userinput>df -h /srv/files/</userinput>\n"
"<computeroutput>Filesystem Size Used Avail Use% Mounted on\n"
"/dev/mapper/vg_critical-lv_files 5.0G 4.6G 146M 97% /srv/files\n"
"# </computeroutput><userinput>lvdisplay -C vg_critical/lv_files</userinput>\n"
"<computeroutput> LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert\n"
" lv_files vg_critical -wi-ao-- 5.00g\n"
"# </computeroutput><userinput>vgdisplay -C vg_critical</userinput>\n"
"<computeroutput> VG #PV #LV #SN Attr VSize VFree\n"
" vg_critical 2 2 0 wz--n- 8.09g 2.09g\n"
"# </computeroutput><userinput>lvresize -L 7G vg_critical/lv_files</userinput>\n"
"<computeroutput> Size of logical volume vg_critical/lv_files changed from 5.00 GiB (1280 extents) to 7.00 GiB (1792 extents).\n"
" Logical volume lv_files successfully resized\n"
"# </computeroutput><userinput>lvdisplay -C vg_critical/lv_files</userinput>\n"
"<computeroutput> LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert\n"
" lv_files vg_critical -wi-ao-- 7.00g\n"
"# </computeroutput><userinput>resize2fs /dev/vg_critical/lv_files</userinput>\n"
"<computeroutput>resize2fs 1.42.12 (29-Aug-2014)\n"
"Filesystem at /dev/vg_critical/lv_files is mounted on /srv/files; on-line resizing required\n"
"old_desc_blocks = 1, new_desc_blocks = 1\n"
"The filesystem on /dev/vg_critical/lv_files is now 1835008 (4k) blocks long.\n"
"\n"
"# </computeroutput><userinput>df -h /srv/files/</userinput>\n"
"<computeroutput>Filesystem Size Used Avail Use% Mounted on\n"
"/dev/mapper/vg_critical-lv_files 6.9G 4.6G 2.1G 70% /srv/files</computeroutput>"
msgstr ""
msgid "<emphasis>CAUTION</emphasis> Resizing filesystems"
msgstr ""
msgid "Not all filesystems can be resized online; resizing a volume can therefore require unmounting the filesystem first and remounting it afterwards. Of course, if one wants to shrink the space allocated to an LV, the filesystem must be shrunk first; the order is reversed when the resizing goes in the other direction: the logical volume must be grown before the filesystem on it. It's rather straightforward, since at no time must the filesystem size be larger than the block device where it resides (whether that device is a physical partition or a logical volume)."
msgstr ""
msgid "The ext3, ext4 and xfs filesystems can be grown online, without unmounting; shrinking requires an unmount. The reiserfs filesystem allows online resizing in both directions. The venerable ext2 allows neither, and always requires unmounting."
msgstr ""
msgid "We could proceed in a similar fashion to extend the volume hosting the database, only we've reached the VG's available space limit:"
msgstr ""
msgid ""
"<computeroutput># </computeroutput><userinput>df -h /srv/base/</userinput>\n"
"<computeroutput>Filesystem Size Used Avail Use% Mounted on\n"
"/dev/mapper/vg_critical-lv_base 1008M 854M 104M 90% /srv/base\n"
"# </computeroutput><userinput>vgdisplay -C vg_critical</userinput>\n"
"<computeroutput> VG #PV #LV #SN Attr VSize VFree \n"
" vg_critical 2 2 0 wz--n- 8.09g 92.00m</computeroutput>"
msgstr ""
msgid "No matter, since LVM allows adding physical volumes to existing volume groups. For instance, maybe we've noticed that the <filename>sdb1</filename> partition, which was so far used outside of LVM, only contained archives that could be moved to <filename>lv_backups</filename>. We can now recycle it and integrate it to the volume group, and thereby reclaim some available space. This is the purpose of the <command>vgextend</command> command. Of course, the partition must be prepared as a physical volume beforehand. Once the VG has been extended, we can use similar commands as previously to grow the logical volume then the filesystem:"
msgstr ""
msgid ""
"<computeroutput># </computeroutput><userinput>pvcreate /dev/sdb1</userinput>\n"
"<computeroutput> Physical volume \"/dev/sdb1\" successfully created\n"
"# </computeroutput><userinput>vgextend vg_critical /dev/sdb1</userinput>\n"
"<computeroutput> Volume group \"vg_critical\" successfully extended\n"
"# </computeroutput><userinput>vgdisplay -C vg_critical</userinput>\n"
"<computeroutput> VG #PV #LV #SN Attr VSize VFree\n"
" vg_critical 3 2 0 wz--n- 9.09g 1.09g\n"
"# </computeroutput><userinput>[...]</userinput>\n"
"<computeroutput>[...]\n"
"# </computeroutput><userinput>df -h /srv/base/</userinput>\n"
"<computeroutput>Filesystem Size Used Avail Use% Mounted on\n"
"/dev/mapper/vg_critical-lv_base 2.0G 854M 1.1G 45% /srv/base</computeroutput>"
msgstr ""
msgid "<emphasis>GOING FURTHER</emphasis> Advanced LVM"
msgstr ""
msgid "LVM also caters for more advanced uses, where many details can be specified by hand. For instance, an administrator can tweak the size of the blocks that make up physical and logical volumes, as well as their physical layout. It is also possible to move blocks across PVs, for instance to fine-tune performance or, in a more mundane way, to free a PV when one needs to extract the corresponding physical disk from the VG (whether to affect it to another VG or to remove it from LVM altogether). The manual pages describing the commands are generally clear and detailed. A good entry point is the <citerefentry><refentrytitle>lvm</refentrytitle> <manvolnum>8</manvolnum></citerefentry> manual page."
msgstr ""
msgid "RAID or LVM?"
msgstr ""
msgid "RAID and LVM both bring indisputable advantages as soon as one leaves the simple case of a desktop computer with a single hard disk where the usage pattern doesn't change over time. However, RAID and LVM go in two different directions, with diverging goals, and it is legitimate to wonder which one should be adopted. The most appropriate answer will of course depend on current and foreseeable requirements."
msgstr ""
msgid "There are a few simple cases where the question doesn't really arise. If the requirement is to safeguard data against hardware failures, then obviously RAID will be set up on a redundant array of disks, since LVM doesn't really address this problem. If, on the other hand, the need is for a flexible storage scheme where the volumes are made independent of the physical layout of the disks, RAID doesn't help much and LVM will be the natural choice."
msgstr ""
msgid "<emphasis>NOTE</emphasis> If performance matters…"
msgstr ""
msgid "If input/output speed is of the essence, especially in terms of access times, using LVM and/or RAID in one of the many combinations may have some impact on performances, and this may influence decisions as to which to pick. However, these differences in performance are really minor, and will only be measurable in a few use cases. If performance matters, the best gain to be obtained would be to use non-rotating storage media (<indexterm><primary>SSD</primary></indexterm><emphasis>solid-state drives</emphasis> or SSDs); their cost per megabyte is higher than that of standard hard disk drives, and their capacity is usually smaller, but they provide excellent performance for random accesses. If the usage pattern includes many input/output operations scattered all around the filesystem, for instance for databases where complex queries are routinely being run, then the advantage of running them on an SSD far outweigh whatever could be gained by picking LVM over RAID or the reverse. In these situations, the choice should be determined by other considerations than pure speed, since the performance aspect is most easily handled by using SSDs."
msgstr ""
msgid "The third notable use case is when one just wants to aggregate two disks into one volume, either for performance reasons or to have a single filesystem that is larger than any of the available disks. This case can be addressed both by a RAID-0 (or even linear-RAID) and by an LVM volume. When in this situation, and barring extra constraints (for instance, keeping in line with the rest of the computers if they only use RAID), the configuration of choice will often be LVM. The initial set up is barely more complex, and that slight increase in complexity more than makes up for the extra flexibility that LVM brings if the requirements change or if new disks need to be added."
msgstr ""
msgid "Then of course, there is the really interesting use case, where the storage system needs to be made both resistant to hardware failure and flexible when it comes to volume allocation. Neither RAID nor LVM can address both requirements on their own; no matter, this is where we use both at the same time — or rather, one on top of the other. The scheme that has all but become a standard since RAID and LVM have reached maturity is to ensure data redundancy first by grouping disks in a small number of large RAID arrays, and to use these RAID arrays as LVM physical volumes; logical partitions will then be carved from these LVs for filesystems. The selling point of this setup is that when a disk fails, only a small number of RAID arrays will need to be reconstructed, thereby limiting the time spent by the administrator for recovery."
msgstr ""
msgid "Let's take a concrete example: the public relations department at Falcot Corp needs a workstation for video editing, but the department's budget doesn't allow investing in high-end hardware from the bottom up. A decision is made to favor the hardware that is specific to the graphic nature of the work (monitor and video card), and to stay with generic hardware for storage. However, as is widely known, digital video does have some particular requirements for its storage: the amount of data to store is large, and the throughput rate for reading and writing this data is important for the overall system performance (more than typical access time, for instance). These constraints need to be fulfilled with generic hardware, in this case two 300 GB SATA hard disk drives; the system data must also be made resistant to hardware failure, as well as some of the user data. Edited videoclips must indeed be safe, but video rushes pending editing are less critical, since they're still on the videotapes."
msgstr ""
msgid "RAID-1 and LVM are combined to satisfy these constraints. The disks are attached to two different SATA controllers to optimize parallel access and reduce the risk of a simultaneous failure, and they therefore appear as <filename>sda</filename> and <filename>sdc</filename>. They are partitioned identically along the following scheme:"
msgstr ""
msgid ""
"<computeroutput># </computeroutput><userinput>fdisk -l /dev/sda</userinput>\n"
"<computeroutput>\n"
"Disk /dev/sda: 300 GB, 300090728448 bytes, 586114704 sectors\n"
"Units: sectors of 1 * 512 = 512 bytes\n"
"Sector size (logical/physical): 512 bytes / 512 bytes\n"
"I/O size (minimum/optimal): 512 bytes / 512 bytes\n"
"Disklabel type: dos\n"
"Disk identifier: 0x00039a9f\n"
"\n"
"Device Boot Start End Sectors Size Id Type\n"
"/dev/sda1 * 2048 1992060 1990012 1.0G fd Linux raid autodetect\n"
"/dev/sda2 1992061 3984120 1992059 1.0G 82 Linux swap / Solaris\n"
"/dev/sda3 4000185 586099395 582099210 298G 5 Extended\n"
"/dev/sda5 4000185 203977305 199977120 102G fd Linux raid autodetect\n"
"/dev/sda6 203977306 403970490 199993184 102G fd Linux raid autodetect\n"
"/dev/sda7 403970491 586099395 182128904 93G 8e Linux LVM</computeroutput>"
msgstr ""
msgid "The first partitions of both disks (about 1 GB) are assembled into a RAID-1 volume, <filename>md0</filename>. This mirror is directly used to store the root filesystem."
msgstr ""
msgid "The <filename>sda2</filename> and <filename>sdc2</filename> partitions are used as swap partitions, providing a total 2 GB of swap space. With 1 GB of RAM, the workstation has a comfortable amount of available memory."
msgstr ""
msgid "The <filename>sda5</filename> and <filename>sdc5</filename> partitions, as well as <filename>sda6</filename> and <filename>sdc6</filename>, are assembled into two new RAID-1 volumes of about 100 GB each, <filename>md1</filename> and <filename>md2</filename>. Both these mirrors are initialized as physical volumes for LVM, and assigned to the <filename>vg_raid</filename> volume group. This VG thus contains about 200 GB of safe space."
msgstr ""
msgid "The remaining partitions, <filename>sda7</filename> and <filename>sdc7</filename>, are directly used as physical volumes, and assigned to another VG called <filename>vg_bulk</filename>, which therefore ends up with roughly 200 GB of space."
msgstr ""
msgid "Once the VGs are created, they can be partitioned in a very flexible way. One must keep in mind that LVs created in <filename>vg_raid</filename> will be preserved even if one of the disks fails, which will not be the case for LVs created in <filename>vg_bulk</filename>; on the other hand, the latter will be allocated in parallel on both disks, which allows higher read or write speeds for large files."
msgstr ""
msgid "We will therefore create the <filename>lv_usr</filename>, <filename>lv_var</filename> and <filename>lv_home</filename> LVs on <filename>vg_raid</filename>, to host the matching filesystems; another large LV, <filename>lv_movies</filename>, will be used to host the definitive versions of movies after editing. The other VG will be split into a large <filename>lv_rushes</filename>, for data straight out of the digital video cameras, and a <filename>lv_tmp</filename> for temporary files. The location of the work area is a less straightforward choice to make: while good performance is needed for that volume, is it worth risking losing work if a disk fails during an editing session? Depending on the answer to that question, the relevant LV will be created on one VG or the other."
msgstr ""
msgid "We now have both some redundancy for important data and much flexibility in how the available space is split across the applications. Should new software be installed later on (for editing audio clips, for instance), the LV hosting <filename>/usr/</filename> can be grown painlessly."
msgstr ""
msgid "<emphasis>NOTE</emphasis> Why three RAID-1 volumes?"
msgstr ""
msgid "We could have set up one RAID-1 volume only, to serve as a physical volume for <filename>vg_raid</filename>. Why create three of them, then?"
msgstr ""
msgid "The rationale for the first split (<filename>md0</filename> vs. the others) is about data safety: data written to both elements of a RAID-1 mirror are exactly the same, and it is therefore possible to bypass the RAID layer and mount one of the disks directly. In case of a kernel bug, for instance, or if the LVM metadata become corrupted, it is still possible to boot a minimal system to access critical data such as the layout of disks in the RAID and LVM volumes; the metadata can then be reconstructed and the files can be accessed again, so that the system can be brought back to its nominal state."
msgstr ""
msgid "The rationale for the second split (<filename>md1</filename> vs. <filename>md2</filename>) is less clear-cut, and more related to acknowledging that the future is uncertain. When the workstation is first assembled, the exact storage requirements are not necessarily known with perfect precision; they can also evolve over time. In our case, we can't know in advance the actual storage space requirements for video rushes and complete video clips. If one particular clip needs a very large amount of rushes, and the VG dedicated to redundant data is less than halfway full, we can re-use some of its unneeded space. We can remove one of the physical volumes, say <filename>md2</filename>, from <filename>vg_raid</filename> and either assign it to <filename>vg_bulk</filename> directly (if the expected duration of the operation is short enough that we can live with the temporary drop in performance), or undo the RAID setup on <filename>md2</filename> and integrate its components <filename>sda6</filename> and <filename>sdc6</filename> into the bulk VG (which grows by 200 GB instead of 100 GB); the <filename>lv_rushes</filename> logical volume can then be grown according to requirements."
msgstr ""
msgid "<primary>virtualization</primary>"
msgstr ""
msgid "Virtualization is one of the most major advances in the recent years of computing. The term covers various abstractions and techniques simulating virtual computers with a variable degree of independence on the actual hardware. One physical server can then host several systems working at the same time and in isolation. Applications are many, and often derive from this isolation: test environments with varying configurations for instance, or separation of hosted services across different virtual machines for security."
msgstr ""
msgid "There are multiple virtualization solutions, each with its own pros and cons. This book will focus on Xen, LXC, and KVM, but other noteworthy implementations include the following:"
msgstr ""
msgid "<primary><emphasis>VMWare</emphasis></primary>"
msgstr ""
msgid "<primary><emphasis>Bochs</emphasis></primary>"
msgstr ""
msgid "<primary><emphasis>QEMU</emphasis></primary>"
msgstr ""
msgid "<primary><emphasis>VirtualBox</emphasis></primary>"
msgstr ""
msgid "<primary><emphasis>KVM</emphasis></primary>"
msgstr ""
msgid "<primary><emphasis>LXC</emphasis></primary>"
msgstr ""
msgid "QEMU is a software emulator for a full computer; performances are far from the speed one could achieve running natively, but this allows running unmodified or experimental operating systems on the emulated hardware. It also allows emulating a different hardware architecture: for instance, an <emphasis>amd64</emphasis> system can emulate an <emphasis>arm</emphasis> computer. QEMU is free software. <ulink type=\"block\" url=\"http://www.qemu.org/\" />"
msgstr ""
msgid "Bochs is another free virtual machine, but it only emulates the x86 architectures (i386 and amd64)."
msgstr ""
msgid "VMWare is a proprietary virtual machine; being one of the oldest out there, it is also one of the most widely-known. It works on principles similar to QEMU. VMWare proposes advanced features such as snapshotting a running virtual machine. <ulink type=\"block\" url=\"http://www.vmware.com/\" />"
msgstr ""
msgid "VirtualBox is a virtual machine that is mostly free software (some extra components are available under a proprietary license). Unfortunately it is in Debian's “contrib” section because it includes some precompiled files that cannot be rebuilt without a proprietary compiler. While younger than VMWare and restricted to the i386 and amd64 architectures, it still includes some snapshotting and other interesting features. <ulink type=\"block\" url=\"http://www.virtualbox.org/\" />"
msgstr ""
msgid "Xen <indexterm><primary>Xen</primary></indexterm> is a “paravirtualization” solution. It introduces a thin abstraction layer, called a “hypervisor”, between the hardware and the upper systems; this acts as a referee that controls access to hardware from the virtual machines. However, it only handles a few of the instructions, the rest is directly executed by the hardware on behalf of the systems. The main advantage is that performances are not degraded, and systems run close to native speed; the drawback is that the kernels of the operating systems one wishes to use on a Xen hypervisor need to be adapted to run on Xen."
msgstr ""
msgid "Let's spend some time on terms. The hypervisor is the lowest layer, that runs directly on the hardware, even below the kernel. This hypervisor can split the rest of the software across several <emphasis>domains</emphasis>, which can be seen as so many virtual machines. One of these domains (the first one that gets started) is known as <emphasis>dom0</emphasis>, and has a special role, since only this domain can control the hypervisor and the execution of other domains. These other domains are known as <emphasis>domU</emphasis>. In other words, and from a user point of view, the <emphasis>dom0</emphasis> matches the “host” of other virtualization systems, while a <emphasis>domU</emphasis> can be seen as a “guest”."
msgstr ""
msgid "<emphasis>CULTURE</emphasis> Xen and the various versions of Linux"
msgstr ""
msgid "Xen was initially developed as a set of patches that lived out of the official tree, and not integrated to the Linux kernel. At the same time, several upcoming virtualization systems (including KVM) required some generic virtualization-related functions to facilitate their integration, and the Linux kernel gained this set of functions (known as the <emphasis>paravirt_ops</emphasis> or <emphasis>pv_ops</emphasis> interface). Since the Xen patches were duplicating some of the functionality of this interface, they couldn't be accepted officially."
msgstr ""
msgid "Xensource, the company behind Xen, therefore had to port Xen to this new framework, so that the Xen patches could be merged into the official Linux kernel. That meant a lot of code rewrite, and although Xensource soon had a working version based on the paravirt_ops interface, the patches were only progressively merged into the official kernel. The merge was completed in Linux 3.0. <ulink type=\"block\" url=\"http://wiki.xenproject.org/wiki/XenParavirtOps\" />"
msgstr ""
msgid "Since <emphasis role=\"distribution\">Jessie</emphasis> is based on version 3.16 of the Linux kernel, the standard <emphasis role=\"pkg\">linux-image-686-pae</emphasis> and <emphasis role=\"pkg\">linux-image-amd64</emphasis> packages include the necessary code, and the distribution-specific patching that was required for <emphasis role=\"distribution\">Squeeze</emphasis> and earlier versions of Debian is no more. <ulink type=\"block\" url=\"http://wiki.xenproject.org/wiki/Xen_Kernel_Feature_Matrix\" />"
msgstr ""
msgid "<emphasis>NOTE</emphasis> Architectures compatible with Xen"
msgstr ""
msgid "Xen is currently only available for the i386, amd64, arm64 and armhf architectures."
msgstr ""
msgid "<emphasis>CULTURE</emphasis> Xen and non-Linux kernels"
msgstr ""
msgid "Xen requires modifications to all the operating systems one wants to run on it; not all kernels have the same level of maturity in this regard. Many are fully-functional, both as dom0 and domU: Linux 3.0 and later, NetBSD 4.0 and later, and OpenSolaris. Others only work as a domU. You can check the status of each operating system in the Xen wiki: <ulink type=\"block\" url=\"http://wiki.xenproject.org/wiki/Dom0_Kernels_for_Xen\" /> <ulink type=\"block\" url=\"http://wiki.xenproject.org/wiki/DomU_Support_for_Xen\" />"
msgstr ""
msgid "However, if Xen can rely on the hardware functions dedicated to virtualization (which are only present in more recent processors), even non-modified operating systems can run as domU (including Windows)."
msgstr ""
msgid "Using Xen under Debian requires three components:"
msgstr ""
msgid "The hypervisor itself. According to the available hardware, the appropriate package will be either <emphasis role=\"pkg\">xen-hypervisor-4.4-amd64</emphasis>, <emphasis role=\"pkg\">xen-hypervisor-4.4-armhf</emphasis>, or <emphasis role=\"pkg\">xen-hypervisor-4.4-arm64</emphasis>."
msgstr ""
msgid "A kernel that runs on that hypervisor. Any kernel more recent than 3.0 will do, including the 3.16 version present in <emphasis role=\"distribution\">Jessie</emphasis>."