GlusterFS 3.6.1 on CentOS 6.5: geo-replication and sparse files problem
From MyWiki
Initial config:
[root@master-node alexm]# gluster vol geo-replication factory ruser@remote-node1::factory-az1 config special_sync_mode: partial state_socket_unencoded: /var/lib/glusterd/geo-replication/factory_remote-node1_factory-az1/ssh%3A%2F%2Fruser%40192.168.10.9%3Agluster%3A%2F%2F127.0.0.1%3Afactory-az1.socket gluster_log_file: /var/log/glusterfs/geo-replication/factory/ssh%3A%2F%2Fruser%40192.168.10.9%3Agluster%3A%2F%2F127.0.0.1%3Afactory-az1.gluster.log ssh_command: ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem ignore_deletes: true change_detector: changelog gluster_command_dir: /usr/sbin/ georep_session_working_dir: /var/lib/glusterd/geo-replication/factory_remote-node1_factory-az1/ state_file: /var/lib/glusterd/geo-replication/factory_remote-node1_factory-az1/ssh%3A%2F%2Fruser%40192.168.10.9%3Agluster%3A%2F%2F127.0.0.1%3Afactory-az1.status remote_gsyncd: /nonexistent/gsyncd session_owner: 0008af01-9878-4eb1-832b-1da92875cde6 changelog_log_file: /var/log/glusterfs/geo-replication/factory/ssh%3A%2F%2Fruser%40192.168.10.9%3Agluster%3A%2F%2F127.0.0.1%3Afactory-az1-changes.log socketdir: /var/run working_dir: /var/lib/misc/glusterfsd/factory/ssh%3A%2F%2Fruser%40192.168.10.9%3Agluster%3A%2F%2F127.0.0.1%3Afactory-az1 state_detail_file: /var/lib/glusterd/geo-replication/factory_remote-node1_factory-az1/ssh%3A%2F%2Fruser%40192.168.10.9%3Agluster%3A%2F%2F127.0.0.1%3Afactory-az1-detail.status ssh_command_tar: ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/tar_ssh.pem pid_file: /var/lib/glusterd/geo-replication/factory_remote-node1_factory-az1/ssh%3A%2F%2Fruser%40192.168.10.9%3Agluster%3A%2F%2F127.0.0.1%3Afactory-az1.pid log_file: /var/log/glusterfs/geo-replication/factory/ssh%3A%2F%2Fruser%40192.168.10.9%3Agluster%3A%2F%2F127.0.0.1%3Afactory-az1.log gluster_params: aux-gfid-mount volume_id: 0008af01-9878-4eb1-832b-1da92875cde6
Sparse issue reported this year in June but received no reply. And we have the same issue:
[root@client alexm]# du -sh /data/factory/test/* 1.8G /data/factory/test/c65_base.dsk 284M /data/factory/test/c65_base.qcow2 13K /data/factory/test/c65_base.xml 2.0G /data/factory/test/c66_base.dsk 283M /data/factory/test/c66_base.qcow2 14K /data/factory/test/c66_base.xml 0 /data/factory/test/image_base.dsk 0 /data/factory/test/image_base.qcow2 [root@client alexm]# du -sh --apparent-size /data/factory/test/* 10G /data/factory/test/c65_base.dsk 284M /data/factory/test/c65_base.qcow2 13K /data/factory/test/c65_base.xml 10G /data/factory/test/c66_base.dsk 283M /data/factory/test/c66_base.qcow2 14K /data/factory/test/c66_base.xml 12 /data/factory/test/image_base.dsk 14 /data/factory/test/image_base.qcow2 [root@client images]# du -sh /data/factory/test/ 4.0G /data/factory/test/ [root@remote-node2 alexm]# du -sh /data/bricks/sdb/data/test/ 21G /data/bricks/sdb/data/test/
Tried the below, but it didn't pick up the option:
[root@master-node alexm]# vi /var/lib/glusterd/geo-replication/factory_remote-node1_factory-az1/gsyncd.conf [root@master-node alexm]# tail -1 /var/lib/glusterd/geo-replication/factory_remote-node1_factory-az1/gsyncd.conf rsync_options = --sparse [root@master-node alexm]# ps -ef | grep rsync root 32456 31109 51 14:13 ? 00:00:01 rsync -avR0 --inplace --files-from=- --super --stats --numeric-ids --no-implied-dirs . -e ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-JCnAJv/d517d24dbad39b9d01e3dc1662a34aae.sock --compress ruser@remote-node2:/proc/122101/cwd root 32457 32456 0 14:13 ? 00:00:00 ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-JCnAJv/d517d24dbad39b9d01e3dc1662a34aae.sock -l ruser remote-node2 rsync --server -vlogDtpRze.Ls --super --numeric-ids --inplace --no-implied-dirs . /proc/122101/cwd root 32459 15064 0 14:14 pts/0 00:00:00 grep rsync [root@master-node alexm]# ps -ef | grep rsync | grep --color sparse [root@master-node alexm]#
Tried few things... According to this post and this one XFS mount options might cause the sparse files to take more space than they should.
Tested with allocsize=64k, but it's didn't help:
[root@remote-node1 test]# mount | grep sdb /dev/sdb1 on /data/bricks/sdb type xfs (rw,allocsize=64k)
Tried to create sparse file on the filesystem to make sure that it's working as expected outside of Gluster and it does work:
[root@remote-node1 test]# dd if=/dev/zero of=test.sparse bs=1 count=0 seek=8G 0+0 records in 0+0 records out 0 bytes (0 B) copied, 0.000111255 s, 0.0 kB/s [root@remote-node1 test]# ls -lh test.sparse -rw-r--r-- 1 root wheel 8.0G Oct 21 22:01 test.sparse [root@remote-node1 test]# du -sh test.sparse 0 test.sparse [root@remote-node1 test]# du -sh --apparent-size test.sparse 8.0G test.sparse
Gluster volume mounted to client:
[root@client alexm]# mount | grep arti master-node:/factory on /data/factory type fuse.glusterfs (rw,default_permissions,allow_other,max_read=131072) [root@client alexm]# du -sh /data/factory/test/ 3.8G /data/factory/test/ [root@client alexm]# du -sh /data/factory/test/* 1.5G /data/factory/test/c65_base.dsk 284M /data/factory/test/c65_base.qcow2 13K /data/factory/test/c65_base.xml 1.6G /data/factory/test/c66_base.dsk 283M /data/factory/test/c66_base.qcow2 14K /data/factory/test/c66_base.xml 0 /data/factory/test/image_base.dsk 0 /data/factory/test/image_base.qcow2 [root@client alexm]# du -sh --apparent-size /data/factory/test/* 10G /data/factory/test/c65_base.dsk 284M /data/factory/test/c65_base.qcow2 13K /data/factory/test/c65_base.xml 10G /data/factory/test/c66_base.dsk 283M /data/factory/test/c66_base.qcow2 14K /data/factory/test/c66_base.xml 12 /data/factory/test/image_base.dsk 14 /data/factory/test/image_base.qcow2
Remote site Gluster node after receiving the files - sparseness is lost:
[root@remote-node1 /]# mount | grep sdb /dev/sdb1 on /data/bricks/sdb type xfs (rw) [root@remote-node1 /]# du -sh /data/bricks/sdb/data/test/ 22G /data/bricks/sdb/data/test/ [root@remote-node1 /]# du -sh /data/bricks/sdb/data/test/* 11G /data/bricks/sdb/data/test/c65_base.dsk 284M /data/bricks/sdb/data/test/c65_base.qcow2 20K /data/bricks/sdb/data/test/c65_base.xml 11G /data/bricks/sdb/data/test/c66_base.dsk 283M /data/bricks/sdb/data/test/c66_base.qcow2 20K /data/bricks/sdb/data/test/c66_base.xml 0 /data/bricks/sdb/data/test/image_base.dsk 0 /data/bricks/sdb/data/test/image_base.qcow2 [root@remote-node1 /]# du -sh --apparent-size /data/bricks/sdb/data/test/* 10G /data/bricks/sdb/data/test/c65_base.dsk 284M /data/bricks/sdb/data/test/c65_base.qcow2 13K /data/bricks/sdb/data/test/c65_base.xml 10G /data/bricks/sdb/data/test/c66_base.dsk 283M /data/bricks/sdb/data/test/c66_base.qcow2 14K /data/bricks/sdb/data/test/c66_base.xml 12 /data/bricks/sdb/data/test/image_base.dsk 14 /data/bricks/sdb/data/test/image_base.qcow2
Tried the same with allocsize=64k on remote site nodes:
[root@remote-node1 /]# mount | grep sdb /dev/sdb1 on /data/bricks/sdb type xfs (rw,allocsize=64k) [root@remote-node1 /]# du -sh /data/bricks/sdb/data/test/* 11G /data/bricks/sdb/data/test/c65_base.dsk 284M /data/bricks/sdb/data/test/c65_base.qcow2 20K /data/bricks/sdb/data/test/c65_base.xml 11G /data/bricks/sdb/data/test/c66_base.dsk 283M /data/bricks/sdb/data/test/c66_base.qcow2 20K /data/bricks/sdb/data/test/c66_base.xml 0 /data/bricks/sdb/data/test/image_base.dsk 0 /data/bricks/sdb/data/test/image_base.qcow2 [root@remote-node1 /]# du -sh --apparent-size /data/bricks/sdb/data/test/* 10G /data/bricks/sdb/data/test/c65_base.dsk 284M /data/bricks/sdb/data/test/c65_base.qcow2 13K /data/bricks/sdb/data/test/c65_base.xml 10G /data/bricks/sdb/data/test/c66_base.dsk 283M /data/bricks/sdb/data/test/c66_base.qcow2 14K /data/bricks/sdb/data/test/c66_base.xml 12 /data/bricks/sdb/data/test/image_base.dsk 14 /data/bricks/sdb/data/test/image_base.qcow2
Same story. I'm pretty confident that the cause of this phenomena is not the filesystem, which has no problem of hosting sparse files, but the rsync command that gluster uses.
To make sure it's not the filesystem I tried the same bricks with ext4 and the sparse issue is still present:
[root@remote-node1 /]# mount | grep sdb /dev/sdb1 on /data/bricks/sdb type ext4 (rw) [root@remote-node1 /]# du -sh /data/bricks/sdb/data/test/* 11G /data/bricks/sdb/data/test/c65_base.dsk 284M /data/bricks/sdb/data/test/c65_base.qcow2 20K /data/bricks/sdb/data/test/c65_base.xml 11G /data/bricks/sdb/data/test/c66_base.dsk 283M /data/bricks/sdb/data/test/c66_base.qcow2 20K /data/bricks/sdb/data/test/c66_base.xml 0 /data/bricks/sdb/data/test/image_base.dsk 0 /data/bricks/sdb/data/test/image_base.qcow2 [root@remote-node1 /]# du -sh --apparent-size /data/bricks/sdb/data/test/* 10G /data/bricks/sdb/data/test/c65_base.dsk 284M /data/bricks/sdb/data/test/c65_base.qcow2 13K /data/bricks/sdb/data/test/c65_base.xml 10G /data/bricks/sdb/data/test/c66_base.dsk 283M /data/bricks/sdb/data/test/c66_base.qcow2 14K /data/bricks/sdb/data/test/c66_base.xml 12 /data/bricks/sdb/data/test/image_base.dsk 14 /data/bricks/sdb/data/test/image_base.qcow2
According to rsync manual '-inplace' is incompatible with '-sparse/-S':
-S, --sparse Try to handle sparse files efficiently so they take up less space on the destination. Conflicts with --inplace because it’s not possible to overwrite data in a sparse fashion. --inplace This option changes how rsync transfers a file when the file’s data needs to be updated: instead of the default method of creating a new copy of the file and moving it into place when it is complete, rsync instead writes the updated data directly to the destination file. This has several effects: (1) in-use binaries cannot be updated (either the OS will prevent this from happening, or binaries that attempt to swap-in their data will misbehave or crash), (2) the file’s data will be in an inconsis- tent state during the transfer, (3) a file’s data may be left in an inconsistent state after the transfer if the transfer is interrupted or if an update fails, (4) a file that does not have write permissions can not be updated, and (5) the efficiency of rsync’s delta-transfer algorithm may be reduced if some data in the destination file is overwritten before it can be copied to a position later in the file (one exception to this is if you combine this option with --backup, since rsync is smart enough to use the backup file as the basis file for the transfer). WARNING: you should not use this option to update files that are being accessed by others, so be careful when choosing to use this for a copy. This option is useful for transfer of large files with block-based changes or appended data, and also on systems that are disk bound, not network bound. The option implies --partial (since an interrupted transfer does not delete the file), but conflicts with --partial-dir and --delay-updates. Prior to rsync 2.6.4 --inplace was also incompatible with --compare-dest and --link-dest.
You can see 'S' being used by rsync, but it's part of SSH command, and not "-sparse" synonym:
[root@master-node ssl]# ps -ef | grep rsync root 7038 34055 34 09:58 ? 00:03:49 rsync -avR0 --inplace --files-from=- --super --stats --numeric-ids --no-implied-dirs . -e ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-W2RGpR/d517d24dbad39b9d01e3dc1662a34aae.sock --compress ruser@remote-node2:/proc/131830/cwd root 7042 7038 0 09:58 ? 00:00:00 ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-W2RGpR/d517d24dbad39b9d01e3dc1662a34aae.sock -l ruser remote-node2 rsync --server -vlogDtpRze.Ls --super --numeric-ids --inplace --no-implied-dirs . /proc/131830/cwd root 7043 34055 41 09:58 ? 00:04:38 rsync -avR0 --inplace --files-from=- --super --stats --numeric-ids --no-implied-dirs . -e ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-W2RGpR/d517d24dbad39b9d01e3dc1662a34aae.sock --compress ruser@remote-node2:/proc/131830/cwd root 7044 7043 0 09:58 ? 00:00:00 ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-W2RGpR/d517d24dbad39b9d01e3dc1662a34aae.sock -l ruser remote-node2 rsync --server -vlogDtpRze.Ls --super --numeric-ids --inplace --no-implied-dirs . /proc/131830/cwd
The problem is that "--inplace" option is hardcoded in geo-replication code (see line 826 below), so I can not see a way to remove it via gsyncd.conf (/var/lib/glusterd/geo-replication/factory_remote-node1_factory-az1/gsyncd.conf):
[root@master-node ssl]# grep -C10 -n inplace /usr/libexec/glusterfs/python/syncdaemon/resource.py 816- raise GsyncdError( 817- "RePCe major version mismatch: local %s, remote %s" % 818- (exrv, rv)) 819- 820- def rsync(self, files, *args): 821- """invoke rsync""" 822- if not files: 823- raise GsyncdError("no files to sync") 824- logging.debug("files: " + ", ".join(files)) 825- argv = gconf.rsync_command.split() + \ 826: ['-avR0', '--inplace', '--files-from=-', '--super', 827- '--stats', '--numeric-ids', '--no-implied-dirs'] + \ 828- gconf.rsync_options.split() + \ 829- (boolify(gconf.use_rsync_xattrs) and ['--xattrs'] or []) + \ 830- ['.'] + list(args) 831- po = Popen(argv, stdin=subprocess.PIPE, stderr=subprocess.PIPE) 832- for f in files: 833- po.stdin.write(f) 834- po.stdin.write('\0') 835- 836- po.stdin.close()
For the sake of the test I made a local change:
[root@master-node ssl]# rpm -qf /usr/libexec/glusterfs/python/syncdaemon/resource.py glusterfs-geo-replication-3.6.1-1.el6.x86_64 [root@master-node ssl]# ls -l /usr/libexec/glusterfs/python/syncdaemon/resource.py -rw-r--r-- 1 root root 51629 Nov 7 2014 /usr/libexec/glusterfs/python/syncdaemon/resource.py [root@master-node ssl]# cp /usr/libexec/glusterfs/python/syncdaemon/resource.py /usr/libexec/glusterfs/python/syncdaemon/resource.py.bak [root@master-node ssl]# gluster vol geo-replication factory ruser@remote-node1::factory-az1 stop Stopping geo-replication session between factory & ruser@remote-node1::factory-az1 has been successful [root@master-node ssl]# vi /usr/libexec/glusterfs/python/syncdaemon/resource.py [root@master-node ssl]# git diff /usr/libexec/glusterfs/python/syncdaemon/resource.py.bak /usr/libexec/glusterfs/python/syncdaemon/resource.py diff --git a/usr/libexec/glusterfs/python/syncdaemon/resource.py.bak b/usr/libexec/glusterfs/python/syncdaemon/resource.py index 3a3bd00..e9721d4 100644 --- a/usr/libexec/glusterfs/python/syncdaemon/resource.py.bak +++ b/usr/libexec/glusterfs/python/syncdaemon/resource.py @@ -823,7 +823,7 @@ class SlaveRemote(object): raise GsyncdError("no files to sync") logging.debug("files: " + ", ".join(files)) argv = gconf.rsync_command.split() + \ - ['-avR0', '--inplace', '--files-from=-', '--super', + ['-avR0', '--sparse', '--files-from=-', '--super', '--stats', '--numeric-ids', '--no-implied-dirs'] + \ gconf.rsync_options.split() + \ (boolify(gconf.use_rsync_xattrs) and ['--xattrs'] or []) + \
Restarted geo-replication so it uses "--sparse":
[root@master-node ssl]# ps -ef | grep rsync root 21066 20974 33 10:50 ? 00:00:00 rsync -avR0 --sparse --files-from=- --super --stats --numeric-ids --no-implied-dirs . -e ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-9WViaB/d517d24dbad39b9d01e3dc1662a34aae.sock --compress ruser@remote-node2:/proc/142806/cwd root 21067 21066 0 10:50 ? 00:00:00 ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-9WViaB/d517d24dbad39b9d01e3dc1662a34aae.sock -l ruser remote-node2 rsync --server -vlogDtpRSze.Ls --super --numeric-ids --no-implied-dirs . /proc/142806/cwd root 21069 40474 0 10:50 pts/0 00:00:00 grep rsync
But after creating the files on the slaves, it doesn't replicate the data reporting rsync error code 23 (Partial transfer due to error):
[2015-10-22 10:51:18.218955] W [master(/data/bricks/sdd/data):294:regjob] _GMaster: Rsync: .gfid/b2e1ec26-245e-459c-9554-e81fdf7f9241 [errcode: 23] [2015-10-22 10:51:18.220887] W [master(/data/bricks/sdd/data):294:regjob] _GMaster: Rsync: .gfid/47c6256e-c483-4541-bf05-e814357dee5a [errcode: 23] [2015-10-22 10:51:18.222296] W [master(/data/bricks/sdd/data):294:regjob] _GMaster: Rsync: .gfid/1e8ed7d0-d067-4a1f-99cf-a9b9b0be356e [errcode: 23] [2015-10-22 10:51:18.222604] W [master(/data/bricks/sdd/data):294:regjob] _GMaster: Rsync: .gfid/bfb7d195-5063-4b14-8366-77f93a56402d [errcode: 23] [2015-10-22 10:51:18.223124] W [master(/data/bricks/sdd/data):294:regjob] _GMaster: Rsync: .gfid/7a99864f-4a3c-4775-8c91-c4005b1866ba [errcode: 23] [2015-10-22 10:51:18.224288] W [master(/data/bricks/sdd/data):294:regjob] _GMaster: Rsync: .gfid/85e66945-c239-426b-ac0f-15567f1a2b0e [errcode: 23] [2015-10-22 10:51:18.224412] W [master(/data/bricks/sdd/data):1005:process] _GMaster: incomplete sync, retrying changelogs: CHANGELOG.1445511017 _GMaster: incomplete sync, retrying changelogs: CHANGELOG.1445511032 CHANGELOG.1445511047 CHANGELOG.1445511064 CHANGELOG.1445511079 CHANGELOG.1445511096 CHANGELOG.1445511114 CHANGELOG.1445511129
in /var/log/glusterfs/geo-replication/factory/ssh%3A%2F%2Fruser%40192.168.10.9%3Agluster%3A%2F%2F127.0.0.1%3Afactory-az1.log file.
[root@remote-node1 /]# ls -l /data/bricks/sdb/data/test/ total 0 -rw-r--r-- 2 root ruser 0 Oct 22 10:50 c65_base.dsk -rw-r--r-- 2 root ruser 0 Oct 22 10:50 c65_base.qcow2 -rw-r--r-- 2 root ruser 0 Oct 22 10:50 c65_base.xml -rw-r--r-- 2 root ruser 0 Oct 22 10:50 c66_base.dsk -rw-r--r-- 2 root ruser 0 Oct 22 10:50 c66_base.qcow2 -rw-r--r-- 2 root ruser 0 Oct 22 10:50 c66_base.xml lrwxrwxrwx 2 root ruser 12 Oct 22 10:50 image_base.dsk -> c66_base.dsk lrwxrwxrwx 2 root ruser 14 Oct 22 10:50 image_base.qcow2 -> c66_base.qcow2
Based on this post and this post there is no way of solving the sparse issue with rsync, so I decided to try and use tar+ssh instead:
[root@master-node ssl]# gluster vol geo-replication factory ruser@remote-node1::factory-az1 stop Stopping geo-replication session between factory & ruser@remote-node1::factory-az1 has been successful [root@master-node ssl]# gluster vol geo-replication factory ruser@remote-node1::factory-az1 config use-tarssh true geo-replication config updated successfully [root@master-node ssl]# gluster vol geo-replication factory ruser@remote-node1::factory-az1 start Starting geo-replication session between factory & ruser@remote-node1::factory-az1 has been successful [root@master-node ssl]# ps -ef | grep geo-replication | grep tar root 33747 32262 4 12:37 ? 00:00:00 ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/tar_ssh.pem ruser@remote-node2 tar --overwrite -xf - -C /proc/10812/cwd [root@master-node ssl]# ps -ef | grep rsync root 33759 40474 0 12:37 pts/0 00:00:00 grep rsync
It had strange side effect though - the image files were constantly downloaded, dropped to zero and downloaded again. Sparseness was again lost. After reading tar manual I decided to make the below change and try again:
[root@master-node ssl]# git diff /usr/libexec/glusterfs/python/syncdaemon/resource.py.bak /usr/libexec/glusterfs/python/syncdaemon/resource.py diff --git a/usr/libexec/glusterfs/python/syncdaemon/resource.py.bak b/usr/libexec/glusterfs/python/syncdaemon/resource.py index 3a3bd00..b1dabd3 100644 --- a/usr/libexec/glusterfs/python/syncdaemon/resource.py.bak +++ b/usr/libexec/glusterfs/python/syncdaemon/resource.py @@ -848,7 +848,7 @@ class SlaveRemote(object): raise GsyncdError("no files to sync") logging.debug("files: " + ", ".join(files)) (host, rdir) = slaveurl.split(':') - tar_cmd = ["tar", "-cf", "-", "--files-from", "-"] + tar_cmd = ["tar", "--sparse", "-cf", "-", "--files-from", "-"] ssh_cmd = gconf.ssh_command_tar.split() + \ [host, "tar", "--overwrite", "-xf", "-", "-C", rdir] p0 = Popen(tar_cmd, stdout=subprocess.PIPE,
Restarted geo-replication:
[root@master-node ssl]# ps -ef | grep tar root 38859 38782 31 13:13 ? 00:00:02 tar --sparse -cf - --files-from - root 38860 38782 0 13:13 ? 00:00:00 ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/tar_ssh.pem ruser@remote-node2 tar --overwrite -xf - -C /proc/17380/cwd root 38865 40474 0 13:13 pts/0 00:00:00 grep tar
It did solve sparse issue:
[root@remote-node1 /]# du -sh /data/bricks/sdb/data/test/c66_base.dsk 1.2G /data/bricks/sdb/data/test/c66_base.dsk [root@remote-node1 /]# du -sh --apparent-size /data/bricks/sdb/data/test/c66_base.dsk 10G /data/bricks/sdb/data/test/c66_base.dsk
but for some reason it keeps overwriting files constantly.
Interesting... Whatever happened, it stopped overwriting the files constantly and settled with the correct sizes and sparseness:
Source:
[root@client alexm]# du -sh /data/factory/test/* 1.8G /data/factory/test/c65_base.dsk 284M /data/factory/test/c65_base.qcow2 13K /data/factory/test/c65_base.xml 2.0G /data/factory/test/c66_base.dsk 283M /data/factory/test/c66_base.qcow2 14K /data/factory/test/c66_base.xml 0 /data/factory/test/image_base.dsk 0 /data/factory/test/image_base.qcow2 [root@client alexm]# du -sh --apparent-size /data/factory/test/* 10G /data/factory/test/c65_base.dsk 284M /data/factory/test/c65_base.qcow2 13K /data/factory/test/c65_base.xml 10G /data/factory/test/c66_base.dsk 283M /data/factory/test/c66_base.qcow2 14K /data/factory/test/c66_base.xml 12 /data/factory/test/image_base.dsk 14 /data/factory/test/image_base.qcow2
Geo-replication destination:
[root@remote-node1 /]# du -sh /data/bricks/sdb/data/test/* 1.2G /data/bricks/sdb/data/test/c65_base.dsk 284M /data/bricks/sdb/data/test/c65_base.qcow2 20K /data/bricks/sdb/data/test/c65_base.xml 1.2G /data/bricks/sdb/data/test/c66_base.dsk 283M /data/bricks/sdb/data/test/c66_base.qcow2 20K /data/bricks/sdb/data/test/c66_base.xml 0 /data/bricks/sdb/data/test/image_base.dsk 0 /data/bricks/sdb/data/test/image_base.qcow2 [root@remote-node1 /]# du -sh --apparent-size /data/bricks/sdb/data/test/* 10G /data/bricks/sdb/data/test/c65_base.dsk 284M /data/bricks/sdb/data/test/c65_base.qcow2 13K /data/bricks/sdb/data/test/c65_base.xml 10G /data/bricks/sdb/data/test/c66_base.dsk 283M /data/bricks/sdb/data/test/c66_base.qcow2 14K /data/bricks/sdb/data/test/c66_base.xml 12 /data/bricks/sdb/data/test/image_base.dsk 14 /data/bricks/sdb/data/test/image_base.qcow2
That something that was triggering files re-sync'ing seems to be a bug. The below is from /var/log/glusterfs/geo-replication/factory/ssh%3A%2F%2Fruser%40192.168.10.9%3Agluster%3A%2F%2F127.0.0.1%3Afactory-az1.log
[2015-10-22 13:13:24.22871] I [master(/data/bricks/sdd/data):1216:crawl] _GMaster: starting history crawl... turns: 1, stime: (1445517445, 0) [2015-10-22 13:13:24.23425] E [repce(agent):117:worker] <top>: call failed: Traceback (most recent call last): File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 113, in worker res = getattr(self.obj, rmeth)(*in_data[2:]) File "/usr/libexec/glusterfs/python/syncdaemon/changelogagent.py", line 51, in history num_parallel) File "/usr/libexec/glusterfs/python/syncdaemon/libgfchangelog.py", line 94, in cl_history_changelog cls.raise_changelog_err() File "/usr/libexec/glusterfs/python/syncdaemon/libgfchangelog.py", line 27, in raise_changelog_err raise ChangelogException(errn, os.strerror(errn)) ChangelogException: [Errno 61] No data available [2015-10-22 13:13:24.24013] E [repce(/data/bricks/sdd/data):207:__call__] RepceClient: call 38782:139685616219904:1445519604.02 (history) failed on peer with ChangelogException [2015-10-22 13:13:24.24122] I [resource(/data/bricks/sdd/data):1332:service_loop] GLUSTER: Changelog history crawl failed, fallback to xsync: 61 - No data available [2015-10-22 13:13:24.25684] I [master(/data/bricks/sdd/data):480:crawlwrap] _GMaster: primary master with volume id 0008af01-9878-4eb1-832b-1da92875cde6 ... [2015-10-22 13:13:24.47414] I [master(/data/bricks/sdd/data):491:crawlwrap] _GMaster: crawl interval: 60 seconds [2015-10-22 13:13:24.75035] I [master(/data/bricks/sdd/data):1323:crawl] _GMaster: starting hybrid crawl..., stime: (1445517445, 0) [2015-10-22 13:13:24.97860] I [master(/data/bricks/sdd/data):1333:crawl] _GMaster: processing xsync changelog /var/lib/misc/glusterfsd/factory/ssh%3A%2F%2Fruser%40192.168.10.9%3Agluster%3A%2F%2F127.0.0.1%3Afactory-az1/23fb04b9a75223da01a99dc595fa4a25/xsync/XSYNC-CHANGELOG.1445519604
Then it went away and all settled:
[2015-10-22 13:14:20.878886] I [monitor(monitor):109:set_state] Monitor: new state: Stable [2015-10-22 13:18:33.567658] I [master(/data/bricks/sdd/data):1330:crawl] _GMaster: finished hybrid crawl syncing, stime: (1445517445, 0) [2015-10-22 13:18:33.569169] I [master(/data/bricks/sdd/data):480:crawlwrap] _GMaster: primary master with volume id 0008af01-9878-4eb1-832b-1da92875cde6 ... [2015-10-22 13:18:33.592328] I [master(/data/bricks/sdd/data):491:crawlwrap] _GMaster: crawl interval: 3 seconds [2015-10-22 13:18:33.689018] I [master(/data/bricks/sdd/data):1182:crawl] _GMaster: slave's time: (1445517445, 0) [2015-10-22 13:40:16.734766] I [master(/data/bricks/sdd/data):504:crawlwrap] _GMaster: 1 crawls, 6 turns [2015-10-22 13:41:18.216042] I [master(/data/bricks/sdd/data):504:crawlwrap] _GMaster: 20 crawls, 0 turns [2015-10-22 13:42:19.419151] I [master(/data/bricks/sdd/data):504:crawlwrap] _GMaster: 20 crawls, 0 turns [2015-10-22 13:43:20.631899] I [master(/data/bricks/sdd/data):504:crawlwrap] _GMaster: 20 crawls, 0 turns [2015-10-22 13:44:21.754441] I [master(/data/bricks/sdd/data):504:crawlwrap] _GMaster: 20 crawls, 0 turns [2015-10-22 13:45:22.942805] I [master(/data/bricks/sdd/data):504:crawlwrap] _GMaster: 20 crawls, 0 turns [2015-10-22 13:46:24.45471] I [master(/data/bricks/sdd/data):504:crawlwrap] _GMaster: 20 crawls, 0 turns [2015-10-22 13:47:25.211256] I [master(/data/bricks/sdd/data):504:crawlwrap] _GMaster: 20 crawls, 0 turns [2015-10-22 13:48:26.366693] I [master(/data/bricks/sdd/data):504:crawlwrap] _GMaster: 20 crawls, 0 turns [2015-10-22 13:49:27.540987] I [master(/data/bricks/sdd/data):504:crawlwrap] _GMaster: 20 crawls, 0 turns [2015-10-22 13:50:28.710438] I [master(/data/bricks/sdd/data):504:crawlwrap] _GMaster: 20 crawls, 0 turns [2015-10-22 13:51:29.865518] I [master(/data/bricks/sdd/data):504:crawlwrap] _GMaster: 20 crawls, 0 turns [2015-10-22 13:52:31.40570] I [master(/data/bricks/sdd/data):504:crawlwrap] _GMaster: 20 crawls, 0 turns [2015-10-22 13:53:32.198990] I [master(/data/bricks/sdd/data):504:crawlwrap] _GMaster: 20 crawls, 0 turns [2015-10-22 13:54:33.500740] I [master(/data/bricks/sdd/data):504:crawlwrap] _GMaster: 20 crawls, 0 turns [2015-10-22 13:55:34.680412] I [master(/data/bricks/sdd/data):504:crawlwrap] _GMaster: 20 crawls, 0 turns [2015-10-22 13:56:36.298779] I [master(/data/bricks/sdd/data):504:crawlwrap] _GMaster: 20 crawls, 0 turns
Turned remote site volume's filesystem back to XFS. With the above mentioned code change sparse files were created properly as well:
[root@remote-node1 /]# du -sh /data/bricks/sdb/data/test/* 1.4G /data/bricks/sdb/data/test/c65_base.dsk 284M /data/bricks/sdb/data/test/c65_base.qcow2 20K /data/bricks/sdb/data/test/c65_base.xml 1.9G /data/bricks/sdb/data/test/c66_base.dsk 283M /data/bricks/sdb/data/test/c66_base.qcow2 20K /data/bricks/sdb/data/test/c66_base.xml 0 /data/bricks/sdb/data/test/image_base.dsk 0 /data/bricks/sdb/data/test/image_base.qcow2 [root@remote-node1 /]# du -sh --apparent-size /data/bricks/sdb/data/test/* 10G /data/bricks/sdb/data/test/c65_base.dsk 284M /data/bricks/sdb/data/test/c65_base.qcow2 13K /data/bricks/sdb/data/test/c65_base.xml 10G /data/bricks/sdb/data/test/c66_base.dsk 283M /data/bricks/sdb/data/test/c66_base.qcow2 14K /data/bricks/sdb/data/test/c66_base.xml 12 /data/bricks/sdb/data/test/image_base.dsk 14 /data/bricks/sdb/data/test/image_base.qcow2
I created pull request to submit code change to glusterfs code base. And here is Gerrit review for the same and the Bugzilla bug report.