GlusterFS 3.6.1 on CentOS 6.5: geo-replication and sparse files problem

From MyWiki

Initial config:

[root@master-node alexm]# gluster vol geo-replication factory ruser@remote-node1::factory-az1 config
special_sync_mode: partial
state_socket_unencoded: /var/lib/glusterd/geo-replication/factory_remote-node1_factory-az1/ssh%3A%2F%2Fruser%40192.168.10.9%3Agluster%3A%2F%2F127.0.0.1%3Afactory-az1.socket
gluster_log_file: /var/log/glusterfs/geo-replication/factory/ssh%3A%2F%2Fruser%40192.168.10.9%3Agluster%3A%2F%2F127.0.0.1%3Afactory-az1.gluster.log
ssh_command: ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem
ignore_deletes: true
change_detector: changelog
gluster_command_dir: /usr/sbin/
georep_session_working_dir: /var/lib/glusterd/geo-replication/factory_remote-node1_factory-az1/
state_file: /var/lib/glusterd/geo-replication/factory_remote-node1_factory-az1/ssh%3A%2F%2Fruser%40192.168.10.9%3Agluster%3A%2F%2F127.0.0.1%3Afactory-az1.status
remote_gsyncd: /nonexistent/gsyncd
session_owner: 0008af01-9878-4eb1-832b-1da92875cde6
changelog_log_file: /var/log/glusterfs/geo-replication/factory/ssh%3A%2F%2Fruser%40192.168.10.9%3Agluster%3A%2F%2F127.0.0.1%3Afactory-az1-changes.log
socketdir: /var/run
working_dir: /var/lib/misc/glusterfsd/factory/ssh%3A%2F%2Fruser%40192.168.10.9%3Agluster%3A%2F%2F127.0.0.1%3Afactory-az1
state_detail_file: /var/lib/glusterd/geo-replication/factory_remote-node1_factory-az1/ssh%3A%2F%2Fruser%40192.168.10.9%3Agluster%3A%2F%2F127.0.0.1%3Afactory-az1-detail.status
ssh_command_tar: ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/tar_ssh.pem
pid_file: /var/lib/glusterd/geo-replication/factory_remote-node1_factory-az1/ssh%3A%2F%2Fruser%40192.168.10.9%3Agluster%3A%2F%2F127.0.0.1%3Afactory-az1.pid
log_file: /var/log/glusterfs/geo-replication/factory/ssh%3A%2F%2Fruser%40192.168.10.9%3Agluster%3A%2F%2F127.0.0.1%3Afactory-az1.log
gluster_params: aux-gfid-mount
volume_id: 0008af01-9878-4eb1-832b-1da92875cde6

Sparse issue reported this year in June but received no reply. And we have the same issue:

[root@client alexm]# du -sh /data/factory/test/*
1.8G	/data/factory/test/c65_base.dsk
284M	/data/factory/test/c65_base.qcow2
13K	/data/factory/test/c65_base.xml
2.0G	/data/factory/test/c66_base.dsk
283M	/data/factory/test/c66_base.qcow2
14K	/data/factory/test/c66_base.xml
0	/data/factory/test/image_base.dsk
0	/data/factory/test/image_base.qcow2

[root@client alexm]# du -sh --apparent-size /data/factory/test/*
10G	/data/factory/test/c65_base.dsk
284M	/data/factory/test/c65_base.qcow2
13K	/data/factory/test/c65_base.xml
10G	/data/factory/test/c66_base.dsk
283M	/data/factory/test/c66_base.qcow2
14K	/data/factory/test/c66_base.xml
12	/data/factory/test/image_base.dsk
14	/data/factory/test/image_base.qcow2

[root@client images]# du -sh /data/factory/test/
4.0G	/data/factory/test/
 
[root@remote-node2 alexm]# du -sh /data/bricks/sdb/data/test/
21G	/data/bricks/sdb/data/test/

Tried the below, but it didn't pick up the option:

[root@master-node alexm]# vi /var/lib/glusterd/geo-replication/factory_remote-node1_factory-az1/gsyncd.conf
 
[root@master-node alexm]# tail -1 /var/lib/glusterd/geo-replication/factory_remote-node1_factory-az1/gsyncd.conf
rsync_options = --sparse
 
[root@master-node alexm]# ps -ef | grep rsync
root     32456 31109 51 14:13 ?        00:00:01 rsync -avR0 --inplace --files-from=- --super --stats --numeric-ids --no-implied-dirs . -e ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-JCnAJv/d517d24dbad39b9d01e3dc1662a34aae.sock --compress ruser@remote-node2:/proc/122101/cwd
root     32457 32456  0 14:13 ?        00:00:00 ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-JCnAJv/d517d24dbad39b9d01e3dc1662a34aae.sock -l ruser remote-node2 rsync --server -vlogDtpRze.Ls --super --numeric-ids --inplace --no-implied-dirs . /proc/122101/cwd
root     32459 15064  0 14:14 pts/0    00:00:00 grep rsync
[root@master-node alexm]# ps -ef | grep rsync | grep --color sparse
[root@master-node alexm]#

Tried few things... According to this post and this one XFS mount options might cause the sparse files to take more space than they should.

Tested with allocsize=64k, but it's didn't help:

[root@remote-node1 test]# mount | grep sdb
/dev/sdb1 on /data/bricks/sdb type xfs (rw,allocsize=64k)

Tried to create sparse file on the filesystem to make sure that it's working as expected outside of Gluster and it does work:

[root@remote-node1 test]# dd if=/dev/zero of=test.sparse bs=1 count=0 seek=8G
0+0 records in
0+0 records out
0 bytes (0 B) copied, 0.000111255 s, 0.0 kB/s

[root@remote-node1 test]# ls -lh test.sparse
-rw-r--r-- 1 root wheel 8.0G Oct 21 22:01 test.sparse

[root@remote-node1 test]# du -sh test.sparse
0	test.sparse

[root@remote-node1 test]# du -sh --apparent-size test.sparse
8.0G	test.sparse

Gluster volume mounted to client:

[root@client alexm]# mount | grep arti
master-node:/factory on /data/factory type fuse.glusterfs (rw,default_permissions,allow_other,max_read=131072)
 
[root@client alexm]# du -sh /data/factory/test/
3.8G	/data/factory/test/
 
[root@client alexm]# du -sh /data/factory/test/*
1.5G	/data/factory/test/c65_base.dsk
284M	/data/factory/test/c65_base.qcow2
13K	/data/factory/test/c65_base.xml
1.6G	/data/factory/test/c66_base.dsk
283M	/data/factory/test/c66_base.qcow2
14K	/data/factory/test/c66_base.xml
0	/data/factory/test/image_base.dsk
0	/data/factory/test/image_base.qcow2
 
[root@client alexm]# du -sh --apparent-size /data/factory/test/*
10G	/data/factory/test/c65_base.dsk
284M	/data/factory/test/c65_base.qcow2
13K	/data/factory/test/c65_base.xml
10G	/data/factory/test/c66_base.dsk
283M	/data/factory/test/c66_base.qcow2
14K	/data/factory/test/c66_base.xml
12	/data/factory/test/image_base.dsk
14	/data/factory/test/image_base.qcow2

Remote site Gluster node after receiving the files - sparseness is lost:

[root@remote-node1 /]# mount | grep sdb
/dev/sdb1 on /data/bricks/sdb type xfs (rw)
 
[root@remote-node1 /]# du -sh /data/bricks/sdb/data/test/
22G	/data/bricks/sdb/data/test/
 
[root@remote-node1 /]# du -sh /data/bricks/sdb/data/test/*
11G	/data/bricks/sdb/data/test/c65_base.dsk
284M	/data/bricks/sdb/data/test/c65_base.qcow2
20K	/data/bricks/sdb/data/test/c65_base.xml
11G	/data/bricks/sdb/data/test/c66_base.dsk
283M	/data/bricks/sdb/data/test/c66_base.qcow2
20K	/data/bricks/sdb/data/test/c66_base.xml
0	/data/bricks/sdb/data/test/image_base.dsk
0	/data/bricks/sdb/data/test/image_base.qcow2
 
[root@remote-node1 /]# du -sh --apparent-size /data/bricks/sdb/data/test/*
10G	/data/bricks/sdb/data/test/c65_base.dsk
284M	/data/bricks/sdb/data/test/c65_base.qcow2
13K	/data/bricks/sdb/data/test/c65_base.xml
10G	/data/bricks/sdb/data/test/c66_base.dsk
283M	/data/bricks/sdb/data/test/c66_base.qcow2
14K	/data/bricks/sdb/data/test/c66_base.xml
12	/data/bricks/sdb/data/test/image_base.dsk
14	/data/bricks/sdb/data/test/image_base.qcow2

Tried the same with allocsize=64k on remote site nodes:

[root@remote-node1 /]# mount | grep sdb
/dev/sdb1 on /data/bricks/sdb type xfs (rw,allocsize=64k)
 
[root@remote-node1 /]# du -sh /data/bricks/sdb/data/test/*
11G	/data/bricks/sdb/data/test/c65_base.dsk
284M	/data/bricks/sdb/data/test/c65_base.qcow2
20K	/data/bricks/sdb/data/test/c65_base.xml
11G	/data/bricks/sdb/data/test/c66_base.dsk
283M	/data/bricks/sdb/data/test/c66_base.qcow2
20K	/data/bricks/sdb/data/test/c66_base.xml
0	/data/bricks/sdb/data/test/image_base.dsk
0	/data/bricks/sdb/data/test/image_base.qcow2

[root@remote-node1 /]# du -sh --apparent-size /data/bricks/sdb/data/test/*
10G	/data/bricks/sdb/data/test/c65_base.dsk
284M	/data/bricks/sdb/data/test/c65_base.qcow2
13K	/data/bricks/sdb/data/test/c65_base.xml
10G	/data/bricks/sdb/data/test/c66_base.dsk
283M	/data/bricks/sdb/data/test/c66_base.qcow2
14K	/data/bricks/sdb/data/test/c66_base.xml
12	/data/bricks/sdb/data/test/image_base.dsk
14	/data/bricks/sdb/data/test/image_base.qcow2

Same story. I'm pretty confident that the cause of this phenomena is not the filesystem, which has no problem of hosting sparse files, but the rsync command that gluster uses.

To make sure it's not the filesystem I tried the same bricks with ext4 and the sparse issue is still present:

[root@remote-node1 /]# mount | grep sdb
/dev/sdb1 on /data/bricks/sdb type ext4 (rw)
 
[root@remote-node1 /]# du -sh /data/bricks/sdb/data/test/*
11G	/data/bricks/sdb/data/test/c65_base.dsk
284M	/data/bricks/sdb/data/test/c65_base.qcow2
20K	/data/bricks/sdb/data/test/c65_base.xml
11G	/data/bricks/sdb/data/test/c66_base.dsk
283M	/data/bricks/sdb/data/test/c66_base.qcow2
20K	/data/bricks/sdb/data/test/c66_base.xml
0	/data/bricks/sdb/data/test/image_base.dsk
0	/data/bricks/sdb/data/test/image_base.qcow2
 
[root@remote-node1 /]# du -sh --apparent-size /data/bricks/sdb/data/test/*
10G	/data/bricks/sdb/data/test/c65_base.dsk
284M	/data/bricks/sdb/data/test/c65_base.qcow2
13K	/data/bricks/sdb/data/test/c65_base.xml
10G	/data/bricks/sdb/data/test/c66_base.dsk
283M	/data/bricks/sdb/data/test/c66_base.qcow2
14K	/data/bricks/sdb/data/test/c66_base.xml
12	/data/bricks/sdb/data/test/image_base.dsk
14	/data/bricks/sdb/data/test/image_base.qcow2

According to rsync manual '-inplace' is incompatible with '-sparse/-S':

       -S, --sparse
              Try to handle sparse files efficiently so they take up less space on the destination.  Conflicts with --inplace because it’s not possible to overwrite data in a sparse fashion.
 
       --inplace
              This  option  changes  how  rsync  transfers a file when the file’s data needs to be updated: instead of the default method of creating a new copy of the file and moving it into place when it is complete, rsync instead writes the
              updated data directly to the destination file.
 
              This has several effects: (1) in-use binaries cannot be updated (either the OS will prevent this from happening, or binaries that attempt to swap-in their data will misbehave or crash), (2) the file’s data will be in an inconsis-
              tent  state  during the transfer, (3) a file’s data may be left in an inconsistent state after the transfer if the transfer is interrupted or if an update fails, (4) a file that does not have write permissions can not be updated,
              and (5) the efficiency of rsync’s delta-transfer algorithm may be reduced if some data in the destination file is overwritten before it can be copied to a position later in the file (one exception to this is if you  combine  this
              option with --backup, since rsync is smart enough to use the backup file as the basis file for the transfer).
 
              WARNING: you should not use this option to update files that are being accessed by others, so be careful when choosing to use this for a copy.
 
              This option is useful for transfer of large files with block-based changes or appended data, and also on systems that are disk bound, not network bound.
 
              The  option  implies  --partial  (since  an  interrupted  transfer  does  not  delete  the file), but conflicts with --partial-dir and --delay-updates.  Prior to rsync 2.6.4 --inplace was also incompatible with --compare-dest and
              --link-dest.

You can see 'S' being used by rsync, but it's part of SSH command, and not "-sparse" synonym:

[root@master-node ssl]# ps -ef | grep rsync
root      7038 34055 34 09:58 ?        00:03:49 rsync -avR0 --inplace --files-from=- --super --stats --numeric-ids --no-implied-dirs . -e ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-W2RGpR/d517d24dbad39b9d01e3dc1662a34aae.sock --compress ruser@remote-node2:/proc/131830/cwd
root      7042  7038  0 09:58 ?        00:00:00 ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-W2RGpR/d517d24dbad39b9d01e3dc1662a34aae.sock -l ruser remote-node2 rsync --server -vlogDtpRze.Ls --super --numeric-ids --inplace --no-implied-dirs . /proc/131830/cwd
root      7043 34055 41 09:58 ?        00:04:38 rsync -avR0 --inplace --files-from=- --super --stats --numeric-ids --no-implied-dirs . -e ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-W2RGpR/d517d24dbad39b9d01e3dc1662a34aae.sock --compress ruser@remote-node2:/proc/131830/cwd
root      7044  7043  0 09:58 ?        00:00:00 ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-W2RGpR/d517d24dbad39b9d01e3dc1662a34aae.sock -l ruser remote-node2 rsync --server -vlogDtpRze.Ls --super --numeric-ids --inplace --no-implied-dirs . /proc/131830/cwd

The problem is that "--inplace" option is hardcoded in geo-replication code (see line 826 below), so I can not see a way to remove it via gsyncd.conf (/var/lib/glusterd/geo-replication/factory_remote-node1_factory-az1/gsyncd.conf):

[root@master-node ssl]# grep -C10 -n inplace /usr/libexec/glusterfs/python/syncdaemon/resource.py
816-            raise GsyncdError(
817-                "RePCe major version mismatch: local %s, remote %s" %
818-                (exrv, rv))
819-
820-    def rsync(self, files, *args):
821-        """invoke rsync"""
822-        if not files:
823-            raise GsyncdError("no files to sync")
824-        logging.debug("files: " + ", ".join(files))
825-        argv = gconf.rsync_command.split() + \
826:            ['-avR0', '--inplace', '--files-from=-', '--super',
827-             '--stats', '--numeric-ids', '--no-implied-dirs'] + \
828-            gconf.rsync_options.split() + \
829-            (boolify(gconf.use_rsync_xattrs) and ['--xattrs'] or []) + \
830-            ['.'] + list(args)
831-        po = Popen(argv, stdin=subprocess.PIPE, stderr=subprocess.PIPE)
832-        for f in files:
833-            po.stdin.write(f)
834-            po.stdin.write('\0')
835-
836-        po.stdin.close()

For the sake of the test I made a local change:

[root@master-node ssl]# rpm -qf /usr/libexec/glusterfs/python/syncdaemon/resource.py
glusterfs-geo-replication-3.6.1-1.el6.x86_64
 
[root@master-node ssl]# ls -l /usr/libexec/glusterfs/python/syncdaemon/resource.py
-rw-r--r-- 1 root root 51629 Nov  7  2014 /usr/libexec/glusterfs/python/syncdaemon/resource.py
 
[root@master-node ssl]# cp /usr/libexec/glusterfs/python/syncdaemon/resource.py /usr/libexec/glusterfs/python/syncdaemon/resource.py.bak
 
[root@master-node ssl]# gluster vol geo-replication factory ruser@remote-node1::factory-az1 stop
Stopping geo-replication session between factory & ruser@remote-node1::factory-az1 has been successful

[root@master-node ssl]# vi /usr/libexec/glusterfs/python/syncdaemon/resource.py
 
[root@master-node ssl]# git diff /usr/libexec/glusterfs/python/syncdaemon/resource.py.bak /usr/libexec/glusterfs/python/syncdaemon/resource.py
diff --git a/usr/libexec/glusterfs/python/syncdaemon/resource.py.bak b/usr/libexec/glusterfs/python/syncdaemon/resource.py
index 3a3bd00..e9721d4 100644
--- a/usr/libexec/glusterfs/python/syncdaemon/resource.py.bak
+++ b/usr/libexec/glusterfs/python/syncdaemon/resource.py
@@ -823,7 +823,7 @@ class SlaveRemote(object):
             raise GsyncdError("no files to sync")
         logging.debug("files: " + ", ".join(files))
         argv = gconf.rsync_command.split() + \
-            ['-avR0', '--inplace', '--files-from=-', '--super',
+            ['-avR0', '--sparse', '--files-from=-', '--super',
              '--stats', '--numeric-ids', '--no-implied-dirs'] + \
             gconf.rsync_options.split() + \
             (boolify(gconf.use_rsync_xattrs) and ['--xattrs'] or []) + \

Restarted geo-replication so it uses "--sparse":

[root@master-node ssl]# ps -ef | grep rsync
root     21066 20974 33 10:50 ?        00:00:00 rsync -avR0 --sparse --files-from=- --super --stats --numeric-ids --no-implied-dirs . -e ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-9WViaB/d517d24dbad39b9d01e3dc1662a34aae.sock --compress ruser@remote-node2:/proc/142806/cwd
root     21067 21066  0 10:50 ?        00:00:00 ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-9WViaB/d517d24dbad39b9d01e3dc1662a34aae.sock -l ruser remote-node2 rsync --server -vlogDtpRSze.Ls --super --numeric-ids --no-implied-dirs . /proc/142806/cwd
root     21069 40474  0 10:50 pts/0    00:00:00 grep rsync

But after creating the files on the slaves, it doesn't replicate the data reporting rsync error code 23 (Partial transfer due to error):

 
[2015-10-22 10:51:18.218955] W [master(/data/bricks/sdd/data):294:regjob] _GMaster: Rsync: .gfid/b2e1ec26-245e-459c-9554-e81fdf7f9241 [errcode: 23]
[2015-10-22 10:51:18.220887] W [master(/data/bricks/sdd/data):294:regjob] _GMaster: Rsync: .gfid/47c6256e-c483-4541-bf05-e814357dee5a [errcode: 23]
[2015-10-22 10:51:18.222296] W [master(/data/bricks/sdd/data):294:regjob] _GMaster: Rsync: .gfid/1e8ed7d0-d067-4a1f-99cf-a9b9b0be356e [errcode: 23]
[2015-10-22 10:51:18.222604] W [master(/data/bricks/sdd/data):294:regjob] _GMaster: Rsync: .gfid/bfb7d195-5063-4b14-8366-77f93a56402d [errcode: 23]
[2015-10-22 10:51:18.223124] W [master(/data/bricks/sdd/data):294:regjob] _GMaster: Rsync: .gfid/7a99864f-4a3c-4775-8c91-c4005b1866ba [errcode: 23]
[2015-10-22 10:51:18.224288] W [master(/data/bricks/sdd/data):294:regjob] _GMaster: Rsync: .gfid/85e66945-c239-426b-ac0f-15567f1a2b0e [errcode: 23]
[2015-10-22 10:51:18.224412] W [master(/data/bricks/sdd/data):1005:process] _GMaster: incomplete sync, retrying changelogs: CHANGELOG.1445511017
_GMaster: incomplete sync, retrying changelogs: CHANGELOG.1445511032 CHANGELOG.1445511047 CHANGELOG.1445511064 CHANGELOG.1445511079 CHANGELOG.1445511096 CHANGELOG.1445511114 CHANGELOG.1445511129

in /var/log/glusterfs/geo-replication/factory/ssh%3A%2F%2Fruser%40192.168.10.9%3Agluster%3A%2F%2F127.0.0.1%3Afactory-az1.log file.

[root@remote-node1 /]# ls -l /data/bricks/sdb/data/test/
total 0
-rw-r--r-- 2 root ruser  0 Oct 22 10:50 c65_base.dsk
-rw-r--r-- 2 root ruser  0 Oct 22 10:50 c65_base.qcow2
-rw-r--r-- 2 root ruser  0 Oct 22 10:50 c65_base.xml
-rw-r--r-- 2 root ruser  0 Oct 22 10:50 c66_base.dsk
-rw-r--r-- 2 root ruser  0 Oct 22 10:50 c66_base.qcow2
-rw-r--r-- 2 root ruser  0 Oct 22 10:50 c66_base.xml
lrwxrwxrwx 2 root ruser 12 Oct 22 10:50 image_base.dsk -> c66_base.dsk
lrwxrwxrwx 2 root ruser 14 Oct 22 10:50 image_base.qcow2 -> c66_base.qcow2

Based on this post and this post there is no way of solving the sparse issue with rsync, so I decided to try and use tar+ssh instead:

[root@master-node ssl]# gluster vol geo-replication factory ruser@remote-node1::factory-az1 stop
Stopping geo-replication session between factory & ruser@remote-node1::factory-az1 has been successful

[root@master-node ssl]# gluster vol geo-replication factory ruser@remote-node1::factory-az1 config use-tarssh true
geo-replication config updated successfully

[root@master-node ssl]# gluster vol geo-replication factory ruser@remote-node1::factory-az1 start
Starting geo-replication session between factory & ruser@remote-node1::factory-az1 has been successful

[root@master-node ssl]# ps -ef | grep geo-replication | grep tar
root     33747 32262  4 12:37 ?        00:00:00 ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/tar_ssh.pem ruser@remote-node2 tar --overwrite -xf - -C /proc/10812/cwd

[root@master-node ssl]# ps -ef | grep rsync
root     33759 40474  0 12:37 pts/0    00:00:00 grep rsync

It had strange side effect though - the image files were constantly downloaded, dropped to zero and downloaded again. Sparseness was again lost. After reading tar manual I decided to make the below change and try again:

[root@master-node ssl]# git diff /usr/libexec/glusterfs/python/syncdaemon/resource.py.bak /usr/libexec/glusterfs/python/syncdaemon/resource.py
diff --git a/usr/libexec/glusterfs/python/syncdaemon/resource.py.bak b/usr/libexec/glusterfs/python/syncdaemon/resource.py
index 3a3bd00..b1dabd3 100644
--- a/usr/libexec/glusterfs/python/syncdaemon/resource.py.bak
+++ b/usr/libexec/glusterfs/python/syncdaemon/resource.py
@@ -848,7 +848,7 @@ class SlaveRemote(object):
             raise GsyncdError("no files to sync")
         logging.debug("files: " + ", ".join(files))
         (host, rdir) = slaveurl.split(':')
-        tar_cmd = ["tar", "-cf", "-", "--files-from", "-"]
+        tar_cmd = ["tar", "--sparse", "-cf", "-", "--files-from", "-"]
         ssh_cmd = gconf.ssh_command_tar.split() + \
             [host, "tar", "--overwrite", "-xf", "-", "-C", rdir]
         p0 = Popen(tar_cmd, stdout=subprocess.PIPE,

Restarted geo-replication:

[root@master-node ssl]# ps -ef | grep tar
root     38859 38782 31 13:13 ?        00:00:02 tar --sparse -cf - --files-from -
root     38860 38782  0 13:13 ?        00:00:00 ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/tar_ssh.pem ruser@remote-node2 tar --overwrite -xf - -C /proc/17380/cwd
root     38865 40474  0 13:13 pts/0    00:00:00 grep tar

It did solve sparse issue:

[root@remote-node1 /]# du -sh /data/bricks/sdb/data/test/c66_base.dsk
1.2G	/data/bricks/sdb/data/test/c66_base.dsk

[root@remote-node1 /]# du -sh --apparent-size /data/bricks/sdb/data/test/c66_base.dsk
10G	/data/bricks/sdb/data/test/c66_base.dsk

but for some reason it keeps overwriting files constantly.

Interesting... Whatever happened, it stopped overwriting the files constantly and settled with the correct sizes and sparseness:

Source:

[root@client alexm]# du -sh /data/factory/test/*
1.8G	/data/factory/test/c65_base.dsk
284M	/data/factory/test/c65_base.qcow2
13K	/data/factory/test/c65_base.xml
2.0G	/data/factory/test/c66_base.dsk
283M	/data/factory/test/c66_base.qcow2
14K	/data/factory/test/c66_base.xml
0	/data/factory/test/image_base.dsk
0	/data/factory/test/image_base.qcow2

[root@client alexm]# du -sh --apparent-size /data/factory/test/*
10G	/data/factory/test/c65_base.dsk
284M	/data/factory/test/c65_base.qcow2
13K	/data/factory/test/c65_base.xml
10G	/data/factory/test/c66_base.dsk
283M	/data/factory/test/c66_base.qcow2
14K	/data/factory/test/c66_base.xml
12	/data/factory/test/image_base.dsk
14	/data/factory/test/image_base.qcow2

Geo-replication destination:

[root@remote-node1 /]# du -sh /data/bricks/sdb/data/test/*
1.2G	/data/bricks/sdb/data/test/c65_base.dsk
284M	/data/bricks/sdb/data/test/c65_base.qcow2
20K	/data/bricks/sdb/data/test/c65_base.xml
1.2G	/data/bricks/sdb/data/test/c66_base.dsk
283M	/data/bricks/sdb/data/test/c66_base.qcow2
20K	/data/bricks/sdb/data/test/c66_base.xml
0	/data/bricks/sdb/data/test/image_base.dsk
0	/data/bricks/sdb/data/test/image_base.qcow2

[root@remote-node1 /]# du -sh --apparent-size /data/bricks/sdb/data/test/*
10G	/data/bricks/sdb/data/test/c65_base.dsk
284M	/data/bricks/sdb/data/test/c65_base.qcow2
13K	/data/bricks/sdb/data/test/c65_base.xml
10G	/data/bricks/sdb/data/test/c66_base.dsk
283M	/data/bricks/sdb/data/test/c66_base.qcow2
14K	/data/bricks/sdb/data/test/c66_base.xml
12	/data/bricks/sdb/data/test/image_base.dsk
14	/data/bricks/sdb/data/test/image_base.qcow2

That something that was triggering files re-sync'ing seems to be a bug. The below is from /var/log/glusterfs/geo-replication/factory/ssh%3A%2F%2Fruser%40192.168.10.9%3Agluster%3A%2F%2F127.0.0.1%3Afactory-az1.log

[2015-10-22 13:13:24.22871] I [master(/data/bricks/sdd/data):1216:crawl] _GMaster: starting history crawl... turns: 1, stime: (1445517445, 0)
[2015-10-22 13:13:24.23425] E [repce(agent):117:worker] <top>: call failed:
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 113, in worker
    res = getattr(self.obj, rmeth)(*in_data[2:])
  File "/usr/libexec/glusterfs/python/syncdaemon/changelogagent.py", line 51, in history
    num_parallel)
  File "/usr/libexec/glusterfs/python/syncdaemon/libgfchangelog.py", line 94, in cl_history_changelog
    cls.raise_changelog_err()
  File "/usr/libexec/glusterfs/python/syncdaemon/libgfchangelog.py", line 27, in raise_changelog_err
    raise ChangelogException(errn, os.strerror(errn))
ChangelogException: [Errno 61] No data available
[2015-10-22 13:13:24.24013] E [repce(/data/bricks/sdd/data):207:__call__] RepceClient: call 38782:139685616219904:1445519604.02 (history) failed on peer with ChangelogException
[2015-10-22 13:13:24.24122] I [resource(/data/bricks/sdd/data):1332:service_loop] GLUSTER: Changelog history crawl failed, fallback to xsync: 61 - No data available
[2015-10-22 13:13:24.25684] I [master(/data/bricks/sdd/data):480:crawlwrap] _GMaster: primary master with volume id 0008af01-9878-4eb1-832b-1da92875cde6 ...
[2015-10-22 13:13:24.47414] I [master(/data/bricks/sdd/data):491:crawlwrap] _GMaster: crawl interval: 60 seconds
[2015-10-22 13:13:24.75035] I [master(/data/bricks/sdd/data):1323:crawl] _GMaster: starting hybrid crawl..., stime: (1445517445, 0)
[2015-10-22 13:13:24.97860] I [master(/data/bricks/sdd/data):1333:crawl] _GMaster: processing xsync changelog /var/lib/misc/glusterfsd/factory/ssh%3A%2F%2Fruser%40192.168.10.9%3Agluster%3A%2F%2F127.0.0.1%3Afactory-az1/23fb04b9a75223da01a99dc595fa4a25/xsync/XSYNC-CHANGELOG.1445519604

Then it went away and all settled:

[2015-10-22 13:14:20.878886] I [monitor(monitor):109:set_state] Monitor: new state: Stable
[2015-10-22 13:18:33.567658] I [master(/data/bricks/sdd/data):1330:crawl] _GMaster: finished hybrid crawl syncing, stime: (1445517445, 0)
[2015-10-22 13:18:33.569169] I [master(/data/bricks/sdd/data):480:crawlwrap] _GMaster: primary master with volume id 0008af01-9878-4eb1-832b-1da92875cde6 ...
[2015-10-22 13:18:33.592328] I [master(/data/bricks/sdd/data):491:crawlwrap] _GMaster: crawl interval: 3 seconds
[2015-10-22 13:18:33.689018] I [master(/data/bricks/sdd/data):1182:crawl] _GMaster: slave's time: (1445517445, 0)
[2015-10-22 13:40:16.734766] I [master(/data/bricks/sdd/data):504:crawlwrap] _GMaster: 1 crawls, 6 turns
[2015-10-22 13:41:18.216042] I [master(/data/bricks/sdd/data):504:crawlwrap] _GMaster: 20 crawls, 0 turns
[2015-10-22 13:42:19.419151] I [master(/data/bricks/sdd/data):504:crawlwrap] _GMaster: 20 crawls, 0 turns
[2015-10-22 13:43:20.631899] I [master(/data/bricks/sdd/data):504:crawlwrap] _GMaster: 20 crawls, 0 turns
[2015-10-22 13:44:21.754441] I [master(/data/bricks/sdd/data):504:crawlwrap] _GMaster: 20 crawls, 0 turns
[2015-10-22 13:45:22.942805] I [master(/data/bricks/sdd/data):504:crawlwrap] _GMaster: 20 crawls, 0 turns
[2015-10-22 13:46:24.45471] I [master(/data/bricks/sdd/data):504:crawlwrap] _GMaster: 20 crawls, 0 turns
[2015-10-22 13:47:25.211256] I [master(/data/bricks/sdd/data):504:crawlwrap] _GMaster: 20 crawls, 0 turns
[2015-10-22 13:48:26.366693] I [master(/data/bricks/sdd/data):504:crawlwrap] _GMaster: 20 crawls, 0 turns
[2015-10-22 13:49:27.540987] I [master(/data/bricks/sdd/data):504:crawlwrap] _GMaster: 20 crawls, 0 turns
[2015-10-22 13:50:28.710438] I [master(/data/bricks/sdd/data):504:crawlwrap] _GMaster: 20 crawls, 0 turns
[2015-10-22 13:51:29.865518] I [master(/data/bricks/sdd/data):504:crawlwrap] _GMaster: 20 crawls, 0 turns
[2015-10-22 13:52:31.40570] I [master(/data/bricks/sdd/data):504:crawlwrap] _GMaster: 20 crawls, 0 turns
[2015-10-22 13:53:32.198990] I [master(/data/bricks/sdd/data):504:crawlwrap] _GMaster: 20 crawls, 0 turns
[2015-10-22 13:54:33.500740] I [master(/data/bricks/sdd/data):504:crawlwrap] _GMaster: 20 crawls, 0 turns
[2015-10-22 13:55:34.680412] I [master(/data/bricks/sdd/data):504:crawlwrap] _GMaster: 20 crawls, 0 turns
[2015-10-22 13:56:36.298779] I [master(/data/bricks/sdd/data):504:crawlwrap] _GMaster: 20 crawls, 0 turns

Turned remote site volume's filesystem back to XFS. With the above mentioned code change sparse files were created properly as well:

[root@remote-node1 /]# du -sh /data/bricks/sdb/data/test/*
1.4G	/data/bricks/sdb/data/test/c65_base.dsk
284M	/data/bricks/sdb/data/test/c65_base.qcow2
20K	/data/bricks/sdb/data/test/c65_base.xml
1.9G	/data/bricks/sdb/data/test/c66_base.dsk
283M	/data/bricks/sdb/data/test/c66_base.qcow2
20K	/data/bricks/sdb/data/test/c66_base.xml
0	/data/bricks/sdb/data/test/image_base.dsk
0	/data/bricks/sdb/data/test/image_base.qcow2
 
[root@remote-node1 /]# du -sh --apparent-size /data/bricks/sdb/data/test/*
10G	/data/bricks/sdb/data/test/c65_base.dsk
284M	/data/bricks/sdb/data/test/c65_base.qcow2
13K	/data/bricks/sdb/data/test/c65_base.xml
10G	/data/bricks/sdb/data/test/c66_base.dsk
283M	/data/bricks/sdb/data/test/c66_base.qcow2
14K	/data/bricks/sdb/data/test/c66_base.xml
12	/data/bricks/sdb/data/test/image_base.dsk
14	/data/bricks/sdb/data/test/image_base.qcow2

I created pull request to submit code change to glusterfs code base. And here is Gerrit review for the same and the Bugzilla bug report.

GlusterFS 3.6.1 on CentOS 6.5: geo-replication and sparse files problem

From MyWiki

Views

Personal tools

Navigation

Search

Toolbox