Fixing snapmerge...

classic Classic list List threaded Threaded
34 messages Options
12
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Fixing snapmerge...

mavrothal
In the mean time I tried s19 from XOpup-2.2 on a 200kb folder

# time /mnt/+mnt+mmcblk0p1+xopup-205.sfs/usr/sbin/snapmergepuppy #s19
3194 (process ID) old priority 0, new priority -20
Freezing all processes
All process now frozen
deleting newly deleted files
deleting old whiteouts
rsync-ing
/usr/bin/lsof
All processes now thawed

real 0m10.035s
user 0m2.860s
sys 0m3.930s

# time snapmergepuppy
Merging /initrd/pup_rw onto /initrd/pup_ro1...

real 0m36.170s
user 0m0.940s
sys 0m2.500s

# time /initrd/pup_ro2/usr/sbin/snapmergepuppy
Merging /initrd/pup_rw onto /initrd/pup_ro1...

real 0m48.153s
user 0m1.170s
sys 0m2.840s

Is 3-4 times faster!
No wonder I was using it :-)
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Fixing snapmerge...

JakeSFR
In reply to this post by JakeSFR
JakeSFR wrote
Hmm, something's wrong with attachements here..? - can't unpack downloaded file (archive error).
Ahh, it seems to be an issue of server/browser pair, because nabble.com uses:
Server: Jetty(7.6.0.v20120127)
and:
http://jira.codehaus.org/browse/JETTY-1409

Anyways, downloading via wget is ok...

Greetings!
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Fixing snapmerge...

JakeSFR
Yet another accelerant I just spotted in James' script (I can't believe I haven't thought about this before!) - parameter expansion instead of dirname/basename externals.

BN="`basename "$N"`"
DN="`dirname "$N"`"

changed to:

BN="${N##*/}"
DN="${N%/*}"
[ "$DN" = "$N" ] && DN="."

(the 3rd line is necessary to emulate dirname properly - "$N" comes with leading / stripped off and if no other / in the string, dirname returns '.', whereas N%/* returns the entire string)

Needs more testing, of course, but got ~47% in comparison to the pristine snapmerge from pup_ro2.
That's better...

snapmergepuppy.tbz2

Greetings!
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Fixing snapmerge...

mavrothal
47% faster = ~half the time?

BTW I know is a puppy "tradition" but having now git and commit messages there is no reason for "historical" coding style. Would be better to delete rather than comment out code lines. Makes patches easier to understand and the code more readable overall.
Also "git blame" can tell exactly who changed every single line of code ;)
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Fixing snapmerge...

JakeSFR
47% faster = ~half the time?
Yes sir! :)
Nice surprise - I wasn't expecting such a significant acceleration.

BTW I know is a puppy "tradition" but having now git and commit messages there is no reason for "historical" coding style. Would be better to delete rather than comment out code lines. Makes patches easier to understand and the code more readable overall.
Also "git blame" can tell exactly who changed every single line of code ;)
Yeah, will clean up the code a bit before commiting.
It's just my local, "dirty" version, still under testing, so in case if something goes wrong, it's easier to revert the changes.

Greetings!
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Fixing snapmerge...

JakeSFR
This post was updated on .
Ok, I think it's been tested well enough:
https://github.com/puppylinux-woof-CE/woof-CE/commit/746ffa3edf8c73b307d87a900f82e166955be0a6

In the meantime I've been tinkering with 'cp -a -u -f' section, which I don't like for one reason: 'cp -u' copies a file depending on its mtime and this is very bad, for example:
1. Let's have a file in /root, eg. testfile, which is also already saved in pup_ro1
2. The file's mtime is, let's say,  2014-04-29 12:00:00
3. Now, we have a testfile.tar.gz archive, which also contains (other) testfile, with older mtime (2010-01-01 10:00:00).
4. Extract it to /root - it will overwrite the previous testfile, BUT tar by default preserves mtime.
5. Now, snapmerge won't copy that file to pup_ro1 due to the older mtime, but it should since the file is "technically" newer than the one from pup_ro1.

Same can happen while overwriting an existing file using 'cp -a' and in a few other occasions...

So, I tried another approach, using ctime.
The logic is simple: IF destination_file doesn't exist OR source_file's ctime is greater than destination_file's ctime THEN copy source -> destination

[ ! -e "$BASE/$N" ] || [ ${NCTIME%%.*} -gt `stat -c %Z "$BASE/$N"` ] && cp -a -f "$N" "$BASE/$N"

I was afraid it would have too negative impact on speed, so ran my standard test (10.000 short files, already saved in pup_ro1) and WHAT THE..!?!?
That's quite unbelievable, but got ~73% performance boost in comparison to the original snapmerge from pup_ro2!
It seems that  'stat -C %Z' is simply faster than 'cp -a -f' and now the latter gets executed only if there's a reason for it.

Oh well, when the actual copying takes place, the boost is bit lower - "only" ~65% ;).

I think now the woof-CE snapmerge can compete with Jamesbond's version. :D

Testing in progress...

snapmergepuppy.tbz2

EDIT: hmm, the actual efficiency seems to be somehow proportional to the amount of files in pup_rw: more files = higher speed ratio.
If there's not much to be processed, the speed boost is about 55-60% (still better than 47%). But with 10.000 files it's 70-75%.
Anyway, the most important is that my changes didn't decrease the performance. :)

Greetings!
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Fixing snapmerge...

JakeSFR
All is working fine.
Merged: https://github.com/puppylinux-woof-CE/woof-CE/commit/4fde20348d7b791b9f5cdbc3af58d63a9b81bba1
___________

Another issue. Older versions of 'cp' have problem with '-f' option (at least if it comes for special files), see here for details:
http://www.murga-linux.com/puppy/viewtopic.php?p=775885#775885

Now, such behaviour would result in unnecessary, "false-positive" error messages from snapmerge.

Possible fixes (from worst to best, IMO) I can think of, are:

1. Screen out /dev/ completely from being merged onto pup_ro1.
Never tried that though, so don't know what possible consequences could be.
I assume BK had a reason for screening out only some of /dev/* entries.
Besides, /var/ also can contain some special files (like /var/run/acpid.socket), so it won't fix the problem anyways.

2. Remove handling of cp errors, but I don't think it's a good idea, as there still might be real fs errors that users should be informed about.

3. Add '--remove-destination' option.
This works fine with old and new cp, however not with busybox cp, in which this option isn't implemented at all.

4. Add '--remove-destination' option conditionally - only if detected cp is full version, not busybox one.
Busybox (v1.21.0) cp handles '-f' properly, btw.

5. Leave everything as is, hoping that no one will be building Puppy using some ancient coreutils version.

Thoughts?

Greetings!
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Fixing snapmerge...

jamesbond3142
On Mon, 12 May 2014 07:47:53 -0700 (PDT)
"JakeSFR [via woof-CE]" <[hidden email]> wrote:

>
> Possible fixes (from worst to best, IMO) I can think of, are:
>
> 1. Screen out /dev/ completely from being merged onto pup_ro1.
> Never tried that though, so don't know what possible consequences could be.
> I assume BK had a reason for screening out only some of /dev/* entries.
> Besides, /var/ also can contain some special files (like
> /var/run/acpid.socket), so it won't fix the problem anyways.
>

I'd go for this one. /dev has no business to stay in a savefile.
Whatever the reason was, it's no longer valid. Modern systems avoid the problem completely by mounting devtmpfs on top of /dev, so looking from pup_rw, /dev will always be blank.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Fixing snapmerge...

JakeSFR
Thank you for the opinion, James.
Although excluding /dev won't fix that specific issue fully, but could be implemented regardless.
However, here in Slacko, /dev is not a mountpoint in neither PUPMODE 13 nor 12.

I just gave it a test run and everything seemed to be ok, with one exception - /dev is being created in pup_ro1 anyway (apparently while saving the session at shutdown time) and contains the following whiteouts:
# ls -AR /initrd/pup_ro1/dev/
/initrd/pup_ro1/dev/:
input  usb  .wh.snd

/initrd/pup_ro1/dev/input:
.wh.event1  .wh.event2  .wh.mouse0

/initrd/pup_ro1/dev/usb:
.wh.hiddev0
# 
Deleting all that stuff from pup_ro1 resulted in...inability to open urxvt.
I was able to open LXTerminal, but with no prompt and non-responisive keyboard.
I haven't tested much more than that.

So, it seems that /dev must be in pup_ro1 one way or another...

Greetings!
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Fixing snapmerge...

mavrothal
In reply to this post by JakeSFR
I'll take #5
As long as puppy will keep not using the full udev (now systemd) package, I would keep the "change as needed" approach in /dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Fixing snapmerge...

JakeSFR
@Mavrothal: I've never been playing with pup_[a,y,z], so I want to be sure - taking into account their existence, I guess they also should be included in snapmerge, right?

So:
SFSPoints=$( grep -o -E "/initrd/pup_ro[0-9]+" /proc/mounts )
should be:
SFSPoints=$( grep -o -E "/initrd/pup_ro[0-9]+|/initrd/pup_[a,y,z]" /proc/mounts )

what would allow "deletion" of files also from these layers.

 #if file exists on a lower layer, have to save the whiteout file...
 #110206 Dougal: speedup and refine the search...
 for P in $SFSPoints
 do
   if [ -e "$P/$N" ] || [ -L "$P/$N" ] ; then	# SFR: broken symlinks also deserve to be processed ( '-e' won't detect them, needs to be also '-L')
     [ -d "${BASE}/${DN}" ] || mkdir -p "${BASE}/${DN}"
     touch "${BASE}/${DN}/.wh.${BN}"
     break
   fi
 done #110206 End Dougal.

Greetings!
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Fixing snapmerge...

Karl Godt
Administrator
This post was updated on .
In reply to this post by JakeSFR
Now am running from USB in PUPMODE 13 .

What I detected in the new code is that sed starts on file before tested for -s:

if [ -s /tmp/snapmergepuppy-error ]; then   #140102 SFR
sed -i '/No such file or directory/d' /tmp/snapmergepuppy-error # discard errors caused by bad timing



I additionally added | sort filter to find , for if filling fails because of low space, at least the USER (me) gets an ordered result :

#user options:
SORT_OPTS="-k3 -t\ "  #"-k3 -t ' '" does not work in tharpuppy both ash+bash
                      #-V means real number sorting.
#SORT_OPTS=-V         #-V option not supported by busybox sort currrently and older sort binary.
find . -xdev \
       -not -path . \
       -not -type d \
       -regextype posix-extended \
       -not \( -regex '.*/\.wh\.[^\]*' -type f \) \
       -not \( -regex '^./mnt.*|^./initrd.*|^./proc.*|^./sys.*|^./tmp.*|^./pup_.*|^./zdrv_.*|^./root/tmp.*|.*_zdrv_.*|^./dev/\..*|^./dev/fd.*|^./dev/pts.*|^./dev/snd.*|^./dev/shm.*|^./\.wh\..*|^./var/run.*|^./root/ftpd.*|^./var/tmp.*|.*\.XLOADED$' \) \
       -not \( -regex '.*\.thumbnails.*|.*\.part$|.*\.crdownload$' \) \
       -printf "%s %C@ %P\n" | \
       busybox sort $SORT_OPTS |


Then it is possible to check if pup_rw contains more content than savefile beforehand like (uncleaned) :
mountpoint -q "$BASE" || { gettext -e "ALERT:$BASE is not mounted"'!'"\n"; exit 1; }

 DU_SNAPB7=`du -s -B7 "$SNAP" |awk '{print $1}'`
 echo B7=$DU_SNAPB7
 DU_SNAPB8=`du -s -B8 "$SNAP" |awk '{print $1}'`
 echo B8=$DU_SNAPB8
 DU_SNAPK00=`du -s -B1000 "$SNAP" |awk '{print $1}'`
 echo K00=$DU_SNAPK00
 DU_SNAPK24=`du -s -B1024 "$SNAP" |awk '{print $1}'`
 echo K24=$DU_SNAPK24
 DU_SNAPM00=`du -s -B $((1000*1000)) "$SNAP" |awk '{print $1}'`
 echo M00=$DU_SNAPM00
 DU_SNAPM0024=`du -s -B $((1000*1024)) "$SNAP" |awk '{print $1}'`
 echo M0024=$DU_SNAPM0024
 DU_SNAPM24=`du -s -B $((1024*1024)) "$SNAP" |awk '{print $1}'`
 echo M24=$DU_SNAPM24

 if mountpoint "$SNAP"; then
  TOTAL_SNAPB="`df -B8 | grep -m1 " $SNAP$" | awk '{print $2}'`"
     DU_SNAPB=`df -B8 | grep -m1 " $SNAP$" | awk '{print $3}'`
               df -B8 | grep -m1 " $SNAP$" | awk '{print $2" "$3}'
  TOTAL_SNAPK="`df -B1024 | grep -m1 " $SNAP$" | awk '{print $2}'`"
     DU_SNAPK=`df -B1024 | grep -m1 " $SNAP$" | awk '{print $3}'`
               df -B1024 | grep -m1 " $SNAP$" | awk '{print $2" "$3}'
  TOTAL_SNAPM="`df -B $((1024*1024)) | grep -m1 " $SNAP$" | awk '{print $2}'`"
     DU_SNAPM=`df -B $((1024*1024)) | grep -m1 " $SNAP$" | awk '{print $3}'`
               df -B $((1024*1024)) | grep -m1 " $SNAP$" | awk '{print $2" "$3}'
  TOTAL_SNAPM=$(((TOTAL_SNAPB / 128) / 1024 * 1024 / 1000))
  echo TOTAL_SNAPM=$TOTAL_SNAPM
 fi

 DU_BASEB7=`du -s -B7 "$BASE" |awk '{print $1}'`
 echo B7=$DU_BASEB7
 DU_BASEB8=`du -s -B8 "$BASE" |awk '{print $1}'`
 echo B8=$DU_BASEB8
 DU_BASEK00=`du -s -B1000 "$BASE" |awk '{print $1}'`
 echo K00=$DU_BASEK00
 DU_BASEK24=`du -s -B1024 "$BASE" |awk '{print $1}'`
 echo K24=$DU_BASEK24
 DU_BASEM00=`du -s -B $((1000*1000)) "$BASE" |awk '{print $1}'`
 echo M00=$DU_BASEM00
 DU_BASEM0024=`du -s -B $((1000*1024)) "$BASE" |awk '{print $1}'`
 echo M0024=$DU_BASEM0024
 DU_BASEM24=`du -s -B $((1024*1024)) "$BASE" |awk '{print $1}'`
 echo M24=$DU_BASEM24

 if mountpoint "$BASE"; then
  TOTAL_BASEB="`df -B8 | grep -m1 " $BASE$" | awk '{print $2}'`"
  df -B8 | grep -m1 " $BASE$" | awk '{print $2" "$3}'
  TOTAL_BASEK="`df -B1000 | grep -m1 " $BASE$" | awk '{print $2}'`"
  df -B1000 | grep -m1 " $BASE$" | awk '{print $2" "$3}'
  TOTAL_BASEM="`df -B $((1000*1000)) | grep -m1 " $BASE$" | awk '{print $2}'`"
  df -B $((1000*1000)) | grep -m1 " $BASE$" | awk '{print $2" "$3}'
  TOTAL_BASEM=$(((TOTAL_BASEB / 128) / 1024 * 1024 / 1000))
 fi
 echo "DU_SNAPB=$DU_SNAPB TOTAL_BASEB=$TOTAL_BASEB"
 echo "DU_SNAPK=$DU_SNAPK TOTAL_BASEK=$TOTAL_BASEK"
 #if [ "$DU_SNAPB" -ge "$TOTAL_BASEB" ]; then
  if [ "$DU_SNAPK" -ge "$TOTAL_BASEK" ]; then
  export LANG="$OLDLANG"
  ( sleep 0.2; error_msg "Not enough free space in $BASE: $DU_SNAPB Bytes ( $DU_SNAPM MB ) needed, but only $TOTAL_BASEM MB available" ) &
  if [ "`pidof X`" ]; then xmessage -bg green -title "$0" -buttons "Proceed:100,Abort:101" " Space in target $BASE may not be sufficient.
   It is recommended to abort now and move some files to an external USB, rather than
   copying them from current temporary RAM into the savefile .
  Filling the savefile up 100% may lead to incomplete or corrupt files
   and very likely a corrupt filesystem too.
 Only proceed now if in an emergency situation !
"
  case $? in 100):;; *) exit 0;;esac
  fi
 fi
###


gettext -e "INFO:Merging $SNAP onto $BASE...""\n"


I also re-introduced parameter parsing old-style because creating a new savefile manually in commandline
and
mount -t aufs -o remount,add=1:/path/to/mountpoint/of/puppysave-NEW.2fs unionfs /
adding the new save-file while running system I want to fill the new savefile too :

error_msg () {
  if [ "$SHUTDOWN" = "no" -a "$XRUNNING" = "yes" ];then
    export DISPLAY=':0'
    /usr/lib/gtkdialog/box_splash -timeout 30 -close box -icon gtk-dialog-warning -bg red -text "$1"
  else
    echo "$1"
  fi
}

_say_help(){
 test "$*" && gettext -e "\n""ERROR:$*""\n"
 gettext -e "\n"" If parameters, it must be 'into directory1' 'from directory2'""\n"
 gettext -e " Usually run on flash modes 3,7,13 to copy into mountpoint pup_ro1 data from pup_rw""\n"
 gettext -e " The traditional syntax is as above - opposite to the cp syntax 'from direcctory1' 'into directory2'""\n"
 #gettext -e " Currently running PUPMODE '$PUPMODE' .""\n""\n"
 exit 2
}

echo -e "\n""INFO:$$:$0:$*"
QUESTPIDS=`pidof -o $$ -o %PPID "${0##*/}"`
[ "$QUESTPIDS" ] && { gettext -e "NOTICE:Already running with pid(s) $QUESTPIDS.""\n"; exit 1; }

if [ $# = 2 -a "$1" -a "$2" ]; then
 SNAP="$2"
 BASE="$1"
elif [ "$*" ]; then
 _say_help
elif [ $PUPMODE -eq 3 -o $PUPMODE -eq 7 -o $PUPMODE -eq 13 ]; then
 SNAP="/initrd/pup_rw" # topmost layer
 if [ "$SAVE_LAYER" ]; then #defined in PUPSTATE
 BASE="/initrd${SAVE_LAYER}"
 else
 BASE="/initrd/pup_ro1" # savefile
 fi
 #umount -l could unmount the savefile in rc.shutdown and if poweroff fails
 #and user returns to X and forgets to remount savefile, either funny messages will
 #fill xerrs.log or a dissapointed user may think everything is okay in /initrd/pup_ro1
 #when looking into it, but missing changed files next boot ...
 #So at least give a warning message when exiting to xerrs.log.
 mountpoint -q "$BASE" || { gettext -e "ALERT:$BASE is not mounted"'!'"\n"; exit 1; }

 DU_SNAPB=`du -s -B7 "$SNAP" |awk '{print $1}'`

 if mountpoint "$SNAP"; then
  TOTAL_SNAPB=`df -B8 | grep -m1 " $SNAP$" | awk '{print $2}'`
  DU_SNAPB=`df -B8 | grep -m1 " $SNAP$" | awk '{print $3}'`
  TOTAL_SNAPM=$(((TOTAL_SNAPB / 128) /1024))
  DU_SNAPM=$(((DU_SNAPB / 128) / 1024 * 1024 / 1000))
 fi

 DU_BASEB=`du -s -B7 "$BASE" |awk '{print $1}'`

 if mountpoint "$BASE"; then
  TOTAL_BASEB=`df -B8 | grep -m1 " $BASE$" | awk '{print $2}'`
  TOTAL_BASEM=$(((TOTAL_BASEB / 128) / 1024 * 1024 / 1000))
 fi

echo DU_SNAPM=$DU_SNAPM TOTAL_BASEM=$TOTAL_BASEM
echo DU_SNAPB=$DU_SNAPB TOTAL_BASEB=$TOTAL_BASEB
 if [ "$DU_SNAPB" -ge "$TOTAL_BASEB" ]; then
  export LANG="$OLDLANG"
  error_msg "Not enough free space in $BASE: $DU_SNAPB Bytes ( $DU_SNAPM MB ) needed. Please move some bigger unneeded files onto an external USB drive. Also note, that there is the 'Resize personal storage file' application in the Utilities Menu, but that works at next boot and would not help securing your current data from the top layer in memory."
  exit 5
 fi

else
 _say_help
fi

# Precautions
SNAP=`echo "$SNAP" | tr -s '/'`
BASE=`echo "$BASE" | tr -s '/'`
test -d "$SNAP" || _say_help "ERROR:$SNAP not a directory."
test -d "$BASE" || _say_help "ERROR:$BASE not a directory."
[ "$BASE" = "$SNAP" ] && _say_help "ERROR:$BASE and $SNAP must be different directories"'!'!"\n"
cd $SNAP || { gettext -e "ERROR:Unable to change into directory $SNAP"'!'"\n"; exit 3; }


What do you think ?
Otherwise snapmergepuppy was nice to work on - well done SFR !
 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Fixing snapmerge...

JakeSFR
Hey Karl

I haven't been here for a while.
I've been using Fatdog exclusively since December, so I'm bit detached from Woof-CE and regular Pups.

What I detected in the new code is that sed starts on file before tested for -s:

if [ -s /tmp/snapmergepuppy-error ]; then   #140102 SFR
sed -i '/No such file or directory/d' /tmp/snapmergepuppy-error # discard errors caused by bad timing
This was intended like this.
If this sed would be placed after if [ -s ... ]; then, then if there were only those errors caused by bad timing, they will be discarded by 'sed', but the error message will be displayed anyway (shouldn't be in such a case), despite the fact that /tmp/snapmergepuppy-error log file is empty.

As for the other changes, they look quite interesting! Glad to see you have interest in working on this.

Greetings!
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Fixing snapmerge...

Karl Godt
Administrator
Thanks for pointing that out .
It is a bit frustrating to see unexpected error messages about no such file or directory from unknown source .
And it is a bit wondering about an empty error message too .

test -s file is a bit unusable for it has size if only a newline like by echo "" >file . echo -n "" >file would return false on test -s file . But I avoid echo -n and replace it everytime i find it in puppy scripts : two bytes less code in script and one byte more in file better in full installs .

grep -q '[[:alnum:][:punct:]]' file would be better than test -s IMO .


12
Loading...