annotate Script_Explanation @ 201:30d2fb656029

scrapping grep -vf
author edhoprima@gmail.com <edhoprima@gmail.com>
date Mon, 29 Jun 2009 15:40:28 +0000
parents ac6533a8fb51
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
193
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
1 Variables:
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
2
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
3 Outer:
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
4 - ADDITIONAL_PATH
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
5 - MD5
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
6 - DEFAULT_SITE
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
7 - BASE_DIR
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
8 - MOEFETCHVERSION
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
9
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
10 Functions:
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
11
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
12 - Msg_Welcome
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
13 - Welcome message (MOEFETCHVERSION)
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
14 - Err_Help
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
15 - Err_Fatal
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
16 - Generate_Link
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
17 - chdir to ${BASE_DIR}/temp
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
18 - fetch xml with wget
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
19 - xsltproc the xml
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
20 - Check_Tools
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
21 - Check if MD5 is empty - if empty: check os
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
22 - *BSD: md -r
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
23 - Linux/SunOS: md5sum
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
24 - Anything else: Err_Fatal
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
25 - Get md5 command (MD5_COMMAND)
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
26 - Check availability of needed tools
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
27 - cut
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
28 - sed
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
29 - wc
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
30 - wget
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
31 - xsltproc
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
32 - xargs
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
33 - rm
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
34 - mkdir
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
35 - chown
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
36 - comm
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
37 - grep
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
38 - date
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
39 - MD5_COMMAND
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
40 - Check for grep usability
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
41 - TODO: greplace grep -f with POSIX compatible
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
42 - Check_Folders
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
43 - Check BASE_DIR ownership
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
44 - Check: existance (create if not exist) and ownership (apply globally writable permission) of BASE_DIR/:
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
45 - temp
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
46 - trash
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
47 - deleted
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
48 - SITE_DIR/TARGET_DIR
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
49 - Check if "BASE_DIR/SITE_DIR/TARGET_DIR" is empty: if empty, ISNEW=1
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
50 - Create temporary files: BASE_DIR/temp/${SITE_DIR}-${TARGET_DIR}-:
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
51 - error
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
52 - ok
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
53 - list
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
54 - newlist
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
55 - Cleanup_Repository
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
56 - TRASH_DIR: ${BASE_DIR}/trash/${SITE_DIR}-${TARGET_DIR}-%Y%m%d-%H.%M
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
57 - create trash folder
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
58 - check if file in "${BASE_DIR}/${SITE_DIR}/${TARGET_DIR}/ fulfills [a-f0-9]{32}\..* or is a folder
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
59 - move to trash if is trash
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
60 - check if file is contained in ${BASE_DIR}/temp/${SITE_DIR}-${TARGET_DIR}-list
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
61 - if is not, move to trash
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
62 - Check_Files
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
63 - if not ISNEW (empty target folder):
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
64 - if not NOCLEAN (not skipping cleanup:
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
65 - Call Clean_Repository
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
66 - chdir to target folder
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
67 - TODO: chdir-free operation
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
68 - empty ${BASE_DIR}/temp/${SITE_DIR}-${TARGET_DIR}-err
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
69 - check file in current directory (${BASE_DIR}/${SITE_DIR}/${TARGET_DIR})
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
70 - skip if not correct file ([a-f0-9]{32}\..*)
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
71 - put into ${BASE_DIR}/temp/${SITE_DIR}-${TARGET_DIR}-error for every error files
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
72 - remove the files
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
73 - chdir to temp folder
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
74 - list ${BASE_DIR}/${SITE_DIR}/${TARGET_DIR}, compare with error, exclude the errors, put into ${SITE_DIR}-${TARGET_DIR}-ok
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
75 - get list of new files - compare with ${SITE_DIR}-${TARGET_DIR}-list
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
76 - TODO: remove ls, grep -f dependencies
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
77 - if ISQUICK: skip check
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
78 - if not ISQUICK: print 'empty repository'
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
79 - copy ${SITE_DIR}-${TARGET_DIR}-list to ${SITE_DIR}-${TARGET_DIR}-newlist
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
80 - Fetch_Images
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
81 - chdir to ${BASE_DIR}/temp
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
82 - check if ${SITE_DIR}-${TARGET_DIR}-newlist is empty -> stop
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
83 - chdir to ${BASE_DIR}/${SITE_DIR}/${TARGET_DIR}
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
84 - start wget: wget -e continue=on -bi "${BASE_DIR}/temp/${SITE_DIR}-${TARGET_DIR}-newlist" -o "${BASE_DIR}/temp/${SITE_DIR}-${TARGET_DIR}.log"
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
85 - Init
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
86 - Add path (PATH)
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
87 - Check command (fetch/check/quickfetch/* - JOB)
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
88 - Check site (-s <site> or default - SITE)
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
89 - Check if not clean folder (-nc/no clean - NOCLEAN)
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
90 - Get tags (TAGS)
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
91 - Check site - if SITE empty then set default (SITE=DEFAULT_SITE)
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
92 - TODO: Validate SITE
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
93 - Check tag - if TAGS empty then Err_Fatal
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
94 - Get BASE_DIR: default to PWD - fallback to HOME
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
95 - Validate BASE_DIR: must absolute path
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
96 - Get TARGET_DIR: escape TAGS (replace / with _)
ac6533a8fb51 - Documentation
edhoprima@gmail.com <edhoprima@gmail.com>
parents:
diff changeset
97 - Get SITE_DIR: escape SITE (remove ending /, replace / with _)