GuerrillaBrowser Beginner's Guide, Part 2



Here we will have a look at Batch Mode.  If you haven't already seen Part 1 of the Beginner's Guide you may want to do that first, since Batch Mode is really just the machine-gun version of what you were doing in Single-Index Mode.  Ironically, Batch Mode may help Single-Index Mode make more sense to you, since GuerrillaBrowser was set up with Batch Mode in mind.

A typical way to download something like a picture from a website would be to use a "Save As" type dialog, in which you'd choose a format (maybe only 1 choice available), filename, and a location.  But if you're okay with the name and format used on the server, a regularized way to assign downloaded items to folders would allow a more robotic approach to downloading, saving you a lot of unnecessary flailing around with dialog boxes.

That's what the Cache Index does.  In Single-Index Mode, you set the spinner to choose a target subfolder in the current GB Cache.  In Batch Mode, you use the first 2 characters of each line in $Batch.txt (an ordinary ASCII text file) to assign up to 100 target subfolders.

Automation, as always, requires predictability, so while Batch Mode will give you considerable leverage on cooperative websites, you may not find it much help on uncooperative ones.  Experience will soon teach you which is which among your favorite sites.  It may take you a while to get the hang of switching between Single-Index and Batch Mode depending on what the situation requires.

Let's begin by using the "Open Cache..." command to locate the GB Cache we created before (C:\GB\$Cache.txt).  Then we can use the spinner to choose Cache Index "00" and the "Open Map" command to load the C:\GB\00\$4.map file:


GB menu "Open Map"


The "Open Map" command automatically switches us to Document View (if we weren't there already).  We never bothered to download thumbs for this Map so the thumbs margin (on the left) is blank.  We could have downloaded thumbs just now prior to opening the Map, but let's just pick out a couple of URLs without them:


GB Document View selections


We're in Add Mode (indicated by the blinking red caret), which is best for multiple selection.  We could also have used Replace Mode (black caret) in this instance, because a single extended selection would work for a set of adjacent URLs like these.  Now all we have to do is copy our selections to the Windows Clipboard:


GB menu "Copy"


Then we can paste them into our text editor and save them as $URLs.txt in the current GB Cache folder:


Notepad paste clipboard contents


"Save As" dialog specify C:\GB\$URLs.txt


We can use the "Autonumber" command to assign unused Cache Indexes to a list of URLs.  We set the spinner to "01" to tell it where to start and choose the "Autonumber" command from the menu (the pop-up message gives you a last chance to change your mind):


GB menu "Autonumber"


"Autonumber" message box


The Results View shows the assignments that were made (the contents of the newly-created $Batch.txt file):


GB results black (done)


We can see that $Batch.txt has been added to the GB Cache, using the Command Prompt:


C:\GB>dir
 Volume in drive C has no label.
 Volume Serial Number is 386B-9C6D

 Directory of C:\GB

02/26/2008  11:31 PM    <DIR>          00
02/26/2008  11:52 PM               543 $Cache.txt
06/02/2008  09:47 PM    <DIR>          ..
06/02/2008  09:47 PM    <DIR>          .
06/02/2008  09:49 PM                94 $URLs.txt
06/02/2008  09:49 PM               100 $Batch.txt
               3 File(s)            737 bytes
               3 Dir(s)  43,902,386,176 bytes free

C:\GB>type $URLs.txt
http://www.ibiblio.org/wm/paint/auth/holbein/
http://www.ibiblio.org/wm/paint/auth/bellini/

C:\GB>type $Batch.txt
01 http://www.ibiblio.org/wm/paint/auth/holbein/
02 http://www.ibiblio.org/wm/paint/auth/bellini/

C:\GB>

Using Windows Explorer:


C:\GB\ contains $Batch.txt


Okay, for only 2 URLs we could have just typed the Indexes in ourselves to create $Batch.txt, but we have 99 unused Indexes in this Cache and could have put that many URLs in the $URLs.txt file, so Autonumbering is faster for large batches.

Now we need to click the 'B' button to switch to Batch Mode, so that the Grab/Scrub commands will know to use the $Batch.txt file instead of the spinner like they did in Single-Index Mode.  (The "Autonumber" command doesn't care about the button since it works the same in either Mode.)  Then we can choose the "Grab HTML" command:


GB menu "Grab HTML"


"Batch Mode" message box


Notice that the pop-up message reminds us we're in Batch Mode.  The Results View shows each Index in our batch in gray to begin with, changing to black when the operation on that Index is complete:


GB results gray (in progress)


GB results black (done)


We can see brand-new "01" and "02" subfolders in the Cache containing some work files (thanks to the "Auto Scrub" menu item):


C:\GB>dir
 Volume in drive C has no label.
 Volume Serial Number is 386B-9C6D

 Directory of C:\GB

02/26/2008  11:31 PM    <DIR>          00
02/26/2008  11:52 PM               543 $Cache.txt
06/02/2008  09:49 PM                94 $URLs.txt
06/02/2008  09:49 PM               100 $Batch.txt
06/02/2008  09:59 PM    <DIR>          01
06/02/2008  09:59 PM    <DIR>          ..
06/02/2008  09:59 PM    <DIR>          .
06/02/2008  09:59 PM    <DIR>          02
               3 File(s)            737 bytes
               5 Dir(s)  43,898,052,608 bytes free

C:\GB>dir 01
 Volume in drive C has no label.
 Volume Serial Number is 386B-9C6D

 Directory of C:\GB\01

06/02/2008  09:59 PM    <DIR>          ..
06/02/2008  09:59 PM             5,463 $0.raw
06/02/2008  09:59 PM             1,078 $4.map
06/02/2008  09:59 PM               407 $5T.txt
06/02/2008  09:59 PM               233 $6P.txt
06/02/2008  09:59 PM               366 $8O.txt
06/02/2008  09:59 PM    <DIR>          .
               5 File(s)          7,547 bytes
               2 Dir(s)  43,898,052,608 bytes free

C:\GB>type 01\$5T.txt
0001http://www.ibiblio.org/wm/home.css
0002http://www.ibiblio.org/wm/i/webmuseum.png
0003http://www.ibiblio.org/wm/i/painttool.png
0004http://www.ibiblio.org/wm/paint/auth/holbein/tuke.small.jpg
0005http://www.ibiblio.org/wm/paint/auth/holbein/burgomaster.small.jpg
0006http://www.ibiblio.org/wm/paint/auth/holbein/gisze.small.jpg
0007http://www.ibiblio.org/wm/paint/auth/holbein/southwell.small.jpg

C:\GB>dir 02
 Volume in drive C has no label.
 Volume Serial Number is 386B-9C6D

 Directory of C:\GB\02

06/02/2008  09:59 PM             6,791 $0.raw
06/02/2008  09:59 PM    <DIR>          ..
06/02/2008  09:59 PM             2,032 $4.map
06/02/2008  09:59 PM               769 $5T.txt
06/02/2008  09:59 PM               545 $6P.txt
06/02/2008  09:59 PM               522 $8O.txt
06/02/2008  09:59 PM    <DIR>          .
               5 File(s)         10,659 bytes
               2 Dir(s)  43,898,052,608 bytes free

C:\GB>type 02\$5T.txt
0001http://www.ibiblio.org/wm/home.css
0002http://www.ibiblio.org/wm/i/webmuseum.png
0003http://www.ibiblio.org/wm/i/painttool.png
0004http://www.ibiblio.org/wm/paint/auth/bellini/blessing.small.jpg
0005http://www.ibiblio.org/wm/paint/auth/bellini/emo.small.jpg
0006http://www.ibiblio.org/wm/paint/auth/bellini/virgin-child.small.jpg
0007http://www.ibiblio.org/wm/paint/auth/bellini/barbarigo.small.jpg
0008http://www.ibiblio.org/wm/paint/auth/bellini/lamentation.small.jpg
0009http://www.ibiblio.org/wm/paint/auth/bellini/virgin-sts-left.small.jpg
0010http://www.ibiblio.org/wm/paint/auth/bellini/virgin-sts-right.small.jpg
0011http://www.ibiblio.org/wm/paint/auth/bellini/madonna.small.jpg
0012http://www.ibiblio.org/wm/paint/auth/bellini/feast.small.jpg

C:\GB>


C:\GB\01\ contains workfiles


This time we are going to download thumbs, because we're going to use the "Grab List" command to pick and choose items.  We're still in Batch Mode, so the "Grab Thumbs" command will apply to every Index in the current $Batch.txt file:


GB menu "Grab Thumbs"


"Batch Mode" message box


"F9F0" message box


Oops!  The "ibiblio.org" server bounced us off.  We could decide to abort the batch since it contains more requests to the same server.  But we just accessed it for the "Grab HTML" command earlier without any problems, so maybe it's just one of those Internet "hiccups" (server's a little busy or whatever).  Let's try continuing:


GB results red (problem)


The Results View shows Cache Index 01 in red to indicate a problem.  We can snoop around in the C:\GB\01\ subfolder to see what happened.  When we do, we find the "home.css" cascading style sheet from the 01\$5T.txt thumbs list didn't get downloaded (should have appeared under the alias "0001.css" in the 01\01_files\ folder).

We have several options.  We could switch back to Single-Index Mode and repeat the "Grab Thumbs" for Index 01.  The "Grab List" command would allow us to get whatever URLs we save to the 01\$9L.txt file, which we can rename and move to the 01\01_files\ folder.  In this instance, we did get the exact same resource in the 02\02_files\ folder so we could just copy it from there.  Or we could decide we aren't that interested in this particular file and forget about it.


C:\GB>dir 01
 Volume in drive C has no label.
 Volume Serial Number is 386B-9C6D

 Directory of C:\GB\01

06/02/2008  09:59 PM               233 $6P.txt
06/02/2008  09:59 PM             5,463 $0.raw
06/02/2008  09:59 PM             1,078 $4.map
06/02/2008  09:59 PM               407 $5T.txt
06/02/2008  09:59 PM               366 $8O.txt
06/02/2008  10:17 PM    <DIR>          ..
06/02/2008  10:17 PM    <DIR>          .
06/02/2008  10:17 PM    <DIR>          01_files
               5 File(s)          7,547 bytes
               3 Dir(s)  43,890,679,808 bytes free

C:\GB>dir 01\01_files
 Volume in drive C has no label.
 Volume Serial Number is 386B-9C6D

 Directory of C:\GB\01\01_files

06/02/2008  10:17 PM             3,298 0002.png
06/02/2008  10:17 PM               509 0003.png
06/02/2008  10:17 PM             1,036 0004.jpg
06/02/2008  10:17 PM             1,383 0005.jpg
06/02/2008  10:17 PM             1,408 0006.jpg
06/02/2008  10:17 PM    <DIR>          ..
06/02/2008  10:17 PM               925 0007.jpg
06/02/2008  10:17 PM    <DIR>          .
               6 File(s)          8,559 bytes
               2 Dir(s)  43,890,679,808 bytes free

C:\GB>


C:\GB\01\01_files\ missing 0001.css


Now that we have some thumbs, we're ready to open some Map files.  We could set the spinner and use the "Open Map" command to pick out the ones we want, but the "Open Batch" command (Shift+F2) will open the $4.map file for every Index in the current $Batch.txt file.  (The session-history list will only hold 100 documents, so we might need to close some to free up some space for a large batch.)


GB Document View w/ thumbs


Clicking on one of the images in the thumbs margin will cause the GuerrillaViewer companion program to load the same $4.map file as its Image List.  By navigating through the list in GV and double-clicking images, we can remotely update the selections over in the GB window (depending on whether it's using Add Mode or Replace Mode):


GV thumb blowup and remote selection


We can record the selections we made to the 01\$9L.txt list by choosing the "Save List" command.


GB menu "Save List"


Since we used the "Open Batch" command, the "Forward" button ('>') will let us navigate to the 02\$4.map document where we can make some selections there as well.  We can see that the $9L.txt lists have been added to the subfolders for the Map files we used "Save List" on:


C:\GB>dir 01
 Volume in drive C has no label.
 Volume Serial Number is 386B-9C6D

 Directory of C:\GB\01

06/02/2008  09:59 PM               366 $8O.txt
06/02/2008  09:59 PM             5,463 $0.raw
06/02/2008  09:59 PM             1,078 $4.map
06/02/2008  09:59 PM               233 $6P.txt
06/02/2008  09:59 PM               407 $5T.txt
06/02/2008  10:17 PM    <DIR>          01_files
06/02/2008  10:34 PM               115 $9L.txt
06/02/2008  10:34 PM    <DIR>          .
06/02/2008  10:34 PM    <DIR>          ..
               6 File(s)          7,662 bytes
               3 Dir(s)  43,884,158,976 bytes free

C:\GB>type 01\$9L.txt
http://www.ibiblio.org/wm/paint/auth/holbein/tuke.jpg
http://www.ibiblio.org/wm/paint/auth/holbein/southwell.jpg

C:\GB>dir 02
 Volume in drive C has no label.
 Volume Serial Number is 386B-9C6D

 Directory of C:\GB\02

06/02/2008  09:59 PM             6,791 $0.raw
06/02/2008  09:59 PM               522 $8O.txt
06/02/2008  09:59 PM             2,032 $4.map
06/02/2008  09:59 PM               545 $6P.txt
06/02/2008  09:59 PM               769 $5T.txt
06/02/2008  10:19 PM    <DIR>          02_files
06/02/2008  10:38 PM               175 $9L.txt
06/02/2008  10:38 PM    <DIR>          .
06/02/2008  10:38 PM    <DIR>          ..
               6 File(s)         10,834 bytes
               3 Dir(s)  43,882,803,200 bytes free

C:\GB>type 02\$9L.txt
http://www.ibiblio.org/wm/paint/auth/bellini/blessing.jpg
http://www.ibiblio.org/wm/paint/auth/bellini/barbarigo.jpg
http://www.ibiblio.org/wm/paint/auth/bellini/feast.jpg

C:\GB>


C:\GB\01\ contains $9L.txt


Now that we've created some $9L.txt files for it to use, we can choose the "Grab List" command:


GB menu "Grab List"


"Batch Mode" message box


The images we picked out have been saved to their assigned subfolders:


C:\GB>dir 01\*.jpg
 Volume in drive C has no label.
 Volume Serial Number is 386B-9C6D

 Directory of C:\GB\01

06/02/2008  10:48 PM           138,895 tuke.jpg
06/02/2008  10:48 PM           141,071 southwell.jpg
               2 File(s)        279,966 bytes
               0 Dir(s)  43,875,172,352 bytes free

C:\GB>dir 02\*.jpg
 Volume in drive C has no label.
 Volume Serial Number is 386B-9C6D

 Directory of C:\GB\02

06/02/2008  10:48 PM            58,052 blessing.jpg
06/02/2008  10:48 PM            33,136 barbarigo.jpg
06/02/2008  10:48 PM           163,373 feast.jpg
               3 File(s)        254,561 bytes
               0 Dir(s)  43,875,172,352 bytes free

C:\GB>


C:\GB\02\ contains downloaded files


That should give you enough to allow you to experiment further on your own.  Peace out.