GuerrillaBrowser isn't hard to use, but it is very different from your big, conventional web browser. It doesn't look like it, work like it, or even have the same goal.
Your big web browser is designed to help the websites you visit put on a show. When you click on a link in your regular browser (and sometimes even when you don't), a lot of things can happen. What are they? Are they all safe? Who knows. The whole thing is sort of a black box.
GuerrillaBrowser is the opposite of a black box. It's the totally transparent way to connect to web servers. You get to see (and initiate) everything it does. You tell it what resources (URLs) to get, and you can see exactly what appears on your hard drive in response.
GuerrillaBrowser is just one component of a larger toolkit, some of which comes with the GuerrillaBrowser package and some of which came with your computer. You may also have other programs you found elsewhere. Let's see how that works.
GuerrillaBrowser needs a place to store downloaded files (the GB Cache).
If you used the SETUP.EXE program, it created a GB Cache for you at
First we'll create a new
C:\Program Files\GuerrillaBrowser>md \GB
C:\Program Files\GuerrillaBrowser>cd \GB
C:\GB>dir
Volume in drive C has no label.
Volume Serial Number is 386B-9C6D
Directory of C:\GB
02/26/2008 09:07 PM <DIR> ..
02/26/2008 09:07 PM <DIR> .
0 File(s) 0 bytes
2 Dir(s) 44,066,013,184 bytes free
C:\GB>
Right-click anywhere on GuerrillaBrowser's client area (white background) to see the pop-up menu (or use the menu key on the keyboard), and choose the "Open Cache..." command:
Use the "Open Cache" dialog to locate the
We're going to start with Single-Index Mode ('B' button on the toolbar is up). Single-Index Mode uses the Cache Index spinner and URL edit box on the toolbar, while Batch Mode gets its information from the $Batch.txt file. Batch Mode allows you to operate on up to 100 Cache Indexes at one go.
The "Grab HTML" command will actually download whatever you've entered into
the URL edit box on the toolbar. It saves the server's response as $0.raw,
but if it is a web page (HTML) the "Scrub HTML" command can scan it for
possible links and it's those links we want to discover. We need to choose
a Cache Index in the spinner ("00" in this example), and paste a complete
URL in the edit box
While GuerrillaBrowser is executing the "Grab HTML" command, it switches to Results View and shows the Index and URL you gave it in gray, changing to black when the download completes:
Because we had the "Auto Scrub" menu item checked, the "Scrub HTML" command was automatically performed on the $0.raw file. We can see a brand new "00" subfolder in the Cache, and some work files in it.
Using the Command Prompt:
C:\GB>dir
Volume in drive C has no label.
Volume Serial Number is 386B-9C6D
Directory of C:\GB
02/26/2008 10:04 PM <DIR> ..
02/26/2008 10:04 PM <DIR> .
02/26/2008 10:04 PM <DIR> 00
0 File(s) 0 bytes
3 Dir(s) 44,043,231,232 bytes free
C:\GB>dir 00
Volume in drive C has no label.
Volume Serial Number is 386B-9C6D
Directory of C:\GB\00
02/26/2008 10:04 PM 13,213 $0.raw
02/26/2008 10:04 PM <DIR> ..
02/26/2008 10:04 PM 2,767 $4.map
02/26/2008 10:04 PM 1,018 $5T.txt
02/26/2008 10:04 PM 754 $6P.txt
02/26/2008 10:04 PM 682 $8O.txt
02/26/2008 10:04 PM <DIR> .
5 File(s) 18,434 bytes
2 Dir(s) 44,043,231,232 bytes free
C:\GB>
Using Windows Explorer:
The $4.map file associates links with their thumbnail images (if any) and description (if any). You can view it in your text editor, but it's normally accessed by choosing its Cache Index in the spinner, followed by the "Open Map" command. (See the "Quick-Start Tutorial" for an example.)
The Scrub also sorts links by type into the remaining work files: thumbs in $5T.txt, pictures in $6P.txt, movies in $7M.txt, and everything else in $8O.txt (the "other" links list). The "Grab Thumbs", "Grab Pics", "Grab Movies", and "Grab List" commands ('T', 'P', 'M' and 'L' buttons on the toolbar) are hooked directly to the $5T.txt, $6P.txt, $7M.txt and $9L.txt lists. The Scrub found no movie links on this particular web page (so there's no $7M.txt file for the "Grab Movies" command to use), and the $9L.txt file is one you create yourself (using the "Save List" command to save selections you make in the Map file in GuerrillaBrowser's Document View).
For now, we're going to blow off downloading the thumbs, and just get all the pictures on the page. We can see what their links are by looking at the $6P.txt list, using the Command Prompt:
C:\GB>type 00\$6P.txt http://www.ibiblio.org/wm/paint/auth/durer/7sorrows.jpg http://www.ibiblio.org/wm/paint/auth/durer/st-michel.jpg http://www.ibiblio.org/wm/paint/auth/durer/paumgartner.jpg http://www.ibiblio.org/wm/paint/auth/durer/hare.jpg http://www.ibiblio.org/wm/paint/auth/durer/large-turf.jpg http://www.ibiblio.org/wm/paint/auth/durer/magi.jpg http://www.ibiblio.org/wm/paint/auth/durer/doctors.jpg http://www.ibiblio.org/wm/paint/auth/durer/st-anne.jpg http://www.ibiblio.org/wm/paint/auth/durer/self/self-26.jpg http://www.ibiblio.org/wm/paint/auth/durer/adam-eve.jpg http://www.ibiblio.org/wm/paint/auth/durer/adam-eve-1507.jpg http://www.ibiblio.org/wm/paint/auth/durer/4holymen.jpg http://www.ibiblio.org/wm/paint/auth/durer/portraits/father.jpg C:\GB>
Using Notepad:
The procedure for the "Grab Pics" command is similar to what we did for "Grab HTML". We're still using the Cache Index spinner, but instead of the URL edit box we've got the $6P.txt list (which can contain dozens of links instead of just one, get it?).
GuerrillaBrowser has no built-in network monitor, but you probably already have some other program that'll do the job. The ZoneAlarm firewall has two: one on its main window and one in the system tray. Windows' "Connection Status" dialog shows bytes sent and received (right-click the connection you're using in Control Panel's Network properties and choose "Status").
Windows' "Task Manager" (Ctrl+Alt+Del) network activity graph shows these files coming down from the server:
You can see that the downloaded pictures have been added to the folder for Cache Index 00, using the Command Prompt:
C:\GB>dir 00
Volume in drive C has no label.
Volume Serial Number is 386B-9C6D
Directory of C:\GB\00
02/26/2008 10:04 PM 13,213 $0.raw
02/26/2008 10:04 PM 2,767 $4.map
02/26/2008 10:04 PM 1,018 $5T.txt
02/26/2008 10:04 PM 754 $6P.txt
02/26/2008 10:04 PM 682 $8O.txt
02/26/2008 10:30 PM 202,158 7sorrows.jpg
02/26/2008 10:30 PM 334,648 st-michel.jpg
02/26/2008 10:31 PM 155,064 paumgartner.jpg
02/26/2008 10:31 PM 147,133 hare.jpg
02/26/2008 10:31 PM 185,503 large-turf.jpg
02/26/2008 10:31 PM 66,986 magi.jpg
02/26/2008 10:31 PM 151,542 doctors.jpg
02/26/2008 10:31 PM 167,331 st-anne.jpg
02/26/2008 10:31 PM 144,540 self-26.jpg
02/26/2008 10:31 PM 253,795 adam-eve.jpg
02/26/2008 10:31 PM 116,940 adam-eve-1507.jpg
02/26/2008 10:31 PM 121,609 4holymen.jpg
02/26/2008 10:31 PM <DIR> ..
02/26/2008 10:31 PM <DIR> .
02/26/2008 10:31 PM 137,980 father.jpg
18 File(s) 2,203,663 bytes
2 Dir(s) 44,024,991,744 bytes free
C:\GB>
Using Windows Explorer:
You can use your file manager to move the pictures somewhere else or delete them. If you reuse Cache Index 00 with the "Grab HTML" command, they'll be automatically deleted, so be sure and change the spinner next time if that's not what you want. You could also store them indefinitely where they're at. If you do, the GuerrillaViewer companion program can use the same $6P.txt list to find them:
Use the spacebar or little navigation buttons to walk through the list:
GuerrillaBrowser keeps its current GB Cache in memory unless you open a different one or quit the program. To write it out to the $Cache.txt file immediately, use the "Write Cache" command:
Notice that the Cache Index and URL you gave to the "Grab HTML" command is saved in the $Cache.txt file:
C:\GB>dir
Volume in drive C has no label.
Volume Serial Number is 386B-9C6D
Directory of C:\GB
02/26/2008 10:31 PM <DIR> 00
02/26/2008 10:52 PM 543 $Cache.txt
02/26/2008 10:52 PM <DIR> ..
02/26/2008 10:52 PM <DIR> .
1 File(s) 543 bytes
3 Dir(s) 44,019,077,120 bytes free
C:\GB>type $Cache.txt
00 http://www.ibiblio.org/wm/paint/auth/durer/
01
02
03
04
05
06
07
08
If you create a number of GB Caches in one subtree of folders on your hard drive, you can use Windows Explorer's "Search" button to scan all of the $Cache.txt files for some link you used before, or all of the $8O.txt files for any available links to your favorite website, or whatever. Storing your data in ASCII text files means you don't have to be limited to the GB suite programs when using it.
You don't have to download files sight unseen like we did in this exercise. If you follow "Grab HTML" with the "Grab Thumbs" and "Open Map" commands, you can use GuerrillaBrowser's Document View to see all of the information it could find for each of the links on that web page, along with their URLs where you can plainly see them. GuerrillaBrowser also has a "Thumb Scout" command that can find extra thumbs your big browser doesn't know about.
And notice that if we had 100 URLs to web pages like the one in this demonstration, Batch Mode would allow us to download more than a thousand pictures at one go. Try getting your big web browser to do that!
No computer program can read your mind, it only knows what commands you give it. You should pay particular attention to the state of the 'B' button, the Cache Index spinner (in Single-Index Mode), and the contents of the $Batch.txt file (in Batch Mode). See the "Quick-Start Tutorial" and "User Guide" for additional examples, a command summary, troubleshooting tips, and more.