Documentation
SITE SEARCH PRO


System Requirements

  • Perl 5
  • MYSQL
  • Telnet
  • Must be able to run cgi outside of cgi-bin

Preliminaries

  • Determine the path to PERL 5 on your web server host.  Note that some web hosting companies run both PERL 4 and PERL 5.  Make ABSOLUTELY sure you are not setting this up under PERL 4.  Ask your administrator if you are not sure.
  • Unpack the tar archive on your desktop using a program that unpacks UNIX TAR ARCHIVES. If you don't have such a program then download WINZIP FREE from SHAREWARE.COM
  • After you have unpacked the TAR archive you will have a collection of folders and files on your desktop.  Now you have to do some basic editing of each of these files (or at least some of them).  Use a text editor such as wordpad, notepad, BBEdit, simpletext, or teachtext to edit the files.  These are NOT WORD PROCESSOR DOCUMENTS they are just simple TEXT files so don't save them as word processor documents or save them with extentions such as .txt or they will NOT WORK.   Note that there may be a some files inside of folders which are "blank".   This is normal.

Preparing the CGI scripts

Define Path To PERL 5

The first step is to open up each and every file that has a .cgi extention and edit line number one of each script.  Each of the cgi scripts is written in perl 5. For your scripts to run they must know where perl 5 is installed on your web server. The path to perl 5 is defined to a cgi script in the first line of the file. In each of the cgi scripts the first line of code looks something like this:

#!/usr/bin/perl

If the path to perl 5 on your web server is different from /usr/bin/perl you must edit the first line of each cgi script to reflect the correct path. If the path to perl 5 is the same no changes are necessary. If you do not know the path to perl 5 ask the webmaster or system administrator at your server site.  

Configure the .cgi files

configure.cgi

Set variables inside of configure.cgi like so

  • $rooturl="/full/path/to/root/spidering/directory/";
  • $subdir = "/full/path/to/public_html";
  • $headerhtml = "/full/path/to/cgi-bin/sitesearchpro/header.html";
  • $footerhtml = "/full/path/to/cgi-bin/sitesearchpro/footer.html";
  • $webpagelist="/full/path/to/sites.txt";
  • $spiderfile = "sites.txt";
  • $mysqldatabase = "mysql database name";
  • $mysqlusername = "mysql user name";
  • $mysqlpassword = "mysql password";
  • $searchresultsperpage = 5;
  • $outputdescriptionlength = 350;
  • $pointsfortitlematch = 10;
  • $pointsformetadescriptionmatch = 10;
  • $pointsformetakeywordsmatch = 10;
  • $rooturl is full path to the directory your spidering starts (usually this will be your root html dir)
  • $subdir is the same as $rooturl without the ending backslash
  • $headerhtml is full path to header.html
  • $footerhtml is full path to footer.html
  • $webpagelist is full path to sites.txt
  • $spiderfile is just called sites.txt (do not change)
  • $mysqldatabasename,$mysqlusername,$mysqlpassword - pretty obvious
  • $searchresultsperpage is the # of results you want returned per query
  • $outputdescriptionlength is the length of each search result 350 characters is pretty good
  • $pointsfortitlematch effects search engine ranking - more points if keyword is in title the higher it ranks
  • $pointsformetadescriptionmatch effects search engine ranking - more points if keyword is in metadescription the higher it ranks
  • $pointsformetakeywordsmatch effects search engine ranking - more points if keyword is in metakeywords the higher it ranks

Upload Your Edited CGI and Database Files

  • Create directory inside cgi-bin called sitesearchpro and upload all files, chmod everything to 755 that ends in .cgi, and everything else to 666 or 777.
  • Create your mysqldatabase and upload the .sql file
  • Upload a copy of sites.txt into the root directory you want your search engine to begin spidering (usually your root html dir)
  • Upload a copy of configure.cgi and spider.cgi in this same directory
  • Run spider.cgi by telnet until it finishes
  • Open the sites.txt file and remove any files you do NOT want to appear in the search engine
  • Do a search and replace in sites.txt - replace "//" with "/"
  • Upload this edited sites.txt file into /cgi-bin/sitesearchpro and run upload.cgi to create your database
  • Make sure index.html points to /cgi-bin/sitesearchpro/search.cgi
  • Edit the footer.html and header.html files to customize your header and footer output if desired

Editing the search engine

  • If you accidentally get files in your search engine you do NOT want either delete the files from sites.txt, delete and restore the .sql file and restart upload.cgi or login to mysql and delete manually. 

Updating the search engine

  • To update the search engine data simply run upload.cgi to replace the data with current data.