WWW-Tech Mtg 10/4/95, Rough Notes

Attendees: Les Cottrell, Bebo White, Tony Johnson, Kathryn Henniss, Karl Young, Joan Winters, George Crane, Pat Kreitz.

Agenda:

Harvest System - Karl Young

Karl has been investigating a search engine for SLAC use. His criteria were: He looked at WAIS, SWISH, Glimpse, GILS, and Harvest. Harvets appears to satisy most opf the criteria, CERN and ESnet are using it, it has good press on netnews, it seems like a superset of some of the others, the authors (from University of Colorado) seem to be very responsive. He chose to start with Harvest.

The recommended way to run it is with the "gatherer" running (this runs once in a while to collect out of the file system information such as keywords, author names, and titles) on the server. This will not be possible for non Unix servers. For these the gatherer will have to run on a separate Unix machine. The information from multiple gatheres is retrieved by a broker which suppresses duplicate information. The broker can be replicated. Harvest has lots of options for how deep to search etc. It also can handle multiple filetypes including postscript, html, TeX, RTF, mail gathering different information based on template files. It allows a separate indexer, Karl thinks we will start out with Glimpse.

Several questions came up, for example, can it gather authors of pages, can it verify links for all of SLAC and one level out, can it set up sub-indexes (e.g. how would SLD be able to only search SLD pages), can it be set up to search all the WWW servers in the SLAC domain? Karl will put it up and investigate how it meets our needs, or could be extended to meet them. Karl estimates he will get a trial version up by the end of next week.

There is a demonstration service at the University of Colorado.

Status of Log File Analysis - Tony Johnson

The analysis of the server (on www1 and SLACVX) log files is now completed using wwwstat and gwstat, and running automatically. Bebo and Tony will confer to check whether Tony's analysis is a superset of Bebos. As part of the cron job, Tony can also get rid of the log files. Then the question is how long to keep the files and how much disk space is there. Tony will confer with John to move the log files to another large disk partition. Pat wants to be sure that she has the right info extracted for her reports before anything is deleted. One could consider tarring the log files into a single file on a monthly basis and then backing up to the silo. John raised the question of privacy of the information (e.g. who checked out what books)? Pat Kreitz said this is not an issue, but that Joan should record the fact in the privacy document. SPIface also keeps a log of accesses. Maybe this would be a better source of data to be analyzed for Pat Kreitz.

Obsolete Links in InfoSeek & Lycos - Les Cottrell

Agostini reported that InfoSeek and Lycos were pointing to obsolete SLAC home pages, Joan wrote to Lycos. The problem appears to have been fixed.

SLAC Visitors Center Web - Tony Johnson

Some concerns were raised about Web browsers in the visitors center being able to access pages which would be inappropriate to be left on the screen when a congressman walks in. Also there may be an issue of escaping to the shell, and the issue of being able to see SLAC pages which are supposed to be invisible to the outside world. Tony will set up a special proxy server, see if Netscape has a kiosk mode (if not maybe we will need to use Mosaic which does have a kiosk mode).

SLAC Icon Library - Joan Winters

Joan handed out a proposal. She asked that we read it and discuss at the next meeting.

Report from CHEP

There was not an enormous amount of Web activity at CHEP95. Karl Malamud made an interesting presentation on the Internet World's Fair.. Bebo has a CHEP95 trip report in the SLAC netnews group slac.scs.trips.

Status of Steering Committee - Pat Kreitz

The ADCOC has a revised charter. David Leith hopes it will be established on this Friday.

Status of Home Page Survey - Karthryn Henniss

Kathryn handed out a summary of the questionnaire. Joan and Kathryn are working on this.

Next Meeting

Action Items

AOB

Karen Heidenreich went to an Oracle fair and got a lot of information that she will present to the WWW-Tech.

Bebo went to the Seybold conference last week. There was a heavy Web presence from many vendors.

Netscape 2.0 is almost ready for release. It has Java support (not clear what platforms), macromedia. They also have a lot of tools for publishers (e.g. to update links etc.)

Les Cottrell