WWW-Tech Mtg 10/4/95, Rough Notes
Attendees: Les Cottrell, Bebo White, Tony Johnson, Kathryn Henniss,
Karl Young, Joan Winters, George Crane, Pat Kreitz.
Agenda:
- Harvest System - KarlY
- Status of Log file analysis - tonyj
- Obsolete links in Yahoo
- SLAC visitors center web
- SLAC Icon library
- Report from CHEP?
- Status of institutional page
- Status of policy committee
- Status of home page survey
- Next Meeting
- Action Items
- AOB
Harvest System - Karl Young
Karl has been investigating a search engine for SLAC use.
His criteria were:
- Standards conformance
- Unix based
- Efficient
- Good performance for user
- Easy to use & maintain
- Functionality (e.g. Boolean operators)
He looked at WAIS, SWISH, Glimpse, GILS, and Harvest.
Harvets appears to satisy most opf the criteria, CERN and ESnet are
using it, it has good press on netnews, it seems like a superset of some of
the others, the authors (from University of Colorado) seem to be very
responsive. He chose to start with Harvest.
The recommended way to run it is with the "gatherer" running
(this runs once in a while to collect out of the file
system information such as
keywords, author names, and titles)
on the server. This will
not be possible for non Unix servers. For these the gatherer will
have to run on a separate Unix machine. The information from
multiple gatheres is retrieved by a broker which suppresses duplicate
information. The broker can be replicated. Harvest has lots of options for
how deep to search etc. It also can handle multiple
filetypes including postscript, html, TeX, RTF, mail gathering
different information based on template files.
It allows a separate indexer, Karl thinks we will
start out with Glimpse.
Several questions came up, for example, can it gather authors of pages,
can it verify links for all of SLAC and one level out,
can it set up sub-indexes (e.g. how would SLD be able to only search SLD
pages), can it be set up to search all the WWW servers in the SLAC domain?
Karl will put it up and investigate how it meets our needs, or could be
extended to meet them. Karl estimates he will get a trial version up by the
end of next week.
There is a
demonstration
service
at the University of Colorado.
Status of Log File Analysis - Tony Johnson
The analysis of the server (on www1 and SLACVX) log files is now completed
using wwwstat and gwstat, and running automatically.
Bebo and Tony will confer to check whether Tony's analysis is a superset of
Bebos. As part of the cron job,
Tony can also get rid of the log files.
Then the question is how long to keep
the files and how much disk space is there.
Tony will confer with John to move the log files to another
large disk partition. Pat wants to be sure that
she has the right info extracted for her reports before anything is deleted.
One could consider tarring the log files into a single file on a monthly
basis and then backing up to the silo. John raised the question of privacy
of the information (e.g. who checked out what books)? Pat Kreitz said this is
not an issue, but that
Joan should record the fact in the privacy document.
SPIface also keeps a log of accesses. Maybe this would be a better
source of data to be analyzed for Pat Kreitz.
Obsolete Links in InfoSeek & Lycos - Les Cottrell
Agostini reported that InfoSeek and Lycos were pointing
to obsolete SLAC home pages, Joan wrote to Lycos.
The problem appears to have been fixed.
SLAC Visitors Center Web - Tony Johnson
Some concerns were raised about Web browsers in the visitors center
being able to access pages which would be inappropriate to be
left on the screen when a congressman walks in. Also there may be an issue
of escaping to the shell, and the issue of being able to see SLAC pages
which are supposed to be invisible to the outside world.
Tony will set up a special proxy server, see if Netscape has a kiosk mode (if
not maybe we will need to use Mosaic which does have a kiosk mode).
SLAC Icon Library - Joan Winters
Joan handed out a proposal. She asked that we read it and discuss at the
next meeting.
Report from CHEP
There was not an enormous amount of Web activity at CHEP95.
Karl Malamud made an interesting presentation on the
Internet
World's Fair.. Bebo has a CHEP95 trip report in the SLAC
netnews group slac.scs.trips.
Status of Steering Committee - Pat Kreitz
The ADCOC has a revised charter. David Leith hopes it will be established
on this Friday.
Status of Home Page Survey - Karthryn Henniss
Kathryn handed out a summary of the questionnaire. Joan and Kathryn are
working on this.
Next Meeting
Action Items
- Bebo install latest versions of SunOS and AIX Lynx.
[The binaries do not exist, they may exist on campus.
Bebo will complete this by the end of week.]
- Bebo The Unix news to WWW gateway is unreliable,
Bebo is investigating. This is causing loss of news postings
to the mailing list. [Bebo hopes to resolve this within the next 2
weeks, i.e. before the next meeting.]
- Bebo will investigate the interactions of Netscape and the
SLAC news server.
AOB
Karen Heidenreich went to an Oracle fair and got a lot of information
that she will present to the WWW-Tech.
Bebo went to the Seybold conference last week. There was a heavy Web
presence from many vendors.
Netscape 2.0 is almost ready for release. It has Java support (not clear
what platforms), macromedia.
They also have a lot of tools for publishers (e.g. to update links etc.)
Les Cottrell