# The files in the KEHOME/knowledge/ApplicationInternet directory
# contain partial internet directory hierarchies from several public
# web sites, and KR programs to edit and
compare the hierarchies.
# As discussed below, there are many practical problems in making
# the comparisons.
# The hierarchy categories are often ad hoc, with no
definitions
# (or attributes) available. The particular categories
chosen
# vary widely between directories.
1. The first problem is lexical variations such
as
capitalization and singular/plural forms.
Example:
games, Games, Games@
Autos, Automotive
This problem can be overcome by folding to lower
case
and using aliases
set kcase = LOWER
games is Games, Games@
Autos is Automotive
2. Another problem is related categories which have been
combined into a single category.
Example:
Computers & Internet
This problem can be overcome by separating the categories:
set kformat = ho
Computers & Internet
/ Computers
/ Internet
3. The same word may have many different meanings.
This situation can be represented as a hierarchy with
the same word occupying multiple positions in the
hierarchy. (The "hierarchy" is actually a lattice.)
Example:
Excite
Search
Horoscopes
Channels
Horoscopes
More Services:
Horoscopes
This situation can be detected using
do read from excite.ho done; do check od genus done
and corrected by deleting the more general meaning
from the hierarchy, or by choosing a different word.
4. In some situations the same word has very different
meanings in two hierarchies.
Example:
Excite
InfoSeek
Shopping
Shopping
Bargains
Gifts
Clothes
music CDs
Music
online malls
This can be detected by comparing the units of the
categories using the UNIX diff command
!diff with -ib od excite.ho,infoseek.ho
out diffei.txt done
5. There may be contradictions between hierarchies,
i.e., A isa B in one hierarchy, and A isa not B in another.
Example:
Excite
InfoSeek
Entertainment
Entertainment
Movies
Books
Music
games
TV
great movies
Games
music
...
This can be detected using
do read from excite.ho, infoseek.ho done; Excite
is InfoSeek; do check od genus done
Knowledge Explorer automatically checks for
explicit declarations of the form
A is B; ...; A is not B