# KEHOME/knowledge/ApplicationInternet/ihreadme.htm # 1999/2/28</font> # new syntax Sep/30/2002</font>

# The files in the KEHOME/knowledge/ApplicationInternet directory
# contain partial internet directory hierarchies from several public
# web sites, and KR programs to edit and compare the hierarchies.
# As discussed below, there are many practical problems in making
# the comparisons.
# The hierarchy categories are often ad hoc, with no definitions
# (or attributes) available.  The particular categories chosen
# vary widely between directories.

 1. The first problem is lexical variations such as
 capitalization and singular/plural forms.
 Example:
    games, Games, Games@
    Autos, Automotive
 This problem can be overcome by folding to lower case
 and using aliases
    set kcase = LOWER
    games is Games, Games@
    Autos is Automotive

2. Another problem is related categories which have been
combined into a single category.
Example:
    Computers & Internet
This problem can be overcome by separating the categories:
    set kformat = ho
    Computers & Internet
    /    Computers
    /    Internet

3. The same word may have many different meanings.
This situation can be represented as a hierarchy with
the same word occupying multiple positions in the
hierarchy.  (The "hierarchy" is actually a lattice.)
Example:
    Excite
        Search
            Horoscopes
        Channels
            Horoscopes
        More Services:
            Horoscopes
This situation can be detected using
    do read from excite.ho done; do check od genus done
and corrected by deleting the more general meaning
from the hierarchy, or by choosing a different word.

4. In some situations the same word has very different
meanings in two hierarchies.
Example:
   Excite                         InfoSeek
      Shopping                      Shopping
          Bargains                       Gifts
          Clothes                         music CDs
          Music                           online malls
This can be detected by comparing the units of the
categories using the UNIX diff command
    !diff with -ib od excite.ho,infoseek.ho out diffei.txt done

5. There may be contradictions between hierarchies,
i.e., A isa B in one hierarchy, and A isa not B in another.
Example:
    Excite                            InfoSeek
        Entertainment                 Entertainment
            Movies                            Books
            Music                              games
            TV                                  great movies
        Games                                 music
                                                   ...
This can be detected using
    do read from excite.ho, infoseek.ho done; Excite is InfoSeek; do check od genus done
Knowledge Explorer automatically checks for
explicit declarations of the form
    A is B; ...; A is not B