Modern
hypermedia systems encompassing the ability to adapt to the properties
of human memory and cognition |
Dr Piotr
Wozniak
Prof. Witold Abramowicz
Feb 28, 1997 |
This
text has first appeared as part of Wozniak, P.A., 1995, Economics
of learning (Doctoral Dissertation at the University of Economics
in Wroclaw) and has since become a theoretical inspiration for developing
SuperMemo 8 for Windows. The current version has been updated by new observations
and puts more emphasis on distributed hypermedia systems |
In this text we would like to show the
need for developing knowledge access systems that would account for the
imperfectness of human perception, information processing and memory. The
implementation of such systems will result in enormous savings in the process
of learning at all three stages of knowledge acquisition by the mind: (1)
knowledge access, (2) learning and (3) knowledge retention. In particular,
we will try to stress the importance of repetition spacing algorithms,
as well as the importance of the application of the newly introduced concept
of processing, semantic and ordinal attributes in hypertext documents.
Fusion of the hypertext paradigm with
techniques targeted against human forgetfulness
Implementation
shortcomings evident in generic hypertext interfaces
Historically, the development of repetition
spacing algorithms proceeded from common sense paper-and- pencil applications
to increasingly sophisticated computer algorithms that have finally been
implemented in commercial products that have gained substantial popularity
among students of languages, medicine, and many more.
This development process was almost
entirely orientated towards the maintenance of the acquired knowledge in
the student’s memory. Currently, there is a possibility of a similar development
process being initiated in reference to retrieval and acquisition of knowledge.
Effective learning is based not only
on being able to retain the learned material in one’s memory. Before that,
the to-be-learned knowledge must be identified, pre-processed with a view
to understanding, classified and selected with respect to its relevance
and importance. This process can greatly be enhanced by means of simple
techniques, which make an excellent material for computer implementation.
This implementation is more and more
urgent with the diminishing role played by printed materials in the wake
of an increasing role of World Wide Web and the vast market for CD-ROM
title releases across the board of all possible subject domains. The straightforward
use of a pencil, that is often instrumental in one’s work with the printed
matter, becomes increasingly impossible with more and more multimedia titles
appearing on CD-ROM and with a rapid growth of hypermedia available via
global computer networks. Some visionaries are even predicting the death
of the printed matter as we know it. The gap between the effectiveness
of browsing printed vs hypertext documents seems to grow by the minute,
though still very little attention is paid to the reader’s or user’s ability
to leave the trace of his work in the document. Most of hypertext systems
distributed on CD-ROM provide the user only with annotation and bookmark
tools, which leave much room for improvement.
Let us shortly present exemplary tools
and techniques that can be used in working with printed textbooks, and
what inspiration this might provide for the design of future hypermedia
documents.
-
The first problem with books to read is
that there are usually too many of them. A good selection of the most applicable
material is the first step to effective acquisition of knowledge. This
subject, however, we will leave out from the consideration. This is because
we would like to entirely focus on the authoring systems for development
of hypertext documents, as well as the tools that would enhance such documents,
and made them more attractive from the student’s standpoint. The new technologies,
most notably CD-ROM, will make the author’s choice easier in this sense,
that the vast capacity of the media will leave less stringent constrain
on what not to include in the final shape of the document. When we extend
it to World Wide Web, the question becomes irrelevant. With appropriate
navigation and search tools, the hyperspace might virtually remain unlimited.
-
After selecting the learning material,
the important tool to use is a bookmark. Apart from reference materials
like encyclopedias, dictionaries, computer documentation, etc. most of
the printed material provides the possibility and often requires a substantial
dose of linear progress through the contents. As time slices allocated
for reading, often break one’s work in the middle of a linear section,
bookmarks are of indispensable value. With the advent of hypertext applications,
the average length of a linearly processed text is likely to drop dramatically.
However, bookmarks do not only serve as pointers to interrupted reading,
but also provide the means of a customizable table of contents, which can
be used for quickly accessing sections which are of the greatest interest.
Bookmarks have been an early and ubiquitous child of hypertext documents;
therefore, we will also not consider them in the reasoning that follows.
-
After picking a book, and selecting the
relevant bookmark, the process of reading or browsing begins. First of
all, the same bookmark that was used in accessing a particular chapter
or section, may serve as the pointer that helps keeping the sight focused
on the processed paragraph. This is particularly useful in richly illustrated
texts, or at moments when external interruptions require frequent shifting
the sight beyond the reading area. In a hypertext document, the counterpart
of a paper bookmark used in reading a textbook, should be a cursor that
highlights a single semantic unit that is currently being processed. The
importance of such a cursor may go far beyond the sight guidance of a traditional
bookmark. Such a cursor will later on be called a semantic focus. It is
not difficult to notice that modern textbooks go further and further into
making particular semantic units of the text less context dependent. In
other words, by picking up a randomly selected sentence from a modern textbook,
we are more likely to understand it that it would be possible in the textbooks
written in the style from a few decades ago. The general trend is to shift
from prose to more precise forms of expressions. This will be most visibly
seen through proliferation of illustrations, formulas, insert boxes, enumerations,
underlined text, etc. This trend comes from the increasing tendency to
convert linear textbooks to pick’n’read reference materials. This makes
the job of a hypertext document author much less of a trouble. This will
also make semantic units live life of their own, with the benefit for knowledge
retrieval and acquisition.
-
The most important part of a good textbook
processing technique is to leave traces of one’s work in the text. After
all, let the book itself learn about what the reader’s progress is, and
not keep the entire burden in that reference on reader’s memory. First
of all, it is useful to prepare a page chart for every carefully studied
book. The page chart keeps the record of each page processed, and the current
processing status. The processing status may assume at least the three
following values:
-
intact - the page has not yet been
processed
-
processed - the page has been read
at least once and all its semantic units have been reviewed and marked
with processing attributes, which, very much like in page charts, indicate
the processing status (e.g. irrelevant, important, memorized, etc.)
-
done - the page has been fully
processed, and needs no further reference. For example, all its semantic
units have been marked as irrelevant, or all its relevant semantic units
have been memorized
In some cases, it may also be worthwhile
to separate a few degrees of the attribute processed (or read). After all,
the page might have been read once, twice, or several times, with all its
semantic units changing the processing attributes during each passage.
The rationale behind page charts is
to have a constant opportunity to control the speed and direction of processing
a particular textbook; the greatest advantages being: (1) no need to refer
to fully processed pages marked with done, and (2) giving priority to new
material (intact) as opposed to the material that has already been, at
least partly, processed (read).
-
As mentioned earlier, all semantic units
are marked with processing attributes during the progress of reading. These
are:
-
irrelevant - the semantic unit
is not worth future reference.
-
relevant - the semantic unit is
worth future reference (which may change its attribute to irrelevant, to-be-memorized
or memorized).
-
to-be-memorized - the semantic
unit seems worth remembering, and should be put to a database with the
learned material associated with the currently processed book. The process
of actually transferring the unit to a database will take place as soon
as processing the book moves to more advanced stages. On occasion, this
may happen much earlier or never.
-
memorized - the semantic unit has
been transferred to a database with the learned material, memorized and
subjected to a repetition spacing algorithm. In other words, it needs no
future reference.
The obvious rationale behind marking semantic
units with processing attributes is never to refer to irrelevant or memorized
units, to focus the reading attention on relevant units, and to use to-
be-memorized units only during the process of selecting new material for
memorization.
In a majority of presently available hypertext
systems, it is difficult to develop an equivalent of page charts. Such
a document still leaves an impression of straying in a sea of information
with little chance for managing the access in a rational way. The main
problems here are: (1) how to make sure that one does not wade again through
once processed material (during the reading process, it is easy to have
a pleasant impression of knowing everything beforehand just to discover
that some of the formulations evoke a déjà vu effect), (2)
how to make sure that no important section is missed (perhaps the increasing
drive toward large hypertext documents that cannot be encompassed in any
way will eliminate this concern altogether). Sooner or later, developers
of hypertext tools will discover that there is much more to reading printed
books that what has until now been encapsulated in standard hypertext technologies.
New solutions
proposed for hypertext systems
Let us consider a collection of proposed
enhancements to generic hypertext systems that would provide solutions
to the problems mentioned in the preceding section.
-
The first of the mentioned problems concerned
selection of the material. What generic systems have to offer in that respect
is: (1) possibility of choosing a title, (2) collapsible tables of contents,
(3) searching tools, and (4) bookmarks. All that still leaves the reader
with the entire document to work with.
The first and the easiest step toward
the customized content is editable table of contents. We will discuss the
possible add-ons to tables of contents in Point 4 as we address the problem
of page charts.
A much more complicated, however,
and probably more desirable approach to customizing documents to particular
needs are document filters. Boolean and fuzzy search procedures standardly
included in hypertext documents are usually armed with the ability to yield
the list of topics collected in the search. Such a list is usually presented
in the sorted form using one of the two criteria: (1) semantic order, and
(2) number of search hits. Indeed, such a newly generated list of topics
can be viewed as a customized table of contents. However, such a table
has no attribute of persistence, in other words, it is usually destroyed
by repeating the search procedure. Moreover, if the newly generated table
of contents was all the reader was interested in, there is, as a rule,
no way of hiding the remaining contents of the document from other browsing
procedures.
A document filter might have similar
searching abilities as the mentioned standard search procedures; however,
the output of the search might have a form of the new document with a new
table of contents. Additionally, a keyword system, or better yet, semantic
attributes associated with particular topics or even semantic units, might
be used in the search. In other words, instead of looking for words or
phrases, the search would look for keywords or even semantic content expressed
through semantic attributes.
The ultimate solution with respect
to document filters is to let them collect all relevant semantic units
and, literally, generate a new document from the collected pieces. Before
such a solution might be implemented, quite a great deal of progress in
natural language processing will be required. In contrast, as it will be
demonstrated in Points 4 and 5, some handy solutions concerned with processing
attributes might be just a few steps away.
-
As mentioned earlier, bookmarks are already
a standard fixture in all documents that have anything to do with hypertext
capability. Bookmarks may serve as a way of constructing a customized table
of contents upon locating the most relevant topics used in one’s work with
the document. In the context of document filters, one might only propose
that one of the possible outcomes of search should be an editable bookmark
table, that would make it possible to employ the results of search long
after it actually took place.
-
The important role of semantic focus will
be shown only later when we consider the link between a hypertext document
and a database with the learned material generated during the browsing
process. At this point we only note that its function can be compared to
a selection bar in menus or caret cursor in edit controls or word processor.
The position of the semantic focus indicates the currently processed semantic
unit. Very much like in the case of cursors or selection bars, the actions
undertaken by the user or reader will affect only the selected unit. This
actions might be: (1) change the processing attributes of the unit, (2)
change semantic attributes of the unit (e.g. to determine the future search
outcomes), (3) transfer semantic items associated with the unit to a database
with the learned material, and (4) perform an editing action on the unit
(delete, print, transfer to another document, etc.).
-
Page charts are most painfully missing
upon moving from printed matter to hyperspace. The division of books to
pages seemed quite artificial, but the benefits of charting are definitely
worth this little inconvenience.
In the case of hypertext documents,
the concept of a page seized to exist being replaced with the concept of
a topic. The best link to the entire semantic structure of topics from
the human standpoint comes via table of contents; hence the most obvious
implementation target for a counterpart of page charts. A flexible table
of contents that would make paper the commodity of the past, should meet
the following conditions:
-
collapsibility (this feature, allowing
chapters to be expanded to sections or collapsed to the tile level, is
increasingly apparent in modern hypertext systems)
-
editability that would make the user choose
the sequence of topics, as well as to choose topics that should disappear
from view not only at the contents level, but also from the document itself
-
awareness of the readers progress through
application of processing attributes The last point seems the least obvious
and worth the most attention.
As in the case of page charts, the reader
should have the possibility to mark topics with processing attributes (which
are initially set to intact). Marking a topic as irrelevant or done would
be equivalent to erasing it from the table of contents or leaving it in
an easily distinguishable form, e.g. grayed. Marking a topic as processed
might be enhanced by the indicator of the degree of processing, which might
also be reflected in the appearance of the topic’s title in the table (e.g.
through coloring). Obviously, the process of tagging topics with processing
attributes should be available both at the contents level and the topic
level.
-
Finally, individual semantic units should
also be markable with processing attributes. Initially, all semantic units
would be marked as intact. Upon the first reading, irrelevant items should
be marked as irrelevant, and, depending on user’s choice, disappear from
the text or appear grayed in their original place. Semantic units of utmost
importance, might be immediately transferred to a database with to-be-memorized
items. At the very least, this process would allow the user to paste the
content of the semantic unit, reedit it and place it in a selected database.
However, a much more desirable solution is to associate all semantic units
in a hypertext document with ready-made collections of items that might
be transferred to or linked with the student's database with a key-stroke
(e.g. after optional selection and pre-processing). Items marked as memorized
could also, depending on the set-up, become invisible or distinguished
by different coloring. The remaining items could be marked with a degree
of relevance (or number of reading passes); the highest degree being equivalent
to the attribute to-be-memorized. The degree of relevance might contribute
to the application of ordinal attributes that might be later used in prioritizing
once-accessed items for secondary access. Similarly, to-be-memorized items
might also be tagged by ordinal attributes that, in this case, would determine
the memorization order. If the processing attributes were applied, the
user would be able to quickly skip the parts once identified as irrelevant,
as well as to pay less attention to those sections that have already been
entirely mastered by means of a module using repetition spacing algorithms.
The usual situation is, that at the early stages of processing the document,
the intact topics and units are of the highest processing priority. As
the work progresses, the once-referred-to units may increasingly get into
the focus of attention (e.g. in the order determined by their ordinal attributes).
This will, in all likelihood, move their processing status to increasing
degrees of relevancy, up to the point where a decision is made to memorize
a particular semantic unit. In an optimum situation, a collection of simple
techniques should be developed to make sure that the flexible table of
contents makes it possible to quantitatively assess the progress of processing
the semantic units in a given topic. For example, the topic title in the
table could be associated with a bar chart showing the proportion of semantic
units in the intact, irrelevant, relevant and memorized categories.
Our experience shows that there is a great
potential for an increase in the effectiveness of using hypertext documents
in case the proposed tools are provided in both the software shell and
in the document in question.
Integrating repetition
spacing technology with a hypertext interface
Our hope is that in the future, the student
will not ever have to work with a repetition spacing algorithms employed
by a dedicated program like SuperMemo. The optimum situation is that the
student will obtain access to a hypermedia knowledge base (e.g. withing
the framework of World Wide Web) with a seamlessly integrated algorithms
for optimum spacing of repetitions (e.g. as a plug-in to a Web browser).
In other words, the focus should shift from software and its options, to
knowledge itself.
Naturally, the development of a hypermedia
interface for a knowledge base associated with a database used in learning,
will put much greater burden on the authors of a particular learning system.
However, the increase in the effectiveness of accessing and learning knowledge
will certainly fully compensate the higher development costs.
In the optimum case, all semantic
units relevant to learning should be associated with predefined, well-structured
items (often in the standard question-answer form). A single semantic unit
might generate from one to several individual to-be-memorized items. In
other words, developing a seamless hypermedia knowledge base integrated
with repetition spacing algorithms would triple or quadruple the authors’
effort and costs.
A subset of the aforementioned technological
solutions is currently available in SuperMemo 98