With the establishment of international networks of sufficient capacity to allow provision of multimedia services, there are many opportunities for distributing information and services. This paper discusses the important issue of how to provide these services in a way that is best for both service providers and potential customers. It is important that a network based multimedia shop allow fair advertising of products, and easy access for customers. We propose one possible solution based on the use of probabilistic protocols, allowing fair access by advertisers while guaranteeing network stability. The use of user and population models assists this process and provides for easy access for seekers of information.
We take the concept of a multimedia shop as a metaphor for how multimedia presentations and interactions may be conducted in the future. With continuing dramatic drops in the cost of communicating at a distance, this can create new markets and new ways for product and service providers to meet with potential customers. A network based shop must allow for the sort of accidental encounter we now have when we stroll through a shopping mall to discover that a new shop has opened offering cakes that we like. In the spirit of the dictum that it is better to travel than to arrive, we offer a possible approach to network based exploration of multimedia services.
In order for such a network based multimedia commerce to prosper, we need two fundamental mechanisms. Sellers need to efficiently distribute information about their products, in a way that is fair and reasonable. Purchasers need means of locating products suitable for their needs in a way that is easy and convenient, without being badgered by aggressive sales campaigns or boxed into local purchases. Since Australian companies are typically at a distance disadvantage in world markets, they should be strongly motivated to participate in a network based marketplace that attempts to overcome the tyranny of distance. The costs of our multimedia shop must be born by advertisers, so we tackle the central problem of how to provide a medium and mechanism that satisfies both advertisers and purchasers.
If we allow unrestricted access for advertisers to promote themselves on the network, then clients can easily be badgered. Particularly if it is cheap for an advertiser to send advertising messages. With a prevalence of junk faxes, junk email and junk advertising generally it is necessary to somehow restrict the connectivity between any single advertiser and every consumer on the network. In a very large network it is also possible for advertisers to congest the network either intentionally or unintentionally. In aggressively promoting their latest product range they could completely block a less financially powerful competitor from gaining access to the network. Clearly this is a situation to be avoided. With very large networks we must follow an approach that avoids rather than attempts to control network congestion. Probabilistic protocols (Schwartz 1990) serve the purpose of guaranteeing the avoidance of network congestion, and provide fair network access for advertisers.
If I am searching for a bicycle shop that supplies the new sprockets for my bicycle then should I send out a search message that specifies "sprockets" or "bicycles"? If I have a guide to the categories under which information is classified then I can consult this guide to better determine which category to search. But in a future international networked marketplace it is difficult to arrange coordination of these categories between different information suppliers. Unless we somehow establish some world authority for the categorisation of documents, and somehow enforce this categorisation then we win have a plethora of categories. Even in a narrow search this categorisation is difficult. Experimental research (Gomez and Lochbaum 1990, Chen and Dhar 1990) clearly shows that new users find great difficulty in coming to grips with this rigid categorisation. Far better to develop new approaches that overcome this difficulty. Systems for user modelling (Jennings 1991) are the basis for this approach.
The approach we outline for a multimedia shop is based on the use of probabilistic protocols with user and population modelling. It offers an approach that will scale up to a multimedia shop that win allow a user to browse for products on a machine in Paris just as easily as one in Argentina.
How do purchasers encounter the product of their choice on the network? Of course we could not allow any advertising on the network and simply require people to locate the services via other means, for example by looking in a newspaper. This is not a preferred solution since there are enormous advantages to being able to locate services using the network, but there are important constraints:
Brokers advertise services or information that is of use to others on the network. A typical broker may be a desktop publishing company that will format your multimedia document in any publishing format you desire and then distribute it for you. To attract business the broker needs to advertise to potential customers on the network. Agents seek out information, as we shall see in the next section.
The process of advertising follows the probabilistic protocol. Each broker contacts a small number of network nodes and advertises the availability of a piece of information. It might announce a new magazine, or the latest issue of the magazine. Or perhaps a special offer for new subscribers. At randomly selected intervals the broker node announces the availability of this information to other nodes in the network to which it is connected. These contact nodes are selected at random from the total number of possible contacts. To restrict traffic flow we restrict the total number of messages transmitted at each stage. Each of these nodes then in turn propagate this information at a randomly chosen time to further connected nodes. Nodes then exchange information and update their lists of broker information. An advertisement message keeps a count of the number of stations it has traversed, and when the number exceeds a specified limit, the message is not transmitted further.
As Figure 1 illustrates this process of dissemination is analogous to gas diffusion in a container. By a gradual process the information is spread throughout the network. Each node holds a list of information about services that are available on other nodes in the network. Although this information is not completely current, it is updated regularly through this process of sparse dissemination. The process can be carefully controlled by limiting the number of nodes to be contacted at each stage of spreading.
Figure 1: Dissemination of advertisements
Each agent holds a list of information about brokers and their services, in the form of a list of advertisements that have been circulated by brokers. Since this list must necessarily be limited in size, Schwartz in the original proposal followed a Least Frequently Used policy for discarding brokers from the list. We follow an alternative approach of allowing nodes to specialise in those areas of most interest to users attached to that node. At a node with several hundred customers then a population model is intended to represent the general spheres of interest of users associated with that node. This model is gradually developed on the basis of advertisements retrieved, and services delivered. So at a History Department the population model will most likely have a strong representation appropriate to history courseware. The population model need not be visible to other nodes in the network since the population model is only used to determine which broker advertisements to keep. Of course there will have to eventually be some discarding of old items from the list, but each node can decide independently the amount of network information it can afford to store. To carefully restrict the circulation of broker information, our implementation follows a policy of keeping track of nodes already contacted. This avoids the recirculation of information.
An advertiser cannot dominate the network if he follows this protocol of distribution. On average he is restricted to a very small net rate of transmission. However in reality if a broker for some reason transmits in excess of the allowed number of advertisements. then this will become apparent to other nodes. These nodes can then refuse to receive the broker messages, in effect defeating the purpose of the excess announcements. It is not necessary to police this in any other fashion. Of course in many instances an advertiser may wish to only advertise locally, but this can be implemented by restricting the circulation of advertisements using the normal domain addressing restrictions.
If we allowed unrestricted distribution of advertisements then a focused overload could possibly develop. A furious competition for a set of customers could result in large traffic toward their node. However with the protocol network stability is guaranteed by the random nature of the process of diffusion. With the protocol we have outlined it is not possible for advertisements to follow any deterministic path on the network. This leaves the vast majority of network bandwidth free for the important task of distributing multimedia information and services. Whilst it is not possible to instantaneously advertise to a large number of nodes, it is possible to circulate this information very quickly.
By circulating the advertisements widely, we guarantee that there is some likelihood that a node will have a wide range of advertisements that are suitable for its population model. Advertisements circulate randomly, but are persistent only at those nodes where the population model is compatible with the information or services being offered. So the node with many bicycle enthusiasts will eventually specialise in storing advertisements relevant to this interest. Essentially our approach follows Schwartz. but with this important specialisation process. In Schwartz's original proposal for the use of probabilistic protocols the network specialises paths between nodes. In our approach nodes tend to specialise in particular subject areas. Only a detailed trial could determine which of these schemes is to be preferred.
The economics of this set of protocols are such that the advertisements are forced to follow a smooth traffic pattern that effectively soaks up bandwidth that is unlikely to be used by other network applications. It quite comfortably coexists with time critical applications such as the delivery of real time video across the network, if the network is capable of supporting this application. By regulating the maximum average rate at which advertisements can be transmitted we can either reduce or increase the proportion of network bandwidth allocated to advertising information and services.
With a very large number of advertisements, when a user sends a request we have to match his/her request against the available information. If the user is interested in courseware then they may specify a request with the words "Early English history 1700---". A request must be matched against the available sets of information available. Traditionally systems have used keyword matching to locate relevant documents. But in our possible future of a very large network with many thousands of information suppliers, the keyword approach is difficult. For the keyword approach to be successful, the query must use the "proper" keywords for the area of the request. In most cases the keywords the supplier chooses may be recommended by some authority, but inevitably it is difficult to ensure compliance. The experimental evidence (Gomez, 1990) shows that this is not an effective method for inexperienced users. Particularly first time users (Chen, 1990) face enormous difficulty in selecting the "correct" words for their request. Far better to develop systems that accommodate the user rather than require them to learn some arcane keyword classification system.
User modelling (Jennings, 1991) offers a means of overcoming this problem of specifying what you want. The model accumulates over time the words that I read and use, and their relationship. For example if a user is a frequent reader of history advertisements then there will be a clear link between the words "Elizabethan" and "period". If the user then uses "Elizabethan' in a query then the user model ensures that this is interpreted as a period of history, rather than as a style of architecture. The individual user model can partly overcome the need to be specific about keywords, by modelling the most common use of words by the user. The process of user model formation is discussed in detail in (Jennings 1991).
This user model can be used at a node to rank advertisements in order of interest for the user. The normal user model is augmented by the words that specify the information request, by adding these as extra words or reinforcing the word in the user model if it is already present. The features of the document are extracted, which win consist of a set of words. For example a history courseware document could have the features ["English", "History", "1770", ...]. Typically a set of features could extend to several hundred words, and is not limited to a small set of keywords. Since the features are automatically extracted from the document we remove the cost of selecting keywords. Each of these features is then examined for membership of the user model network. Those words that are present receive an initial activation. This is then spread throughout the network for several iterations. Figure 2 illustrates this process. Each word spreads energy proportional to its connection strength. If a word in the user model accumulates sufficient energy, it then "fires" and takes a high energy. This word then spreads its energy to its connected words. We can then take the final energy of the network as a degree of match between the user's interests and the document. A typical user model might consist of a user model for each of the following categories:
Mail: New mailing lists, New nodeswhere each would contain a detailed model as illustrated in Figure 2. The categories here correspond to categories for which advertisements circulate. There may be many thousands of courseware offerings, and many magazines.
Courseware: Nihongo, Philosophy
Magazines: Nihongo, Travel
Music releases: Soft rock, Australian
Each of the nodes tends to accumulate information that is particular to its own population model. A population model (Orwant, 1991) is an averaging of user models over the entire population of users at that node. So if a large number of users are interested in bicycling then the population model will be highly representative of the terms of bicycling, and will tend to attract advertisements of that nature. Population models are a statistical picture of the user population. Advertisements are circulated under categories such as those for the illustrative user model above.
Forming the population model requires that we form a user model for each of the users of that node. Recall that our model was a site of several hundred users, say those associated with a branch operation of a company. Each of the users will read or interact with a large number of documents and applications. The user model simply forms a representation that encompasses these interests. If the user reads a lot of documents associated with bicycling or looks at a lot of pictures or videos of bicycling, then there will be a strong representation of that interest in the user model.
There are important privacy considerations in the widespread use of user models. It is more than likely unacceptable for user models to be widely available on the network. So we compromise and make use of a population model (Orwant). Here each of the user models associated with a node are accumulated and averaged. We simply take all of the user model networks and average them into a large network. The population model can then form a useful ranking for the distribution of advertisements. To rank an advertisement we follow the process of first extracting the key features of the document, and then applying them to the population network. Through the spreading activation approach outlined above we can take the network energy as the level of interest in this advertisement. In our case of bicycling all the population model would indicate is a strong interest in bicycling amongst a number of the population of the site, which is relatively innocuous information. Since an advertiser is unaware of which nodes have retained his advertisement. he cannot use network information as marketing guidance without soliciting assistance from customers.
The protocol and discovery mechanism are one possible method of implementing network based exploration of multimedia services and information. In implementing text based distribution of information, such as the present Internet news distribution system, there are very few impediments to the widespread distribution of information. This is because text is very inexpensive to store, and the average news article is short. When we contemplate international distribution of multimedia applications, courseware and presentations we must seriously consider the costs of distribution. Even though the cost of transport of information is dropping rapidly, we will still think seriously about transporting 30 minutes of video internationally across the network, to browse through it. More likely we would read an advertisement and ask our friends if its worth buying, and then purchase it.
Our proposal attempts to address the important issues of advertising availability of services, and customers finding satisfactory products. But of course there are other very serious considerations that restrict this possible future. If service providers take the short sighted view that vertical integration, or bundling the network with the information and service, is a way of guaranteeing a future market, then widespread network distribution will not become a possibility. Public networks are partly driven by sharing costs. If many participants share the cost then the cost per customer can be small, especially in a global market. A network based marketplace has the enormous advantage of universal connectivity. If a supplier of courseware can advertise to a large proportion of students and instructors, then his marketing costs could be very small. Similarly from the student or instructor side, there are very cheap ways of exploring alternative courseware or information.
Gomez, L. M. and Lochbaum, C. C. (1990). All the Right Words: Finding what you want as a function of richness of indexing vocabulary. Journal of the American Society for Information Science, 41(8), 547-559.
Jennings, A., Higuchi, H. and Liu, H. (1991). A personal news service based on a user model neural network. Proceedings of the IJCAI Workshop on Agent Modelling for Intelligent Interaction, pp. 1-33, Sydney, August 24-30.
Orwant, J. (1991). The Doppelganger user modelling system. Proceedings of the IJCAI Workshop on Agent Modelling for Intelligent Interaction, pp. 164-168, Sydney, August 24-30.
Schwartz, M. (1990). A scalable, non-hierarchical resource discovery mechanism based on probabilistic protocols. Technical Report CU-CS-474-90, Department of Computer Science, University of Colorado, Boulder, Colorado.
|Authors: A. Jennings, M. Flower and P. Nicholson|
Telecom Australia Research Laboratories
770 Blackburn Rd, Clayton Vic 3168, Australia
Please cite as: A. Jennings, M. Flower and P. Nicholson (1992). A multimedia shop. In Promaco Conventions (Ed.), Proceedings of the International Interactive Multimedia Symposium, 573-583. Perth, Western Australia, 27-31 January. Promaco Conventions. http://www.aset.org.au/confs/iims/1992/jennings.html