How fast is the Internet? Can it perform well enough to serve as a vehicle for multimedia? Three Perth based clients logged in to nine Web servers for a period of three months, measuring the number of seconds required to download home pages. Of 375 connections, 12% failed, with successful connections often taking more than a minute to complete. Relationships between download time and server location, time of day, and client type, Ethemet and SLIP dialup, are reported, with SLIP performance found to lag much less than anticipated. The best Internet performance obtainable during the period of the study gave an effective throughput rate only about one percent of that commonly delivered by late model CD-ROM drives.
The Internet's World Wide Web, and its expected successor, the so-called Information Superhighway, or Infobahn, continues to be touted as a vehicle which will carry multimedia to homes, schools, and businesses. This very conference, the 3rd International Interactive Multimedia Symposium, is dedicated to the "learning superhighway", and to exploring an expected fusing of two emerging technologies, multimedia and the Web.
Is the Web truly a suitable vehicle for transporting multimedia? Are we satisfied that the Web can presently deliver graphics and text at an acceptable rate? Is the Web, and our browsers, really multimedia capable? Rich media capable? Truly interactive?
Most of us know that the Web, as found mid-1995, handled rich media only via the use of "helpers", and was only beginning to show signs of interactivity beyond hyperlinks. Multimedia tyros might be excused for believing that such facilities constitute interactive multimedia, but the rest of us know better. Let's put aside thoughts of rich media for the moment, and interactivity, and focus just on the delivery infrastructure on which we expect the Infobahn to grow. Let's agree to be content with text and graphics, and focus on access speeds to Internet Web servers.
Speeds? I say ... pulling down typical Web pages with our "show graphics" browser option on is more often than not a chelonian activity, not something one would undertake at the front of a lecture theatre (unless, perhaps, it was empty). Web browsing is for people with time to spare, not for surfies, not for someone who needs to be somewhere else in ten minutes. Web browsing is rather like going to the museum on a wintry afternoon; interesting things to see, but we don't necessarily expect to see them in a hurry. Or, perhaps it's more like going for a ride on the Ferris wheel at Royal Show time: we think it'll be fun once we're on, but we expect to queue up, and to wait what might be many minutes before the action begins.
One doesn't surf the Web. Not even in Perth, a city known for its propensity to good surfing beaches. One browses the Web in a leisurely manner, at best, and is never surprised to end up crawling. Blasphemy? Some people get red in the face on hearing such talk. We have all heard that the Web is the future, and no one expects to get there on hands and knees.
It was with such thoughts in mind that I decided to knuckle down and collect some hard data on the time it takes to access typical Web sites, both at home and abroad. I selected nine sites, three in Perth, one in New Zealand, four in the United States, and one in the United Kingdom. I accessed these sites via a SLIP connection from my Perth residence, and from my office via our university's Ethernet backbone. I was assisted by Christy Pinfold, a colleague at another Perth based university, who logged into the same Web sites from her office, working over her university's Ethernet network.
We attempted several hundred connections over a period spanning close to three months, mid-May through early July, 1995. Our dependent variable was the number of seconds it took for our browsers to announce that they had completed the downloading of a site's initial Web page. Independent variables included time of day for both client and server; day of the week; and, for SLIP, connection speed.
The only surprise likely to be found among our results is that the SLIP connection gave performance figures which were essentially, or at least tolerably, close to those obtained over a direct Ethernet wire-up. SLIP connection speed varied considerably, but in the end had an unnoticeable impact on the results. I summarise our results in detail below. Perhaps as much as anything else they may come to serve as reference points against which we can compare future soundings.
This was Curtin University's mirror of "Welcome to the Planets", the "collection of many of the best images from NASA's planetary exploration program" (as quoted from the welcome screen itself). Curtin University is in Perth, Western Australia. [verified 21 Feb 2001]
The University of Massachusetts' mirror of "Welcome to the Planets" (same material as at Site 1). [verified 21 Feb 2001]
Cambridge University's mirror of "Welcome to the Planets" (same material as at Sites 1 and 2). [verified 21 Feb 2001]
Site 4: http://www.state.wi.us/agencies/dpi
Home page for the Wisconsin Department of Public Instruction, in Madison, Wisconsin. The address given above is the new URL for http://badger.state.wi.us:8010/agencies/dpi/www/dpi_home.html which was the URL used during our data collection stage. [19 Feb 2001: home page is now http://www.dpi.state.wi.us/]
Site 5: http://infopages.com/
Maximised Online's InfoPages guide to current events in Orange County and the Southern California area.
Site 6: http://www.otago.ac.nz/home-page.html
University of Otago (Dunedin, New Zealand) home page. [19 Feb 2001 at http://www.otago.ac.nz/]
Site 7: http://ipl.sils.umich.edu/
The home page for the IPL, Internet Public Library, located at the University of Michigan. [21 Feb 2001 now at http://www.ipl.org/]
Site 8: http://cleo.murdoch.edu.au/murdoch/murd_map.html
Murdoch University's home page. Murdoch is located in Perth, Western Australia. This site has a simple home page, consisting of only one file. The pages of all the other sites were typical of current Web servers in that they involved the transfer of multiple files. (Some readers may be unaware of the fact that current http protocol requires a new handshake for each file to be transferred; because of this it is advantageous to design Web pages as single files, but this is not always feasible.) [19 Feb 2001 Murdoch University home page is http://www.murdoch.edu.au/ and http://cleo.murdoch.edu.au/ ceased on 30 Sep 2002]
Site 9: http://184.108.40.206/vetsc/indexlst.htm
Home page for veterinary science at Murdoch University. (Same university as Site 8, but different server.) [21 Feb 2001 home page is at http://wwwvet.murdoch.edu.au/]
We both used Netscape as our browser, Christy version 0.94B2, me version 1.0N. We did not update the browser over the course of the data collection. Nor did we change the browser's settings during the study.
At the time of the study, Curtin University was using a dedicated 126 kbps ISDN line to Perth's Internet gateway, while Edith Cowan University was using a 256 kbps microwave link. Curtin, Edith Cowan, and Murdoch are expected to upgrade to 34 Mbps microwave links before the end of 1995.
Local and host times of day are important variables in this sort of study. Perth is eight hours ahead of GMT; when it's 8 AM in Perth on Monday, it's 7 PM Sunday in Chicago (same as Madison, Wisconsin). One would think this time differential would give Perth Internetters good access to sites in the United States - as we head into our day, the US is heading into evening, and, presumably, US Web users are logging off. Eight of a Perth morning equates to 12 midnight of the evening before in Cambridge (UK), while 8 AM Perth maps to 1 PM of the same day in the beautiful city of Dunedin, New Zealand.
We programmed our browsers with the URLs given above. We'd point the browser at one of the sites, double click with the mouse and simultaneously start a stop watch. We then watched the bottom left side of the Netscape screen for the "Document done" message to appear. When it did we stopped the watch and recorded data.
Of course the "Document done" message did not always appear. We agreed that we would abandon a download after five minutes, although I at times hung on longer in those cases where there remained sonic evidence of traffic. (If some day you decide you might have time on your side, and patience, you'll see rather quickly that it's a pretty straightforward matter to predict whether or not a site access is going to hold up for the time needed... in the majority of those cases where we waited five minutes before giving up, it was evident long beforehand that the connection had failed ... on the other hand, there were a few cases where things were still moving at the five-minute mark... in a couple of these cases I stayed tuned just to see how long it might take, and was once rewarded 11.5 minutes later, and once 22.25 minutes later.)
In these tables, the 25th, 50th, and 75th percentiles head columns two through four; IQR is the inter-quartile range; n is the number of connections which were completed in less than five minutes; mean is the average; s.d. refers to standard deviation; bad is the number of connections which either could not be made, or which exceeded five minutes; and %bad is the number of bad connections as a percentage of the total number of connections, good and bad.
The labels of the Client variable appear in these tables as Shelley, Curtin, and Cowan. Shelley was the SLIP connection, Curtin was Ethernet over the Curtin University backbone, and Cowan was Ethernet over the backbone at the Claremont Campus of Edith Cowan University.
As an example of interpreting the information in Tables 1-9: Table 1 shows that 17 successful connections were made from Shelley, the SLIP client; the median (%50) time to document done was 57 seconds from Shelley to Site 1, with an average of 69.41 seconds. The same table shows that 12 successful connections were made from Curtin, with a median time of 26 seconds, mean of 26.67 - note, however, that two (2) bad connections were recorded.
Table 1 indicates that connecting to Site 1 from Cowan was similar to connecting from Shelley in terms of time. Of the 25 connections made from Cowan, 23 (92%) were successful. Table 1 might be considered a maverick table in that the Site. Curtin, was the same as one of the clients. The times seen for the Curtin client in Table 1 might be regarded as Ethemet LAN (local area network) access times.
If one sets Table 1 aside and tries to discern patterns from the other tables using the %75 column to avoid problems of skewing, I suggest there to be some evidence to support these statements:
|Table 1: Perth, Curtin Planets (Site 1)|
|Table 2: Massachusetts Planets (Site 2)|
|Table 3: Cambridge Planets (Site 3)|
|Table 4: Wisconsin DPI (Site 4)|
|Table 5: California InfoPages (Site 5)|
|Table 6: New Zealand U Otago (Site 6)|
|Table 7: Michigan Internet Public Library (Site 7)|
|Table 8: Perth, Murdoch U Images (Site 8)|
|Table 9: Perth, Murdoch U Vet Science (Site 9)|
The great majority of "bad" SLIP connections were ones which started well, but had not completed after five minutes. Three times an "unable to locate host" message was noted from Shelley.
Of the two Ethernet based clients, connection failures from Curtin were similar to those from Shelley SLIP: time outs, not complete after five minutes. Over at Cowan, Christy Pinfold noted five "TCP errors"; three "connection refused by host" messages; and three time outs (not complete after five minutes). All of the Cowan time outs were related to Site 9, a server which was problem free for Shelley and Curtin clients.
I looked at the potential effects of the time of day (client) variable by first of all blocking the variable into time intervals. Then I made a table which looked at the number of bad connections crossed with the time intervals, followed by several box and whisker plots which displayed the time intervals along the x-axis, with percentiles for the main dependent variable, seconds to document done, along the y-axis.
Table 10 displays the cross with "bad", the number of bad connections. For purposes of this cross tabulation, "bad" was broken into connections which failed to come through in less than five minutes ("time outs"), and connections which could not be completed for "other" reasons, including TCP errors, and could not find or refused by host errors.
|Client Time Interval||Total Connections||Number Bad||Bad "Other"||Bad "Time outs"|
|05:30 - 08:00||49||2 (4%)||1 (2%)||1 (2%)|
|08:01 - 10:00||54||2 (4%)||1 (2%)||1 (2%)|
|10:01 - 12:00||78||10 (13%)||4 (5%)||6 (8%)|
|12:01 - 17:00||184||31 (17%)||12 (7%)||19 (10%)|
|17:01 - 21:00||10||0||0||0|
|Total||375||45 (12%)||18 (5%)||27 (7%)|
Figures 1 through 3 are box and whiskers, showing the range in access times which all three clients combined experienced when logging in to the three Planets servers.
If it's been a while since you've seen such graphs, the thick line in the shaded part of the box indicates where the median is (%50 in the tables above); the bottom of the shaded box represents the 25th percentile, while the top of the box corresponds to the 75th percentile.
The boxes which correspond to 'Before 8 AM' and 'After 2000' are all based on data from the Shelley SLIP client.
Figure 1 suggests that the best time to access Site 1 was before 10 in the morning. Figure 2 mirrors my own view of using the Web from my office - best get work done early as performance suffers as the day moves along. Figure 3, however, doesn't support this statement.
It is interesting to note that the three figures suggest that a Perth based Web user wanting Planets information would not necessarily benefit from accessing the local site, Curtin University. Performance from the Cambridge server closely rivalled that found in Perth (compare Figures 1 and 3).
One would hope there would be an inverse linear relationship between these two variables. Connect speeds ranged from a low of 7200 to a high of 24000, with a median of 19200, but there was no consistent relationship between speed and the number of seconds to download home pages. In fact, of the nine correlations, eight were positive, ranging from a value of .07 (Site 5), to a maximum of .40 (Site 3); the median of the positive correlations was .21. The single negative coefficient, -.48, corresponded to the site which took the longest for all three clients to access: Site 2.
Taken as an individual variable, Site, by itself, could account for 26% of the variance in seconds; Client, by itself, could explain only 3%; and Interval (time of day) could, alone, account for but 4% of the variance in the dependent variable. Increasing the number of independent variables to two, a model with Site and Interval together, with interaction, accounted for 38% of the variance in seconds; a model with Site and Client, plus interaction, explained less variance, 32%. A tri-variate model, Site, Interval, Client, plus all interactions, could explain 60%.
These variance partitioning results serve to confirm the data summarisations presented earlier. The amount of time required to download a Web page clearly depends on a number of factors, with the single most important factor being the Site itself. The second most important factor found in this study was Interval, interacting with Site.
Time to download depended first of all on the site being accessed, and then on the time of day the access was attempted. If we take into account site, time of day, client, and all possible interactions among these variables, we end up with a model which accounts for more than half of the time required to download pages.
I started this study with the idea of gathering factual data to back up a personal impression: the Internet, as we found it in Perth at the time of the study, was hardly in a position to present itself as a suitable means for conveying multimedia information.
If we could take one of the better sites in this study, Site 8, a site whose home page remained unchanged over the course of the study, effective throughput ranged from an average of 7,561 bytes/second (60,488 bps) for the Cowan client, to 1,288 bytes/second (10,303 bps) for the Shelley client (these figures were computed by dividing the size of the home page, in bytes, by the average number of seconds to download, as seen under the "mean" column of Table 8).
CD-ROM drives, in comparison, have throughputs which range from 150,000 bytes/second in their original format, to the 600,000 bytes/second found on standard mid-1995 multimedia PCs. In not too rough terms, modem CD-ROM drives have almost one hundred times the best throughput which this study was able to obtain over the Internet. In comparative terms, a 20-second, multimedia AVI file (video clip) occupying 1,930,036 bytes would take just over four minutes (255 seconds) to download using this study's best Internet access figures, compared to just over three seconds from a CD-ROM drive.
An 23-second multimedia MMM animation file (no sound) of 202,118 bytes would take, using this study's best figures, some 26 seconds to haul down from the Internet, compared to a third of a second from a CD-ROM. I find a clear message in these figures. If I had a class dealing with the solar system, would I have students access one of the Planets Web servers, or would I suggest they use the Planets CD-ROM instead?
Will Web access speeds improve? Friends at our Computing Centre assure me they will. Lots of people are said to be working on the bandwidth problem. But there's obviously a long way to go, much work to do. It would be interesting to repeat this study in mid-1996. In the meantime, for Perth based multimedia users, CD-ROMs have an absolute advantage over the Internet.
|Author: Dr Larry R Nelson|
Faculty of Education
Curtin University of Technology
GPO Box U 1987
Perth, Western Australia 6001
Please cite as: Nelson, L. R. (1996). The Internet as a multimedia snailway. In C. McBeath and R. Atkinson (Eds), The Learning Superhighway: New world? New worries? Proceedings of the Third International Interactive Multimedia Symposium, 291-297. Perth, Western Australia, 21-25 January. Promaco Conventions. http://www.aset.org.au/confs/iims/1996/lp/nelson.html