[home]

Tracking the Navigation Behavior of Web Communities - Related Work


Different Ways of Structuring the Contents

  1. Overview - By Relevance
    Documents are grouped by their relevance for one of the following topics: Tracking, Semantic Modeling, Visualization, Web Communities or Other.
  2. Overview - By Document Type Classification
    Documents are grouped by the background they are coming from, which is one of: Science, Scientific Projects, Standard, Specifications, Recommendations and documents about a particular one of those, Software, Tools (non-commercial), Commercial Products / Papers about Commercial work, Link Collections (as entry to further research), Work from lay people for a lay audience and Books (which are not available online.
  3. Overview - In reverse time order
    The documents ordered by when I first found them, newest document first. This is also the order in which the documents sequentially listed in this document.

Overview - By Relevance

Note: Documents which are relevant for multiple topics are listed multiple times.

Overview - By Document Type Classification

Note: The colors are just an experiment with a partitioning classification (which is very limited)!

Overview - In reverse time order


[top]

title

Original URL:
Local URL:
Found via:
Type of document:
Time spent with this document: YY-MM-DD: hh:mm-hh:mm = xh

Summary:

Relevance:

Tracking:

Visualization:

Semantic Modeling:


[top]

PowerBookmarks: A System for Personalizable Web Information Organization, Sharing, and Management

Citation: Li, W-S., Q. Vu, E. Chang, D. Agrawal, Y. Hara, and H. Takano. PowerBookmarks: A System for Personalizable Web Information Organization, Sharing, and Management. In Proceedings of the Eighth International World-Wide Web Conference, May 1999.
Original URL: http://citeseer.nj.nec.com/260729.html
Local URL: relatedWork/power.pdf
Found via: Search Engine: www.citeseer.com, keywords: Web and Navigation Behavior - 5th hit.
Type of document: Paper in PDF. The URL is the citeseer page with links to the document in different formats.
Time spent with this document: 01-06-27: 22:00-23:00 = 1h

Summary:

In the introduction, some figures are given showing that users have trouble organizing and re-finding pages they have visited. "PowerBookmarks is based on the concept of augmented hypermedia; where the system extracts useful metadata from Web documents and observes user behavior to provide valuable personalized services."; "PowerBookmarks utilizes a proxy server to monitor users' access pattern and an external classifier for organizing bookmarks." User behavior is analyzed by the following information for each URL: "(1) number of visits and dates of visit; (2) URLs referring (i.e. navigating) to this URL; (3) URLs referred from this URL; and (4) dates when such navigation occur."; "'Connectedness' is a useful measure to define importance of pages. It is defined as the number of pages a user can reach from or to a page within a predefined distance in term of links."; "We classify the information associated with a bookmarked URL into document specific metadata, owner specific metadata, and user specific information."; "Many usability studies, such as [11], have pointed out that a deep hierarchy yields less effective information retrieval since it requires many traversal steps and users tend to make mistakes."; "[...] This functionality allows collaborative work and social filtering in a workplace."; "Access frequency is defined as the number of accesses for a page during a period of time. Popularity is defined as the percentage of users accessing a page during a period of time." Finally, some interesting related work is mentioned.

Relevance:

Tracking:

They use a proxy on the client to track user data. Unfortunately, there is not much explained about that.

Visualization:

Semantic Modeling:


[top]

Hosting Web Communities

Citation: Figallo, Cliff: Hosting Web Communities: building relationships, increasing customer loyalty, and maintaining a competetive edge. John Wiley & Sons. New York, 1998.
Found via: Had it already...
Type of document: Book, 400 pages
Time spent with this document: 01-06-27: 15:00-15:30 = 0.5h (checked community definition)
Excerpts: relatedWork/books/HostingWebCommunities.html (does not exist, yet!)

Summary:

Relevance:

The first chapter, and possibly the second chapter are relevant for defining the term Web Communities. There is an extensive discussion and a definition that may be quite useful.


[top]

The Ontology Inference Layer OIL

Original URL: http://www.ontoknowledge.org/oil/TR/oil.long.html
Local URL: relatedWork/oil.long.html
Found via: Link from Ontology and Knowledge Representation Formalisms.
Type of document: Paper, distributed on several HTML-pages.
Time spent with this document: 01-06-21: 13:10-14:00 = 1h

Summary:

Relevance:

Semantic Modeling:


[top]

Ontology and Knowledge Representation Formalisms

Original URL: http://www.pms.informatik.uni-muenchen.de/mitarbeiter/ohlbach/Ontology/
Local URL: relatedWork/ohlbachOntology.html
Found via: Browsing the Teaching Unit Websites
Type of document: Collection of links with comments
Time spent with this document: 01-06-21: 13:00-13:10 = 0h

Summary:

Relevance:

Semantic Modeling:

A good entry point for research on Ontologies.


[top]

On the Definition of Semantic Network Semantics

Original URL: http://citeseer.nj.nec.com/analyti97definition.html
Note: this link goes to a database where this paper is stored. From there, several links to versions of the document can be followed.
Found via: Search Engine: www.google.de, keywords: Definition of Semantic Modeling - 5th hit.
Type of document: Page linking to different versions of paper
Time spent with this document: 01-06-16: 18:00-18:15 = 0.25h

Summary:

Relevance:

Semantic Modeling:

This is about semantic networks on a very theoretical level...


[top]

WebQuery: Searching and Visualizing the Web through Connectivity

Original URL: http://www.cgl.uwaterloo.ca/Projects/Vanish/webquery-1.html
Local URL: relatedWork/webquery-1.html
Found via: Search Engine: www.google.de, keywords: Web Communities Definition - 33rd hit.
Type of document: Paper as single HTML-document.
Time spent with this document: 01-06-16: 17:30-18:00 = 0.5h

Summary:

Relevance:

Visualization:

Discusses some visualization techniques for connections between websites.


[top]

Making a Semantic Web

Original URL: http://www.netcrucible.com/semantic.html
Local URL: relatedWork/semantic.html
Found via: Search Engine: www.google.de, keywords: Web Communities Definition - 28th hit.
Type of document: Paper in single HTML-document.
Time spent with this document: 01-06-16: 17:00-17:30 = 0.5h

Summary:

This is a very simple introduction to some of the issues about the semantic web (in example-style): Context-aware Links, Site Information, Collaborative Filtering, Collaborative Categorization, Annotations, Related Links, Corrections. There is also a section on "How does meta-data get there": On The Fly Text Parsing, Embedded in Page, User Published Files, Service Provided by Site, Batch Text Parsing (Crawler), Specialized Metadata Server, Generic Metadata Server. There is also a section on the architecture which discusses issues with centralization and scaling metadata. All in all, the article is quite interesting but completely non-scientific...

Relevance:

Semantic Modeling:


[top]

Virtual-Communities, Virtual Settlements & Cyber-Archaeology: A Theoretical Outline

Original URL: http://www.ascusc.org/jcmc/vol3/issue3/jones.html
Local URL: relatedWork/jones.html
Found via: Search Engine: www.google.de, keywords: Web Communities Definition - 23rd hit.
Type of document: Paper in a single HTML-document.
Time spent with this document: 01-06-16: 16:30-17:00 = 0.5h

Summary:

The paper's abstract:

If useful explanations are to be provided about the relationship between computer mediated communication (CMC) technologies and online behavior, then a longer-term perspective needs to be taken than the current focus of CMC researchers. This paper provides such a perspective by outlining in theoretical terms how a cyber-archaeology of virtual communities can be conducted. In archaeology, researchers focus on cultural artifacts. A similar focus on the cultural artifacts of virtual communities should be a focus for CMC researchers as these artifacts can provide an integrative framework for a community's life, be it virtual or real. It is proposed that CMC researchers pursue cyber-archaeology by systematically examining and modeling the framework for virtual community life provided by their cultural artifacts.

The systematic exploration of cyber-space via cyber-archaeology cannot proceed without adequate linguistic tools that allow for taxonomy. The first step in the creation of such a taxonomy is to distinguish between virtual communities and their cyber-place, the virtual settlement. The second, is to define and operationalize the term virtual settlement so that they can be systematically characterized and modeled. With this new terminology, it is possible to detail a cyber-archaeology where technological determinism is replaced with the notion of bounded hierarchies and material behavior. The theoretical outline will show how cultural artifacts can play a role in constraining the forms virtual settlements can sustain. The modeling of the boundaries of virtual settlements via cyber-archaeology should dramatically increase our understanding of communication in general.

In particular, the paper introduces as defining characteristics a minimum set of conditions consisting of:

  1. a minimum level of interactivity
  2. a variety of communicators
  3. a minimum level of sustained membership
  4. a virtual common-public-space where a significant portion of interactive group-CMCs occur

Relevance:

This is about "virtual communities" and can help a lot defining Web Communities. Also provides link to broader works on "communities" (non-technical but sociological), which may be interesting to look at.


[top]

Trawling the web for emerging cyber-communities

Original URL: http://www8.org/w8-papers/4a-search-mining/trawling/trawling.html
Local URL: relatedWork/trawling.html
Found via: Search Engine: www.google.de, keywords: Web Communities Definition - 6th hit.
Type of document: Paper as single HTML document.
Time spent with this document: 01-06-16: 15:30-16:00 = 0.5h

Summary:

The paper's abstract:

The web harbors a large number of communities -- groups of content-creators sharing a common interest -- each of which manifests itself as a set of interlinked web pages. Newgroups and commercial web directories together contain of the order of 20000 such communities; our particular interest here is on emerging communities -- those that have little or no representation in such fora. The subject of this paper is the systematic enumeration of over 100,000 such emerging communities from a web crawl: we call our process trawling. We motivate a graph-theoretic approach to locating such communities, and describe the algorithms, and the algorithmic engineering necessary to find structures that subscribe to this notion, the challenges in handling such a huge data set, and the results of our experiment.

The term co-citation is introduced as: "The main idea is that pages that are related are frequently referenced together." They say: Web communities are characterized by dense directed bipartite subgraphs.

Relevance:

This is mostly relevant for defining Web Communities and also intersting because it says some things about what hyperlinks can mean. Links also to interesting prior work: Link Analysis, Information Foreaging (this is in particular about annotating websites with the type of information they provide),


[top]

Real-Time Geographic Visualization of World Wide Web Traffic

Original URL: http://www5conf.inria.fr/fich_html/papers/P49/Overview.html
Local URL:
Found via: Search Engine: www.google.de, keyword: xerox parc france visualization web - 4th hit. I was actually looking for Xerox Parc in Grenoble, but this seems also worthwhile!
Type of document: Paper in one HTML-page
Time spent with this document: 01-06-15: 10:30-10:45 = 0.5h

Summary:

Relevance:

Tracking:

Visualization:

Semantic Modeling:


[top]

WebMap - A Graphical Hypertext Navigation Tool

Original URL: http://www.tm.informatik.uni-frankfurt.de/~doemel/Papers/WWWFall94/www-fall94.html
Local URL:
Found via: Link from Cybergeography, section Surf Maps Visualising Web Browsing.
Type of document: Paper distributed on multiple pages
Time spent with this document: 01-06-14: 19:30-20:00 = 0.5h

Summary:

This is a tool from 1994 that communicated with Mosaic to track a single user's navigation Behavior on the Web.

Relevance:

Tracking:

Visualization:


[top]

WebTraffic Visualization

Original URL: http://research3.gsd.harvard.edu/webtraffic/prototypes.htm
Local URL: relatedWork/prototypes.htm
Found via: Link from Cybergeography, section Three-Dimensional Information Spaces.
Type of document: Project home page with downloads
Time spent with this document: 01-06-14: 19:00-19:30 = 0.5h

Summary:

Project Description: Prototype 3d Visualization Tool of On-Line Behavior. The Goal of the project is to investigate the potential of 3D Visualization to enhance the understanding of on-line Behavior, thus allowing designers to adjust web-marketing strategies and design better websites. Contact: avisonneau@gsd.harvard.edu.

Relevance:

Visualization:

High. This is one way of visualizing the navigation behavior. However, the project is still in progress and thus may not help. Possibly, the people involved in the project should be contacted.


[top]

VISVIP: 3D Visualization of Paths through Web Sites

Citation: Cugini, DJ. and J. Scholtz: "VISVIP: 3D Visualization of Paths through Web Sites", Proceedings of the International Workshop on Web-Based Information Visualization (WebVis'99), pp. 259-263, (in conjunction with DEXA'99, Tenth International Workshop on Database and Expert Systems Applications, eds A.M. Tjoa, A. Cammelli, R.R. Wagner) Florence, Italy, September 1-3, 1999, IEEE Computer Society.
Original URL: http://www.itl.nist.gov/iaui/vvrg/cugini/webmet/visvip/webvis-paper.html
This is a part of a project of the National Institute of Standards and Technology, called NIST WebMetrics. The project homepage is located at
http://zing.ncsl.nist.gov/WebTools/
And provides a lot of useful information, in particular on WebVIP and FLUD.
Local URL: relatedWork/webvis-paper.html
Found via: Link from Cybergeography, section Three-Dimensional Information Spaces. This actually links to the VIZVIB Home Page from which the paper and downloads etc. is linked.
Type of document: Paper on one HTML page
Time spent with this document: 01-06-14: 17:00-hh:mm = xh

Summary:

The paper's abstract:

VISVIP allows web site developers and usability engineers to visualize the paths taken through the site by the subjects of usability experiments. They can dynamically customize and simplify the graphical layout of the web site, and select which subjects' paths to view. An animated representation of progress along the path through the web site is also available. The third dimension of the 3D display is used to represent the time spent on each page visit. The graph layout provided by VISVIP can be governed by either the web site topology or the intrinsic structure of the subject's path.

Relevance:

Tracking:

High relevance: this tool uses another tool (Web Variable Instrumenter Program (WebVIP)) for collecting data of a user browsing a particular website consisting of multiple pages. This is done by augmenting websites with JavaScripts. The collected data is stored in Framework for Logging Usability Data (FLUD), a mid-level format for storing user behavior on websites.

Visualization:

High relevance: this is a tool that is used for visualizing the path of a user on a website.


[top]

valence

Original URL: http://acg.media.mit.edu/people/fry/valence/
Local URL: relatedWork/fryValence.html
Found via: Link from Cybergeography, section Three-Dimensional Information Spaces.
Type of document: Project introduction, single web page
Time spent with this document: 01-06-14: 15:30-16:00 = 0.5h (read and summary written)

Summary:

Valence is a software experiment addressing the issue of visualizing dynamic data or very large datasets.

For this work, [Ben Fry is] employing behavioral methods and distributed systems which treat individual pieces of information as elements in an environment that produce a representation based on their interactions.
The most important imformation comes from providing context and setting up the interrelationships between elements of the data.
One application of the software was visualizing the user traffic on a website. The software built a self-evolving map of how people had been using the site, the layout of which was driven by traffic patterns instead of the site's original structure. In that, it was also a measure on how well the site had been constructed.

Ben Fry can be contacted via eMail at fry@media.mit.edu.

Relevance:

Tracking:

Probably not relevant as the data was probably collected from web-server logs.

Visualization:

High relevance: this provides one way of visualizing the navigation behavior of a community of users.


[top]

The Order of Things: Activity-Centred Information Access

Citation: Chalmers, Matthew, Kerry Rodden & Dominique Brodbeck: The Order of Things: Activity-Centred Information Access. In: Proc. 7th Intl. Conf. on the World Wide Web, Brisbane, April 1998, pp. 359-367.
Original URL: http://www.dcs.gla.ac.uk/~matthew/papers/WWW7/www98.html
Matthew Chalmer's Homepage with information on Recer and Equator which are probably also interesting related projects is located at:
http://www.dcs.gla.ac.uk/~matthew/
Local URL: relatedWork/www98.html
Found via: Link from Cybergeography, section Information Space Maps.
Type of document: Paper as single HTML-page
Time spent with this document: 01-06-14: 15:00-15:30, 16:00-16:30 = 1h (read + summary)

Summary:

The paper's abstract:

This paper focuses on the representation and access of Web-based information, and how to make such a representation adapt to the activities or interests of individuals within a community of users. The heterogeneous mix of information on the Web restricts the coverage of traditional indexing techniques and so limits the power of search engines. In contrast to traditional methods, and in a way that extends collaborative filtering approaches, the path model centres representation on usage histories rather than content analysis. By putting activity at the centre of representation and not the periphery, the path model concentrates on the reader not the author and the browser not the site. We describe metrics of similarity based on the path model, and their application in a URL recommender tool and in visualising sets of URLs.

The paper suggests that tracking the information on the client/browser-side has advantages to analyzing weblogs as related work using the latter method discovered problems. Systems that try to "understand" the content of pages also have limitations. Instead, the use of pages in a certain context should be used to find out for example that other users looking for the same information liked a particular page. Collaborative Filtering is introduced as a method where relevance of information is defined for a specific group of people and ratings of information value are strictly subjective. The Path Model is introduced as a model taking into account not only pairwise links but the complete path of an activity like browsing the web. The basic idea is that people that moved along similar paths in the past are likely to move along similar paths in the future which allows predicting the path from a particular step, making recommendations of links to follow possible. The user's own path is also taken into account, allowing pages not recently viewed to be preferred - unlike in collaborative filtering, where the user's own path is not used. A Java tool recommending links is introduced as well as a tool for visualizing paths. The data of the recommender is accessed by logging changes to the browser's history file. Two metrics of similarity of paths are explained. The visualization technique used in that project is extensively explained. Some of the given references may be interesting.

Relevance:

Tracking:

In that project, information was gathered by logging changes in the browser's history file. This may also be an option for this project.

Visualization:

High relevance: the visualization technique used in that project is extensively explained.

Semantic Modeling:

Low relevance: Using paths in a certain way provides semantic information. However, that information is only implicit and thus cannot really be extracted.


[top]

Graphviz - open source graph drawing software

Original URL: http://www.research.att.com/sw/tools/graphviz/
Local URL:
Found via: Sebastian Schaffert
Type of document: Product home page with links, downloads etc.
Time spent with this document: 01-06-12: 13:30-14:00 = 0.5h

Summary:

Graphviz is an open source tool to draw all kinds of graphs.

Relevance:

Visualization:

This might be used to visualize the collected data.


[top]

HotJava Browser Product Family

Original URL: http://java.sun.com/products/hotjava/
Local URL:
Found via: Search on Web Site: java.sun.com, keyword: HotJava - first two hits
Type of document: Product home page / description with links
Time spent with this document: 01-06-12: 13:15-13:30 = 0.25h

Summary:

HotJava Browser 3.0 is a Web Browser written in Java that supports JavaScript (Full ECMA 1.4 standard support) and has been updated with HTML rendering fidelity improvements, adding support for critical Netscape 4.0 and Internet Explorer 4.0 extensions to the W3C HTML 3.2 specification. Unfortunately, the HotJava HTML Component 1.1.2 is no longer available.

Relevance:

Tracking:

Unfortunately, the HotJava HTML Component is no longer available. There are plans of releasing the complete source code of HotJava 3.0, which would make this relevant for writing an own enhanced browser.


[top]

NetClue - Java Web Browser

Original URL: http://www.netcluesoft.com/wbc/
Local URL:
Found via: Search Engine: www.google.de, keyword: web browser java components - first hit
Type of document: Commercial product web page with a couple of sub-pages
Time spent with this document: 01-06-12: 13:00-13:15 = 0.25h

Summary:

Java Web Browser is a set of components that can be used for displaying websites in own applications. A lot of features are implemented including style-sheets and JavaScript. An evaluation copy that will work for 30 days can be downloaded.

Relevance:

Tracking:

This could be used for implementing an own web browser with enhanced capabilities. There is an interface for intercepting hyperlinks, which could very well be used for tracking. However, this is an expansive commercial product making it much less interesting - especially for a prototype. Possibly, this may be interesting when developing a "high end product".


[top]

Web Proxy Servers

Citation: Luotonen, Ari: Web Proxy Servers. Prentice Hall, 1997.
Found via: Search Engine: www.google.de, keyword: proxy java - 20th hit: Geocrawler.com - mozilla-java - a JAVA proxy, was a posting in some mailing-list where someone asked about implementing a proxy in Java. The reply linked to that book.
Type of document: Book, 400 pages
Time spent with this document: 01-06-12: 12:50-13:00 = 0h (discovered and ordered from amazon)
Excerpts: relatedWork/books/WebProxyServers.html (does not exist, yet!)

Summary:

Relevance:

Tracking:

This is relevant for writing an own proxy or modifying an existing one.


[top]

Muffin - World Wide Web Filtering System

Original URL: http://muffin.doit.org/
Local URL: relatedWork/muffin.html relatedWork/muffin-0.9.3a.tar.gz (Source, doc, etc.)
Found via: Search Engine: www.google.de, keyword: proxy java - 13th hit
Type of document: project home page with some links to downloads, doc, etc.
Time spent with this document: 01-06-12: 12:30-12:50 = 0.5h

Summary:

This is the project page of Muffin, a HTTP-proxy with filtering capabilities. It is completely written in Java and freely available under the GNU General Public License. The proxy supports HTTP/0.9, HTTP/1.0, HTTP/1.1, and SSL (https). It is possible to write own filters. There is also a very useful link to the related RFCs and specifications concerning proxies.

Relevance:

Tracking:

This could either be directly used with minor modifications, or it could serve as an example of how to implement an own (much simpler) proxy in Java.


[top]

Mapping Cyberspace

Citation: Dodge, Martin and Rob Kitchin: Mapping Cyberspace. Routledge, London and New York 2001.
Book Website: http://www.MappingCyberspace.com
Found via: Mentioned on Cyber Geography Research
Type of document: Book, 260 pages, 11 chapters
Time spent with this document: 01-06-09: 15:00-18:00 = 3h (created own page for books, scanned 1st chapter, wrote notes)
Excerpts: relatedWork/books/MappingCyberspace.html

Summary:

From the back of the book:

Space is central to our lives. Because of this, much attention is directed at understanding and explaining the geographic world. Mapping Cyberspace extends this analysis to provide a geographic exploration and critical reading of cyberspace and information and communication technologies. Mapping Cyberspace: Mapping Cyberspace draws together the findings and theories of researchers from geography, cartography, sociology, cultural studies, computer-mediated communications, information visualisation, literary theory and cognitive psychology. It is highly illustrated with 8 pages colour plates and over 50 black and white figures.

Relevance:

Tracking:

Visualization:

Semantic Modeling:


[top]

Resource Description Framework (RDF) Model and Syntax Specification

Original URL: http://www.w3.org/TR/1999/REC-rdf-syntax-19990222/
Local URL: relatedWork/REC-rdf-syntax-19990222.html
Found via: Linked by Semantic Web Activity
Type of document: Single document (w3c recommendation).
Time spent with this document: 01-06-08: 18:00-19:00 = 1h

Summary:

Everything on the Web is machine-readable but not machine-understandable, making it hard to automate anything on the web. Enhancing the Web with metadata may be a solution. RDF and XML are complementary in that RDF provides a model of metadata and XML provides a syntax and encoding issues. RDF will have a class system like object-oriented programming and modelling systems. Collections of such classes are called schema. Multiple inheritance will be allowed to mix definitions. The main influences on RDF are from the Web standardization community, the library community, the structured document community and the knowledge representation community. Furthermore object-oriented programming and modeling and database technologies have contributed to the RDF design. RDF does not specify a mechanism for reasoning but reasoning mechanisms could be built on top of RDF seen as a simple frame system. The data model of RDF consists of three basic data types: resources (any entity that can be referenced by an URI), properties (a specific aspect, characteristic, attribute, or relation used to describe a resource; it has a specific meaning, defines its permitted values, the types of resources it can describe, and its relationship with other properties) and statements (connecting resources with properties and property-values as subject, predicate, object; objects can be literals or other resources). A syntax for describing RDF models in XML is given. There is a serialization syntax and an abbreviated syntax. RDF defines three types of container objects which can be used when a collection of resources shall be referred to: bag, sequence and alternative. Containers can also be defined by URI patterns, e.g. for saying all pages of my website without having to list each particular page. The difference between using multiple statements and using containers is that using containers states that the elements they contain belong together in some way while using multiple statements does not have that implication. Statements about statements (higher order statements) are possible to allow reification. Finally, a formal model and grammer and examples are given.

Relevance:

Semantic Modeling:

High relevance: contains a lot of concepts that I may have to consider for my semantic model. However, right now it seems that using RDF in my context would be kind of overkill.


[top]

Semantic Web Activity

Original URL: http://www.w3.org/2001/sw/
Found via: Linked by The Semantic Web
Type of document: mostly a link-base
Time spent with this document: 01-06-08: 17:30-18:00 = 0.5h

Summary:

The home base of Semantic Web with a lot of useful links to documents (specifications, publications and presentations) as well as discussion groups.

Relevance:

Semantic Modeling:

High relevance. Links to a lot of resources that may be quite interesting.


[top]

W3C's Math Home Page

Original URL: http://www.w3.org/Math/
Local URL:
Found via: Search Engine: www.google.de, keyword: MathML - first hit
Type of document: single document, followed a few links
Time spent with this document: 01-06-08: 14:00-15:00 = 1h

Summary:

MathML is an XML based language for describing mathematical content. MathML is not only capable of describing the format of formulas but also allows capturing the semantic content of such formulas. There are currently 17 implementations of MathML available, interesting for the project may be Amaya, E-Lite and the MathML enabled version of Mozilla.

Relevance:

If the thesis contains mathematical formulas, and it shall be formatted in XHTML, an XHTML viewer that supports MathML and is capable of printing nicely would be required.


[top]

The SHOE FAQ

Original URL: http://www.cs.umd.edu/projects/plus/SHOE/faq.html
Local URL: relatedWork/shoe_faq.html
Found via: Linked by The Semantic Web, Prof. Bry suggested taking a look.
Type of document: single document, FAQ-style - links to example ontologies
Time spent with this document: 01-06-07: 01:00-02:00 = 1h
01-06-08: 17:00-17:30 = 0.5h (wrote summary)

Summary:

Everything one needs to know about SHOE: "SHOE is an HTML-based knowledge representation language." It defines both tags for constructing ontologies and annotating web pages. Examples are given for web searches that will not work with current search-engines but will work with SHOE. Furthermore, SHOE could be used for giving users additional information while or before browsing websites and providing information that agents could process. There already is a tool available called Knowledge Annotator which can be used to add SHOE tags to websites. SHOE ontologies declare: classifications, relationships between entities, inferences in the form of horn clauses, inheritance from other ontologies and versioning. HTML pages with embedded SHOE data may: declare arbitrary data entities, declare the ontologies that are used within them, categorize entities, declare relationships between entities or between entities and data . "SHOE allows n-ary relations, horn clause inference, simple inheritance in the form of classification, multi-valued relations, and a conjunctive knowledge base. It does not currently allow negation, disjunction, or arbitrary functions and predicates." SHOE is XML compliant and can thus be used well in XHTML using namespaces. A list of example ontologies defined in SHOE is given as well as a list of ontologies defined in other languages. Finally, a few questions about the Knowledge Annotator are answered.

Relevance:

Tracking:

Low relevance: This could be used to store the links between pages. However, this is probably not the most elegant solution to do so.

Semantic Modeling:

High relevance: This could be used directly, possibly enhanced with ratings.


[top]

XML Linking: State of the Art

Original URL: http://www.sun.com/software/xml/developers/xlink.html
Local URL: relatedWork/xlink.html
Found via: Michael Kraus
Type of document: single document
Time spent with this document: 01-06-06: 00:00-00:30 = 0.5h

Summary:

Provides an easy introduction to XLink and XPointer with an extensive example.

Relevance:

This may be relevant for the implementation of a browser and for the data-structure used to store information. However, this is probably beyond the scope of the project.

Tracking:

Might be used to store data.

Visualization:

Might be used for browser.

Semantic Modeling:

Might be used to connect entities via hyperlinks.


[top]

Cyber Geography Research

Original URL: http://www.cybergeography.org/
Local URL: relatedWork/cybergeography/artistic.html
relatedWork/cybergeography/census.html
relatedWork/cybergeography/conceptual.html
relatedWork/cybergeography/geographic.html
relatedWork/cybergeography/info_landscapes.html (alright)
relatedWork/cybergeography/info_maps.html (good)
relatedWork/cybergeography/info_spaces.html (best)
relatedWork/cybergeography/topology.html
relatedWork/cybergeography/more_topology.html
relatedWork/cybergeography/surf.html (very good)
relatedWork/cybergeography/web_sites.html (good)
Found via: Jens Abraham via Michael Kraus
Type of document: domain with many subsections
Time spent with this document: 01-06-06: 21:00-24:00 = 3h
01-06-12: 15:00-16:00 = 1h (summary / relevance for Atlas of Cyberspace)

Summary:

[Note: On the page, the book Mapping Cyberspace is being advertised for. Ordered that book and Mapping Websites, Readings in Information Visualization, Envisioning Information and The Humane Interface. These books will serve as further readings to the project!]

A lot of useful material about visualization!!!

Relevance:

This is a very good starting point for research!

Tracking:

High relevance.

Visualization:

High relevance. The objective of the domain is to provide a topological map of the web which is exactly one part of what I am planning to do. On Info Spaces, there are a few traffic analyzing projects mentioned and a few interesting visualizations thereof are shown. In Info Maps, there is also a link to project that tracked the navigation behavior of a few people (look for Matthew Chalmers) and there are also some links to site maps and the like. On Surf Maps a few tools for mapping the trails of individual users are presented with screenshots. In the section Info Landscapes a few 3D visualizations of websites are shown. On More Topology, there is an interesting link to HyperSpace World-Wide Web Visualiser and a link to WebBrain, which may also be interesting. On Web Sites there is a link to the User Interface Research Group of Xerox Parc in Palo Alto.


[top]

Markup Languages and Ontologies

Original URL: http://www.semanticweb.org/knowmarkup.html
Local URL: relatedWork/knowmarkup.html
Found via: Linked by The Semantic Web
Type of document: 3 sequential documents
Time spent with this document: 01-06-06: 20:00-20:30 = 0.5h

Summary:

XML in itself provides no formalism for adding semantics to its tags. The tag-names may help humans but are mostly useless for computers. Exchanging information with well-defined formats helps in specific domains, but even within such domains there are often different formats with different semantics making conversions necessary. This can be handled by ontologies defining how the terms used in the specifications are related to each other. Definition: An ontology is a specification of a conceptualization. Ontologies are used to establish a common terminology for a community of interest (e.g. agents). Examples of representation languages and systems based the First Order Logic are given. A better approach is using a standard-syntax like XML, examples for languages based on XML are: SHOE, Ontology Exchange Language (XOL), Ontology Markup Language (OML and CKML), Resource Description Framework Schema Language (RDFS), and Riboweb (all examples with links to pages with further information). A set of ontology editors is introduced. Inference engines are required to process the knowledge stored on the Semantic Web. Higher Order Logics are most expressive but do not have nice computational properties. Distinguishing between higher order semantics and higher order syntax helps a bit. Even full first order logic is not suitable in the domain of the Semantic Web, though, because it won't scale. Thus, using subsets of FOL with nice properties is recommended. Further approaches are discussed...
A set of links to Semantic Web companies is given.

Relevance:

In particular, this document provides a lot of links to other resources. And thus can be used for a lot of further research about the Semantic Web and potentially also Visualization.

Visualization:

Indirectly: There it at least one link to a company that also does visualisation of search results (those companies weren't that interesting, though).

Semantic Modeling:

Very high: discusses problems concerning the processing of the generated data and potential solutions.


[top]

The Semantic Web

Original URL: http://www.scientificamerican.com/2001/0501issue/0501berners-lee.html
Local URL: relatedWork/0501berners-lee.html
Found via: Article in a Slashdot news-update
Type of document: Single document
Time spent with this document: 01-06-06: 19:00-20:00 = 1h

Summary:

The term Semantic Web is explained extensively. In the near future, semantically enhanced web-sites will be used by agents to help people arrange their lifes. This works because the "semantic enhancements" are machine-processable. The article argues that knowledge representation must be distributed instead of keeping it central and unanswerable questions and paradoxes must be accepted in order to allow a broad knowledge base. The main technologies behind the Semantic Web will be XML and RDF (Resource Description Framework). Within RDF, knowledge is represented through triples of URIs which make up two objects and how they are connected. Ontologies consisting of taxanomies and inference rules are used to solve the problem of different terms having equivalent semantics by allowing the agents to deduce information. Searches on the web can thus be improved by looking for precise concepts instead of sets of keywords. Finally, the Semantic Web unifying language is mentioned which will allow agents to exchange information, in particular proofs of how valid the information they have retrieved is.

Relevance:

Semantic Modeling:

Very high: this article proposes using XML and RDF to store the semantic data and explains the technologies on an abstract level. Ontologies in such a format could be used to define types of web-pages, links and user-groups.


Last modified: Saturday October 13 2001
by Holger (David) Wagner
Valid XHTML 1.0! Valid CSS!