semantic website

Friday, 9 October 2009

About Wolfram|Alpha (Semantic search)

Goals
Wolfram|Alpha's long-term goal is to make all systematic knowledge immediately computable and accessible to everyone. We aim to collect and curate all objective data; implement every known model, method, and algorithm; and make it possible to compute whatever can be computed about anything. Our goal is to build on the achievements of science and other systematizations of knowledge to provide a single source that can be relied on by everyone for definitive answers to factual queries.

Wolfram|Alpha aims to bring expert-level knowledge and capabilities to the broadest possible range of people—spanning all professions and education levels. Our goal is to accept completely free-form input, and to serve as a knowledge engine that generates powerful results and presents them with maximum clarity.

Wolfram|Alpha is an ambitious, long-term intellectual endeavor that we intend will deliver increasing capabilities over the years and decades to come. With a world-class team and participation from top outside experts in countless fields, our goal is to create something that will stand as a major milestone of 21st century intellectual achievement.

Status
That it should be possible to build Wolfram|Alpha as it exists today in the first decade of the 21st century was far from obvious. And yet there is much more to come.

As of now, Wolfram|Alpha contains 10+ trillion pieces of data, 50,000+ types of algorithms and models, and linguistic capabilities for 1000+ domains. Built with Mathematica—which is itself the result of more than 20 years of development at Wolfram Research—Wolfram|Alpha's core code base now exceeds 5 million lines of symbolic Mathematica code. Running on supercomputer-class compute clusters, Wolfram|Alpha makes extensive use of the latest generation of web and parallel computing technologies, including webMathematica and gridMathematica.

Wolfram|Alpha's knowledge base and capabilities already span a great many domains, and its underlying framework has the power and flexibility to support ready extension to essentially any domain that is based on systematic knowledge. More »

The universe of potentially computable knowledge is, however, almost endless, and in creating Wolfram|Alpha as it is today, we needed to start somewhere. Our approach so far has been to emphasize domains where computation has traditionally had a more significant role. As we have developed Wolfram|Alpha, we have in effect been systematically covering the content areas of reference libraries and handbooks. In going forward, we plan broader and deeper coverage, both of traditionally scientific, technical, economic, and otherwise quantitative knowledge, and of more everyday, popular, and cultural knowledge.

Wolfram|Alpha's ability to understand free-form input is based on algorithms that are informed by our analysis of linguistic usage in large volumes of material on the web and elsewhere. As the usage of Wolfram|Alpha grows, we will capture a whole new level of linguistic data, which will allow us to greatly enhance Wolfram|Alpha's linguistic capabilities.

Today's Wolfram|Alpha is just the beginning. We have ambitious plans, for data, for computation, for linguistics, for presentation, and more. As we go forward, we'll be discussing what we're doing on the Wolfram|Alpha Blog, and we encourage suggestions and participation, especially through the Wolfram|Alpha Community.

Less »

Future
Wolfram|Alpha, as it exists today, is just the beginning. We have both short- and long-term plans to dramatically expand all aspects of Wolfram|Alpha, broadening and deepening our data, our computation, our linguistics, our presentation, and more.

Wolfram|Alpha is built on solid foundations. And as we go forward, we see more and more that can be made computable using the basic paradigms of Wolfram|Alpha—and a faster and faster path for development as we leverage the broad capabilities already in place.

Wolfram|Alpha was made possible in part by the achievements of Mathematica and A New Kind of Science (NKS). In their different ways, both of these point to far-reaching future opportunities for Wolfram|Alpha—whether a radically new kind of programming or the systematic automation of invention and discovery.

Wolfram|Alpha is being introduced first in the form of the wolframalpha.com website. But Wolfram|Alpha is really a technology and a platform that can be used and presented in many different ways. Among short-term plans are developer APIs, professional and corporate versions, custom versions for internal data, connections with other forms of content, and deployment on emerging mobile and other platforms.

History & Background
The quest to make knowledge computable has a long and distinguished history. Indeed, when computers were first imagined, it was almost taken for granted that they would eventually have the kinds of question-answering capabilities that we now begin to see in Wolfram|Alpha. See timeline »

What has now made Wolfram|Alpha possible today is a somewhat unique set of circumstances—and the singular vision of Stephen Wolfram.

For the first time in history, we have computers that are powerful enough to support the capabilities of Wolfram|Alpha, and we have the web as a broad-based means of delivery. But this technology alone was not enough to make Wolfram|Alpha possible.

What was needed were also two developments that have been driven by Stephen Wolfram over the course of nearly 30 years. More »

The first was Mathematica—the system in which all of Wolfram|Alpha is implemented. Mathematica has three crucial roles in Wolfram|Alpha. First, its very general symbolic language provides the framework in which all the diverse knowledge of Wolfram|Alpha is represented, and all its capabilities are implemented. Second, Mathematica's vast web of built-in algorithms provides the computational foundation that makes it even conceivably practical to implement the methods and models of so many fields. And finally, the strength of Mathematica as a software engineering and deployment platform makes it possible to take the technical achievements of Wolfram|Alpha and deliver them broadly and robustly.

Beyond Mathematica, another key to Wolfram|Alpha was NKS. Many specific ideas from NKS—particularly related to algorithms discovered by exploring the computational universe—are used in the implementation of Wolfram|Alpha. But still more important is that the very paradigm of NKS was crucial in imagining that Wolfram|Alpha might be possible.

Wolfram|Alpha represents a substantial technical and intellectual achievement. But to build it required not just unique technology and ideas, but also the experience of 20 years of long-term R&D and ongoing development of robust technology at Wolfram Research. Wolfram|Alpha's world-class team draws from many fields and disciplines, and has unique access to experts across the globe. But what ultimately made Wolfram|Alpha possible was a singular commitment to the goal of making all the world's systematic knowledge computable.

Sunday, 4 January 2009

Ontology

Ontologies are considered as one of the pillars of the Semantic Web, although they do not have a universally accepted definition. A (Semantic Web) vocabulary can be considered as a special form of (usually light-weight) ontology, or sometimes also merely as a collection of URIs with an (usually informally) described meaning.

Ontologies on semanticweb.org are usually assumed to be accompanied by some document in a formal ontology language, though some ontologies do not use standardised formats for that purpose. A list of documents that are considered ontologies in this wiki is given below.

Definitions

There have been several definitions of what an ontology is.

An ontology is a formal specification of a shared conceptualization (Tom Gruber, see [1]). Many authors cite this definition.
The main thread of ontology in the philosophical sense is the study of entities and their relations. The question ontology asks is: What kinds of things exist or can exist in the world, and what manner of relations can those things have to each other? Ontology is less concerned with what is than with what is possible. Clay Shirky in his critical entry Ontology is overrated [2]. He says this is the first meaning. AI people took the word and rather define it as "an explicit specification of a conceptualization.", pointing to Gruber.
...

(If you add another one, please always cite the source as well)

Ontologies on semanticweb.org

The following vocabularies and ontologies are currently described on semanticweb.org, ordered by the occurrence on the web as estimated by Swoogle. See also the UMBC Top 100 of common namespaces. As the revision dates given here indicate, some of the below data may require update: feel free to contribute!

Semantic Web Activity: Advanced Development

"Now, miraculously, we have the Web. For the documents in our lives, everything is simple and smooth. But for data, we are still pre-Web." -- Tim Berners-Lee, Business Model for the Semantic Web

"The bane of my existence is doing things that I know the computer could do for me." -- Dan Connolly, The XML Revolution

Just as the early development of the Web depended on code modules such as libwww, W3C is devoting resources to the creation and distribution of similar core components that will form the basis for the Semantic Web. Our approach is Live Early Adoption and Demonstration (LEAD) -- using these tools in our own work.

If you're doing related work, please let us know!

Schedule Coordination and Dependency Tracking

The information and communication intensive environment of the W3C provides challenges for effective scheduling. Tools that facilitate and automate the process of calendaring and schedule management are increasingly helpful towards a more effective, collaborative environment.

TR Publication process Paper Trail, in progress Nov 2002
Working Group Home Page Markup in progress Feb 2003
contributes to
- Calendar View of Current and Future Deliverables
W3C Roadmap Diagrams which illustrate, broadly, dependencies between W3C activities based on RDF data representing these entities and their relationships. We expect to integrate a number of related projects with these diagrams; see
- W3C operational projects related to coordination and activity tracking, Dec 2001
Site Summaries from XHTML to RSS using XSLT, Aug 2000 announcement 11 Sep 2000
W3C At A Glance prototype May 2002
RDF Calendar workspace, www-rdf-calendar list archive, RdfCalendar wiki topic
- Leigh Dodds' XML.com report on the RDF Calendar task force
- ILRT's Calendaring Work: notes/ demo of Jan 2001
- Relevant selected rdf-interest posts: Jan 2000, more

Less mature bits include

Palmagent, integrating palmpilot data with RDF and HTTP
- Palm Pilot datebook notes in progress Sep 2000
Tools for supporting Semantic Web enabled meeting records announced May 25, 2001
An event-based model of Internet RFC Publication by Dan Connolly, April 2001
An early draft of an event-based model of the W3C process, including the state of the W3C Semantic Web Activity and its surroundings as of Feb 2001
RDF calendar experiment for SWWS, in progress Jul 2001
a travel itinerary, from n3 to RDF via python, from RDF to XHTML via XSLT; Makefile; includes a sort of RDF normalizer in XSLT; more travel itinerary automation tools
index of appearances with an RDF model for conferences/events, presentations, etc.

Metadata: Photos, Music, Documents, ...

Describing and retrieving photos using RDF and HTTP, W3C Note, 28 September 2000; demo site, photo metadata editor
Dublin Core Extraction Service, announcement of Jun 09, 2000
Site Summaries from XHTML to RSS using XSLT, Aug 2000 announcement 11 Sep 2000
Me Llamo -- an RDF Schema (in progress Sep 2000)
extracting RDF from RFC822 formatted email, announced, Jul 14, 2000

Annotation, Collaboration and Automated Knowledge Access

The web provides the capability for anyone to say anything about anything. Knowing who is making these assertions, is increasingly important in trusting these descriptions. The Annotea advanced development project provides the basis for asserting descriptive information, comments, notes, reviews, explanations, or other types of external remarks to any resource. Together with XML digital signatures, the Annotea project will provide a test-bed for 'web-of-trust' semantic web applications.

Real-time teleconferences are an integral part of the W3C environment, in addition to sharing documents in the Web and engaging in e-mail discussions. W3C augments voice teleconferences with simultaneous keyboard-based (irc) communication, for the purpose of facilitating the flow of the meeting, sharing URIs for items under discussion, and keeping meeting records. Meeting preparation, meeting facilitation, and meeting recording present opportunities for capturing data for the Semantic Web that is useful to workflow analysis. The Zakim and RRSAgent teleconference irc agents are SWAD tools that give the teleconference audio system a presence in the meeting irc co-channel and record the progress of the meeting. Zakim provides control over the audio system as well as meeting chair tools for agenda management, speaker (floor) control, and time management. Zakim and RRSAgent together capture meeting data and make it available in RDF/XML for other analysis tools.

Access Control Rules, Logic, and Proof

Resources that are maintained on a Web server may be protected by descriptive rules that express authority to access the document based upon properties of the document in addition to properties of the requester. Examples of this to date include W3C's RDF Access Control mechanisms for supporting team, member and global accessibility. Further descriptive rules are anticipated to support richer access control functions.

Proof that a meeting can occur at which the resources required to reach a decision are able to be present will depend on the ability to identify all the resources; including personnel, meeting facilities (room, teleconference system), and prerequisite documents. Any participant can use this proof to synchronize independent databases including personal planners. Proofs that a meeting took place at which all prerequisites were met and a decision was taken, become messages that state, for example, that a document progressed from Working Draft to Last Call Working Draft.

cwm is a general-purpose data processor for the semantic web, somewhat like sed, awk, etc. for text files or an XSLT processor. See Semantic Web Tutorial Using N3 which starts with the N3 Primer
Logic on the Web, Proof carrying authentication, design note from 1998
An overview of the W3C Access Control System, Jun 2001
An RDF model of popular protocols in N3 in progress June 2001
toward P3P/APPEL rules in RDF/N3: 5 Dec 2001 to p3p-dev
Perl RDF library with to support Acccess Management, Jun 2000
Rules for RDF Standard Terms (in progress Aug 2000)
KIF as an RDF Schema (in progress Aug 2000)
An RDF Model for GET/PUT and Document Management (in in progress Aug 2000)
Converting SHOE to RDF, inference rules discussion, 2000 Jul 15, first announced 2000 May 11

MIT/LCS DAML Project: Semantic Web Development

MIT/LCS has funding for a proposal, Semantic Web Development, under the DARPA Agent Markup Language (DAML) program.

Progress reports include:

Semantic Web Development: Intent of Work Nov, 2002
Semantic Web Development: Intent of Work Feb, 2002
2001 Project Summary Semantic Web Development MIT
Semantic Web Development Intent of Work 23 March, 2001
Semantic Web Development, proposal

The work is bring done in close connection with the W3C and may lead to W3C activities in the future.

Annotated DAML Ontology Markup draft Oct 2000
An Agent Markup Language draft Aug 2000
Using XML Schema Datatypes in RDF and DAML+OIL proposal Jan 2001

Language group resources

MIT/LCS Researchers

Tim Berners-Lee, LCS/W3C
Dan Connolly, LCS/W3C
Sandro Hawke, LCS/W3C
David P. Karger, LCS/TOC
Lynn Andrea Stein, LCS & AI
Ralph R. Swick, LCS/W3C
Eric Prud'hommeaux, LCS/W3C
Daniel J. Weitzner, LCS/W3C

Earlier work includes tools, integration with XML infrastructure, plans, talks, etc.

Semantic Web Advanced Development, May 2002, Hawaii
30 Jul - 1 August: International Semantic Web Working Symposium (SWWS) -- Infrastructure and Applications for the Semantic Web
- RDF calendar experiment for SWWS
RDF Syntax: An XML Schema/XSLT Approach, in progress May 2001
earlier (Aug 2000) work: RDF Syntax: An XML Schema Approach
A DAML+OIL model of the XML Information Set (in progress May 2001)
Circles and arrows diagrams using stylesheet rules
9 Feb 2001: W3C launches Semantic Web Activity, including advanced development.
Using XML Schema Datatypes in RDF and DAML+OIL proposal Jan 2001
Semantic Web, Tim Berners-Lee's XML2000 keynote slides, Dec 6, 2000, see also: technetcast audio/video
HyperRDF: Using XHTML Authoring Tools with XSLT to produce RDF Schemas (in progress Jul 2000)
Xlink to RDF mapping in XSLT, announced, Jun 27, 2000
RDF parser in XSLT, discussion of 2000 May 02
XSLT for screen-scraping RDF out of real-world data Mar 2000, Dan Connolly
RDF Perllib, a perl RDF library with for supporting RDF applications.
an index of appearances by Dan Connolly, with an RDF model for conferences/events, presentations, etc.
swad-chart (in SVG, PNG, ps, RDF/xml)
technical details: Makefile, swad-chart.n3
Weaving the Web
A roadmap to the Semantic Web, Sep 1998, Tim Berners-Lee

reference : http://www.w3.org/2000/01/sw/

Thursday, 1 January 2009

neon toolkit

NeOn is a 14.7 million Euros project involving 14 European partners and co-funded by the European Commission’s Sixth Framework Programme under grant number IST-2005-027595. NeOn started in March 2006 and has a duration of 4 years. Our aim is to advance the state of the art in using ontologies for large-scale semantic applications in the distributed organizations. Particularly, we aim at improving the capability to handle multiple networked ontologies that exist in a particular context, are created collaboratively, and might be highly dynamic and constantly evolving.

This project has not yet categorized itself in the Trove Software Map.

Project Type: Software
Registered: 2007-05-18 14:40
Activity Percentile: 31.25%
View project Statistics or Activity.
View list of RSS feeds available for this project

Tools Used by Project: 0

Developer Info
Project Admins: Michael Erdmann Peter Haase Developers: Holger Lewen Qiu JI Yiorgos Trimponias [View Members]

Latest File Releases

Package	Version	Date	Notes / Monitor	Download
neon-toolkit.sources	NeOn Toolkit Open Source v1.2.1 2008-11-24	November 24, 2008	-	Download
documentation	Sample Plugin	August 23, 2008	-	Download
neon-toolkit.binaries	NeOn Toolkit Binaries 1.2.0 B801 2008-11-24	November 24, 2008	-	Download

[View All Project Files]

TextToOnto, a framework for ontology learning from text.

Publication: Philipp Cimiano, Johanna Völker. Text2Onto - A Framework for Ontology Learning and Data-driven Change Discovery. In Andres Montoyo, Rafael Munoz, Elisabeth Metais, Proceedings of the 10th International Conference on Applications of Natural Language to Information Systems (NLDB), volume 3513 of Lecture Notes in Computer Science, pp. 227-238. Springer, Alicante, Spain, June 2005.

In addition to the standalone version of Text2Onto's graphical user interface, an Eclipse plugin has been published lately, which enables a tighter integration of Text2Onto with the editing and maintenance facilities of the NeOn Toolkit ontology engineering environment.

Development Status: 4 - Beta
Environment: Other Environment
Intended Audience: End Users/Desktop
License: GNU Lesser General Public License (LGPL)
Natural Language: English
Operating System: OS Independent
Programming Language: Java
Topic: Knowledge acquisition

Project Type: Software
Registered: 2004-07-28 17:09
Activity Percentile: 56.25%
View project Statistics or Activity.
View list of RSS feeds available for this project

Tools Used by Project: 0

Developer Info
Project Admins: Johanna Völker Philipp Cimiano Developers: Andreas Hotho Hua Gao Peter Haase Yimin Wang York Sure Yuangang Sheng [View Members]

Latest File Releases

Package	Version	Date	Notes / Monitor	Download
text2onto-core	text2onto-080807-src	August 7, 2008	-	Download
text2onto-plugin	text2onto-plugin-081114	November 14, 2008	-	Download

[View All Project Files]

reference : http://ontoware.org

PROTON ONTOLOGY

This is the home page of the PROTON Ontology (PROTo ONtology). It has been developed in the scope of the SEKT Project. Some important URL-s, relating to PROTON, follow below.

A New Release of PROTON! The PROTON Knowledge Management ontology is the new, fourth module, already part of PROTON. This ontology module is actually the SEKT-specific domain ontology (former SKULO - SEKT Knowledge Management Upper Level Ontology), that has been integrated in PROTON for easier management and consistency. It is only dependent on the System and Top modules and not on the Upper module. The namespaces of PROTON are different, as shown below (check the old ones here).

PROTON is split into 4 modules, which can be accessed as follows:

PROTON System module: http://proton.semanticweb.org/2005/04/protons
PROTON Top module: http://proton.semanticweb.org/2005/04/protont
PROTON Upper module: http://proton.semanticweb.org/2005/04/protonu
PROTON Knowledge Management module: http://proton.semanticweb.org/2005/04/protonkm

Updated Documentation of PROTON! The latest version of the PROTON Guidance ("D1.8.1 Base Upper-level Ontology (BULO) Guidance" - a deliverable document within the SEKT project ; its title contains the initial name that was given to the ontology (BULO), which was later changed to PROTON) is available at:

http://proton.semanticweb.org/D1_8_1.pdf

Note: The current version of the guidance will be duly updated with information, related to the new PROTON module, not later than March 15, 2005.

A PROTON .ppt presentation is available at:

http://proton.semanticweb.org/PROTON.ppt

PROTON is a development of the KIMO ontology, which had been created and used in the scope of the KIM platform for semantic annotation, indexing, and retrieval. A part of the KIMO ontology (the predecessor of PROTON) was specific to the KIM Platform, therefore all this information was taken out of the PROTON ontology and it was organized into two separate, KIM-specific modules, available as follows:

KIM System Ontology:
http://www.ontotext.com/kim/2004/12/kimso

KIM Lexical Ontology:
http://www.ontotext.com/kim/2004/12/kimlo

Mappings of the KIMO URIs to the PROTON URIs: the local names of almost all the classes and properties are preserved and the mappings relate old (KIMO) URIs to the corresponding modules of PROTON.

http://proton.semanticweb.org/mapping-classes.txt

http://proton.semanticweb.org/mapping-properties.txt
reference : http://semanticweb.org

Semantic Web (WikiPedia)

he Semantic Web is an evolving extension of the World Wide Web in which the semantics of information and services on the web is defined, making it possible for the web to understand and satisfy the requests of people and machines to use the web content.^[1]^[2] It derives from World Wide Web Consortium director Sir Tim Berners-Lee's vision of the Web as a universal medium for data, information, and knowledge exchange.^[3]

At its core, the semantic web comprises a set of design principles,^[4] collaborative working groups, and a variety of enabling technologies. Some elements of the semantic web are expressed as prospective future possibilities that are yet to be implemented or realized.^[2] Other elements of the semantic web are expressed in formal specifications.^[5] Some of these include Resource Description Framework (RDF), a variety of data interchange formats (e.g. RDF/XML, N3, Turtle, N-Triples), and notations such as RDF Schema (RDFS) and the Web Ontology Language (OWL), all of which are intended to provide a formal description of concepts, terms, and relationships within a given knowledge domain.

Purpose

Humans are capable of using the Web to carry out tasks such as finding the Finnish word for "monkey", reserving a library book, and searching for a low price on a DVD. However, a computer cannot accomplish the same tasks without human direction because web pages are designed to be read by people, not machines. The semantic web is a vision of information that is understandable by computers, so that they can perform more of the tedious work involved in finding, sharing and combining information on the web.

Tim Berners-Lee originally expressed the vision of the semantic web as follows:^[6]

I have a dream for the Web [in which computers] become capable of analyzing all the data on the Web – the content, links, and transactions between people and computers. A ‘Semantic Web’, which should make this possible, has yet to emerge, but when it does, the day-to-day mechanisms of trade, bureaucracy and our daily lives will be handled by machines talking to machines. The ‘intelligent agents’ people have touted for ages will finally materialize.

– Tim Berners-Lee, 1999

Semantic publishing will benefit greatly from the semantic web. In particular, the semantic web is expected to revolutionize scientific publishing, such as real-time publishing and sharing of experimental data on the Internet. This simple but radical idea is now being explored by W3C HCLS group's Scientific Publishing Task Force.

Tim Berners-Lee has described the semantic web as a component of Web 3.0. ^[7]

Relationship to the hypertext web

Limitations of HTML

Many files on a typical computer can be loosely divided into documents and data. Documents like mail messages, reports, and brochures are read by humans. Data, like calendars, addressbooks, playlists, and spreadsheets are presented using an application program which lets them be viewed, searched and combined in many ways.

Currently, the World Wide Web is based mainly on documents written in Hypertext Markup Language (HTML), a markup convention that is used for coding a body of text interspersed with multimedia objects such as images and interactive forms. Metadata tags, for example

provide a method by which computers can categorise the content of web pages.

With HTML and a tool to render it (perhaps web browser software, perhaps another user agent), one can create and present a page that lists items for sale. The HTML of this catalog page can make simple, document-level assertions such as "this document's title is 'Widget Superstore'". But there is no capability within the HTML itself to assert unambiguously that, for example, item number X586172 is an Acme Gizmo with a retail price of €199, or that it is a consumer product. Rather, HTML can only say that the span of text "X586172" is something that should be positioned near "Acme Gizmo" and "€ 199", etc. There is no way to say "this is a catalog" or even to establish that "Acme Gizmo" is a kind of title or that "€ 199" is a price. There is also no way to express that these pieces of information are bound together in describing a discrete item, distinct from other items perhaps listed on the page.

Semantic HTML refers to the traditional HTML practice of markup following intention, rather than specifying layout details directly. For example, the use of denoting "emphasis" rather than , which specifies italics. Layout details are left up to the browser, in combination with Cascading Style Sheets. But this practice falls short of specifying the semantics of objects such as items for sale or prices.

Microformats represent unofficial attempts to extend HTML syntax to create machine-readable semantic markup about objects such as retail stores and items for sale.

Semantic Web solutions

The Semantic Web takes the solution further. It involves publishing in languages specifically designed for data: Resource Description Framework (RDF), Web Ontology Language (OWL), and Extensible Markup Language (XML). HTML describes documents and the links between them. RDF, OWL, and XML, by contrast, can describe arbitrary things such as people, meetings, or airplane parts. Tim Berners-Lee calls the resulting network of Linked Data the Giant Global Graph, in contrast to the HTML-based World Wide Web.

These technologies are combined in order to provide descriptions that supplement or replace the content of Web documents. Thus, content may manifest as descriptive data stored in Web-accessible databases, or as markup within documents (particularly, in Extensible HTML (XHTML) interspersed with XML, or, more often, purely in XML, with layout/rendering cues stored separately). The machine-readable descriptions enable content managers to add meaning to the content, i.e. to describe the structure of the knowledge we have about that content. In this way, a machine can process knowledge itself, instead of text, using processes similar to human deductive reasoning and inference, thereby obtaining more meaningful results and facilitating automated information gathering and research by computers.

An example of a tag that would be used in a non-semantic web page:

cat

Encoding similar information in a semantic web page might look like this:

Cat

Relationship to object oriented programming

A number of authors highlight the similarities which the Semantic Web shares with object-oriented programming (OOP).^[8]^[9] Both the semantic web and object-oriented programming have classes with attributes and the concept of instances or objects. Linked Data uses Dereferenceable Uniform Resource Identifiers in a manner similar to the common programming concept of pointers or "object identifiers" in OOP. Dereferenceable URIs can thus be used to access "data by reference". The Unified Modeling Language is designed to communicate about object-oriented systems, and can thus be used for both object-oriented programming and semantic web development.

When the web was first being created in the late 1980s and early 1990s, it was done using object-oriented programming languages^{[citation needed]} such as Objective-C, Smalltalk and CORBA. In the mid-1990s this development practise was furthered with the announcement of the Enterprise Objects Framework, Portable Distributed Objects and WebObjects all by NeXT, in addition to the Component Object Model released by Microsoft. XML was then released in 1998, and RDF a year after in 1999.

Similarity to object oriented programming also came from two other routes: the first was the development of the very knowledge-centric "Hyperdocument" systems by Douglas Engelbart ^[10], and the second comes from the usage and development of the Hypertext Transfer Protocol.^[11]^{[clarification needed]}

Skeptical reactions

Practical feasibility

Critics question the basic feasibility of a complete or even partial fulfillment of the semantic web. Some develop their critique from the perspective of human behavior and personal preferences, which ostensibly diminish the likelihood of its fulfillment (see e.g., metacrap). Other commentators object that there are limitations that stem from the current state of software engineering itself (see e.g., Leaky abstraction).

Where semantic web technologies have found a greater degree of practical adoption, it has tended to be among core specialized communities and organizations for intra-company projects.^[12] The practical constraints toward adoption have appeared less challenging where domain and scope is more limited than that of the general public and the World-Wide Web.^[12]

An unrealized idea

The original 2001 Scientific American article by Berners-Lee described an expected evolution of the existing Web to a Semantic Web.^[13] Such an evolution has yet to occur. Indeed, a more recent article from Berners-Lee and colleagues stated that: "This simple idea, however, remains largely unrealized."^[14]

Censorship and privacy

Enthusiasm about the semantic web could be tempered by concerns regarding censorship and privacy. For instance, text-analyzing techniques can now be easily bypassed by using other words, metaphors for instance, or by using images in place of words. An advanced implementation of the semantic web would make it much easier for governments to control the viewing and creation of online information, as this information would be much easier for an automated content-blocking machine to understand. In addition, the issue has also been raised that, with the use of FOAF files and geo location meta-data, there would be very little anonymity associated with the authorship of articles on things such as a personal blog.

Doubling output formats

Another criticism of the semantic web is that it would be much more time-consuming to create and publish content because there would need to be two formats for one piece of data: one for human viewing and one for machines. However, many web applications in development are addressing this issue by creating a machine-readable format upon the publishing of data or the request of a machine for such data. The development of microformats has been one reaction to this kind of criticism.

Specifications such as eRDF and RDFa allow arbitrary RDF data to be embedded in HTML pages. The GRDDL (Gleaning Resource Descriptions from Dialects of Language) mechanism allows existing material (including microformats) to be automatically interpreted as RDF, so publishers only need to use a single format, such as HTML.

Need

The idea of a 'semantic web' necessarily coming from some marking code other than simple HTML is built on the assumption that it is not possible for a machine to appropriately interpret code based on nothing but the order relationships of letters and words. If this is not true, then it may be possible to build a 'semantic web' on HTML alone, making a specially built 'semantic web' coding system unnecessary.

There are latent dynamic network models that can, under certain conditions, be 'trained' to appropriately 'learn' meaning based on order data, in the process 'learning' relationships with order (a kind of rudimentary working grammar). See for example latent semantic analysis.

Components

The Semantic Web Stack.

The semantic web comprises the standards and tools of XML, XML Schema, RDF, RDF Schema and OWL that are organized in the Semantic Web Stack. The OWL Web Ontology Language Overview describes the function and relationship of each of these components of the semantic web:

XML provides an elemental syntax for content structure within documents, yet associates no semantics with the meaning of the content contained within.
XML Schema is a language for providing and restricting the structure and content of elements contained within XML documents.
RDF is a simple language for expressing data models, which refer to objects ("resources") and their relationships. An RDF-based model can be represented in XML syntax.
RDF Schema is a vocabulary for describing properties and classes of RDF-based resources, with semantics for generalized-hierarchies of such properties and classes.
OWL adds more vocabulary for describing properties and classes: among others, relations between classes (e.g. disjointness), cardinality (e.g. "exactly one"), equality, richer typing of properties, characteristics of properties (e.g. symmetry), and enumerated classes.
SPARQL is a protocol and query language for semantic web data sources.

Current ongoing standardizations include:

Rule Interchange Format (RIF) as the Rule Layer of the Semantic Web Stack

The intent is to enhance the usability and usefulness of the Web and its interconnected resources through:

Servers which expose existing data systems using the RDF and SPARQL standards. Many converters to RDF exist from different applications. Relational databases are an important source. The semantic web server attaches to the existing system without affecting its operation.
Documents "marked up" with semantic information (an extension of the HTML tags used in today's Web pages to supply information for Web search engines using web crawlers). This could be machine-understandable information about the human-understandable content of the document (such as the creator, title, description, etc., of the document) or it could be purely metadata representing a set of facts (such as resources and services elsewhere in the site). (Note that anything that can be identified with a Uniform Resource Identifier (URI) can be described, so the semantic web can reason about animals, people, places, ideas, etc.) Semantic markup is often generated automatically, rather than manually.
Common metadata vocabularies (ontologies) and maps between vocabularies that allow document creators to know how to mark up their documents so that agents can use the information in the supplied metadata (so that Author in the sense of 'the Author of the page' won't be confused with Author in the sense of a book that is the subject of a book review).
Automated agents to perform tasks for users of the semantic web using this data
Web-based services (often with agents of their own) to supply information specifically to agents (for example, a Trust service that an agent could ask if some online store has a history of poor service or spamming).

Projects

This section provides some example projects and tools, but is very incomplete. The choice of projects is somewhat arbitrary but may serve illustrative purposes.

DBpedia

DBpedia is an effort to publish structured data extracted from Wikipedia: the data is published in RDF and made available on the Web for use under the GNU Free Documentation License, thus allowing Semantic Web agents to provide inferencing and advanced querying over the Wikipedia-derived dataset and facilitating interlinking, re-use and extension in other data-sources.

FOAF

A popular application of the semantic web is Friend of a Friend (or FoaF), which describes relationships among people and other agents in terms of RDF.

SIOC

The SIOC Project - Semantically-Interlinked Online Communities provides a vocabulary of terms and relationships that model web data spaces. Examples of such data spaces include, among others: discussion forums, weblogs, blogrolls / feed subscriptions, mailing lists, shared bookmarks, image galleries.

Open GUID

Aimed at providing context for the Semantic Web, Open GUID^[15] maintains a global Identifier repository for use in the linked web. Domain-specific Ontologies and content publishers establish identity relationships with Open GUIDs.

SIMILE

Semantic Interoperability of Metadata and Information in unLike Environments

SIMILE is a joint project, conducted by the MIT Libraries and MIT CSAIL, which seeks to enhance interoperability among digital assets, schemata/vocabularies/ontologies, meta data, and services.

NextBio

A database consolidating high-throughput life sciences experimental data tagged and connected via biomedical ontologies. Nextbio is accessible via a search engine interface. Researchers can contribute their findings for incorporation to the database. The database currently supports gene or protein expression data and is steadily expanding to support other biological data types.

Linking Open Data

Datasets in the Linking Open Data project, as of April 2008

Class linkages within the Linking Open Data datasets

The Linking Open Data project is a community-led effort to create openly accessible, and interlinked, RDF Data on the Web. The data in question takes the form of RDF Data Sets drawn from a broad collection of data sources. There is a focus on the Linked Data style of publishing RDF on the Web. See #Triplify for a small plugin to expose data from your Web application as Linked Data.

The project is one of several sponsored by the W3C's Semantic Web Education & Outreach Interest Group (SWEO).

Services

Notification Services

Semantic Web Ping Service

The Semantic Web Ping Service is a notification service for the semantic web that tracks the creation and modification of RDF based data sources on the Web. It provides Web Services for loosely coupled monitoring of RDF data. In addition, it provides a breakdown of RDF data sources tracked by vocabulary that includes: SIOC, FOAF, DOAP, RDFS, and OWL.

Piggy Bank

Another freely downloadable tool is the Piggy Bank plug-in to Firefox. Piggy Bank works by extracting or translating web scripts into RDF information and storing this information on the user’s computer. This information can then be retrieved independently of the original context and used in other contexts, for example by using Google Maps to display information. Piggy Bank works with a new service, Semantic Bank, which combines the idea of tagging information with the new web languages. Piggy Bank was developed by the Simile Project, which also provides RDFizers, tools that can be used to translate specific types of information, for example weather reports for US zip codes, into RDF. Efforts like these could ease a potentially troublesome transition between the web of today and its semantic successor.

References

^ Berners-Lee, Tim; James Hendler and Ora Lassila (May 17, 2001). "The Semantic Web". Scientific American Magazine. http://www.sciam.com/article.cfm?id=the-semantic-web&print=true. Retrieved on 26 March 2008.
^ ^a ^b "W3C Semantic Web Frequently Asked Questions". W3C. Retrieved on 2008-03-13.
^ Herman, Ivan (2008-03-07). "Semantic Web Activity Statement". W3C. Retrieved on 2008-03-13.
^ "Design Issues". W3C. Retrieved on 2008-03-13.
^ Herman, Ivan (2008-03-12). "W3C Semantic Web Activity". W3C. Retrieved on 2008-03-13.
^ Berners-Lee, Tim; Fischetti, Mark (1999). Weaving the Web. HarperSanFrancisco. chapter 12. ISBN 9780062515872.
^ "People keep asking what Web 3.0 is. I think maybe when you've got an overlay of scalable vector graphics - everything rippling and folding and looking misty - on Web 2.0 and access to a semantic Web integrated across a huge space of data, you'll have access to an unbelievable data resource." -- Tim Berners-Lee Victoria Shannon (2006-06-26). "A 'more revolutionary' Web". International Herald Tribune. Retrieved on 2006-05-24.
^ "A Semantic Web Primer for Object-Oriented Software Developers". W3C (2006-03-09). Retrieved on 2008-07-30.
^ Connolly, Daniel (2002-08-13). "An Evaluation of the World Wide Web with respect to Engelbart's Requirements". W3C. Retrieved on 2008-07-30.
^ Engelbart, Douglas (1990). "Knowledge-Domain Interoperability and an Open Hyperdocument System". Bootstrap Institute. Retrieved on 2008-07-30.
^ Connolly, Dan. "From the editor... WebApps". W3C. Retrieved on 2008-07-30.
^ ^a ^b Ivan Herman (2007). "State of the Semantic Web". Semantic Days 2007. Retrieved on 2007-07-26.
^ Berners-Lee, Tim (2001-05-01). "The Semantic Web". Scientific American. Retrieved on 2008-03-13.
^ Nigel Shadbolt, Wendy Hall, Tim Berners-Lee (2006). "The Semantic Web Revisited". IEEE Intelligent Systems. Retrieved on 2007-04-13.
^ "Open GUID". OpenGUID.net. Retrieved on 2008-10-19.

This article or section includes a list of references or external links, but its sources remain unclear because it has insufficient inline citations.
You can improve this article by introducing more precise citations where appropriate. (July 2008)

External links

Wikimedia Commons has media related to: Semantic Web

W3C Semantic Web Activity
Semantic Web Interest Group IRC channel
semanticweb.org the Semantic Web community wiki, including descriptions of many related tools, events, and ontologies
The Semantic Web: An Introduction
Shiyong Lu, Ming Dong, and Farshad Fotouhi, “The Semantic Web: Opportunities and Challenges for Next-Generation Web Applications”, Information Research, Special Issue on the Semantic Web, 7(4), 2002.
Semantic Web in c#
Introduction to Ontologies and Semantic Web
GoPubMed: bringing Pubmed and the semantic web together
Semantic Web Video Lectures

Semantic Web Software & Demonstrations

Human Computation Video Luis Von Ahn presents innovative techniques to incorporate RDF info into a database of images, video or other group of data.
Open Source Semantic Tools
Open Source Semantic Search provided by WebGaps
SWED portal provided by WordPressHelp
Recommendation engine based on Semantic Web provided by Mesh Labs
Leading semantic software for online media properties provided by Inform Technolgies, Inc.[1]
Semantic Systems Biology