Semantic systems and the Semantic Web in particular, are believed to provide appropriate frameworks for informationintegration and interperability in ultra large scaledistributed environments. The Semantic Web is an extension of current Internet technology in which the information is given well-defined meaning by making its underlying structure explicit and formal. The Resource Definition Framework (RDF) plus structured ontology languages such as the Web Ontology Language (OWL) provide necessary constructs for representing information and knowledge as globally unique resources with a clear, unambiguous, precise, and computationally interpretable (formal) semantic. Formal representation of information and knowledge supports computer reasoning for automatic classifications, and enables greater interoperability, integration, and repurposing of information in large scale and complex settings, and allows data to be shared and reused across boundaries of applications, enterprises, and communities. This also makes the information identically 'understandable' for both human and machine.
 |
|
The Semantic Web technology is generally viewed as layers of technological frameworks as depicted in figure below. Each layer extends and builds on the layer below and tends to be progressively more specialized, and more expressive.
|
Resource Description Framework (RDF): The RDF provides a general-purpose framework for representation of a web of information. An RDF statement comprises of three elements (nodes): subject, predicate, and object, as in an English statement.
RDF statements are often called RDF triples and that is basically the only schema for construction of RDF documents.

Put together, all statements (triples) within an RDF document or dataset form a directed labeled graph. These graphs can be given a URI on the web and refered to as any other RDF resource. Automated agents then can use these URIs to find RDF datasets and use them on demand:

An RDF Document is a graph representation of RDF statements
RDF represents each node in the statement using a unique resource identifier (URI) that uniquely and globally identify any given resource in a web of distributed
information. For example a URI such as <http://umls.nlm.nih.gov/C0027515> can uniquely identify a specific antibiotic on a network as distributed and complex as internet. Since URIs are
globally resolvable, they can be used to link, mash-up, or integrate nodes (rdf resource), statements (triples), documents (collection of triples
forming a graph), and databases (collection of graphs) in novel ways and on
widely distributed network architectures such as the Web. Built on top of internet architecture, RDF data also makes underlying network and storage infrastructure completely transparent and resilitent to change. Since all layers of the Semantic Web layered cake follow a well-defined formal
semantic, RDF documents are machine processable, and can directly participate in
automatic and computer reasoning processes (no additional middleware for
translation necessary).
OWL (Web Ontology
Language): The Web Ontology Language (OWL) provides a rich and expressive
language for defining structured ontologies. OWL is an extension of the
RDF Schema Language (RDFS) that provides modeling primitives for defining
relationships between properties and resources, and to constrain their
interpretation by reasoning engines. There
are three major flavors of the OWL language: OWL-Lite, OWL-DL and OWL-Full to
support construction of ontologies with increasing levels of complexity and
expressivity.
SPARQL: SPARQL
(pronounced "sparkle” ) is an RDF query language; its name is a recursive
acronym that stands for SPARQL Protocol and RDF Query Language.
SPARQL is a syntactically-SQL-like language for querying RDF graphs via pattern matching. The
language's features include basic conjunctive patterns, value filters, optional
patterns, and pattern disjunction. SPARQL can be used to express queries
across diverse data sources, whether the data is stored natively as RDF or
viewed as RDF via middleware. The output of SPARQL queries can be a result set (represented
as tuples, XML, or JSON), or another RDF graph.
The
SPARQL protocol is a method for remote invocation of SPARQL queries. It
specifies a simple interface that can be supported via HTTP or SOAP that a
client can use to issue SPARQL queries against some endpoint. Combination
of SPARQL as language and SPARQL as protocol creates an extensible query and
retrieval mechanism that can retrieve information from a distributed network.
RDF
provides the most fine-grained data representation possible. The informational
resources can be represented without any schema at all, and can be bound to
more than one schema, on a later time (late binding). A URI such as
<http://umls.nlm.nih.gov/C0023451> is the most atomic form of data that
can be published and resolved uniquely and unambiguously on a widely
distributed network. RDF statements
(<subject><predicate><object>) are the most atomic unit of
information and can be constructed by mashing up URIs:
<http://umls.nlm.nih.gov/C0027515> <http://w3.org/…/rdf-syntax#type> <http://semanticnetwork.nlm.nih.gov/Antibiotic>
or
to make it easier to read:
<umls:C0027515>
<rdf:type> <sn:Antibiotic>

Notice that the 3 nodes or URI’s
in this statement are basically mashed-up from 3 different locations on the web:
<umls:C0027515>
is a concept defined in the NLM UMLS Metathesaurus, <rdf:type> is an RDF primitive defined by RDF
specifications located on the W3C website, and <sn:Antibiotic> is another concept, this time from UMLS
Semantic Network. That is, new information is generated by mash-up of existing data.
This mechanism can be used to contextualize
any data on the web, in most fine-grained, distributed and extensible way
possible.
Using triples as the single method of data representation and the modularity of the RDF content makes it an inherently
and infinitely extensible language. New vocabularies or constructs can be added
to an RDF document dynamically. In this example we add <skos:prefLabel> to our data to extend it and enable associating a name to an existing object:
<umls:C0027515> <skos:prefLabel> “Neamine”
 |
| Extending RDF graphs to capture new types of information is as easy as adding a new triple and does not require change in underlying database schema |
Existing
vocabulary can be mashed up (by adding new RDF statements about
existing resources) to impose new semantics, or to add new information to the
model:
<skos:prefLabel> <rdf:type>
<owl:AnnotationProperty>
In this example by assigning <owl:AnnotationProperty> type to the <skos:prefLabel> we inform computer programs that this property points to literal data suitable for human use and not for computation. This can reduce cost of certain types of search and computations in large datasets, and improve indexing of data.
RDFS (RDF Schema Language) and OWL are RDF dialects that introduce
primitives with well defined formal semantics to extend RDF documents in ways
that computer programs can automatically interpret and process them. Primitives
for transitivity, subclass and subProperty relations between concepts and
properties, existential and universal qualifiers that can constrain properties,
sets, intersections, unions, disjoints, compliments, quantifiers (cardinality),
property chaining, symmetric, asymmetric, inverse, relflexive and irreflexive
relations all enable infinite extensibility of any given RDF document in many
different ways and up to a full blown Ontology with strong semantics
processible by a computer reasoning engine.
- Modular
and Highly distributable
Every RDF triple is meaningful on its own account, and has no other
implied meaning based on where in the document or on the network it is located.
That is, order within which triples are recorded in an RDF document does not
matter. Therefore, an RDF document can be segmented into several smaller subsets of RDF
triples arbitrarily, and each subset can be maintained in a different document,
database or network and under different governance, access, and protection
structure. RDF documents can be linked to each other by making RDF statements
about RDF documents:
<a:RDFDocument_1> <owl:imports>
<b:RDFDocument_3>
 |
| This graph shows how new RDF models can be
generated by reusing and sharing other RDF documents on a distribute network
(through linking). The RDF 1 for example, is a dataset that consists of all contents of RDF A, B,
3 and 5, plus its own unique triples. RDF 2 consists of all triples contained in RDF
3,4, 5 and its own triples. Each RDF document may reside in a
different network and only linked to each other through an import statement
(the arrow represents importing, for example a:RDF1 owl:imports b:RDF3). Notice that this whole figure by itself is an RDF graph that represents distribution of information in a distributed network! |
RDF documents can also be linked by RDF statements
between URIs within each document (<a:URI1>
<rdf:type><b:URI2>). This
creates an largely distributed and modular architecture where a dataset can be
modularized arbitrarily and reused, linked, shared, mixed, and matched with
many other modules on a largely distributed network, in novel and ad-hoc ways:
|

|
|
In this example two RDF documents are linked
through two statements within RDFB that linksto some URI defined by the RDFA
document (red arrows).
|