Tuesday, September 19, 2006

Google Adsense Ontology

by Anderson Paulen

Applied Semantics was started in 1998 (its name was Oingo at that time) by Adam Weissman and Gil Elbaz, with an interest in making computers more "human-literate". They worked to build a new architecture using their expertise in scalable information systems design, database applications development, software engineering, and natural language processing (NLP). Together with a team of linguists and software engineers, they developed the company's patented technology, CIRCA, which serves as the common platform for all Applied Semantics' products.

Google eventually bought Applied Semantics in April 2003, making it the owner of the AdSense technology as well as its CIRCA technology (Conceptual Information Retrieval and Communication Architecture) which AdSense is built on.

The CIRCA ontology is based on a language independent, scalable ontology consisting of millions of words along with what the words mean, how the words are related conceptually to other meanings. Ontologies are commonly used in artificial intelligence and knowledge representation to define a hierarchical data structure containing all the relevant entities and their relationships and rules.

Synonymy/antonymy ("good" is an antonym of "bad")

Similarity ("gluttonous" is similar to "greedy")

Hypernymy (is a kind of / has kind) ("horse" has kind "Arabian")

Membership ("commissioner" is a member of "commission")

Metonymy (whole/part relations) ("motor vehicle" has part "clutch pedal")

Substance (e.g. "lumber" has substance "wood")

Product (e.g. "Microsoft Corporation" produces "Microsoft Access")

Attribute ("past", "preceding" are attributes of "timing")

Causation (e.g. travel causes displacement/motion)

Entailment (e.g. buying entails paying)

Lateral bonds (concepts closely related to one another, e.g. "dog" and "dog collar")

A typical example is the word Java, which has a number of meanings, including a synonym for coffee, an Indonesian island and a computer programming language.

In the case of a word like Ford, however, the system has to rank the relationships generated. Ford is a car manufacturer as well as a company. The concept "car manufacturer" is more specific than company, so it would receive a stronger value. This entire scheme of how concepts relate is called an ontology and forms the core of most linguistics engines produced today.

What makes CIRCA ontology a very clever choice for web advertising?

CIRCA ontology understands and extracts key themes of a page

CIRCA discerns ambiguous terms

CIRCA uses the context and delivers relevant keywords

For More- http://www.adsense-digest.com

No comments: