The development of WAGI, Web Artificial General Intelligence, can for instance involve an intelligence algorithm with two metasystem transitions as I explained in my earlier article “Bloom’s Beehive -Intelligence is an algorithm”. In his book “Creating Internet Intelligence” Ben Goertzel also implicitly describes this. Steps 3 and 6 I mentioned in my earlier article are the most important steps in that they identify differences, correspondences and spatio-temporal relations are mapped as patterns. A pattern P is said to be grounded in a mind, when the mind contains a number of specific entities wherein P is in fact a pattern. From correspondences, shared meaning and grounded patterns, abstractions and simplification rules can be derived, whereas differences prompt for the evaluation towards possible modification.
For the abstraction and simplification processes wherein from numerous data events patterns are derived Artificial Intelligence programs exist and are developed, but they are often dedicated to a very specific niche. When it comes to numerical data, such as in stock market analysis, commercial activity analysis, scientific experimental data etc. or spatiotemporal data such as traffic systems or rule and pattern based data such as in games these programs work fairly well for their specific niche. What Goertzel is attempting in the OpenCog software and the Novamente project is bringing these features to the world of Artificial General Intelligence (AGI). Here the data mining which involves a great deal of analysis of a linguistic and semantic nature is of a quite different order. Although quite a number of programs exist (e.g. DOGMA; OBO, OWL: Web Ontology Language etc.) exist and a lot of work has been done in the field of Ontology (ontology in the field of AI is a “formal, explicit specification of a shared conceptualisation”) there is still room for improvement of rules and schemes helping in establishing ontologies.
It is here where the daily work of patent attorneys and patent examiners can provide ideas for development in the field of Ontology. In fact a great deal of the jobs of patent attorneys and patent examiners involve establishing ontologies. When a patent attorney drafts a claim for an invention, which is a specific entity, he tries to conceptualise in what way the invention can be described in the most general way, whilst maintaining all essential features for defining the invention. Upon drafting an application he has to take into account all possible components of an ontology being commonly known as individuals, classes, attributes, relations, function terms, restrictions, rules, axioms and events as illustrated hereunder:
- The specific entities on which a pattern is grounded, of which at least one must be described in a detailed manner and which can be claimed in dependent claims can be considered as the “individuals”, the ground level objects.
- The claim dependency structure, the so-called claim-tree has various kinds of intermediate generalisations before arriving at individual specific entities and can be considered as providing the “classes”.
- A claim most essentially exists of a list of features which qualify as “attributes”.
- By means of the dependency in the claim tree the “relations” are provided.
- The so-called “functional features” which encompass a series of specific entities provide the “function terms”.
- Disclaimers, proviso’s qualify as “restrictions”.
- If..then “rules” result in dependent claims of particular combinations of conditional requirements
- The provision of “axioms” is most often done in the description; it amounts to giving a plausible explanation of why the structural and functional features give rise to the described technical effect the invention has over the prior art.
- Changes in attributes or relations which lead to the drafting of different independent claims qualify as “events”.
In an astute manner patent attorneys are extremely proficient in this process. With a minimum of features and functional relations between those features, so as to warrant a claim which is as broad as possible without infringing teachings from the prior art, they arrive at giving an ontological definition of an invention.
The whole process of drafting a patent application and especially a successful claim tree depends on the proficiency of the patent attorney to identify classes and sub-classes: hypernyms and hyponyms. In the feature description he’ll have to use holonyms and meronyms. And in the ideal situation the broadest independent claim has been generalised in such a manner that prima facie it is difficult to see what concrete types of inventions fall under the conceptualisation.
And it doesn’t stop there the differences as regards the prior art prompt for the evaluation towards possible modification and/or additional industrial applications.
When a patent examiner has to evaluate a patent application, he has to go through this process in reverse order. He has to find out which specific entities have allowed for the generalisation and he has to imagine, what existing types of inventions could possibly fall under the scope of the generalised claims. He has to identify which features (structural and/or functional) are responsible for the technical effect over the prior art. From those notions he can then build a search strategy for identifying relevant prior art, which anticipates and falls within the scope of the subject-matter of the claims. For this search strategy to be complete he must combine a set of search concepts which reflect all individual essential features describing the invention. The search will start with some concrete examples of individual entities and synonyms at one level but when simple search strategies fail he’ll have to define (in as far as such has not been done by the patent attorney) hypernyms and hyponyms of the features and combine these. Or he’ll have to describe a feature as a set of meronyms or conversely a set of features as a holonym. Nasty problems occur often with acronyms which have more than one meaning, i.e. they are holonyms, which lead to search hits, which have too many documents. Then the Boolean operator NOT must be added in an additional search statement so as to filter out the irrelevant documents, the so-called noise. Antonyms at close distance to negating terms as “not” or “non” can also lead to positive results. If hits sets contain too many members narrowing down must occur, by adding more search terms or more specific search terms. Additionally search terms that have a defined relationship can be combined in a specified manner so as to warrant a proximity between the terms: this is done with so-called proximity operators, which are more powerful in those instances than simple Boolean “AND” operators. Conversely, if a hit set has too few members, it can be expanded by using more general terms, less search statements or less strict proximities.
In fact in building a search strategy, the search examiner is making a very detailed partial Ontology, and it is a pity (but a logic consequence of the requirement of secrecy) that these ontologies are not stored in a publicly accessible database in analogy the Semantic Web. In addition the community of patent examiners has created and still creates a very detailed classification scheme such as the IPC, which can suitably be used as inspiration in the development of ontological classification schemes. It would also be useful for everybody (not just patent professionals, scientists, inventors and AI ontologists) if search engines such as Google and Yahoo would finally make proximity operators available. There is a lot of criticism from the world of scientists and inventors on the inadequate results that web based search engines deliver (see Grivell, L. in EMBO reports (2006) 7, pp.10-13). The search engines employed by the patent offices are in many respects far superior. Unfortunately for you, they are not accessible to the public. In any way the AIbot based crawlers and spiders do not go into the deep web databases, where extremely relevant information may be waiting for you.
Ontologies stored in a specific database with links to other deep web databases that are completely searchable in combination with non-spider/non-crawler data mining bots may be a great step forward in information provision.
The proximity relation is a concept that may require further attention in the field of ontology as it is an indicator as to how certain terms are semantically connected to each other. For instance it would be useful to map for each term defined in a semantic web to know the average distance in all documents on the web to each other term. Perhaps from such a data mining it would turn out that certain terms have very close average proximities, where both terms have not been defined in the semantic web to have any relation to each other. It would provide a further degree of ontological mapping. On a more concrete level involving geographical data such processes are already under way (e.g. Arpinar et al. In “Handbook of geographic information science”: Geospatial Ontology Development and Semantic Analytics).
The ontologies are in any way required to build a Webmind based on WAGI and it is about time that AI developers at Google, Yahoo etc. start to work on these issues and to avoid that Hegel’s OWLs will only fly at dusk.