We created a RDF ontology for the telecom domain to support broader AI activities like enabling a chatbot system for customer support. The ontology creation process is semi-automatically. The general layout, the entities and edges, are defined by domain experts. The first input is then collected by mirroring the website structure of a customer help portal. Then several strategies for automatic expansion are explored, XML product sheets are crawled and the content is added via mapping rules, the Telekom help forum is crawled and label synonyms for classes and instances get mined by NLP techniques. Also semi-automatic ways for expansion by suggesting keywords as new concepts are investigated.
On the one hand, the ontology will be queried directly by SPARQL queries that are derived from natural language searches. On the other hand, it is the semantic basis for a variety of other applications such as semantic search, agent content assistance, virtual digital assistant, social media mining and intelligent chat bot.
The aforementioned applications use of course their own data pools and a lot of energy will be needed to interface the ontology with other systems. This could have been avoided if we’d store and maintain the data directly in them. But the advantages of maintaining an own centralized ontology are manifold: by storing the knowledge in an open standard format we strengthen the independence of proprietary technology, can keep parts of the data private and on-site, and re-use the data. Many applications can thus profit from a centralized update and maintenance process.
We created an ontology based on RDF in the telecom domain to support our AI activities. From several in domain data sources. On the one hand the ontology will be queried directly by SPARQL queries that are derived from natural language searches. On the other hand, it is the semantic basis for a variety of other applications such as semantic search, agent content assistance, virtual digital assistant, social media mining and intelligent chat bot. The advantages of maintaining an own centralized ontology are manifold: by storing the knowledge in an open standard format we strengthen the independence of proprietary technology, can keep parts of the data private and on-site, and re-use the data.
The talk will describe the general creation process. Further, to automatize the expansion process, we compared traditional machine learning techniques with newer deep neural net approaches and will report on the results. We will give some examples on the application of the ontology when used to foster semantic search in a call center agent environment or drive the question answering capabilities of a chat bot. For maintenance, integration and further expansion, the Poolparty platform is used and we report on experiences we had so far.
Felix Burkhardt does tutoring, consulting, research and development in the working fields human-machine dialog systems, text-to-speech synthesis, speaker classification, ontology based natural language modeling, voice search and emotional human-machine interfaces.
Originally an expert of Speech Synthesis at the Technical University of Berlin, he wrote his ph.d. thesis on the simulation of emotional speech by machines, recorded the Berlin acted emotions database, "EmoDB", and maintains several open source projects, including the emotional speech synthesizer "Emofilt" and the speech labeling and annotation tool "Speechalyzer". He has been working for the Deutsche Telekom AG since 2000, currently for the Telekom Innovation Laboratories in Berlin. He was a member of the European Network of Excellence HUMAINE on emotion-oriented computing and is the editor of the W3C Emotion Markup Language specification.
He is also a lecturer at the Technical University of Berlin presenting a course on auditory human machine interaction.