Mary-Ann Grosset, Thierry Vebr, Jan-Anno Schuur

Industry

Unleash the Triple: Leveraging a corporate discovery interface. The OECD case

“O.N.E Sight”, a fully semantic reading assistant developed in the OECD, relies on a semantic layer to enable concept searching. It crawls over millions of resources, consistently tagged to enable discovery of new facts. The use of corporate taxonomies and ontologies, allows you to search using your preferred language and/or vocabulary without having to worry about the language the information is written in, or the words contained in the content. Search in English, find results in French.  Search for articles on Agriculture in Australia, and be prompted with information on coastal fishery in Sydney. The semantic layer, allows for the use of discovery algorithms to expand your search with ideas and suggestions and allows you to search for something and find something else of interest that you were not aware of. Search for agriculture in Australia and be prompted to discover content on Agritourism in New South Wales.

The talk and demonstration will highlight the development, at the Organisation for Economic Co-operation and Development, of “O.N.E Sight”,  a fully semantic reading assistant, which unleashes the power of the triples, the result of 3 years of capacity building, developments and cross functional team work.
Analysts use “O.N.E Sight” to assist with the drafting of reports. They identify, view in context and extract relevant knowledge, regardless of language, contained within large volumes of structured and unstructured information from  internal or external sources. Content is enriched at fragment level with semantic services  based on organisation-wide or domain specific ontologies and specific linguistic rules used to identify multi-lingual knowledge contained within texts.  Analysts have provided feedback on time savings and the discovery of sources they would not have known of otherwise.
We will outline the project approach, the learning curve the team went through, the intellectual and technical challenges faced as issues linked to new ways of handling information, silos, traditional text-indexation, lack of text fragmentation and semantic links, reconciliation of semantic and textual searches, representation issues and more had to be addressed.
We will describe the long march towards semantic annotation and the emphasis placed on the quality of the tagging.  This will include: i) development, maintenance and use of the OECD central Taxonomies and Ontologies  in the semantic analysis tools,  ii) hazards of semantics (fuzziness, context, acronyms and disambiguation), iii) creation of a golden corpora, annotation quality testing, multi-view annotation graphs and iv) development of tools to identify ‘knowledge nuggets’, such as socio-economic indicators, by tagging semantic relationships within texts. The methodology used to develop these quality tagging applications, persistently returning high precision and recall statistics (around 95%) to ensure reliable results enabling the use of the tags in a production environment, will be described.

This discovery framework relies on the combined use of rdf triples and XML in a MarkLogic environment.  A full metadata-based approach enabling atomisation for searching, contextualisation of fragmented resources for viewing and robust reasoning, with acceptable response times for rendering.
“O.N.E Sight” was developed by the Knowledge Management team in close collaboration with subject matter experts/stakeholders that directly benefit from the application. The application is tailored to their needs and incorporates feedback in an Agile way.
Future developments include adding new resources and semantic tags, a mobile friendly version and developing externally facing OECD semantic applications based on “O.N.E Sight”.

CV

Mary-Ann Grosset is currently Knowledge Management Practice Team Manager in the Digital Knowledge and Information Services Department at the Organisation for Economic Co-operation and Development (OECD) in Paris, France. She is leading the team responsible for the development and maintenance of taxonomies and ontologies as well as the development and implementation of semantic enrichment services and related knowledge discovery and intelligent search applications in the OECD.  The Knowledge Management unit is also responsible for the Library, Records and Information Management services for the Organisation where Mary-Ann has, for the past 10 years, been moving from a mainly paper based service delivery to providing a fully-digital service delivery environment for staff. This includes management  of electronic records, online library services to staff and access to the Organisation’s records and archives through linked data services. Mary-Ann can be reached at mary-ann.grosset@oecd.org.

Interested in this talk?

Register for SEMANTiCS conference
Register