The initiative will help generate a return on the investment in open data made by all kinds of organizations (photo: Pradamas Gifarry / unsplash.com)
The BODI project aims to end the paradox that, despite the constantly increasing amount of open data, people cannot use them because of the lack of infrastructure
Chatbots and voicebots use simple interfaces based on natural language and machine learning to let people access sources of open data
Developing a platform that creates "conversational" access to open data sources is the aim of a new technological research project at the Universitat Oberta de Catalunya (UOC). Bots for open data interaction - Conversational interfaces to facilitate access to public data (BODI) is an initiative, with funding from the Spanish Ministry of Science and Innovation and the EU's Next Generation programme, to develop the role of chatbots in dealing with large open data sources. The idea is to enhance the social impact of open data in applications related to government transparency, smart cities or citizen empowerment.
What is a chatbot?
Chatbots are applications that can converse with people by providing more or less complex automatic responses to a user's requests. These answers are established beforehand by a group of experts who anticipate certain specific questions which the bot must be able to answer. There has been a significant breakthrough: as technology has progressed, so too has their ability to respond or even to learn non-predefined questions.
A clear example of how useful they can be can be seen in Xatkit, a UOC spin-off company that provides pre-trained bots for e-commerce. When installed in stores, they can determine the products being sold and automatically configure themselves to start helping customers. "They can also continue to learn by themselves about changes made to the store and the products," said Jordi Cabot, the person behind the spin-off company and coordinator of the BODI project. Cabot is an ICREA researcher and leader of the Systems, Software and Models Research Lab (SOM Research Lab) at the UOC's Internet Interdisciplinary Institute (IN3).
BODI will also use chatbots as a mechanism for queries, using solutions based on open data, and automatically generating an interface that is simple and familiar for the user. "These technical contributions will have a very positive social and economic impact," said Cabot.
"First, they will help citizens to improve their decision-making by increasing independent access to data. They will also improve the return on the investment in open data projects made by all kinds of institutions and official bodies, by increasing their dissemination and use. In addition, they will enable private projects focusing on the curation and publication of additional open data to be carried out, by reducing the costs for publication and use," he added.
Democratization of information
An increasing amount of open data is published by both the public sector and private sources every day. For example, the official European data portal currently contains more than a million data sets: ranging from connection information, weather data streams, logs, contact numbers to traffic information.
The open data movement aims to make as much information as possible available to the public for countless uses, ranging from planning a journey to overseeing the actions of a government. Although major steps have already been taken in this direction, the problem with working with these large data sets is that the technological infrastructure to manage them is not always in place. This situation creates a paradox: data are increasingly available, but there are no adequate tools to handle them.
At present, only people with technical skills and specialized training are able to take advantage of heterogeneous data sources, while the majority of the population is forced to depend on third-party applications or businesses, which means that a large part of the spirit of open data is overlooked. "BODI seeks to change this situation: our objective is to empower citizens to exploit and benefit from open data. By eliminating the technical barrier to accessing data, our project expands the economic opportunities provided by open data and increases their net worth," said Cabot.
In order to achieve this ambitious goal, the project aims to take advantage of the latest technological breakthroughs in conversational interfaces, i.e. in addition to chatbots, voice bots, which use an interface based on oral conversation so that users can ask questions using natural language.
"The BODI platform will process the questions and break them down into a set of technical requests that will draw from open data sources through APIs or other available sources," said Cabot. "At the end of the process, the technology will compile the data and respond to the user in simple and natural language, ensuring smooth communication." This builds a natural bridge between users and vast sources of data which have hitherto been inaccessible, despite the best intentions, enabling a true democratization of the use of information.
A firm commitment to everything open
"The tools created by the project will be available as free software," said Cabot. "Templates and example projects taken from the pilots will be produced to help all types of companies and institutions interested in applying BODI technologies to their own data sets." He confirmed that the platform is expected to reach a level of maturity that means that when the project ends it can be applied immediately in industrial projects.
"One of BODI's objectives is to ensure the technology transfer of its results to the public administration and the business community. To do this, we will work on a strategy for the commercialization and exploitation of the project results, which will contribute to ensuring its continuity and future sustainability," he continued. "All the results will be published, as well as the open source part of the platform, providing the free technical infrastructure needed to have conversations about open data. In addition, to maximize its outreach, we'll also define a strategy that provides customized solutions for companies and institutions," he said.
This philosophy, which is committed to sharing on an open basis, therefore has a significant social benefit. "The results of this project will have a major impact on providing the population with unrestricted access to huge amounts of open data that are available on line, helping them with their day-to-day decision-making," he explained. "Improving access to open data in the public sector will boost the success of open governance initiatives. And in the private sector, the use of open data will help all types of businesses to strengthen their markets and their relationship with their clients," said Cabot.
"We believe that chatbots are the perfect interface for open data," he said. "Because by definition they're a conversational interface, they don't impose any type of entry barrier that excludes any part of the population. They're the perfect complement to provide citizens with more comprehensive resources thanks to access to data."
Project PDC2021-121404-I00 funded by:
The Xatkit: Massive generation of chatbots on-demand (with reference number 2019 INNOV 00001) project is supported by the Secretariat for Universities and Research of the Government of Catalonia's Ministry of Business and Knowledge, and has been co-financed by the European Union through the European Regional Development Fund (ERDF).
Xatkit was also the winning project of the jury prize at SpinUOC 2021, the UOC's annual entrepreneurship, innovation and knowledge transfer programme, organized through the Hubbik platform.
This project promotes sustainable development goal (SDG) 9: build resilient infrastructure, promote sustainable industrialization and foster innovation.
The UOC's research and innovation (R&I) is helping overcome pressing challenges faced by global societies in the 21st century, by studying interactions between technology and human & social sciences with a specific focus on the network society, e-learning and e-health.
Over 500 researchers and 52 research groups work among the University's seven faculties and two research centres: the Internet Interdisciplinary Institute (IN3) and the eHealth Center (eHC).
The University also cultivates online learning innovations at its eLearn Center (eLC), as well as UOC community entrepreneurship and knowledge transfer via the Hubbik platform.
The United Nations' 2030 Agenda for Sustainable Development and open knowledge serve as strategic pillars for the UOC's teaching, research and innovation. More information: research.uoc.edu #UOC25years
Jordi Cabot Sagrera
ICREA research professor
Expert in: All kinds of techniques for the analysis, design and implementation of software with the greatest possible quality and productivity. This includes areas such has handling open data, analysing free software projects, and the automatic code generation.
Knowledge area: Software engineering.