Abstract: | The amount and variety of digital data currently being generated, stored and analyzed, including images, videos, and time series, have brought challenges to data administrators, analysts and developers, who struggle to comply with the expectations of both data owners and end users. The majority of the applications demand searching complex data by taking advantage of queries that analyze different aspects of the data, and need the answers in a timely manner. Content-based similarity retrieval techniques are well-suited to handle large databases, because they enable performing queries and analyses using features automatically extracted from the data, without users’ intervention. In this paper, we review and discuss the challenges posed to the database and related communities in order to provide techniques and tools that can meet the variety and veracity characteristics of big and complex data, while also considering the aspects of semantical preservation and completeness of the data. Examples and results obtained over a two-decade-long experience with real applications are presented and discussed. |