Results

WP1 - The identification of relevant requirements, techniques, and methods, as well as the limitations and improvements introduced.

D1.1 The activity aimed to analyze and interpret the requirements of the project and the beneficiary, starting from the project’s terms of reference and validated through meetings with the beneficiary, as well as through documents with their questions and answers. The aspects covered in this report can be summarized as follows: (i) description of the general requirements of the project; (ii) description of the specific requirements of the project related to algorithms, data, and their format, as well as the software platform requirements; (iii) description of the scenarios within the project; (iv) requirements related to data and application security.

D1.2 This activity analyzes the main databases from the literature that are useful for the scenarios described in this project. Specifically, a series of databases corresponding to each algorithm were analyzed, including their main attributes such as the number of training, validation, and testing sequences, the type of data, the metrics used, and how the data was acquired. Additionally, methods and techniques for data augmentation, preprocessing, manipulation, and aggregation are presented. Finally, the activity aims to analyze the scientific literature regarding Artificial Intelligence algorithms that meet the project’s requirements.

D1.3 This deliverable aims to define and generate specifications for the integrated software application, focusing on the following aspects: (i) defining the application architecture, including the integration with data sources, processing modules, data storage, and auditing; (ii) defining the JSON data format; (iii) presenting the benchmarking platform provided additionally as part of this project.

D1.4 Defines the main requirements and specifications related to GDPR legislation and Artificial Intelligence, as well as data security.

WP2 Research, design, and development of datasets, AI algorithms, and software services

D2.1 Report on specifications of databases developed for the target scenarios

This deliverable describes and analyzes, in a unified form, the specifications of the databases and output formats for the Artificial Intelligence algorithms developed in the project, defining for each target scenario annotation protocols in JSON format, with fields adapted to the type of data and the type of algorithm. Overall, the document establishes a coherent framework for representing the outputs of all modules, necessary for their integration into the beneficiary’s infrastructure and for comparable performance evaluation.

D2.2 Report on web services for data aggregation, manipulation, and pre-processing

The report describes the role of the suite of web services developed in the DeepDataRomania project, whose main objective is strengthening national security through the aggregation, manipulation, preprocessing and, subsequently, processing with artificial intelligence algorithms of complex collections of datasets. It presents the types of sources that can be integrated into the software platform — live and recorded video streams (IP cameras, VMS systems, media files), textual sources, and data from network traffic specific to the cyber domain — as well as the way in which they are managed by the user through the graphical interface (adding, configuring, editing, deleting, integration with MinIO and local directories). The report details the protocols and standards used for data manipulation (RTSP and HTTP for transport, and the ONVIF standard — especially profiles G and S — for interoperability with IP video devices), explaining how compatibility is ensured with a wide range of hardware resources and media formats. Finally, the document describes the preprocessing steps applied to the data before sending them to the AI algorithms integrated into the platform: demultiplexing, decompression, and sampling of video streams, respectively segmentation of texts into fragments of appropriate size, so that the input data meet the requirements of the algorithms and of the beneficiary.

D2.3 Report on AI algorithms for processing the target scenarios

The deliverable presents, in a unified and in-depth manner, the technical specifications of all Artificial Intelligence algorithms developed in the project for processing the data corresponding to the target scenarios. Emphasis is placed both on the conceptual description of each algorithm and on the justification of the chosen architectures, the detailing of the processing pipelines, and the presentation of the experimental results obtained. Since the project aims at the coherent integration of heterogeneous AI modules — with roles ranging from visual analysis to audio processing, textual analysis, entity detection, or identification of abnormal behaviors — this chapter defines the technical basis necessary for the development of the Artificial Intelligence algorithms.

D2.4 Report on integration and validation of AI algorithms in the web services

The report presents the way in which the AI algorithms are integrated and containerized in the system architecture, using Docker images orchestrated through Kubernetes to obtain a scalable and easy-to-manage solution in which each algorithm becomes an independent, reproducible, and configurable service. The central technologies of the platform are described — such as Kafka for communication between services, Redis for distributed cache, MinIO for multimedia file storage, and PostgreSQL for managing application data — as well as the way in which the operator can add, configure, and run processing instances. An essential element of the solution is the manual validation stage, in which operators verify and correct the predictions of the algorithms to ensure the accuracy of the generated datasets, eliminating annotation errors before final saving. Overall, the report describes a fully containerized and scaling-oriented platform, built to integrate heterogeneous AI algorithms and to guarantee the quality of the produced data, in accordance with the beneficiary’s requirements.

D2.5 Report on TRL5 status of the AI benchmarking web services

The activity described concerns the development and testing of the DeepDataRomania Bench web platform, an Evaluation-as-a-Service solution that offers a standardized, reproducible, and reliable framework for benchmarking artificial intelligence algorithms, allowing their evaluation on the same datasets and with the same metrics. The platform is built as an orchestral architecture of Docker containers (Django, Caddy, PostgreSQL, MinIO, RabbitMQ, compute/site workers, Flower, front-end build components), which together ensure the management of competitions, users, data, and the execution of submissions in isolated and controlled environments. Organizers can define competitions and tasks, configure ingestion and evaluation modules, upload complete bundles or use the graphical interface, manage participants, datasets, competition stages, submissions, and leaderboards, while participants manage their account, profile, participation in competitions, and the submission of results or code. Evaluation is performed automatically, based on clearly defined metrics, and leaderboards are built and updated transparently, the platform providing communication mechanisms (forum), data access control, versioning, and export of results, ensuring that the benchmarking process is robust, fair, and easy to reproduce in diverse experimental contexts.

D2.6 Report on dataset libraries for testing and validating algorithms in the operational environment

This activity analyzes the datasets used for testing and validating the algorithms in the operational environment, more precisely the datasets from open literature, the modifications made to them, or those developed from scratch within the project to cover its specific requirements. Thus, the datasets used for training and testing all Artificial Intelligence modules developed within the project are presented.

D2.7 Report on final testing procedures of the proposed solution, databases, web services, and AI algorithms

This deliverable describes and provides specifications for the methods of testing the proposed solution, analyzing the provided JSON specifications, the web services, the Artificial Intelligence algorithms, as well as the manner in which these were provided to the beneficiary. The general specifications of the project are reproduced, which include integration methods, module invocation methods, the possibilities offered for integration, as well as specifications regarding the ways in which the final solutions will be provided for testing and validation, as well as for integration into the beneficiary’s infrastructure.