Anomaly Detector¶
1. Product Description¶
1.1. Solution Overview¶
The AIDEAS Anomaly Detector (AIAD) is a toolkit for detecting anomalies at component-level or of the machine as a whole when it is in working conditions in the factory where it is being used. The main problem that can be solved by using this solution is the identification of outliers, an observation that significantly deviates from others. With this approach the user will be able to, for example:
Production failures.
Defects.
Undesirable events.
Machinery degradation.
1.2. Features¶
This AIDEAS Anomaly Detector offers the following features:
Providing the capability of defining the current machine configuration, defining the different components of the machine and its associated process variables, which will help the user contextualizing the obtained results.
Providing the capability of importing data from external databases (e.g. MongoDB) and different data sources such as CSV or EXCEL files.
Providing the functionality of data validation and pre-processing to ensure that the input data feed to the model is in the correct format.
Providing the capability of traning models with different algorithms and different data sources. And also to save them in order to be used later.
Obtaining predictions and displaying the results in a friendly way. There are two operating modes: on demand or offline mode and cyclic or online mode.
The solution determines if there are outliers in the system, in relation with the data that the models have been trained with, which is considered normal behaviour.
1.2. Prerequisites¶
• Technical Specifications¶
AIAD is fed with time series data in which there are several process variables distributed in columns while every row represents a single timestamp. As anomaly detection is performed under specific circumstances, these are considered as the normal behaviour. If these conditions changed, the models must be retrained to capture the new operation conditions. In addition to raw data, a machine configuration file is needed to give context to the evaluation performed, to know exactly in which component have an effect.
The AI-AD characterizes the normal behaviour during operation for a set of interest variables. The model files can be saved, under MinIO storage, to the backend of the solutions as a pkl file to be used with new data. Solution’s outputs are displayed in the UI and also sent to the AIMP.
The backend of the AI-AD is developed using python and FLASK as the framework for the API server. The backend provides the API endpoints with which the frontend can communicate to, send requests, and obtain the results.
The frontend of the solution is developed in REACT.
For deployment, docker is used since it is the most widely used containerization solution. Docker also makes it easy to deploy the packaged application into the runtime environment and is widely supported by deployment tools and technologies.
For internal storage a MinIO server is used. MinIO is a High-Performance Object Storage.
• Technical Development¶
This AIDEAS Solution has the following development requirements:
Development Language: Python and Javascript.
Libraries: Numpy, Pandas, Scikit-Learn, PyTorch, Flask, PyYAML, SciPy, Pickle.
Container and Orchestration: Docker, Kubernetes.
User Interface: React, PrimeReact, Redux.
Application Interfaces: RestAPI.
Database engine: MongoDB, MinIO.
• Hardware Requirements¶
AI-AD can run on any platform that supports Docker containers.
• Software Requirements¶
Docker Desktop (Windows, Mac, or Linux)
npm (for frontend deployment)
• External Dependencies¶
MongoDB (optional, for external data storage)
MinIO (for object storage)
2. Installation¶
2.1. Environment Preparation¶
Ensure that all dependencies, including Docker, Python, and npm, are installed. Clone the repository from the official GitLab project and configure the backend and frontend environments as needed.
2.2. Step-by-Step Installation Process¶
Local Installation: Requires configuring backend and frontend, installing dependencies, and launching services manually.
Docker Installation: Uses a
docker-compose.ymlfile to deploy the application.Kubernetes Installation: Pending implementation.
3. Initial Configuration¶
3.1. First Steps¶
• Login¶
Users must log in using GitLab authentication before accessing secured application features.

HOME¶
Dashboard → Tab in which an introduction of AIAD is displayed and from which the other tabs can be accessed too.

Help → Tab with guidelines.

COMMON¶
Machine Configuration → Tab in which the machine in which the application runs is defined.

Data → Tab in which data is imported and visualized in table and graph formats. These options are supported:
Data Files: .csv, .xls or .xlsx.

Establishing a connection with a MongoDB database and accessing its collections.

AI-AD¶
Training → Tab in which models can be trained.

Anomaly Detection → In this tab Anomaly Detection is set and performed and its results are visualized.

Results Overview → Given a Machine Configuration file, the results can be visualized and contextualized within the machine in which the solution is applied.


The tabs “Machine Configuration”, “Data”, “Training” and “Anomaly Detection” cannot be accessed if not logged in. The login can be done by clicking in the top-right user button. A GitLab user is needed.
Machine Configuration¶
In the Machine Configuration screen the following actions can be performed:
Create a new Machine Configuration from scratch¶
Add Components, Variables (name and ID), and its relation using the widgets under “Machine Configuration” Accordion.

Once everything is defined, click on “Create” to see the hierarchy tree table under “Machine Configuration Overview” accordion tab.

If needed, modify min, max, unit or description columns of the existing variables in the hierarchy tree table.

Click on “Export” to save it in MinIO.

Import an existing Machine Configuration¶
Drag and drop it under the upload widget or click on “+ Choose”. Multiple files are supported.

Click “Upload”.

The file will be saved in MinIO.
If extension is not .json or if the size exceeds the size limits an error will raise.
Visualize an existing Machine Configuration file¶
Select a Machine Configuration file from the Machine Configuration files dropdown widget.

Once selected, click on “Select File” to load it.

Once loaded, the Machine Configuration hierarchy tree will be available under “Machine Configuration Overview” accordion tab. Machine’s components, process variables and its relation will also be available under “Machine Configuration” accordion tab.
Edit an existing Machine Configuration file¶
Follow the steps described in the previous functionality.

Modified the desire parameter, e.g. adding new components or process variable or modifying min, max, unit or description columns of the existing variables in the hierarchy tree table as in the create a new machine configuration functionality. If components, variables or relation between them are added click on “Create” to update the hierarchy tree table.
Click on “Export” to save it in MinIO.
Delete an existing Machine Configuration file¶
Select a Machine Configuration file from the Machine Configuration files dropdown widget.

Once selected, click on “Delete File” to delete it from MinIO.

Reset Machine Configuration screen¶
Click on “Reset Screen” button to reset it, located in the top right corner.
Data Files¶
In the Data Files screen the following actions can be performed:
Import a Data File¶
Drag and drop or click “+ Choose” to upload.
Delete an existing Data File¶
Make sure ”Data Files” is selected in the top left buttons.
Select a data file from the data files dropdown widget.
Once selected, click on “Delete File” to delete it from MinIO.
Visualize Data File¶
Make sure ”Data Files” is selected in the top left buttons.
Select a data file from the dropdown.

Data file will be shown in table format.

Establish a connection to a MongoDB database¶
Make sure ”MongoDB” is selected in the top left buttons.
Parametrize the connection by defining: the IP address, the port, the username, the password and the name of the database to connect to.

Once connected, available collections will be visible.

Visualize Collection data¶
Make sure ”MongoDB” is selected in the top left buttons.
Once the connection is established select the desired collection from the collection selection dropdown and click on “Select Collection”.

Collection data will be shown in table format under “Data Visualization Table” accordion tab.

If variables are not sorted in columns, the following information is needed:
First, check the checkbox below “Connect” button.

Define the name of the columns containing the variables names, the values and the index of the time series.

Click on “Select Collection” again.
Collection data will be shown in table format under “Data Visualization Table” accordion tab and would not be rearranged but now it is possible to plot the variables.
Visualize Data Chart¶
Independently of the selected source, once data is loaded.
Under “Data Visualization Chart” accordion tab, select the desire variable, up to 5, to display from the “Y axis variables” dropdown. X axis variable can be selected too by enabling “Choose X axis variable” checkbox and selecting the variable from the “X axis variable” dropdown.

After clicking on “Plot” button a warning popups and after accepting it, a scatter plot will be shown.

Training¶
In the Training screen the following actions can be performed:
Train a model¶
First of all, the data source must be selected using the radio buttons on the top left side of the “Data Source Selection” section, “MongoDB” or “Data Files”. Depending on the selection, the dropdown widget will show the existing files under the specific folder inside MinIO or the available collections inside the database. To be able to select “MongoDB” radio button, the user must be connected to the database, see establish a connection to a MongoDB database above. Finally, click on “Select Data” button.
Once the data is loaded the model can be parameterized, specifying the following information:

The algorithm, clustering or AI (LSTMs) and the metric if the training data contains outliers, if not leave it as None. Use the “Algorithm” and “Metric” dropdowns.

The time variable, using “Time Variable” dropdown.
The different variables to study, using “Select Variables” dropdown.
To enable or disable AI explainability.
See an example of parameterization below.
Finally, click on “Training” to start the training process.

Once finished, a report will be shown in the “Training Results” section.

Visualize training results¶
Under “Data Visualization Chart” accordion tab, select the desired variables, up to 5, from the”Select Variable” dropdown.

After clicking on “Plot” button a plot with the signal will be shown.

Save a trained model¶
Click “Save” to store the model.

Reset Training screen¶
Click on “Reset Screen” button to reset it, located in the top right corner.
3.2. Main Workflows¶
Anomaly Detection¶
In the Anomaly Detection screen the following actions can be performed:
Import an existing model¶
Drag and drop it under the upload widget or click on “+ Choose”. Multiple files are supported.
Click on “Upload”.
The file will be saved in MinIO.
If extension is not .pkl or if the size exceeds the size limits an error will raise.
Delete an existing model¶
Select a model file from the model files dropdown widget.
Once selected, click on “Delete Model” to delete it from MinIO.
Perform anomaly detection¶
First of all, a model has to be selected using the “Model Selection” dropdown widget. Once selected a report of the selected model will be shown in the bottom of the “Model Selection” section.
Then, the data source must be selected using the radio buttons on the top left side of the “Data Source Selection” section, “MongoDB” or “Data Files”. Depending on the selection the dropdown widget will show the existing files under the specifies folder inside MinIO or the available collections. To be able to select “MongoDB” radio button, first the user must be connected to the database, see Establish a connection to a MongoDB database above. Finally, click on “Select Data” button. There are two working modes:
Offline or on demand mode.
Cyclic results are disabled.
Finally, click on “Obtain Results” to perform the anomaly detection.
A report will be shown in the “Anomaly Detection Results” section.

Online or cyclic mode.
Only works if a database is selected as the data source.
Cyclic results are enabled.
Finally, click on play button to perform the anomaly detection cyclically. The interval is defined in the config.yml file.
Results are shown cyclically. A report will be shown in the “Anomaly Detection Results” section. If visualization has been parameterized, it will be updated automatically. Contextualized results in the results overview screen are also updated.
Click on stop button stops the operation.

Visualize the results¶
Under “Data Visualization Chart” accordion tab, select the desire variables, up to 5.

After clicking on “Plot” button a plot with the signals and the identified outliers will be shown.

Finally, under “Data Visualization Table Format” accordion tab, the information displayed above is shown in table format. The data rows where the anomalies have been identified are shown.

The table can be exported as .csv, .xlsx or .pdf, using the butons in the top left corner.
Visualize the results¶
Click on “Reset Screen” button to reset it, located in the top right corner.
Results Overview¶
Given a Machine Configuration file, in the Results Overview screen the results can be visualized, contextualized within the machine in which the solution is applied. The different components are visualized. If machine configuration file is not selected, results can not be contextualized, therefore nothing is displayed.

If online mode is selected, contextualized results are also updated automatically.

4. General Queries¶
4.1. Installation and Configuration Contact (If Service Provided)¶
For installation and configuration support, users should refer to the official GitLab project or the associated organization (IKERLAN).
4.2. Support¶
Company |
Website |
Logo |
|---|---|---|
IKERLAN |
|
4.3. Licensing¶
The solution is licensed under AGPLv3 or PRIVATE licensing models.
Pricing and licensing details are available upon request.
Subject |
Value |
|---|---|
Payment Model |
Quotation under request |
Price |
Quotation under request |
5. User Manual¶
5.1. Glossary of Terms¶
– COMPLETE –
5.2. API Documentation¶
By default, the backend server is served on port 5000 and allows the following API methods. These methods are accessible through the application frontend, or by sending the proper request using tools like Postman, or directly with Python code.
Resource |
GET |
POST |
PUT |
DELETE |
|---|---|---|---|---|
|
Supported |
Supported |
Supported |
|
|
Supported |
Supported |
||
|
Supported |
Supported |
||
|
Supported |
Supported |
||
|
Supported |
Supported |
||
|
Supported |
|||
|
Supported |
Supported |
||
|
Supported |
|||
|
Supported |
|||
|
Supported |
|||
|
Supported |
|||
|
Supported |
Supported |
Supported |
|
|
Supported |
|||
|
Supported |
|||
|
Supported |
|||
|
Supported |
Supported |
||
|
Supported |
|||
|
Supported |
5.3 Machine Configuration¶
/machineConfig/<user_id>¶
GET → Get a list of the Machine Configuration file names existing under MinIO folder, which is specified in
config.yml.POST → Sends the selected Machine Configuration filename to the backend.
PUT → Sends the defined Machine Configuration to be stored as a JSON file under MinIO.
GET response type: [{ "id": 0, "name": "myFileName.json" }, {}, {}]
POST request type: { "username": "ikerlan", "filename": "lookUpTable.json" } POST response type: { "machine": "myMachine", "components": [], "variables": [], "componentDependantVariables": [] }
PUT request type: { "username": "ikerlan", "filename": "fileName.json", "machineTree": {} }
/machineConfig/<user_id>/upload_machineConfig¶
POST → Stores a list of Machine Configuration files (
machineConfig[]) in JSON format under MinIO.DELETE → Deletes the selected Machine Configuration file under MinIO.
DELETE request type: { "username": "ikerlan", "filename": "fileName.json" }
5.4 Data Files¶
/dataFiles/<user_id>¶
GET → Get a list of data file names existing under MinIO.
POST → Given a data file name, returns the column names and the data to be visualized.
Obtain column names
Obtain data file rows (given the number of rows to get and the starting row)
Obtain data file columns (given the column names to get)
Obtain data (given the number of rows to get)
GET response type: [{ "id": 0, "name": "myFileName.csv" }, {}, {}, ...]
POST obtain column names Request type: { "username": "ikerlan", "filename": "myFileName.csv", "colNames": true } Response type: { "filename": "myFileName.csv", "totalRows": int, "colNames": [] }
POST obtain data file rows Request type: { "username": "ikerlan", "filename": "myFileName.csv", "nRows": int, "startingRow": int } Response type: { "filename": "myFileName.csv", "dfDict": {} }
POST obtain data file columns Request type: { "username": "ikerlan", "filename": "myFileName.csv", "xVar": "xVar", "yVars": [] } Response type: { "xVarName": "xVar", "xVarValues": [], "yVarsNames": ["yVar", "yVar1", ...], "yVarsValues": [[],[],...] }
POST obtain data Request type: { "username": "ikerlan", "filename": "myFileName.csv", "nRows": int, } Response type: { "filename": "myFileName.csv", "totalRows": int, "colNames": [], "dfDict": {} }
/dataFiles/<user_id>/upload_dataFile¶
POST → Stores a list of data files (
dataFile[]) in.csv,.xls, or.xlsxformats under MinIO.DELETE → Deletes the selected data file under MinIO.
DELETE request type: { "username": "ikerlan", "filename": "dataFile.csv" }
5.5 MongoDB Data¶
/dataMongo¶
GET → Get a list of MongoDB collections in the established DB connections.
POST → Given a collection name, returns the column names and the data to be visualized.
Obtain collection data
Obtain collection data → given the name of the columns where the variable names, the variable values and the collection index (Time variable) are.
Obtain collection data by columns → given the column names to get.
POST obtain column names Request type: { "collection": "collectionName", "nRows": int, "selectedColVarNames": "", "selectedColValues": "", "selectedColIndex": "" } Response type: { "collection": "collectionName", "totalRows": int, "colNames": [], "dfDict": {}, "varsInCol": [] }
POST obtain column names Request type: { "collection": "collectionName", "nRows": int, "startingRow": int, } Response type: { "collection": "collectionName", "totalRows": int, "colNames": [], "dfDict": {} }
POST obtain column names Request type: { "collection": "collectionName", "xVar": "", "yVars": [], "selectedColVarNames": "", "selectedColValues": "", "selectedColIndex": "", } Response type: { "xVarName": "xVar", "xVarValues": [], "yVarsNames": ["yVar", "yVar1", ...], "yVarsValues": [[],[],...] }
/dataMongo/connection¶
POST → Establishes a connection with a MongoDB database given a set of connection parameters.
POST request type: { "user": "", "password": "", "ip": "", "port": int, "dbName": "" }
5.6 Anomaly Detection Resources¶
/algorithmListAD¶
GET → Get a list of available algorithms to train models with.
POST → Given an algorithm, returns the list of parameters and their default values.
GET response type: [{ "name": "algorithm" }, {}, {}, ...]
POST obtain column names Request type: { "selectedAlgorithm": "algorithm" } Response type: { "algorithmParamsList": ["parameter1":{ "description": "", "dataType": "", "rangeMin": "", "rangeMax": int, "defaultValue": int}, {}, {}, ...] }
/metricsListAD¶
GET → Get a list of available metrics during model training.
GET response type: [{ "name": "metric" }, {}, {}, ...]
5.7 Training Anomaly Detection Models¶
/trainingAD¶
POST → Sends training parameters and obtains information about the trained model.
POST request type: { "username": "ikerlan", "selectedDataFile": "dataFile.csv", "selectedCollection": "collectionName", "selectedColVarNames": "", "selectedColValues": "", "selectedColIndex": "", "connectionParams": {}, "selectedTimeVar": "", "selectedVarTime": "", "selectedVars": "", "selectedAlgorithm": "", "selectedAlgorithmParams": [], "selectedMetric": "", "machineConfig": {"components": [], "variables": []}, "explainability": bool } POST response type: { "trainingReport": "", "modelName": "", "modelVars": [] }
/trainingAD/results¶
POST → Get the results of the training given the desired variables.
POST request type: { "yVars": [] } POST response type: { "signalXName": "", "signalYNames": [], "signalXValues": [], "signalYValues": [], "outliersXValues": [], "outliersYValues": [] }
/trainingAD/saveModel¶
POST → Stores the trained model in .pkl under MinIO folder, which is specified in config.yml file.
POST request type: { "username": "ikerlan", "modelName": "myModel.pkl", }
5.8 Anomaly Detection, testing models¶
/models/<user_id>¶
GET → Gets the list of Model files under MinIO folder, which is specified in config.yml file.
POST → Stores a list of Model files, modelFile[], in .pkl format under MinIO folder, which is specified in config.yml file.
DELETE → Deletes the selected Model file under MinIO folder, which is specified in config.yml file.
GET response type: [{ "id": 0, "name": "myFileName.pkl" }, {}, {}]
```json DELETE request type: { "username": "ikerlan", "filename": "model.pkl" }
/anomalyDetection/model¶
POST → Sends the selected model and get the model information and the training results.
POST request type: { "username": "ikerlan", "selectedModel": "model.pkl", "machineConfig": {"components": [], "variables": []} } POST response type: { "trainingReport": "", "modelName": "", "modelVars": [] }
/anomalyDetection¶
POST → Get the results from performing anomaly detection, given a model and a dataset.
POST request type: { "username": "ikerlan", "selectedDataFile": "myFile.csv", "selectedCollection": "myCollection", "selectedColVarNames": "", "selectedColValues": "", "selectedColIndex": "", "connectionParams": {}, "selectedModel": "myModel.pkl", } POST response type: { "anomalyReport": "", "dataVariables": [], "outliersTableValues": [], "outliersTableColNames": [], "contextTestResults": {"resultsByVar": [], "resultsByComponent": []} }
resultsByVar, one per variable, example: { "var_id": 10106, "var_name": "Temp.table.bear.1", "component": "Structure", "startTS": "2022-06-20 20:41:27", "endTS": "2022-06-20 20:41:28", "status": "ANOMALY" } ```json resultsByComponentMP, one per component, example: { "component": "Structure", "var_id": [10106, 10107], "var_name": ["Temp.table.bear.1", "Temp.table.motor"], "startTS": ["2022-06-20 20:41:27", "2022-06-20 20:41:27"], "endTS": ["2022-06-20 20:41:28", "2022-06-20 20:41:28"], "status": "ANOMALY" }
/anomalyDetection/results¶
POST → Get anomaly detection results given the desired variables.
POST request type: { "yVars": [] } POST response type: { "signalXName": "", "signalYNames": [], "signalXValues": [], "signalYValues": [], "outliersXValues": [], "outliersYValues": [], "outliersTableValues": [], "outliersTableColNames": [] }
5.9 Cyclic Mode¶
/startJob¶
GET → Configures, add the cyclic task to the queue and starts the job. Sockets are used to update the UI every time a new results is obtained. If the cyclic task has been stopped, the job is resumed.
POST → Sends the necessary data to obtain the results from such as: the machine configuration file, the collection where data is gathered from, the model which will be evaluated and so on.
GET response type: { "message": "" }
POST obtain column names Request type: { "username": "ikerlan", "machineConfig": {"components": [], "variables": []}, "selectedCollection": "myCollection", "selectedColVarNames": "", "selectedColValues": "", "selectedColIndex": "", "connectionParams": {}, "selectedModel": "myModel.pkl", "selectedModelVars": [] } Response type: { "message": "", }
/stopJob¶
POST → Stops the cyclic task. If a job has already been scheduled, it will be proccesed.
GET response type: { "message": "" }
/jobResults¶
POST → Sends the results for visualization purposes.
POST obtain column names Request type: { "yVars": [] } Response type: [{ "selectedVar": "", "xVarValues": [], "yVarValues": [], "xOutliersValues": [], "yOutliersValues": [] }]
5.10 Sockets message (only in Cyclic Mode)¶
Sockets routine is handled by my_scheduled_results() function in socket_server.py. This function is executed cyclically as defined when calling /startJob method and performs the following operations
/startJob¶
Gets the data interval to get the data from the database. During the first run, current datetime is taken as the last time and one month prior as the first time to considered, both datetimes in UTC+0. In the next runs these values are updated, e.g. second run would be from current datetime to current datetime plus the interval defined.
Data is read from the database.
Results are obtained.
The following messages are sent:
Ok message:
{ "status": "ok", "message": f"Results obtained.\nFrom: {iniTS_string}, To: {endTS_string}\n{list(report.values())[-1]}", "anomaly_detection_results": {"anomalyReport": "", "dataVariables": [], "outliersTableValues": [], "outliersTableColNames": [], "contextTestResults": {"resultsByVar": [], "resultsByComponent": []}} }
Warning message, if there is no data:
{ "status": "warn", "message": f"No data found for the current time period.\nFrom: {iniTS_string}, To: {endTS_string}" }
Error message, if an error or exception happened:
{ "status": "error", "message": f"Error getting data!\n{ msg['message']},\nFrom: {iniTS_string}, To: {endTS_string}" } { "status": "error", "message": f"Error getting results!\n{ msg['message']},\nFrom: {iniTS_string}, To: {endTS_string}" } { "status": "error", "message": f"Exception happened while obtaining cyclic results.\nException: {str(e)},\nFrom: {iniTS_string}, To: {endTS_string}" }
5.11. Console Commands List¶
npm installfor frontend dependencies.pip install -r dev.txtfor backend dependencies.docker-compose up --buildfor Docker-based deployment.python server.pyto launch the backend server.npm run devto start the frontend server.
