Federated Learning · Tool Condition Monitoring

Federated learning for tool condition monitoring in machining

Scalable, data-sovereign learning across machines and sites. Without transferring raw data.

Explore model aggregation Sovereign data ecosystems

Use case

Tool wear monitoring in milling

Learning mode

Federated: local training, shared models

Data flow

Raw data stays on the shop floor

Status quo: tool-wear models in machining

Single-site models reach good lab accuracy but rarely transfer to new machines and jobs.

Typical ML models (single site / single setup)

Accuracy

CNN / LSTM

F1 ~ 0.78

Gradient Boosting

F1 ~ 0.74

Autoencoder

AUC ~ 0.70

Hybrid

F1 ~ 0.81

Good results in the lab or on one dataset, but often unstable when transferred to new machines or jobs.

Why "one model per use case" rarely scales

Obstacles

Domain shift: a different machine, tool, material, coolant or parameter set changes the signal statistics.
Label scarcity: wear labels require measurement and machine downtime, so ground truth is expensive.
Rare failure cases: chatter and breakage are rare, so the data is heavily imbalanced.
Data silos and governance: production data often cannot be merged into one central store.
Drift: bearings, spindle and sensors change over time, so models age in operation.

A locally perfect model generalizes poorly, and maintenance and retraining effort increases sharply.

What is federated learning?

Federated learning trains a shared model across many data sources without moving the raw data.

Federated Learning (FL)

What it is

Decentralized training across several data sources (machines or sites).
Training happens locally on each machine or site.
Only model updates (weights or gradients) are shared.
An aggregation server builds a global model (for example via FedAvg).
Suited to data restrictions (IP, privacy, network or IT constraints).
Goal: better generalization across heterogeneous production environments.

Transfer Learning (TL)

How it differs

A model is pre-trained on a source domain and fine-tuned on the target.
Typically the data (or features) must be available at the target.
It does not protect data sovereignty on its own.

Which problems does FL solve?

Generalization across many machines without centralizing raw data.
Continuous learning under drift and changing jobs.
Fast rollout: one global model for the whole fleet.

Obstacles and pitfalls

Non-IID data: differing distributions slow convergence.
System heterogeneity: compute power and cycle times vary.
Communication: bandwidth, outages and round coordination.
Privacy and security: updates can leak information, so use secure aggregation or differential privacy.

Transfer learning means "transfer knowledge and adapt locally"; federated learning means "learn together without sharing raw data".

The federated learning process

Local training at the source machines, aggregation in the cloud, and a distributed, improved global model.

Collect Data

Each site collects data from its machines

Local Training

Each site trains a local model on its own data

Upload Local Models

Upload the local models to the cloud

Model Aggregation

Aggregate the local models into a global model

Download Global Models

Download the global models from the cloud

Step 0 of 0

Key building blocks

Client training on local data.
Aggregation of model updates into a global model.
Versioning of models and training rounds.
Monitoring of model and data drift.

Security

Secure aggregation of model updates.
Transport encryption (TLS).
Access control for participants and assets.
Auditing of rounds and artifacts.

Data generation at a single machine

Each machine is instrumented to turn raw process signals into feature streams that can be labeled and learned from.

Milling process · video

Machine M1 · raw data → feature streams

Live

Process recording coming soon

Recording of the cutting process, used to illustrate where the sensor data originates.

Live telemetry · Grafana

Spindle current · vibration · feed

Streaming

Live dashboard available on the internal network

Telemetry such as spindle current, vibration and feed is streamed and feature-extracted per time window.

Scaling: data generation across three machines

The same instrumentation and pipeline building block extends from one machine to a fleet.

M1 · 3-axis milling

Signals: vibration (XYZ), spindle current, PLC. Features: windowing, RMS, FFT bands. Labelling: tool-life proxy and periodic measurement.

Edge agent → MQTT

M2 · 5-axis / aluminum

Signals: acoustic emission, vibration, torque. Features: same API and time base. Local storage: ring buffer plus batch upload.

Edge agent → MQTT

M3 · robot cell

Signals: force/torque, motor currents. Features: same pipeline modules. Events: anomaly flags and process windows.

Edge agent → MQTT

What matters for scaling

A unified time base (clock sync), the same window definition and the same units.
Schema and metadata per job: tool ID, material, cutting parameters, sensor position.
"Same code, different config": the pipeline as a template per machine type.

Application: condition monitoring with a trained FL model

A federated global model is more stable across machines and jobs than a single-site baseline.

Before: single-site models

Baseline

M1 → M1 (seen)

F1 ~ 0.82

M1 → M2 (unseen)

F1 ~ 0.61

M1 → M3 (unseen)

F1 ~ 0.57

Typical strong overfitting to one setup and process, leading to poor transfer.

After: federated global model

Global → M1

F1 ~ 0.86

Global → M2

F1 ~ 0.81

Global → M3

F1 ~ 0.79

More stable performance across machines and jobs, with less local retraining and faster commissioning.

Operating mode

Inference at the edge: condition indicator, remaining tool life or anomaly score in real time.
Round-based training: locally during idle times or adaptively on drift and job changes.
Rollout: the global model is signed, versioned and deployed to the fleet.

Outlook: challenges and pitfalls

Federated learning avoids raw-data centralization, but its success depends on standardized data, reliable orchestration and protected aggregation.

Data and semantics

Standardization

A unified data description: units, sensor position, sampling, job and tool metadata.
Label definitions: what counts as a "wear state" or an "anomaly"?
Quality: missing data, synchronization, drift, sensor changes.

MLOps and operations

Orchestration

Orchestration of model deployment (versioning, rollback, canary).
Round control: when to train, who participates, how to handle outages?
Monitoring: model drift, data drift, performance per machine and job.

Transferability of the knowledge

How well does global knowledge fit new tools and materials (out-of-distribution)?
Strategies: personalized FL, adapter layers, cluster-FL by machine type.
Security and privacy: secure aggregation, differential privacy, robust aggregation against poisoning.

Take away

Federated learning fits deployments where raw data must not be centralized.
Standardized data, reliable orchestration and secured aggregation remain prerequisites in production.

Explore further

Model Aggregation

Try model aggregation

Combine the per-machine XGBoost models with different aggregation strategies and compare the result.

Go to model aggregation

Data Ecosystems

Sovereign data ecosystems

See how Gaia-X, Ocean Protocol and Pontus-X keep raw data on the shop floor while models are shared.

Learn about data ecosystems