Federated learning for tool condition monitoring in machining
Scalable, data-sovereign learning across machines and sites. Without transferring raw data.
Status quo: tool-wear models in machining
Single-site models reach good lab accuracy but rarely transfer to new machines and jobs.
Typical ML models (single site / single setup)
AccuracyGood results in the lab or on one dataset, but often unstable when transferred to new machines or jobs.
Why "one model per use case" rarely scales
Obstacles- Domain shift: a different machine, tool, material, coolant or parameter set changes the signal statistics.
- Label scarcity: wear labels require measurement and machine downtime, so ground truth is expensive.
- Rare failure cases: chatter and breakage are rare, so the data is heavily imbalanced.
- Data silos and governance: production data often cannot be merged into one central store.
- Drift: bearings, spindle and sensors change over time, so models age in operation.
A locally perfect model generalizes poorly, and maintenance and retraining effort increases sharply.
What is federated learning?
Federated learning trains a shared model across many data sources without moving the raw data.
Federated Learning (FL)
What it is- Decentralized training across several data sources (machines or sites).
- Training happens locally on each machine or site.
- Only model updates (weights or gradients) are shared.
- An aggregation server builds a global model (for example via FedAvg).
- Suited to data restrictions (IP, privacy, network or IT constraints).
- Goal: better generalization across heterogeneous production environments.
Transfer Learning (TL)
How it differs- A model is pre-trained on a source domain and fine-tuned on the target.
- Typically the data (or features) must be available at the target.
- It does not protect data sovereignty on its own.
Which problems does FL solve?
- Generalization across many machines without centralizing raw data.
- Continuous learning under drift and changing jobs.
- Fast rollout: one global model for the whole fleet.
Obstacles and pitfalls
- Non-IID data: differing distributions slow convergence.
- System heterogeneity: compute power and cycle times vary.
- Communication: bandwidth, outages and round coordination.
- Privacy and security: updates can leak information, so use secure aggregation or differential privacy.
Transfer learning means "transfer knowledge and adapt locally"; federated learning means "learn together without sharing raw data".
The federated learning process
Local training at the source machines, aggregation in the cloud, and a distributed, improved global model.
Key building blocks
- Client training on local data.
- Aggregation of model updates into a global model.
- Versioning of models and training rounds.
- Monitoring of model and data drift.
Security
- Secure aggregation of model updates.
- Transport encryption (TLS).
- Access control for participants and assets.
- Auditing of rounds and artifacts.
Data generation at a single machine
Each machine is instrumented to turn raw process signals into feature streams that can be labeled and learned from.
Milling process · video
Machine M1 · raw data → feature streams

Process recording coming soon
Recording of the cutting process, used to illustrate where the sensor data originates.
Live telemetry · Grafana
Spindle current · vibration · feed

Live dashboard available on the internal network
Telemetry such as spindle current, vibration and feed is streamed and feature-extracted per time window.
Scaling: data generation across three machines
The same instrumentation and pipeline building block extends from one machine to a fleet.
M1 · 3-axis milling
Signals: vibration (XYZ), spindle current, PLC. Features: windowing, RMS, FFT bands. Labelling: tool-life proxy and periodic measurement.
Edge agent → MQTTM2 · 5-axis / aluminum
Signals: acoustic emission, vibration, torque. Features: same API and time base. Local storage: ring buffer plus batch upload.
Edge agent → MQTTM3 · robot cell
Signals: force/torque, motor currents. Features: same pipeline modules. Events: anomaly flags and process windows.
Edge agent → MQTTWhat matters for scaling
- A unified time base (clock sync), the same window definition and the same units.
- Schema and metadata per job: tool ID, material, cutting parameters, sensor position.
- "Same code, different config": the pipeline as a template per machine type.
Application: condition monitoring with a trained FL model
A federated global model is more stable across machines and jobs than a single-site baseline.
Before: single-site models
BaselineTypical strong overfitting to one setup and process, leading to poor transfer.
After: federated global model
FLMore stable performance across machines and jobs, with less local retraining and faster commissioning.
Operating mode
- Inference at the edge: condition indicator, remaining tool life or anomaly score in real time.
- Round-based training: locally during idle times or adaptively on drift and job changes.
- Rollout: the global model is signed, versioned and deployed to the fleet.
Outlook: challenges and pitfalls
Federated learning avoids raw-data centralization, but its success depends on standardized data, reliable orchestration and protected aggregation.
Data and semantics
Standardization- A unified data description: units, sensor position, sampling, job and tool metadata.
- Label definitions: what counts as a "wear state" or an "anomaly"?
- Quality: missing data, synchronization, drift, sensor changes.
MLOps and operations
Orchestration- Orchestration of model deployment (versioning, rollback, canary).
- Round control: when to train, who participates, how to handle outages?
- Monitoring: model drift, data drift, performance per machine and job.
Transferability of the knowledge
- How well does global knowledge fit new tools and materials (out-of-distribution)?
- Strategies: personalized FL, adapter layers, cluster-FL by machine type.
- Security and privacy: secure aggregation, differential privacy, robust aggregation against poisoning.
Take away
- Federated learning fits deployments where raw data must not be centralized.
- Standardized data, reliable orchestration and secured aggregation remain prerequisites in production.
Explore further
Model Aggregation
Try model aggregation
Combine the per-machine XGBoost models with different aggregation strategies and compare the result.
Go to model aggregationData Ecosystems
Sovereign data ecosystems
See how Gaia-X, Ocean Protocol and Pontus-X keep raw data on the shop floor while models are shared.
Learn about data ecosystems