.jpg)
Summary
Edge AI inference requires computing power that traditional server architectures cannot deliver at scale. NEXCOM's FTA 5190 is a 1U rackmount edge server powered by Intel Xeon 6 SoC, delivering up to 36 cores with integrated AI acceleration via Intel Advanced Matrix Extensions (AMX) and cryptographic acceleration through Intel QAT Gen 5. The server combines eight 25GbE fiber ports, eight 1GbE copper ports, and optional LAN module expansion to 100GbE, enabling simultaneous high-performance inference, content delivery, and security acceleration for cybersecurity, edge analytics, and distributed computing workloads.
Problem / Requirements
Organizations deploying edge AI face three critical bottlenecks:
1. Inference Latency: Remote AI models require sub-millisecond inference response times; cloud-bound processing introduces unacceptable round-trip delays.
2. Throughput Constraints: Standard processors cannot sustain multi-stream AI inference while handling concurrent network traffic and cryptographic operations.
3. Network Integration: Edge servers must process incoming data (10/25/100GbE) while simultaneously delivering encrypted inference results without performance degradation.
Without purpose-built edge AI infrastructure, organizations default to expensive cloud processing or accept degraded inference quality and higher latency.
Technical Approach
The FTA 5190 architecture integrates specialized compute functions at the hardware level:
Processor Core:
- Intel Xeon 6 SoC: up to 36 cores optimized for parallel processing
- Base/turbo frequency tuned for sustained edge workloads
- NUMA-aware memory architecture for multi-socket scalability
AI Acceleration:
- Intel AMX (Advanced Matrix Extensions): Hardware-accelerated matrix operations reduce AI inference latency by 3-8x
- Native support for INT8, FP32, and mixed-precision models
- Optimized for TensorFlow, PyTorch, and ONNX inference frameworks
Cryptographic Acceleration:
- Intel QAT Gen 5: Dedicated hardware for AES, SHA, and public-key operations
- Offloads encryption from CPU cores, preserving compute capacity for AI inference
- Supports IPSec, TLS 1.3, and custom cryptographic pipelines
Network Architecture:
- 8x 25GbE fiber ports: Primary data ingestion and inter-appliance connectivity
- 8x 1GbE copper ports: Legacy system integration and out-of-band management
- Optional LAN module extensions: Scalable to 100GbE for data center aggregation
- Hardware-based packet filtering and load balancing
Implementation Notes
Deployment Scenarios:
1. Cybersecurity Analytics: Inline packet inspection with AI-driven threat detection; simultaneous encryption of egress traffic via QAT acceleration
2. Edge AI Inference: Multi-model inference (object detection, anomaly detection, classification) with sub-50 ms combined latency
3. Content Delivery: DPI-aware traffic optimization combined with real-time cache intelligence
4. Privacy-Preserving Analytics: Local inference prevents raw data transmission; encrypted results shipped to cloud
Configuration Examples:
- Dual-model inference: Two concurrent AI models (30 FPS video + 1000 req/sec tabular inference) on single server
- Cryptographic gateway: 50+ Gbps throughput with AES-GCM encryption via QAT, CPU overhead <5%
- Distributed training feeder: Automated feature extraction for model training pipelines at edge
Performance Metrics:
- ResNet-50 inference: 2.5 ms per image (INT8, batch=1) on single core
- Throughput scaling: Linear performance increase up to 32 concurrent inference streams
- Memory: Up to 1TB DRAM per server via multi-socket expansion
- Power efficiency: 3-5 TFLOPS per watt for AI workloads
Challenge-Solution Mapping
/table
Challenge | Requirement | NEXCOM Solution
AI inference latency at edge | <50 ms end-to-end response | Xeon 6 SoC + Intel AMX acceleration
Multi-stream concurrent inference | 30+ parallel inference jobs | 36 cores with NUMA architecture
Simultaneous encryption overhead | Sub-5% CPU impact for cryptography | Intel QAT Gen 5 offload
Data ingestion throughput | 200+ Gbps aggregate network I/O | 8x 25GbE + LAN module to 100GbE
Mixed legacy/modern integration | Support 1GbE and 25GbE simultaneously | Dual-port configuration (1GbE + 25GbE)
Scalability to 100GbE deployments | Future-proof networking capacity | LAN module extension path to 100GbE
/endtable
Specifications Snapshot
/table
Specification | Detail
Form Factor | 1U rackmount
Processor | Intel Xeon 6 SoC (up to 36 cores)
Memory | Up to 1TB DRAM (NUMA-aware)
AI Acceleration | Intel AMX (matrix extensions)
Crypto Acceleration | Intel QAT Gen 5
Network Ports | 8x 25GbE fiber, 8x 1GbE copper
Optional Expansion | LAN module extensions to 100GbE
Power Supply | Redundant PSU (2x n+1 configuration)
Operating Temp | 0°C to 55°C
Storage | Up to 4x 2.5" NVMe/SSD internal
/endtable
Key Takeaways
1. Hardware-Accelerated AI: Intel AMX integration eliminates software-only inference bottlenecks, delivering 3-8x latency improvement over standard processors.
2. Cryptography Without Penalty: QAT Gen 5 offload preserves CPU capacity for AI inference, enabling simultaneous high-throughput encryption and analytics.
3. Flexible Networking: Dual 25GbE + 1GbE ports support heterogeneous deployments; LAN module scalability reaches 100GbE without platform replacement.
4. Enterprise-Grade Density: 36 cores in 1U enables consolidation of multiple inference workloads, reducing data center footprint and operational overhead.
Contact NEXCOM
For specifications, availability, and technical inquiries, contact NEXCOM via the official website.
Source: https://www.nexcom.com/news/Detail/achieve-faster-ai-insights-with-nexcom-fta-5190-and-xeon-6-soc
