Singapore Institute of Technology
Browse

A Cache-Aware Offloading Strategy for Timely Generative AI Services in IIoT Networks

Download (1.31 MB)
conference contribution
posted on 2025-07-04, 05:28 authored by Tan Zheng Hui ErnestTan Zheng Hui Ernest, A. S. Madhukumar

In this paper, the inference freshness of generative artificial intelligence (gen-AI) services in industrial Internetof- Things (IIoT) networks is investigated. A freshness metric termed the peak age of inference (PAoIF) is proposed to quantify inference freshness by accounting for peak age of information and delays due to transmitting inference requests and results. A cache-aware offloading (CAO) strategy which employs multi-access edge computing in IIoT networks is also proposed for timely inference delivery. Leveraging novel closedform expressions for PAoIF violation probability within the proposed CAO and benchmark strategies for IIoT, this study analyzes the impact of PAoIF violation age, gen-AI service request rate, and average transmission rate, on inference freshness. We identified scenarios where the proposed CAO strategy exhibits a lower PAoIF violation probability compared to the benchmark strategies under consideration. Furthermore, we show that PAoIF violation probability under the proposed CAO strategy is minimized via optimizing the average transmission rate in IIoT networks employing servers with limited computing resources. Therefore, the analysis shows that the proposed CAO strategy is a viable technique towards enabling inference freshness for gen-AI services in IIoT networks.

History

Journal/Conference/Book title

IEEE VTC2025-Spring

Publication date

2025-06-18

Usage metrics

    Categories

    No categories selected

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC