A Comprehensive Analysis of Modern LLM Inference Optimization Techniques From Model Compression to System-Level Acceleration