null
Loading... Please wait...
FREE SHIPPING on All Unbranded Items LEARN MORE
Print This Page

Hands-On LLM Serving and Optimization (Hosting LLMs at Scale)

List Price: $79.99
SKU:
9798341621497
Quantity:
Minimum Purchase
25 unit(s)
Expected release date is Jun 9th 2026
  • Availability: Confirm prior to ordering
  • Branding: minimum 50 pieces (add’l costs below)
  • Check Freight Rates (branded products only)

Branding Options (v), Availability & Lead Times

  • 1-Color Imprint: $2.00 ea.
  • Promo-Page Insert: $2.50 ea. (full-color printed, single-sided page)
  • Belly-Band Wrap: $2.50 ea. (full-color printed)
  • Set-Up Charge: $45 per decoration
FULL DETAILS
  • Availability: Product availability changes daily, so please confirm your quantity is available prior to placing an order.
  • Branded Products: allow 10 business days from proof approval for production. Branding options may be limited or unavailable based on product design or cover artwork.
  • Unbranded Products: allow 3-5 business days for shipping. All Unbranded items receive FREE ground shipping in the US. Inquire for international shipping.
  • RETURNS/CANCELLATIONS: All orders, branded or unbranded, are NON-CANCELLABLE and NON-RETURNABLE once a purchase order has been received.
  • Product Details

    Author:
    Chi Wang, Peiheng Hu
    Format:
    Paperback
    Pages:
    372
    Publisher:
    O'Reilly Media (June 9, 2026)
    Imprint:
    O'Reilly Media
    Release Date:
    June 9, 2026
    Language:
    English
    ISBN-13:
    9798341621497
    Weight:
    16oz
    Dimensions:
    7" x 9.19"
    File:
    TWO RIVERS-PERSEUS-Metadata_Only_Perseus_Distribution_Customer_Group_Metadata_20260323163500-20260324.xml
    Folder:
    TWO RIVERS
    List Price:
    $79.99
    Country of Origin:
    United States
    Pub Discount:
    60
    Case Pack:
    20
    As low as:
    $68.79
    Publisher Identifier:
    P-PER
    Discount Code:
    C
  • Overview

    Large language models (LLMs) are rapidly becoming the backbone of AI-driven applications. Without proper optimization, however, LLMs can be expensive to run, slow to serve, and prone to performance bottlenecks. As the demand for real-time AI applications grows, along comes Hands-On Serving and Optimizing LLM Models, a comprehensive guide to the complexities of deploying and optimizing LLMs at scale.

    In this hands-on book, authors Chi Wang and Peiheng Hu take a real-world approach backed by practical examples and code, and assemble essential strategies for designing robust infrastructures that are equal to the demands of modern AI applications. Whether you're building high-performance AI systems or looking to enhance your knowledge of LLM optimization, this indispensable book will serve as a pillar of your success.

    • Learn the key principles for designing a model-serving system tailored to popular business scenarios
    • Understand the common challenges of hosting LLMs at scale while minimizing costs
    • Pick up practical techniques for optimizing LLM serving performance
    • Build a model-serving system that meets specific business requirements
    • Improve LLM serving throughput and reduce latency
    • Host LLMs in a cost-effective manner, balancing performance and resource efficiency