Hands-On LLM Serving and Optimization (Hosting LLMs at Scale)

Availability: Confirm prior to ordering
Branding: minimum 50 pieces (add’l costs below)
Check Freight Rates (branded products only)

Branding Options (v), Availability & Lead Times

1-Color Imprint: $2.00 ea.
Promo-Page Insert: $2.50 ea. (full-color printed, single-sided page)
Belly-Band Wrap: $2.50 ea. (full-color printed)
Set-Up Charge: $45 per decoration

FULL DETAILS

Availability: Product availability changes daily, so please confirm your quantity is available prior to placing an order.
Branded Products: allow 10 business days from proof approval for production. Branding options may be limited or unavailable based on product design or cover artwork.
Unbranded Products: allow 3-5 business days for shipping. All Unbranded items receive FREE ground shipping in the US. Inquire for international shipping.
RETURNS/CANCELLATIONS: All orders, branded or unbranded, are NON-CANCELLABLE and NON-RETURNABLE once a purchase order has been received.

Product Details

Author:

Chi Wang, Peiheng Hu

Format:

Paperback

Pages:

372

Publisher:

O'Reilly Media (June 9, 2026)

Imprint:

O'Reilly Media

Release Date:

June 9, 2026

Language:

English

ISBN-13:

9798341621497

Weight:

16oz

Dimensions:

7" x 9.19"

File:

TWO RIVERS-PERSEUS-Metadata_Only_Perseus_Distribution_Customer_Group_Metadata_20260323163500-20260324.xml

Folder:

TWO RIVERS

List Price:

$79.99

Country of Origin:

United States

Pub Discount:

60

Case Pack:

20

As low as:

$68.79

Publisher Identifier:

P-PER

Discount Code:

C

Overview

Large language models (LLMs) are rapidly becoming the backbone of AI-driven applications. Without proper optimization, however, LLMs can be expensive to run, slow to serve, and prone to performance bottlenecks. As the demand for real-time AI applications grows, along comes Hands-On Serving and Optimizing LLM Models, a comprehensive guide to the complexities of deploying and optimizing LLMs at scale.

In this hands-on book, authors Chi Wang and Peiheng Hu take a real-world approach backed by practical examples and code, and assemble essential strategies for designing robust infrastructures that are equal to the demands of modern AI applications. Whether you're building high-performance AI systems or looking to enhance your knowledge of LLM optimization, this indispensable book will serve as a pillar of your success.

Learn the key principles for designing a model-serving system tailored to popular business scenarios
Understand the common challenges of hosting LLMs at scale while minimizing costs
Pick up practical techniques for optimizing LLM serving performance
Build a model-serving system that meets specific business requirements
Improve LLM serving throughput and reduce latency
Host LLMs in a cost-effective manner, balancing performance and resource efficiency

Hands-On LLM Serving and Optimization (Hosting LLMs at Scale)

Branding Options (v), Availability & Lead Times

Warranty Information

Product Details

Overview

The Book Company

Promotional Product Distributors

Stay Connected

COLLECTIONS - Our Top Selling Books & Gifts By Theme

SEASONAL COLLECTIONS

Hot Retail Brands

Branding Options (v), Availability & Lead Times

Warranty Information

Product Details

Overview

The Book Company

Promotional Product Distributors

Stay Connected