Eric Liang

@ Anyscale arrow icon

Known information

  • Co-authored a blog post on continuous batching for LLM inference.
  • Discussed benchmark results for existing batching systems such as HuggingFace’s text-generation-inference and vLLM.
  • Worked on optimizing LLM inference throughput and reducing p50 latency.
  • Involved in research and development of continuous batching, also known as dynamic batching or batching with iteration-level scheduling.
  • Contributed to the development of continuous batching-specific memory optimizations using vLLM.
  • Participated in benchmarking experiments to compare static and continuous batching frameworks.
  • Collaborated with other researchers and engineers on projects related to LLM inference and optimization.

About Anyscale

Anyscale offers a platform for scaling AI workloads, featuring products like the Anyscale Platform and Ray Open Source, and provides resources and events such as the Ray Summit.

report flag Report inaccurate information
Unlock exclusive insights

Sign up to reveal more information.

loader Sign up for free