issue/189: add inference server support to InfiniLM #190

ma-hang · 2026-01-14T15:58:43Z

9G7B 单卡 max_batch_size=32：

9G7B 4卡 max_batch_size=32：

PanZezhong1725 · 2026-01-15T05:51:10Z

python/infinilm/server/inference_server.py

+    def start(self):
+        app = self._create_app()
+        logger.info("Starting API Server...")
+        uvicorn.run(app, host="0.0.0.0", port=8000)


把port做一个脚本参数，默认8000

PanZezhong1725 · 2026-01-15T06:08:57Z

python/infinilm/server/inference_server.py

+
+    def _create_app(self):
+        @asynccontextmanager
+        async def lifespan(app: FastAPI):


请模仿vllm.LLM或sglang.Engine对这部分逻辑进行封装，server的脚本尽量只传递request，不要出现这种业务逻辑相关的代码
这个封装应该提供两种用法：

面向网络服务，异步流式输出，即server脚本的用法

面向单机使用，提供batch generate的方法，用户可同时传入多个请求，返回所有结果。目前需要这个的使用场景就是让ceval测试可以批量化进行

请把当前的core/目录按照封装重新命名，封装的代码也放到里面

issue/189: add inference server support to InfiniLM

2aa75c0

ma-hang requested review from a team, PanZezhong1725 and whjthu January 14, 2026 15:58

ma-hang linked an issue Jan 15, 2026 that may be closed by this pull request

[DEV] InfiniLM添加推理服务 #189

Open

PanZezhong1725 requested changes Jan 15, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

issue/189: add inference server support to InfiniLM #190

issue/189: add inference server support to InfiniLM #190

ma-hang commented Jan 14, 2026

Uh oh!

PanZezhong1725 Jan 15, 2026

Uh oh!

PanZezhong1725 Jan 15, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

issue/189: add inference server support to InfiniLM #190

Are you sure you want to change the base?

issue/189: add inference server support to InfiniLM #190

Conversation

ma-hang commented Jan 14, 2026

Uh oh!

PanZezhong1725 Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

PanZezhong1725 Jan 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

PanZezhong1725 Jan 15, 2026 •

edited

Loading