模块

`lazyllm.module.ModuleBase`

Bases: SessionConfigableBase

ModuleBase 是 LazyLLM 的核心基类，定义了所有模块的统一接口和基础能力。
它抽象了模块的训练、部署、推理和评测逻辑，并提供了子模块管理、钩子注册、参数传递和递归更新等机制。
用户自定义的模块需要继承 ModuleBase，并实现 forward 方法来定义具体的推理逻辑。

功能特性

统一管理子模块 (submodules)，自动追踪被持有的 ModuleBase 实例。
支持 Option 类型的超参数设置，方便网格搜索与自动调参。
提供钩子 (hook) 机制，可在调用前后执行自定义逻辑。
封装训练 (train)、服务部署 (server)、评测 (eval) 的更新流程。
支持 evalset 的加载与自动并行推理评测。

Parameters:

return_trace (bool, default: False ) –

是否将推理结果写入 trace 队列，用于调试和追踪。默认为 False。

使用场景

当你需要组合训练、部署、推理和评测中的部分或全部能力时，例如一个 Embedding 模型需要同时训练与推理。
当你希望通过根模块调用 start、update、eval 等方法，递归管理其持有的子模块。
当你希望用户参数从外层模块自动传递到内部实现（参考 WebModule）。
当你希望自定义模块支持参数网格搜索（参考 TrialModule）。

Examples:

>>> import lazyllm
>>> class Module(lazyllm.module.ModuleBase):
...     pass
... 
>>> class Module2(lazyllm.module.ModuleBase):
...     def __init__(self):
...         super(__class__, self).__init__()
...         self.m = Module()
... 
>>> m = Module2()
>>> m.submodules
[<Module type=Module>]
>>> m.m3 = Module()
>>> m.submodules
[<Module type=Module>, <Module type=Module>]

Source code in lazyllm/module/module.py

class ModuleBase(SessionConfigableBase, metaclass=_MetaBind):
    """ModuleBase 是 LazyLLM 的核心基类，定义了所有模块的统一接口和基础能力。  
它抽象了模块的训练、部署、推理和评测逻辑，并提供了子模块管理、钩子注册、参数传递和递归更新等机制。  
用户自定义的模块需要继承 ModuleBase，并实现 ``forward`` 方法来定义具体的推理逻辑。  

功能特性:
    - 统一管理子模块 (submodules)，自动追踪被持有的 ModuleBase 实例。
    - 支持 Option 类型的超参数设置，方便网格搜索与自动调参。
    - 提供钩子 (hook) 机制，可在调用前后执行自定义逻辑。
    - 封装训练 (train)、服务部署 (server)、评测 (eval) 的更新流程。
    - 支持 evalset 的加载与自动并行推理评测。

Args:
    return_trace (bool): 是否将推理结果写入 trace 队列，用于调试和追踪。默认为 ``False``。

使用场景:
    1. 当你需要组合训练、部署、推理和评测中的部分或全部能力时，例如一个 Embedding 模型需要同时训练与推理。
    2. 当你希望通过根模块调用 ``start``、``update``、``eval`` 等方法，递归管理其持有的子模块。
    3. 当你希望用户参数从外层模块自动传递到内部实现（参考 WebModule）。
    4. 当你希望自定义模块支持参数网格搜索（参考 TrialModule）。


Examples:
    >>> import lazyllm
    >>> class Module(lazyllm.module.ModuleBase):
    ...     pass
    ... 
    >>> class Module2(lazyllm.module.ModuleBase):
    ...     def __init__(self):
    ...         super(__class__, self).__init__()
    ...         self.m = Module()
    ... 
    >>> m = Module2()
    >>> m.submodules
    [<Module type=Module>]
    >>> m.m3 = Module()
    >>> m.submodules
    [<Module type=Module>, <Module type=Module>]
    """
    builder_keys = []  # keys in builder support Option by default

    def __new__(cls, *args, **kw):
        sig = inspect.signature(cls.__init__)
        paras = sig.parameters
        values = list(paras.values())[1:]  # paras.value()[0] is self
        for i, p in enumerate(args):
            if isinstance(p, Option):
                ann = values[i].annotation
                assert ann == Option or (isinstance(ann, (tuple, list)) and Option in ann), \
                    f'{values[i].name} cannot accept Option'
        for k, v in kw.items():
            if isinstance(v, Option):
                ann = paras[k].annotation
                assert ann == Option or (isinstance(ann, (tuple, list)) and Option in ann), \
                    f'{k} cannot accept Option'
        return object.__new__(cls)

    def __init__(self, *, return_trace=False, id: Optional[str] = None, name: Optional[str] = None,
                 group_id: Optional[str] = None):
        super().__init__(id=id, name=name, group_id=group_id)
        self._submodules = []
        self._evalset = None
        self._return_trace = return_trace
        self.mode_list = ('train', 'server', 'eval')
        self._used_by_moduleid = None
        self._options = []
        self.eval_result = None
        self._use_cache: Union[bool, str] = False
        self._hooks = []
        register_hooks(self, resolve_builtin_hooks(self))

    def __setattr__(self, name: str, value):
        if isinstance(value, ModuleBase):
            self._submodules.append(value)
        elif isinstance(value, Option):
            self._options.append(value)
        elif name.endswith('_args') and isinstance(value, dict):
            for v in value.values():
                if isinstance(v, Option):
                    self._options.append(v)
        return super().__setattr__(name, value)

    def __getattr__(self, key):
        def _setattr(v, *, _return_value=self, **kw):
            k = key[:-7] if key.endswith('_method') else key
            if isinstance(v, tuple) and len(v) == 2 and isinstance(v[1], dict):
                kw.update(v[1])
                v = v[0]
            if len(kw) > 0:
                setattr(self, f'_{k}_args', kw)
            setattr(self, f'_{k}', v)
            if hasattr(self, f'_{k}_setter_hook'): getattr(self, f'_{k}_setter_hook')()
            return _return_value
        keys = self.__class__.builder_keys
        if key in keys:
            return _setattr
        elif key.startswith('_') and key[1:] in keys:
            return None
        elif key.startswith('_') and key.endswith('_args') and (key[1:-5] in keys or f'{key[1:-4]}method' in keys):
            return dict()
        raise AttributeError(f'{self.__class__} object has no attribute {key}')

    def __call__(self, *args, **kw):
        """
便捷调用，等同于 extract_and_store(data, algo_id)，抽取并存储后返回结果。

Args:
    data (Union[str, List[DocNode]]): 文本或 DocNode 列表。
    algo_id (str, optional): 算法/Document 名称。

**Returns:**

- ExtractResult: 抽取结果。
"""
        kw.update(locals['global_parameters'].get(self._module_id, dict()))
        if (files := locals['lazyllm_files'].get(self._module_id)) is not None: kw['lazyllm_files'] = files
        if (history := locals['chat_history'].get(self._module_id)) is not None: kw['llm_chat_history'] = history

        if args and isinstance(args[0], kwargs): args, kw = [], {**args[0], **kw}

        r = execution_with_hooks(
            self, *args, map_exception=lambda e: _change_exception_type(e, ModuleExecutionError), **kw,
        )(self._call_impl)(*args, **kw)

        if self._return_trace:
            lazyllm.FileSystemQueue.get_instance('lazy_trace').enqueue(str(r))
        self._clear_usage()
        return r

    def _call_impl(self, *args, **kw):
        if self._use_cache and 'R' in lazyllm.config['cache_mode']:
            try:
                return module_cache.get(self.__cache_hash__, args, kw)
            except CacheNotFoundError:
                self._cache_miss_handler()
        with globals.stack_enter(self.identities):
            r = self.forward(**args[0], **kw) if args and isinstance(args[0], kwargs) else self.forward(*args, **kw)
        if self._use_cache and 'W' in lazyllm.config['cache_mode']:
            module_cache.set(self.__cache_hash__, args, kw, r)
        return r

    def _stream_output(self, text: str, color: Optional[str] = None, *, cls: Optional[str] = 'text'):
        payload = {'tag': cls, 'delta': colored_text(text, color)}
        FileSystemQueue().enqueue(json.dumps(payload))
        return ''

    @contextmanager
    def stream_output(self, stream_output: Optional[Union[bool, Dict]] = None):
        """上下文管理器，用于在推理或执行过程中进行流式输出。  
当提供字典类型的 ``stream_output`` 时，可指定输出前缀和后缀，以及对应颜色。

Args:
    stream_output (Optional[Union[bool, Dict]]): 流式输出配置。

        - 如果为布尔值 True，则开启默认流式输出。
        - 如果为字典，可包含以下键：

            - 'prefix' (str): 输出前缀文本。
            - 'prefix_color' (str, optional): 前缀颜色。
            - 'suffix' (str): 输出后缀文本。
            - 'suffix_color' (str, optional): 后缀颜色。
"""
        if stream_output and isinstance(stream_output, dict) and (prefix := stream_output.get('prefix')):
            self._stream_output(prefix, stream_output.get('prefix_color'))
        yield
        if isinstance(stream_output, dict) and (suffix := stream_output.get('suffix')):
            self._stream_output(suffix, stream_output.get('suffix_color'))

    def used_by(self, module_id):
        """设置当前模块被哪个模块使用，用于标记模块的调用关系。  
可链式调用，返回模块自身。

Args:
    module_id (str): 调用该模块的上层模块的唯一 ID。

**Returns:**

- ModuleBase: 返回模块自身，用于链式调用。
"""
        self._used_by_moduleid = module_id
        return self

    def _clear_usage(self):
        globals['usage'].pop(self._module_id, None)

    # interfaces
    def forward(self, *args, **kw):
        """前向计算接口，需要子类实现。  
该方法定义了模块接收输入并返回输出的逻辑，是模块作为仿函数的核心函数。

Args:
    *args: 可变位置参数，子类可根据实际需求定义输入。
    **kw: 可变关键字参数，子类可根据实际需求定义输入。
"""
        raise NotImplementedError

    def register_hook(self, hook_type: Union[LazyLLMHook, Callable]):
        """注册一个钩子（Hook），在模块调用时执行特定逻辑。  
钩子需要继承自 ``LazyLLMHook``，可用于在模块前向计算前后添加自定义操作，例如日志记录或统计。

Args:
    hook_type (LazyLLMHook): 待注册的钩子对象。
"""
        if not isinstance(hook_type, type) and not isinstance(hook_type, LazyLLMHook) and callable(hook_type):
            hook_type = LazyLLMFuncHook(hook_type)
        if not isinstance(hook_type, LazyLLMHook):
            raise TypeError(f'Invalid hook type: {type(hook_type)}, '
                            'must be subclass or instance of LazyLLMHook, or callable function')
        if hook_type not in self._hooks:
            self._hooks.append(hook_type)

    def unregister_hook(self, hook_type: LazyLLMHook):
        """注销已注册的钩子。  
如果钩子存在于模块中，将其移除，使其不再在模块调用时执行。

Args:
    hook_type (LazyLLMHook): 待注销的钩子对象。
"""
        if hook_type in self._hooks:
            self._hooks.remove(hook_type)

    def clear_hooks(self):
        """清空模块中所有已注册的钩子。  
调用后模块将不再执行任何钩子逻辑。
"""
        self._hooks = []

    def _get_train_tasks(self):
        """定义训练任务，该函数返回训练的pipeline，重写了此函数的子类可以在update阶段被训练/微调。


Examples:
    >>> import lazyllm
    >>> class MyModule(lazyllm.module.ModuleBase):
    ...     def _get_train_tasks(self):
    ...         return lazyllm.pipeline(lambda : 1, lambda x: print(x))
    ... 
    >>> MyModule().update()
    1
    """
        return None
    def _get_deploy_tasks(self):
        """定义部署任务，该函数返回训练的pipeline，重写了此函数的子类可以在update/start阶段被部署。


Examples:
    >>> import lazyllm
    >>> class MyModule(lazyllm.module.ModuleBase):
    ...     def _get_deploy_tasks(self):
    ...         return lazyllm.pipeline(lambda : 1, lambda x: print(x))
    ... 
    >>> MyModule().start()
    1
    """
        return None
    def _get_post_process_tasks(self): return None

    def _set_mid(self, mid=None):
        self._config_id = mid if mid else str(uuid.uuid4().hex)
        return self

    @property
    def _module_id(self):
        return self._config_id

    @property
    def submodules(self):
        return self._submodules

    def evalset(self, evalset, load_f=None, collect_f=lambda x: x):
        """为模块设置评测集（evaluation set）。  
模块在调用 ``update`` 或 ``eval`` 时会使用评测集进行推理，并将评测结果存储在 ``eval_result`` 变量中。  

Args:
    evalset (Union[list, str]): 评测数据列表，或者评测数据文件路径。
    load_f (Optional[Callable]): 当 ``evalset`` 为文件路径时，用于加载文件并返回列表的函数，默认为 None。
    collect_f (Callable): 对评测结果进行后处理的函数，默认为 ``lambda x: x``。


Examples:
    >>> import lazyllm
    >>> m = lazyllm.module.TrainableModule().deploy_method(lazyllm.deploy.dummy).finetune_method(lazyllm.finetune.dummy).trainset("").mode("finetune").prompt(None)
    >>> m.evalset([1, 2, 3])
    >>> m.update()
    INFO: (lazyllm.launcher) PID: dummy finetune!, and init-args is {}
    >>> print(m.eval_result)
    ["reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 2, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 3, and parameters is {'do_sample': False, 'temperature': 0.1}"]
    """
        if isinstance(evalset, str) and os.path.exists(evalset):
            with open(evalset) as f:
                assert callable(load_f)
                self._evalset = load_f(f)
        else:
            self._evalset = evalset
        self.eval_result_collet_f = collect_f

    # TODO: add lazyllm.eval
    def _get_eval_tasks(self):
        def set_result(x): self.eval_result = x

        def parallel_infer():
            with ThreadPoolExecutor(max_workers=200) as executor:
                results = list(executor.map(lambda item: self(**item)
                                            if isinstance(item, dict) else self(item), self._evalset))
            return results
        if self._evalset:
            return Pipeline(parallel_infer,
                            lambda x: self.eval_result_collet_f(x),
                            set_result)
        return None

    # update module(train or finetune),
    def _update(self, *, mode: Optional[Union[str, List[str]]] = None, recursive: bool = True):  # noqa C901
        if not mode: mode = list(self.mode_list)
        if type(mode) is not list: mode = [mode]
        for item in mode:
            assert item in self.mode_list, f'Cannot find {item} in mode list: {self.mode_list}'
        # dfs to get all train tasks
        train_tasks, deploy_tasks, eval_tasks, post_process_tasks = FlatList(), FlatList(), FlatList(), FlatList()
        stack, visited = [(self, iter(self.submodules if recursive else []))], set()
        while len(stack) > 0:
            try:
                top = next(stack[-1][1])
                stack.append((top, iter(top.submodules)))
            except StopIteration:
                top = stack.pop()[0]
                if top._module_id in visited: continue
                visited.add(top._module_id)
                if 'train' in mode: train_tasks.absorb(top._get_train_tasks())
                if 'server' in mode: deploy_tasks.absorb(top._get_deploy_tasks())
                if 'eval' in mode: eval_tasks.absorb(top._get_eval_tasks())
                post_process_tasks.absorb(top._get_post_process_tasks())

        if 'train' in mode and len(train_tasks) > 0:
            Parallel(*train_tasks).set_sync(True)()
        if 'server' in mode and len(deploy_tasks) > 0:
            if redis_client:
                Parallel(*deploy_tasks).set_sync(False)()
            else:
                Parallel.sequential(*deploy_tasks)()
        if 'eval' in mode and len(eval_tasks) > 0:
            Parallel.sequential(*eval_tasks)()
        Parallel.sequential(*post_process_tasks)()
        return self

    def update(self, *, recursive: bool = True):
        """更新模块（及所有的子模块）。当模块重写了 ``_get_train_tasks`` 方法后，模块会被更新。更新完后会自动进入部署和推理的流程。

Args:
    recursive (bool): 是否递归更新所有的子模块，默认为True


Examples:
    >>> import lazyllm
    >>> m = lazyllm.module.TrainableModule().finetune_method(lazyllm.finetune.dummy).trainset("").deploy_method(lazyllm.deploy.dummy).mode('finetune').prompt(None)
    >>> m.evalset([1, 2, 3])
    >>> m.update()
    INFO: (lazyllm.launcher) PID: dummy finetune!, and init-args is {}
    >>> print(m.eval_result)
    ["reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 2, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 3, and parameters is {'do_sample': False, 'temperature': 0.1}"]
    """
        return self._update(mode=['train', 'server', 'eval'], recursive=recursive)

    def update_server(self, *, recursive: bool = True):
        """更新模块及其子模块的部署（server）部分。当模块或子模块实现了部署功能时，会进行相应的服务启动。  

Args:
    recursive (bool): 是否递归更新所有子模块的部署任务，默认为 True。
"""
        return self._update(mode=['server'], recursive=recursive)
    def eval(self, *, recursive: bool = True):
        """对模块（及所有的子模块）进行评测。当模块通过 ``evalset`` 设置了评测集之后，本函数生效。

Args:
    recursive (bool): 是否递归评测所有的子模块，默认为True


Examples:
    >>> import lazyllm
    >>> class MyModule(lazyllm.module.ModuleBase):
    ...     def forward(self, input):
    ...         return f'reply for input'
    ... 
    >>> m = MyModule()
    >>> m.evalset([1, 2, 3])
    >>> m.eval().eval_result
    ['reply for input', 'reply for input', 'reply for input']
    """
        return self._update(mode=['eval'], recursive=recursive)
    def start(self):
        """启动模块及所有子模块的部署服务。该方法会确保模块和子模块的 server 功能被执行，适合用于初始化或重新启动服务。

**Returns:**

- ModuleBase: 返回自身实例，以支持链式调用


Examples:
    >>> import lazyllm
    >>> m = lazyllm.TrainableModule().deploy_method(lazyllm.deploy.dummy).prompt(None)
    >>> m.start()
    <Module type=Trainable mode=None basemodel= target= stream=False return_trace=False>
    >>> m(1)
    "reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}"
    """
        return self._update(mode=['server'], recursive=True)
    def restart(self):
        """重启模块及其子模块的部署服务。内部会调用 ``start`` 方法，实现服务的重新启动。

**Returns:**

- ModuleBase: 返回自身实例，以支持链式调用


Examples:
    >>> import lazyllm
    >>> m = lazyllm.TrainableModule().deploy_method(lazyllm.deploy.dummy).prompt(None)
    >>> m.restart()
    <Module type=Trainable mode=None basemodel= target= stream=False return_trace=False>
    >>> m(1)
    "reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}"
    """
        return self.start()

    def wait(self):
        """等待模块或其子模块的执行完成。此方法在当前实现中为空，可由子类根据具体部署逻辑进行实现。
"""
        pass

    def stop(self):
        """停止模块及其所有子模块的运行。该方法会递归调用子模块的 ``stop`` 方法，适用于释放资源或关闭服务。
"""
        for m in self.submodules:
            m.stop()

    @property
    def options(self):
        options = self._options.copy()
        for m in self.submodules:
            options += m.options
        return options

    def _overwrote(self, f):
        return getattr(self.__class__, f) is not getattr(__class__, f)

    def __repr__(self):
        return lazyllm.make_repr('Module', self.__class__, name=self.name)

    def for_each(self, filter, action):
        """对模块的所有子模块执行指定操作。递归遍历所有子模块，如果子模块满足 ``filter`` 条件，则执行 ``action``。

Args:
    filter (Callable): 接受子模块作为输入并返回布尔值的函数，用于判断是否执行操作。
    action (Callable): 对满足条件的子模块执行的操作函数。
"""
        for submodule in self.submodules:
            if filter(submodule):
                action(submodule)
            submodule.for_each(filter, action)

    @property
    def __cache_hash__(self):
        cache_hash = self.__class__.__name__
        if isinstance(self._use_cache, str): cache_hash += f'@{self._use_cache}'
        if hasattr(self, 'appendix_hash_key'): cache_hash += f'@{self.appendix_hash_key}'
        return cache_hash

    def use_cache(self, flag: Union[bool, str] = True):
        """启用或禁用模块的缓存功能。

此方法用于控制模块是否使用缓存来存储和检索执行结果，以提高性能并避免重复计算。

Args:
    flag (bool or str, optional): 缓存控制标志。如果为True，启用缓存；如果为False，禁用缓存；
                                 如果为字符串，使用特定的缓存标识符。默认为True。

**Returns:**

- 返回模块实例本身，支持方法链式调用。

"""
        self._use_cache = flag or False
        return self

    def _cache_miss_handler(self): pass

`eval(*, recursive=True)`

对模块（及所有的子模块）进行评测。当模块通过 evalset 设置了评测集之后，本函数生效。

Parameters:

recursive (bool, default: True ) –

是否递归评测所有的子模块，默认为True

Examples:

>>> import lazyllm
>>> class MyModule(lazyllm.module.ModuleBase):
...     def forward(self, input):
...         return f'reply for input'
... 
>>> m = MyModule()
>>> m.evalset([1, 2, 3])
>>> m.eval().eval_result
['reply for input', 'reply for input', 'reply for input']

Source code in lazyllm/module/module.py

    def eval(self, *, recursive: bool = True):
        """对模块（及所有的子模块）进行评测。当模块通过 ``evalset`` 设置了评测集之后，本函数生效。

Args:
    recursive (bool): 是否递归评测所有的子模块，默认为True


Examples:
    >>> import lazyllm
    >>> class MyModule(lazyllm.module.ModuleBase):
    ...     def forward(self, input):
    ...         return f'reply for input'
    ... 
    >>> m = MyModule()
    >>> m.evalset([1, 2, 3])
    >>> m.eval().eval_result
    ['reply for input', 'reply for input', 'reply for input']
    """
        return self._update(mode=['eval'], recursive=recursive)

`evalset(evalset, load_f=None, collect_f=lambda x: x)`

为模块设置评测集（evaluation set）。
模块在调用 update 或 eval 时会使用评测集进行推理，并将评测结果存储在 eval_result 变量中。

Parameters:

evalset (Union[list, str]) –

评测数据列表，或者评测数据文件路径。
load_f (Optional[Callable], default: None ) –

当 evalset 为文件路径时，用于加载文件并返回列表的函数，默认为 None。
collect_f (Callable, default: lambda x: x ) –

对评测结果进行后处理的函数，默认为 lambda x: x。

Examples:

>>> import lazyllm
>>> m = lazyllm.module.TrainableModule().deploy_method(lazyllm.deploy.dummy).finetune_method(lazyllm.finetune.dummy).trainset("").mode("finetune").prompt(None)
>>> m.evalset([1, 2, 3])
>>> m.update()
INFO: (lazyllm.launcher) PID: dummy finetune!, and init-args is {}
>>> print(m.eval_result)
["reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 2, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 3, and parameters is {'do_sample': False, 'temperature': 0.1}"]

Source code in lazyllm/module/module.py

    def evalset(self, evalset, load_f=None, collect_f=lambda x: x):
        """为模块设置评测集（evaluation set）。  
模块在调用 ``update`` 或 ``eval`` 时会使用评测集进行推理，并将评测结果存储在 ``eval_result`` 变量中。  

Args:
    evalset (Union[list, str]): 评测数据列表，或者评测数据文件路径。
    load_f (Optional[Callable]): 当 ``evalset`` 为文件路径时，用于加载文件并返回列表的函数，默认为 None。
    collect_f (Callable): 对评测结果进行后处理的函数，默认为 ``lambda x: x``。


Examples:
    >>> import lazyllm
    >>> m = lazyllm.module.TrainableModule().deploy_method(lazyllm.deploy.dummy).finetune_method(lazyllm.finetune.dummy).trainset("").mode("finetune").prompt(None)
    >>> m.evalset([1, 2, 3])
    >>> m.update()
    INFO: (lazyllm.launcher) PID: dummy finetune!, and init-args is {}
    >>> print(m.eval_result)
    ["reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 2, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 3, and parameters is {'do_sample': False, 'temperature': 0.1}"]
    """
        if isinstance(evalset, str) and os.path.exists(evalset):
            with open(evalset) as f:
                assert callable(load_f)
                self._evalset = load_f(f)
        else:
            self._evalset = evalset
        self.eval_result_collet_f = collect_f

`forward(*args, **kw)`

前向计算接口，需要子类实现。
该方法定义了模块接收输入并返回输出的逻辑，是模块作为仿函数的核心函数。

Parameters:

*args –

可变位置参数，子类可根据实际需求定义输入。
**kw –

可变关键字参数，子类可根据实际需求定义输入。

Source code in lazyllm/module/module.py

    def forward(self, *args, **kw):
        """前向计算接口，需要子类实现。  
该方法定义了模块接收输入并返回输出的逻辑，是模块作为仿函数的核心函数。

Args:
    *args: 可变位置参数，子类可根据实际需求定义输入。
    **kw: 可变关键字参数，子类可根据实际需求定义输入。
"""
        raise NotImplementedError

`start()`

启动模块及所有子模块的部署服务。该方法会确保模块和子模块的 server 功能被执行，适合用于初始化或重新启动服务。

Returns:

ModuleBase: 返回自身实例，以支持链式调用

Examples:

>>> import lazyllm
>>> m = lazyllm.TrainableModule().deploy_method(lazyllm.deploy.dummy).prompt(None)
>>> m.start()
<Module type=Trainable mode=None basemodel= target= stream=False return_trace=False>
>>> m(1)
"reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}"

Source code in lazyllm/module/module.py

    def start(self):
        """启动模块及所有子模块的部署服务。该方法会确保模块和子模块的 server 功能被执行，适合用于初始化或重新启动服务。

**Returns:**

- ModuleBase: 返回自身实例，以支持链式调用


Examples:
    >>> import lazyllm
    >>> m = lazyllm.TrainableModule().deploy_method(lazyllm.deploy.dummy).prompt(None)
    >>> m.start()
    <Module type=Trainable mode=None basemodel= target= stream=False return_trace=False>
    >>> m(1)
    "reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}"
    """
        return self._update(mode=['server'], recursive=True)

`restart()`

重启模块及其子模块的部署服务。内部会调用 start 方法，实现服务的重新启动。

Returns:

ModuleBase: 返回自身实例，以支持链式调用

Examples:

>>> import lazyllm
>>> m = lazyllm.TrainableModule().deploy_method(lazyllm.deploy.dummy).prompt(None)
>>> m.restart()
<Module type=Trainable mode=None basemodel= target= stream=False return_trace=False>
>>> m(1)
"reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}"

Source code in lazyllm/module/module.py

    def restart(self):
        """重启模块及其子模块的部署服务。内部会调用 ``start`` 方法，实现服务的重新启动。

**Returns:**

- ModuleBase: 返回自身实例，以支持链式调用


Examples:
    >>> import lazyllm
    >>> m = lazyllm.TrainableModule().deploy_method(lazyllm.deploy.dummy).prompt(None)
    >>> m.restart()
    <Module type=Trainable mode=None basemodel= target= stream=False return_trace=False>
    >>> m(1)
    "reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}"
    """
        return self.start()

`update(*, recursive=True)`

更新模块（及所有的子模块）。当模块重写了 _get_train_tasks 方法后，模块会被更新。更新完后会自动进入部署和推理的流程。

Parameters:

recursive (bool, default: True ) –

是否递归更新所有的子模块，默认为True

Examples:

>>> import lazyllm
>>> m = lazyllm.module.TrainableModule().finetune_method(lazyllm.finetune.dummy).trainset("").deploy_method(lazyllm.deploy.dummy).mode('finetune').prompt(None)
>>> m.evalset([1, 2, 3])
>>> m.update()
INFO: (lazyllm.launcher) PID: dummy finetune!, and init-args is {}
>>> print(m.eval_result)
["reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 2, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 3, and parameters is {'do_sample': False, 'temperature': 0.1}"]

Source code in lazyllm/module/module.py

    def update(self, *, recursive: bool = True):
        """更新模块（及所有的子模块）。当模块重写了 ``_get_train_tasks`` 方法后，模块会被更新。更新完后会自动进入部署和推理的流程。

Args:
    recursive (bool): 是否递归更新所有的子模块，默认为True


Examples:
    >>> import lazyllm
    >>> m = lazyllm.module.TrainableModule().finetune_method(lazyllm.finetune.dummy).trainset("").deploy_method(lazyllm.deploy.dummy).mode('finetune').prompt(None)
    >>> m.evalset([1, 2, 3])
    >>> m.update()
    INFO: (lazyllm.launcher) PID: dummy finetune!, and init-args is {}
    >>> print(m.eval_result)
    ["reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 2, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 3, and parameters is {'do_sample': False, 'temperature': 0.1}"]
    """
        return self._update(mode=['train', 'server', 'eval'], recursive=recursive)

`stream_output(stream_output=None)`

上下文管理器，用于在推理或执行过程中进行流式输出。
当提供字典类型的 stream_output 时，可指定输出前缀和后缀，以及对应颜色。

Parameters:

stream_output (Optional[Union[bool, Dict]], default: None ) –
流式输出配置。
- 如果为布尔值 True，则开启默认流式输出。
- 如果为字典，可包含以下键：
  - 'prefix' (str): 输出前缀文本。
  - 'prefix_color' (str, optional): 前缀颜色。
  - 'suffix' (str): 输出后缀文本。
  - 'suffix_color' (str, optional): 后缀颜色。

Source code in lazyllm/module/module.py

    @contextmanager
    def stream_output(self, stream_output: Optional[Union[bool, Dict]] = None):
        """上下文管理器，用于在推理或执行过程中进行流式输出。  
当提供字典类型的 ``stream_output`` 时，可指定输出前缀和后缀，以及对应颜色。

Args:
    stream_output (Optional[Union[bool, Dict]]): 流式输出配置。

        - 如果为布尔值 True，则开启默认流式输出。
        - 如果为字典，可包含以下键：

            - 'prefix' (str): 输出前缀文本。
            - 'prefix_color' (str, optional): 前缀颜色。
            - 'suffix' (str): 输出后缀文本。
            - 'suffix_color' (str, optional): 后缀颜色。
"""
        if stream_output and isinstance(stream_output, dict) and (prefix := stream_output.get('prefix')):
            self._stream_output(prefix, stream_output.get('prefix_color'))
        yield
        if isinstance(stream_output, dict) and (suffix := stream_output.get('suffix')):
            self._stream_output(suffix, stream_output.get('suffix_color'))

`used_by(module_id)`

设置当前模块被哪个模块使用，用于标记模块的调用关系。
可链式调用，返回模块自身。

Parameters:

module_id (str) –

调用该模块的上层模块的唯一 ID。

Returns:

ModuleBase: 返回模块自身，用于链式调用。

Source code in lazyllm/module/module.py

    def used_by(self, module_id):
        """设置当前模块被哪个模块使用，用于标记模块的调用关系。  
可链式调用，返回模块自身。

Args:
    module_id (str): 调用该模块的上层模块的唯一 ID。

**Returns:**

- ModuleBase: 返回模块自身，用于链式调用。
"""
        self._used_by_moduleid = module_id
        return self

`register_hook(hook_type)`

注册一个钩子（Hook），在模块调用时执行特定逻辑。
钩子需要继承自 LazyLLMHook，可用于在模块前向计算前后添加自定义操作，例如日志记录或统计。

Parameters:

hook_type (LazyLLMHook) –

待注册的钩子对象。

Source code in lazyllm/module/module.py

    def register_hook(self, hook_type: Union[LazyLLMHook, Callable]):
        """注册一个钩子（Hook），在模块调用时执行特定逻辑。  
钩子需要继承自 ``LazyLLMHook``，可用于在模块前向计算前后添加自定义操作，例如日志记录或统计。

Args:
    hook_type (LazyLLMHook): 待注册的钩子对象。
"""
        if not isinstance(hook_type, type) and not isinstance(hook_type, LazyLLMHook) and callable(hook_type):
            hook_type = LazyLLMFuncHook(hook_type)
        if not isinstance(hook_type, LazyLLMHook):
            raise TypeError(f'Invalid hook type: {type(hook_type)}, '
                            'must be subclass or instance of LazyLLMHook, or callable function')
        if hook_type not in self._hooks:
            self._hooks.append(hook_type)

`unregister_hook(hook_type)`

注销已注册的钩子。
如果钩子存在于模块中，将其移除，使其不再在模块调用时执行。

Parameters:

hook_type (LazyLLMHook) –

待注销的钩子对象。

Source code in lazyllm/module/module.py

    def unregister_hook(self, hook_type: LazyLLMHook):
        """注销已注册的钩子。  
如果钩子存在于模块中，将其移除，使其不再在模块调用时执行。

Args:
    hook_type (LazyLLMHook): 待注销的钩子对象。
"""
        if hook_type in self._hooks:
            self._hooks.remove(hook_type)

`clear_hooks()`

清空模块中所有已注册的钩子。
调用后模块将不再执行任何钩子逻辑。

Source code in lazyllm/module/module.py

    def clear_hooks(self):
        """清空模块中所有已注册的钩子。  
调用后模块将不再执行任何钩子逻辑。
"""
        self._hooks = []

`update_server(*, recursive=True)`

更新模块及其子模块的部署（server）部分。当模块或子模块实现了部署功能时，会进行相应的服务启动。

Parameters:

recursive (bool, default: True ) –

是否递归更新所有子模块的部署任务，默认为 True。

Source code in lazyllm/module/module.py

    def update_server(self, *, recursive: bool = True):
        """更新模块及其子模块的部署（server）部分。当模块或子模块实现了部署功能时，会进行相应的服务启动。  

Args:
    recursive (bool): 是否递归更新所有子模块的部署任务，默认为 True。
"""
        return self._update(mode=['server'], recursive=recursive)

`wait()`

等待模块或其子模块的执行完成。此方法在当前实现中为空，可由子类根据具体部署逻辑进行实现。

Source code in lazyllm/module/module.py

    def wait(self):
        """等待模块或其子模块的执行完成。此方法在当前实现中为空，可由子类根据具体部署逻辑进行实现。
"""
        pass

`stop()`

停止模块及其所有子模块的运行。该方法会递归调用子模块的 stop 方法，适用于释放资源或关闭服务。

Source code in lazyllm/module/module.py

    def stop(self):
        """停止模块及其所有子模块的运行。该方法会递归调用子模块的 ``stop`` 方法，适用于释放资源或关闭服务。
"""
        for m in self.submodules:
            m.stop()

`for_each(filter, action)`

对模块的所有子模块执行指定操作。递归遍历所有子模块，如果子模块满足 filter 条件，则执行 action。

Parameters:

filter (Callable) –

接受子模块作为输入并返回布尔值的函数，用于判断是否执行操作。
action (Callable) –

对满足条件的子模块执行的操作函数。

Source code in lazyllm/module/module.py

    def for_each(self, filter, action):
        """对模块的所有子模块执行指定操作。递归遍历所有子模块，如果子模块满足 ``filter`` 条件，则执行 ``action``。

Args:
    filter (Callable): 接受子模块作为输入并返回布尔值的函数，用于判断是否执行操作。
    action (Callable): 对满足条件的子模块执行的操作函数。
"""
        for submodule in self.submodules:
            if filter(submodule):
                action(submodule)
            submodule.for_each(filter, action)

`lazyllm.module.servermodule.LLMBase`

Bases: object

大语言模型模块的基类，继承自 ModuleBase。
负责管理流式输出、Prompt 和格式化器的初始化与切换，处理输入中的文件信息，支持实例共享。

Parameters:

stream (bool 或 dict, default: False ) –

是否启用流式输出或流式配置，默认为 False。
return_trace (bool) –

是否返回执行过程的 trace，默认为 False。
init_prompt (bool, default: True ) –

是否在初始化时自动创建默认 Prompt，默认为 True。

Source code in lazyllm/module/servermodule.py

class LLMBase(object):
    """大语言模型模块的基类，继承自 ModuleBase。  
负责管理流式输出、Prompt 和格式化器的初始化与切换，处理输入中的文件信息，支持实例共享。

Args:
    stream (bool 或 dict): 是否启用流式输出或流式配置，默认为 False。
    return_trace (bool): 是否返回执行过程的 trace，默认为 False。
    init_prompt (bool): 是否在初始化时自动创建默认 Prompt，默认为 True。
"""
    def __init__(self, stream: Union[bool, Dict[str, str]] = False, init_prompt: bool = True,
                 type: Optional[Union[str, LLMType]] = None, static_params: Optional[StaticParams] = None):
        self._stream = stream
        self._type = LLMType(type) if type else LLMType.LLM
        if init_prompt: self.prompt()
        self._static_params = static_params or {}
        __class__.formatter(self)

    def _get_files(self, input, lazyllm_files):
        if isinstance(input, package):
            assert not lazyllm_files, 'Duplicate `files` argument provided by args and kwargs'
            input, lazyllm_files = input

        if isinstance(input, list):
            has_images = any(_is_image_path(item) for item in input)
            if has_images:
                assert not lazyllm_files, 'Cannot use both interleaved input and lazyllm_files parameter'
                input, files = _parse_interleaved_input(input)
                return input, files

        if isinstance(input, str) and input.startswith(LAZYLLM_QUERY_PREFIX):
            assert not lazyllm_files, 'Argument `files` is already provided by query'
            deinput = decode_query_with_filepaths(input)
            assert isinstance(deinput, dict), 'decode_query_with_filepaths must return a dict.'
            input, files = deinput['query'], deinput['files']
        else:
            files = _lazyllm_get_file_list(lazyllm_files) if lazyllm_files else []
        return input, files

    def prompt(self, prompt: Optional[str] = None, history: Optional[List[List[str]]] = None):
        """设置或切换 Prompt。支持 None、PrompterBase 子类或字符串/字典类型创建 ChatPrompter。

Args:
    prompt (str/dict/PrompterBase/None): 要设置的 Prompt。
    history (list): 对话历史，仅当 prompt 为字符串或字典时有效。

**Returns:**

- self: 便于链式调用。
"""
        if prompt is None:
            assert not history, 'history is not supported in EmptyPrompter'
            self._prompt = EmptyPrompter()
        elif isinstance(prompt, PrompterBase):
            assert not history, 'history is not supported in user defined prompter'
            self._prompt = prompt
        elif isinstance(prompt, (str, dict)):
            self._prompt = ChatPrompter(prompt, history=history)
        else:
            raise TypeError(f'{prompt} type is not supported.')
        return self

    def formatter(self, format: Optional[FormatterBase] = None):
        """设置或切换输出格式化器。支持 None、FormatterBase 子类或可调用对象。

Args:
    format (FormatterBase/Callable/None): 格式化器对象或函数，默认为 None。

**Returns:**

- self: 便于链式调用。
"""
        assert format is None or isinstance(format, FormatterBase) or callable(format), 'format must be None or Callable'
        self._formatter = format or EmptyFormatter()
        return self

    def share(self, prompt: Optional[Union[str, dict, PrompterBase]] = None, format: Optional[FormatterBase] = None,
              stream: Optional[Union[bool, Dict[str, str]]] = None, history: Optional[List[List[str]]] = None,
              copy_static_params: bool = False):
        """创建当前实例的浅拷贝，并可重新设置 prompt、formatter、stream 等属性。  
适用于多会话或多 Agent 共享基础配置但个性化部分参数的场景。

Args:
    prompt (str/dict/PrompterBase/None): 新的 Prompt，可选。
    format (FormatterBase/None): 新的格式化器，可选。
    stream (bool/dict/None): 新的流式设置，可选。
    history (list/None): 新的对话历史，仅在设置 Prompt 时有效。

**Returns:**

- LLMBase: 新的共享实例。
"""
        new = copy.copy(self)
        new._hooks = self._hooks.copy()
        new._set_mid()
        if prompt is not None: new.prompt(prompt, history=history)
        if format is not None: new.formatter(format)
        if stream is not None: new.stream = stream
        if copy_static_params: new._static_params = copy.deepcopy(self._static_params)
        return new

    @property
    def type(self):
        return self._type.value

    @property
    def stream(self):
        return self._stream

    @stream.setter
    def stream(self, v: Union[bool, Dict[str, str]]):
        self._stream = v

    async def astream_call(self, *args, **kwargs):
        """Async generator that yields tokens as they arrive. Suitable for FastAPI/asyncio contexts."""
        from .stream_helper import StreamCallHelper
        llm = self.share()
        kwargs.setdefault('stream_output', True)
        async for item in StreamCallHelper(llm).astream(*args, **kwargs):
            if item.get('tag', '') in ('text', 'think'):
                yield item.get('delta', '')

    @property
    def static_params(self) -> StaticParams:
        return self._static_params

    @static_params.setter
    def static_params(self, value: StaticParams):
        if not isinstance(value, dict):
            raise TypeError('static_params must be a dict (TypedDict)')
        self._static_params = value

    def __or__(self, other):
        if not isinstance(other, FormatterBase):
            return NotImplemented
        return self.share(format=(other if isinstance(self._formatter, EmptyFormatter) else (self._formatter | other)))

    @property
    def appendix_hash_key(self):
        try:
            prompts = self._prompt.generate_prompt('x')
        except Exception:
            prompts = self._prompt._instruction_template
        if not isinstance(prompts, str):
            try:
                content = json.dumps(prompts, sort_keys=True)
            except Exception:
                content = str(prompts)
        else:
            content = prompts
        return hashlib.md5(content.encode()).hexdigest()

`prompt(prompt=None, history=None)`

设置或切换 Prompt。支持 None、PrompterBase 子类或字符串/字典类型创建 ChatPrompter。

Parameters:

prompt (str / dict / PrompterBase / None, default: None ) –

要设置的 Prompt。
history (list, default: None ) –

对话历史，仅当 prompt 为字符串或字典时有效。

Returns:

self: 便于链式调用。

Source code in lazyllm/module/servermodule.py

    def prompt(self, prompt: Optional[str] = None, history: Optional[List[List[str]]] = None):
        """设置或切换 Prompt。支持 None、PrompterBase 子类或字符串/字典类型创建 ChatPrompter。

Args:
    prompt (str/dict/PrompterBase/None): 要设置的 Prompt。
    history (list): 对话历史，仅当 prompt 为字符串或字典时有效。

**Returns:**

- self: 便于链式调用。
"""
        if prompt is None:
            assert not history, 'history is not supported in EmptyPrompter'
            self._prompt = EmptyPrompter()
        elif isinstance(prompt, PrompterBase):
            assert not history, 'history is not supported in user defined prompter'
            self._prompt = prompt
        elif isinstance(prompt, (str, dict)):
            self._prompt = ChatPrompter(prompt, history=history)
        else:
            raise TypeError(f'{prompt} type is not supported.')
        return self

`formatter(format=None)`

设置或切换输出格式化器。支持 None、FormatterBase 子类或可调用对象。

Parameters:

format (FormatterBase / Callable / None, default: None ) –

格式化器对象或函数，默认为 None。

Returns:

self: 便于链式调用。

Source code in lazyllm/module/servermodule.py

    def formatter(self, format: Optional[FormatterBase] = None):
        """设置或切换输出格式化器。支持 None、FormatterBase 子类或可调用对象。

Args:
    format (FormatterBase/Callable/None): 格式化器对象或函数，默认为 None。

**Returns:**

- self: 便于链式调用。
"""
        assert format is None or isinstance(format, FormatterBase) or callable(format), 'format must be None or Callable'
        self._formatter = format or EmptyFormatter()
        return self

`share(prompt=None, format=None, stream=None, history=None, copy_static_params=False)`

创建当前实例的浅拷贝，并可重新设置 prompt、formatter、stream 等属性。
适用于多会话或多 Agent 共享基础配置但个性化部分参数的场景。

Parameters:

prompt (str / dict / PrompterBase / None, default: None ) –

新的 Prompt，可选。
format (FormatterBase / None, default: None ) –

新的格式化器，可选。
stream (bool / dict / None, default: None ) –

新的流式设置，可选。
history (list / None, default: None ) –

新的对话历史，仅在设置 Prompt 时有效。

Returns:

LLMBase: 新的共享实例。

Source code in lazyllm/module/servermodule.py

    def share(self, prompt: Optional[Union[str, dict, PrompterBase]] = None, format: Optional[FormatterBase] = None,
              stream: Optional[Union[bool, Dict[str, str]]] = None, history: Optional[List[List[str]]] = None,
              copy_static_params: bool = False):
        """创建当前实例的浅拷贝，并可重新设置 prompt、formatter、stream 等属性。  
适用于多会话或多 Agent 共享基础配置但个性化部分参数的场景。

Args:
    prompt (str/dict/PrompterBase/None): 新的 Prompt，可选。
    format (FormatterBase/None): 新的格式化器，可选。
    stream (bool/dict/None): 新的流式设置，可选。
    history (list/None): 新的对话历史，仅在设置 Prompt 时有效。

**Returns:**

- LLMBase: 新的共享实例。
"""
        new = copy.copy(self)
        new._hooks = self._hooks.copy()
        new._set_mid()
        if prompt is not None: new.prompt(prompt, history=history)
        if format is not None: new.formatter(format)
        if stream is not None: new.stream = stream
        if copy_static_params: new._static_params = copy.deepcopy(self._static_params)
        return new

`lazyllm.module.ActionModule`

Bases: ModuleBase

用于将函数、模块、flow、Module等可调用的对象包装一个Module。被包装的Module（包括flow中的Module）都会变成该Module的submodule。

Parameters:

action (Callable | list[Callable], default: () ) –

被包装的对象，是一个或一组可执行的对象。
return_trace (bool, default: False ) –

是否开启 trace 模式，用于记录调用栈，默认为 False。

Source code in lazyllm/module/module.py

class ActionModule(ModuleBase):
    """用于将函数、模块、flow、Module等可调用的对象包装一个Module。被包装的Module（包括flow中的Module）都会变成该Module的submodule。

Args:
    action (Callable|list[Callable]): 被包装的对象，是一个或一组可执行的对象。
    return_trace (bool): 是否开启 trace 模式，用于记录调用栈，默认为 ``False``。
"""
    def __init__(self, *action, return_trace=False):
        super().__init__(return_trace=return_trace)
        if len(action) == 1 and isinstance(action, FlowBase): action = action[0]
        if isinstance(action, (tuple, list)):
            action = Pipeline(*action)
        assert isinstance(action, FlowBase), f'Invalid action type {type(action)}'
        self.action = action

    def forward(self, *args, **kw):
        """执行被包装的 action，对输入参数进行前向计算。等效于调用该模块本身。

Args:
    args (list of callables or single callable): 传递给被包装 action 的位置参数。
    kwargs (dict of callables): 传递给被包装 action 的关键字参数。

**Returns:**

- 任意类型：被包装 action 的执行结果。
"""
        return self.action(*args, **kw)

    @property
    def submodules(self):
        """返回被包装 action 中所有属于 ModuleBase 类型的子模块。该属性会自动展开 Pipeline 中嵌套的模块。

**Returns:**

- list[ModuleBase]: 子模块列表
"""
        try:
            if isinstance(self.action, FlowBase):
                submodule = []
                self.action.for_each(lambda x: isinstance(x, ModuleBase), lambda x: submodule.append(x))
                return submodule
        except Exception as e:
            raise RuntimeError(f'{str(e)}\nOriginal traceback:\n{"".join(traceback.format_tb(e.__traceback__))}')
        return super().submodules

    def __repr__(self):
        return lazyllm.make_repr('Module', 'Action', subs=[repr(self.action)],
                                 name=self.name, return_trace=self._return_trace)

`submodules` `property`

返回被包装 action 中所有属于 ModuleBase 类型的子模块。该属性会自动展开 Pipeline 中嵌套的模块。

Returns:

list[ModuleBase]: 子模块列表

`forward(*args, **kw)`

执行被包装的 action，对输入参数进行前向计算。等效于调用该模块本身。

Parameters:

args (list of callables or single callable, default: () ) –

传递给被包装 action 的位置参数。
kwargs (dict of callables) –

传递给被包装 action 的关键字参数。

Returns:

任意类型：被包装 action 的执行结果。

Source code in lazyllm/module/module.py

    def forward(self, *args, **kw):
        """执行被包装的 action，对输入参数进行前向计算。等效于调用该模块本身。

Args:
    args (list of callables or single callable): 传递给被包装 action 的位置参数。
    kwargs (dict of callables): 传递给被包装 action 的关键字参数。

**Returns:**

- 任意类型：被包装 action 的执行结果。
"""
        return self.action(*args, **kw)

`lazyllm.module.TrainableModule`

Bases: UrlModule

可训练模块，所有模型（包括LLM、Embedding等）都通过TrainableModule提供服务

TrainableModule(base_model='', target_path='', *, stream=False, return_trace=False)

Parameters:

base_model (str, default: '' ) –

基础模型的名称或路径。
target_path (str, default: '' ) –

保存微调任务的路径。
stream (bool, default: False ) –

输出流式结果。
return_trace (bool, default: False ) –

在trace中记录结果。
trust_remote_code (bool, default: True ) –

是否信任远程代码。
type (str / LLMType, default: None ) –

模型类型。
source (str, default: None ) –

模型来源，如果未设置，将从环境变量LAZYLLM_MODEL_SOURCE读取。

TrainableModule.trainset(v):

设置 TrainableModule 的训练集

Parameters:

v (str) –

训练/微调数据集的路径

示例:

>>> import lazyllm
>>> m = lazyllm.module.TrainableModule().finetune_method(finetune.dummy).trainset('/file/to/path').deploy_method(None).mode('finetune')
>>> m.update()
INFO: (lazyllm.launcher) PID: dummy finetune!, and init-args is {}

TrainableModule.train_method(v, **kw):

设置 TrainableModule 的训练方法（暂不支持继续预训练，预计下一版本支持）

Parameters:

v (LazyLLMTrainBase) –

训练方法，可选值包括 train.auto 等
kw (**dict) –

训练方法所需的参数，对应 v 的参数

TrainableModule.finetune_method(v, **kw):

设置 TrainableModule 的微调方法及其参数

Parameters:

v (LazyLLMFinetuneBase) –

微调方法，可选值包括 finetune.auto / finetune.alpacalora / finetune.collie 等
kw (**dict) –

微调方法所需的参数，对应 v 的参数

示例:

>>> import lazyllm
>>> m = lazyllm.module.TrainableModule().finetune_method(finetune.dummy).deploy_method(None).mode('finetune')
>>> m.update()
INFO: (lazyllm.launcher) PID: dummy finetune!, and init-args is {}

TrainableModule.deploy_method(v, **kw):

设置 TrainableModule 的部署方法及其参数

Parameters:

v (LazyLLMDeployBase) –

部署方法，可选值包括 deploy.auto / deploy.lightllm / deploy.vllm 等
kw (**dict) –

部署方法所需的参数，对应 v 的参数

示例:

>>> import lazyllm
>>> m = lazyllm.module.TrainableModule().deploy_method(deploy.dummy).mode('finetune')
>>> m.evalset([1, 2, 3])
>>> m.update()
INFO: (lazyllm.launcher) PID: dummy finetune!, and init-args is {}
>>> m.eval_result
["reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 2, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 3, and parameters is {'do_sample': False, 'temperature': 0.1}"]

TrainableModule.mode(v):

设置 TrainableModule 在 update 时执行训练还是微调

Parameters:

v (str) –

设置在 update 时执行训练还是微调，可选值为 'finetune' 和 'train'，默认为 'finetune'

示例:

>>> import lazyllm
>>> m = lazyllm.module.TrainableModule().finetune_method(finetune.dummy).deploy_method(None).mode('finetune')
>>> m.update()
INFO: (lazyllm.launcher) PID: dummy finetune!, and init-args is {}

eval(*, recursive=True) 评估该模块（及其所有子模块）。此功能需在模块通过evalset设置评估集后生效。

Parameters:

recursive (bool) –

是否递归评估所有子模块，默认为True。

evalset(evalset, load_f=None, collect_f=<function ModuleBase.<lambda>>)

为模块设置评估集。已设置评估集的模块将在执行update或eval时进行评估，评估结果将存储在eval_result变量中。

evalset(evalset, collect_f=lambda x: ...)→ None

Parameters:

evalset (list) –

评估数据集
collect_f (Callable) –

评估结果的后处理方法，默认不进行后处理。

evalset(evalset, load_f=None, collect_f=lambda x: ...)→ None

Parameters:

evalset (str) –

评估集路径
load_f (Callable) –

评估集加载方法，包括文件格式解析和列表转换
collect_f (Callable) –

评估结果后处理方法，默认不进行后处理

示例:

>>> import lazyllm
>>> m = lazyllm.module.TrainableModule().deploy_method(deploy.dummy)
>>> m.evalset([1, 2, 3])
>>> m.update()
INFO: (lazyllm.launcher) PID: dummy finetune!, and init-args is {}
>>> m.eval_result
["reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 2, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 3, and parameters is {'do_sample': False, 'temperature': 0.1}"]

restart()

重启模块及其所有子模块

示例:

>>> import lazyllm
>>> m = lazyllm.module.TrainableModule().deploy_method(deploy.dummy)
>>> m.restart()
>>> m(1)
"reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}"

start()

部署模块及其所有子模块

示例:

>>> import lazyllm
>>> m = lazyllm.module.TrainableModule().deploy_method(deploy.dummy)
>>> m.start()
>>> m(1)
"reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}"

Source code in lazyllm/module/llms/trainablemodule.py

class TrainableModule(UrlModule):
    """可训练模块，所有模型（包括LLM、Embedding等）都通过TrainableModule提供服务

<span style="font-size: 20px;">**`TrainableModule(base_model='', target_path='', *, stream=False, return_trace=False)`**</span>


Args:
    base_model (str): 基础模型的名称或路径。
    target_path (str): 保存微调任务的路径。
    stream (bool): 输出流式结果。     
    return_trace (bool): 在trace中记录结果。
    trust_remote_code (bool): 是否信任远程代码。
    type (str/LLMType): 模型类型。
    source (str): 模型来源，如果未设置，将从环境变量LAZYLLM_MODEL_SOURCE读取。

<span style="font-size: 20px;">**`TrainableModule.trainset(v):`**</span>

设置 TrainableModule 的训练集

Args:
    v (str): 训练/微调数据集的路径

**示例:**

```python
>>> import lazyllm
>>> m = lazyllm.module.TrainableModule().finetune_method(finetune.dummy).trainset('/file/to/path').deploy_method(None).mode('finetune')
>>> m.update()
INFO: (lazyllm.launcher) PID: dummy finetune!, and init-args is {}
```

<span style="font-size: 20px;">**`TrainableModule.train_method(v, **kw):`**</span>

设置 TrainableModule 的训练方法（暂不支持继续预训练，预计下一版本支持）

Args:
    v (LazyLLMTrainBase): 训练方法，可选值包括 ``train.auto`` 等
    kw (**dict): 训练方法所需的参数，对应 v 的参数

<span style="font-size: 20px;">**`TrainableModule.finetune_method(v, **kw):`**</span>

设置 TrainableModule 的微调方法及其参数

Args:
    v (LazyLLMFinetuneBase): 微调方法，可选值包括 ``finetune.auto`` / ``finetune.alpacalora`` / ``finetune.collie`` 等
    kw (**dict): 微调方法所需的参数，对应 v 的参数

**示例:**

```python
>>> import lazyllm
>>> m = lazyllm.module.TrainableModule().finetune_method(finetune.dummy).deploy_method(None).mode('finetune')
>>> m.update()
INFO: (lazyllm.launcher) PID: dummy finetune!, and init-args is {}    
```

<span style="font-size: 20px;">**`TrainableModule.deploy_method(v, **kw):`**</span>

设置 TrainableModule 的部署方法及其参数

Args:
    v (LazyLLMDeployBase): 部署方法，可选值包括 ``deploy.auto`` / ``deploy.lightllm`` / ``deploy.vllm`` 等
    kw (**dict): 部署方法所需的参数，对应 v 的参数

**示例:**

```python
>>> import lazyllm
>>> m = lazyllm.module.TrainableModule().deploy_method(deploy.dummy).mode('finetune')
>>> m.evalset([1, 2, 3])
>>> m.update()
INFO: (lazyllm.launcher) PID: dummy finetune!, and init-args is {}
>>> m.eval_result
["reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 2, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 3, and parameters is {'do_sample': False, 'temperature': 0.1}"]                                   
```

<span style="font-size: 20px;">**`TrainableModule.mode(v):`**</span>

设置 TrainableModule 在 update 时执行训练还是微调

Args:
    v (str): 设置在 update 时执行训练还是微调，可选值为 'finetune' 和 'train'，默认为 'finetune'

**示例:**

```python
>>> import lazyllm
>>> m = lazyllm.module.TrainableModule().finetune_method(finetune.dummy).deploy_method(None).mode('finetune')
>>> m.update()
INFO: (lazyllm.launcher) PID: dummy finetune!, and init-args is {}            
```
<span style="font-size: 20px;">**`eval(*, recursive=True)`**</span>
评估该模块（及其所有子模块）。此功能需在模块通过evalset设置评估集后生效。

Args:
    recursive (bool) :是否递归评估所有子模块，默认为True。

<span style="font-size: 20px;">**`evalset(evalset, load_f=None, collect_f=<function ModuleBase.<lambda>>)`**</span>

为模块设置评估集。已设置评估集的模块将在执行``update``或``eval``时进行评估，评估结果将存储在eval_result变量中。

<span style="font-size: 18px;">&ensp;**`evalset(evalset, collect_f=lambda x: ...)→ None `**</span>


Args:
    evalset (list) :评估数据集
    collect_f (Callable) :评估结果的后处理方法，默认不进行后处理。



<span style="font-size: 18px;">&ensp;**`evalset(evalset, load_f=None, collect_f=lambda x: ...)→ None`**</span>


Args:
    evalset (str) :评估集路径
    load_f (Callable) :评估集加载方法，包括文件格式解析和列表转换
    collect_f (Callable) :评估结果后处理方法，默认不进行后处理

**示例:**

```python
>>> import lazyllm
>>> m = lazyllm.module.TrainableModule().deploy_method(deploy.dummy)
>>> m.evalset([1, 2, 3])
>>> m.update()
INFO: (lazyllm.launcher) PID: dummy finetune!, and init-args is {}
>>> m.eval_result
["reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 2, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 3, and parameters is {'do_sample': False, 'temperature': 0.1}"]                
```     

<span style="font-size: 20px;">**`restart() `**</span>

重启模块及其所有子模块

**示例:**

```python
>>> import lazyllm
>>> m = lazyllm.module.TrainableModule().deploy_method(deploy.dummy)
>>> m.restart()
>>> m(1)
"reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}"
```

<span style="font-size: 20px;">start() </span>

部署模块及其所有子模块

**示例:**

```python
>>> import lazyllm
>>> m = lazyllm.module.TrainableModule().deploy_method(deploy.dummy)
>>> m.start()
>>> m(1)
"reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}"     
```                

"""
    builder_keys = _TrainableModuleImpl.builder_keys

    def __init__(self, base_model: Option = '', target_path='', *, stream: Union[bool, Dict[str, str]] = False,
                 return_trace: bool = False, trust_remote_code: bool = True,
                 type: Optional[Union[str, LLMType]] = None, source: Optional[str] = None,
                 use_model_map: Union[str, bool] = True):
        super().__init__(url=None, stream=stream, return_trace=return_trace, init_prompt=False)
        self._template = _UrlTemplateStruct()
        self._impl = _TrainableModuleImpl(base_model, target_path, stream, None, lazyllm.finetune.auto,
                                          lazyllm.deploy.auto, self._template, self._url_wrapper,
                                          trust_remote_code, type, source=source, use_model_map=use_model_map)
        self._stream = stream
        self.prompt()
        if config['cache_local_module']:
            self.use_cache()

    template_message = property(lambda self: self._template.template_message)
    keys_name_handle = property(lambda self: self._template.keys_name_handle)
    template_headers = property(lambda self: self._template.template_headers)
    extract_result_func = property(lambda self: self._template.extract_result_func)
    stream_parse_parameters = property(lambda self: self._template.stream_parse_parameters)
    stream_url_suffix = property(lambda self: self._template.stream_url_suffix)

    base_model = property(lambda self: self._impl._base_model)
    target_path = property(lambda self: self._impl._target_path)
    finetuned_model_path = property(lambda self: self._impl._finetuned_model_path)
    _url_id = property(lambda self: self._impl._module_id)

    @property
    def series(self):
        return re.sub(r'\d+$', '', ModelManager._get_model_name(self.base_model).split('-')[0].upper())

    @property
    def type(self):
        if self._impl._type is not None: return self._impl._type.value
        return ModelManager.get_model_type(self.base_model).upper()

    def get_all_models(self):
        """get_all_models() -> List[str]

返回当前目标路径下所有微调模型的路径列表。

**Returns:**

- List[str]：所有微调模型的名称或路径列表。
"""
        return self._impl._get_all_finetuned_models()

    def set_specific_finetuned_model(self, model_path):
        """set_specific_finetuned_model(model_path: str) -> None

设置要使用的特定微调模型路径。

Args:
    model_path (str)：要使用的微调模型的路径。
"""
        return self._impl._set_specific_finetuned_model(model_path)

    @property
    def _deploy_type(self):
        if self._impl._deploy is not lazyllm.deploy.AutoDeploy:
            return self._impl._deploy
        elif self._impl._deployer:
            return type(self._impl._deployer)
        else:
            return lazyllm.deploy.AutoDeploy

    def wait(self):
        """等待模型部署任务完成，该方法会阻塞当前线程直到部署完成。


Examples:
    >>> import lazyllm
    >>> class Mywait(lazyllm.module.llms.TrainableModule):
    ...    def forward(self):
    ...        self.wait()
    """
        if launcher := self._impl._launchers['default'].get('deploy'):
            launcher.wait()

    def stop(self, task_name: Optional[str] = None):
        """暂停模型特定任务。

Args:
    task_name(str): 需要暂停的任务名, 默认为None(默认暂停deploy任务)


Examples:
    >>> import lazyllm
    >>> class Mystop(lazyllm.module.llms.TrainableModule):
    ...    def forward(self, task):
    ...        self.stop(task)
    """
        try:
            launcher = self._impl._launchers['manual' if task_name else 'default'][task_name or 'deploy']
        except KeyError:
            raise RuntimeError('Cannot stop an unstarted task')
        if not task_name: self._impl._get_deploy_tasks.flag.reset()
        launcher.cleanup()

    def status(self, task_name: Optional[str] = None):
        """status(task_name: Optional[str] = None) -> str

返回模块中指定任务的当前状态。

Args：
    task_name (Optional[str])：任务名称（如 'deploy'），默认返回 'deploy' 任务的状态。

**Returns:**

- str：状态字符串，例如 'running'、'finished' 或 'stopped'。
"""
        launcher = self._impl._launchers['manual' if task_name else 'default'][task_name or 'deploy']
        return launcher.status

    def log_path(self, task_name: Optional[str] = None):
        """获取任务日志路径。

根据任务名称获取对应的日志文件路径，支持默认部署任务和手动指定任务。

Args:
    task_name (Optional[str]): 任务名称，默认为None（获取默认部署任务日志）

Returns:
    str: 日志文件路径
"""
        launcher = self._impl._launchers['manual' if task_name else 'default'][task_name or 'deploy']
        return launcher.log_path

    # modify default value to ''
    def prompt(self, prompt: Union[str, dict] = '', history: Optional[List[List[str]]] = None):
        """处理输入的prompt生成符合模型需求的格式。

Args:
    prompt(str): 输入的prompt, 默认为空。
    history(**List): 对话历史记忆。


Examples:
    >>> import lazyllm
    >>> class Myprompt(lazyllm.module.llms.TrainableModule):
    ...    def forward(self, prompt, history):
    ...        self.prompt(prompt,history)
    """
        if self.base_model != '' and prompt == '' and self.type != 'LLM':
            prompt = None
        clear_system = isinstance(prompt, dict) and prompt.get('drop_builtin_system')
        prompter = super(__class__, self).prompt(prompt, history)._prompt
        self._tools = getattr(prompter, '_tools', None)
        keys = ModelManager.get_model_prompt_keys(self.base_model).copy()
        if keys:
            if clear_system: keys['system'] = ''
            prompter._set_model_configs(**keys)
            for key in ['tool_start_token', 'tool_args_token', 'tool_end_token']:
                if key in keys: setattr(self, f'_{key}', keys[key])
        if hasattr(self, '_openai_module'):
            self._openai_module.prompt(prompt, history=history)
        return self

    def formatter(self, format: Optional[FormatterBase] = None):
        super(__class__, self).formatter(format)
        if hasattr(self, '_openai_module'):
            self._openai_module.formatter(format)
        return self

    def share(self, *args, **kwargs):
        new = super(__class__, self).share(*args, **kwargs)
        if hasattr(self, '_openai_module'):
            new._openai_module = self._openai_module.share(*args, **kwargs)
        return new

    def _loads_str(self, text: str) -> Union[str, Dict]:
        try:
            ret = json.loads(text)
            return self._loads_str(ret) if isinstance(ret, str) else ret
        except Exception:
            LOG.error(f'{text} is not a valid json string.')
            return text

    def _parse_arguments_with_args_token(self, output: str) -> tuple[str, dict]:
        items = output.split(self._tool_args_token)
        func_name = items[0].strip()
        if len(items) == 1:
            return func_name.split(self._tool_end_token)[0].strip() if getattr(self, '_tool_end_token', None)\
                else func_name, {}
        args = (items[1].split(self._tool_end_token)[0].strip() if getattr(self, '_tool_end_token', None)
                else items[1].strip())
        return func_name, self._loads_str(args) if isinstance(args, str) else args

    def _parse_arguments_without_args_token(self, output: str) -> tuple[str, dict]:
        items = output.split(self._tool_end_token)[0] if getattr(self, '_tool_end_token', None) else output
        func_name = ''
        args = {}
        try:
            items = json.loads(items.strip())
            func_name = items.get('name', '')
            args = items.get('parameters', items.get('arguments', {}))
        except Exception:
            LOG.error(f'tool calls info {items} parse error')

        return func_name, self._loads_str(args) if isinstance(args, str) else args

    def _parse_arguments_with_tools(self, output: Dict[str, Any], tools: List[str]) -> bool:
        func_name = ''
        args = {}
        is_tc = False
        tc = {}
        if output.get('name', '') in tools:
            is_tc = True
            func_name = output.get('name', '')
            args = output.get('parameters', output.get('arguments', {}))
            tc = {'name': func_name, 'arguments': self._loads_str(args) if isinstance(args, str) else args}
            return is_tc, tc
        return is_tc, tc

    def _parse_tool_start_token(self, output: str) -> tuple[str, List[Dict]]:
        tool_calls = []
        segs = output.split(self._tool_start_token)
        content = segs[0]
        for seg in segs[1:]:
            func_name, arguments = self._parse_arguments_with_args_token(seg.strip())\
                if getattr(self, '_tool_args_token', None)\
                else self._parse_arguments_without_args_token(seg.strip())
            if func_name:
                tool_calls.append({'name': func_name, 'arguments': arguments})

        return content, tool_calls

    def _resolve_tools(self):
        from ...components.prompter.builtinPrompt import _DynamicValue
        tools = self._tools
        if isinstance(tools, _DynamicValue):
            tools = tools.resolve(None, None)
        return tools or []

    def _parse_tools(self, output: str) -> tuple[str, List[Dict]]:
        tool_calls = []
        tools = {tool['function']['name'] for tool in self._resolve_tools()}
        lines = output.strip().split('\n')
        content = []
        is_tool_call = False
        for idx, line in enumerate(lines):
            if line.startswith('{') and idx > 0:
                func_name = lines[idx - 1].strip()
                if func_name in tools:
                    is_tool_call = True
                    if func_name == content[-1].strip():
                        content.pop()
                    arguments = '\n'.join(lines[idx:]).strip()
                    tool_calls.append({'name': func_name, 'arguments': arguments})
                    continue
            if '{' in line and 'name' in line:
                try:
                    items = json.loads(line.strip())
                    items = [items] if isinstance(items, dict) else items
                    if isinstance(items, list):
                        for item in items:
                            is_tool_call, tc = self._parse_arguments_with_tools(item, tools)
                            if is_tool_call:
                                tool_calls.append(tc)
                except Exception:
                    LOG.error(f'tool calls info {line} parse error')
            if not is_tool_call:
                content.append(line)
        content = '\n'.join(content) if len(content) > 0 else ''
        return content, tool_calls

    def _extract_tool_calls(self, output: str) -> tuple[str, List[Dict]]:
        tool_calls = []
        content = ''
        if getattr(self, '_tool_start_token', None) and self._tool_start_token in output:
            content, tool_calls = self._parse_tool_start_token(output)
        elif self._tools:
            content, tool_calls = self._parse_tools(output)
        else:
            content = output

        return content, tool_calls

    def _decode_base64_to_file(self, content: str) -> str:
        decontent = decode_query_with_filepaths(content)
        files = [_base64_to_file(file_content) if _is_base64_with_mime(file_content) else file_content
                 for file_content in decontent['files']]
        return encode_query_with_filepaths(query=decontent['query'], files=files)

    def _extract_and_format(self, output: str) -> str:
        """
        1.extract tool calls information;
            a. If 'tool_start_token' exists, the boundary of tool_calls can be found according to 'tool_start_token',
               and then the function name and arguments of tool_calls can be extracted according to 'tool_args_token'
               and 'tool_end_token'.
            b. If 'tool_start_token' does not exist, the text is segmented using '\\n' according to the incoming tools
               information, and then processed according to the rules.
        """
        content, tool_calls = self._extract_tool_calls(output)
        if isinstance(content, str) and content.startswith(LAZYLLM_QUERY_PREFIX):
            content = self._decode_base64_to_file(content)
        tc = [{'id': str(uuid.uuid4().hex), 'type': 'function', 'function': tool_call} for tool_call in tool_calls]
        return dict(role='assistant', content=content, tool_calls=tc)

    def __repr__(self):
        return lazyllm.make_repr('Module', 'Trainable', mode=self._impl._mode, basemodel=self.base_model,
                                 target=self.target_path, name=self.name, deploy_type=self._deploy_type,
                                 stream=bool(self._stream), return_trace=self._return_trace)

    def __getattr__(self, key):
        if key in self.__class__.builder_keys:
            return functools.partial(getattr(self._impl, key), _return_value=self)
        raise AttributeError(f'{__class__} object has no attribute {key}')

    def _record_usage(self, text_input_for_token_usage: str, temp_output: str):
        usage = {'prompt_tokens': self._estimate_token_usage(text_input_for_token_usage)}
        usage['completion_tokens'] = self._estimate_token_usage(temp_output)
        self._record_usage_impl(usage)

    def _record_usage_impl(self, usage: dict):
        globals['usage'][self._module_id] = usage
        par_muduleid = self._used_by_moduleid
        if par_muduleid is None:
            return
        if par_muduleid not in globals['usage']:
            globals['usage'][par_muduleid] = usage
            return
        existing_usage = globals['usage'][par_muduleid]
        if existing_usage['prompt_tokens'] == -1 or usage['prompt_tokens'] == -1:
            globals['usage'][par_muduleid] = {'prompt_tokens': -1, 'completion_tokens': -1}
        else:
            for k in globals['usage'][par_muduleid]:
                globals['usage'][par_muduleid][k] += usage[k]

    def forward(self, __input: Union[Tuple[Union[str, Dict], str], str, Dict] = package(),  # noqa B008
                *, llm_chat_history=None, lazyllm_files=None, tools=None, stream_output=False,
                max_retries: int = 3, **kw):
        """自动构建符合模型要求的输入数据结构，适配多模态场景。


Examples:
    >>> import lazyllm
    >>> from lazyllm.module import TrainableModule
    >>> class MyModule(TrainableModule):
    ...     def forward(self, __input, **kw):
    ...         return f"processed: {__input}"
    ...
    >>> MyModule()("Hello")
    'processed: Hello'
    """
        if self._url.endswith('/v1/'):
            return self.forward_openai(__input, llm_chat_history=llm_chat_history, lazyllm_files=lazyllm_files,
                                       tools=tools, stream_output=stream_output, max_retries=max_retries, **kw)
        else:
            return self.forward_standard(__input, llm_chat_history=llm_chat_history, lazyllm_files=lazyllm_files,
                                         tools=tools, stream_output=stream_output, max_retries=max_retries, **kw)

    def forward_openai(self, __input: Union[Tuple[Union[str, Dict], str], str, Dict] = package(),  # noqa B008
                       *, llm_chat_history=None, lazyllm_files=None, tools=None, stream_output=False,
                       max_retries: int = 3, **kw):
        """使用OpenAI兼容接口进行前向推理。

通过OpenAI标准API格式调用部署的模型服务，支持聊天历史、文件处理、工具调用和流式输出。

Args:
    __input (Union[Tuple[Union[str, Dict], str], str, Dict]): 输入数据，可以是文本、字典或打包数据
    llm_chat_history: 聊天历史记录
    lazyllm_files: 文件数据
    tools: 工具调用配置
    stream_output (bool): 是否流式输出
    **kw: 其他关键字参数

Returns:
    模型推理结果
"""
        if not getattr(self, '_openai_module', None):
            model_type = self.type.lower()
            if model_type in ['llm', 'vlm']:
                self._openai_module = lazyllm.OnlineChatModule(
                    source='openai', model='lazyllm', base_url=self._url, skip_auth=True, type=model_type,
                    stream=self._stream).share(prompt=self._prompt, format=self._formatter)
                self._openai_module._prompt._set_model_configs(
                    system='You are LazyLLM, a large language model developed by SenseTime.'
                )
            elif model_type in ['embed', 'rerank']:
                self._openai_module = lazyllm.OnlineEmbeddingModule(
                    source='openai', embed_model_name='lazyllm', embed_url=self._url, type=model_type)
            else:
                raise ValueError(f'Unsupported type: {model_type} for openai compatible module')
            self._openai_module.used_by(self._module_id)
        return self._openai_module.forward(__input, llm_chat_history=llm_chat_history, lazyllm_files=lazyllm_files,
                                           tools=tools, stream_output=stream_output, max_retries=max_retries, **kw)

    def forward_standard(self, __input: Union[Tuple[Union[str, Dict], str], str, Dict] = package(),  # noqa B008
                         *, llm_chat_history=None, lazyllm_files=None, tools=None, stream_output=False,
                         max_retries: int = 3, **kw):
        """使用标准接口进行前向推理。

通过自定义标准API格式调用部署的模型服务，支持模板消息、文件编码和流式输出。

Args:
    __input (Union[Tuple[Union[str, Dict], str], str, Dict]): 输入数据，可以是文本、字典或打包数据
    llm_chat_history: 聊天历史记录
    lazyllm_files: 文件数据
    tools: 工具调用配置
    stream_output (bool): 是否流式输出
    **kw: 其他关键字参数

Returns:
    模型推理结果
"""
        __input, files = self._get_files(__input, lazyllm_files)
        text_input_for_token_usage = __input = self._prompt.generate_prompt(__input, llm_chat_history, tools)
        url = self._url

        if self.template_message:
            data = self._modify_parameters(copy.deepcopy(self.template_message), kw, optional_keys='modality')
            data[self.keys_name_handle.get('inputs', 'inputs')] = __input
            if files and (keys := list(set(self.keys_name_handle).intersection(LazyLLMDeployBase.encoder_map.keys()))):
                assert len(keys) == 1, 'Only one key is supported for encoder_mapping'
                data[self.keys_name_handle[keys[0]]] = encode_files(files, LazyLLMDeployBase.encoder_map[keys[0]])

            if stream_output:
                if self.stream_url_suffix and not url.endswith(self.stream_url_suffix):
                    url += self.stream_url_suffix
                if 'stream' in data: data['stream'] = stream_output
        else:
            data = __input
            if stream_output: LOG.warning('stream_output is not supported when template_message is not set, ignore it')
            assert not kw, 'kw is not supported when template_message is not set'

        if tools or self._tools:
            stop_key = next((k for k in data if k.startswith('stop')), 'stop')
            data[stop_key] = (data.get(stop_key) or []) + [self._tool_end_token]

        inputs_key = self.keys_name_handle.get('inputs', 'inputs')
        original_input = data.get(inputs_key, '') if isinstance(data, dict) else data
        _RETRY_DELAYS = [3, 10, 30]
        partial_out: List = []

        with self.stream_output((stream_output := (stream_output or self._stream))):
            for attempt in range(max_retries):
                if attempt > 0 and partial_out:
                    partial_messages = partial_out[0]
                    if isinstance(data, dict):
                        data = dict(data)
                        data[inputs_key] = original_input + partial_messages
                    else:
                        data = original_input + partial_messages
                    partial_out.clear()
                try:
                    return self._forward_impl(data, stream_output=stream_output, url=url,
                                              text_input=text_input_for_token_usage, partial_out=partial_out)
                except (requests.exceptions.ChunkedEncodingError, requests.exceptions.ConnectionError):
                    if attempt < max_retries - 1:
                        LOG.warning(f'Stream interrupted (attempt {attempt + 1}), retrying...')
                        time.sleep(_RETRY_DELAYS[min(attempt, len(_RETRY_DELAYS) - 1)])
                        continue
                    raise
                except requests.RequestException:
                    if attempt < max_retries - 1:
                        LOG.warning(f'Request failed (attempt {attempt + 1}), retrying...')
                        time.sleep(_RETRY_DELAYS[min(attempt, len(_RETRY_DELAYS) - 1)])
                        continue
                    raise

    def _maybe_has_fc(self, token: str, chunk: str) -> bool:
        return token and (token.startswith(chunk if token.startswith('\n') else chunk.lstrip('\n')) or token in chunk)

    def _forward_impl(self, data: Union[Tuple[Union[str, Dict], str], str, Dict] = package(), *,  # noqa B008
                      url: str, stream_output: Optional[Union[bool, Dict]] = None,
                      text_input: Optional[str] = None,
                      partial_out: Optional[List] = None) -> Tuple[Any, str]:
        headers = self.template_headers or {'Content-Type': 'application/json'}
        parse_parameters = self.stream_parse_parameters if stream_output else {'delimiter': b'<|lazyllm_delimiter|>'}

        # context bug with httpx, so we use requests
        with requests.post(url, json=data, stream=True, headers=headers, proxies={'http': None, 'https': None}) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))

            messages, cache = '', ''
            token = getattr(self, '_tool_start_token', '')
            color = stream_output.get('color') if isinstance(stream_output, dict) else None

            for line in r.iter_lines(**parse_parameters):
                if not line: continue
                line = self._decode_line(line)

                chunk = self._prompt.get_response(self.extract_result_func(line, data))
                chunk = chunk[len(messages):] if isinstance(chunk, str) and chunk.startswith(messages) else chunk
                messages = chunk if not isinstance(chunk, str) else messages + chunk
                if partial_out is not None: partial_out[:] = [messages]

                if not stream_output: continue
                if not cache: cache = chunk if self._maybe_has_fc(token, chunk) else self._stream_output(chunk, color)
                elif token in cache:
                    stream_output = False
                    if not cache.startswith(token): self._stream_output(cache.split(token)[0], color)
                else:
                    cache += chunk
                    if not self._maybe_has_fc(token, cache): cache = self._stream_output(cache, color)

        if text_input: self._record_usage(text_input, messages)
        temp_output = self._extract_and_format(messages)
        return self._formatter(temp_output)

    def _modify_parameters(self, paras: dict, kw: dict, *, optional_keys: Union[List[str], str] = None):
        for key, value in paras.items():
            if key == self.keys_name_handle['inputs']: continue
            elif isinstance(value, dict):
                if key in kw:
                    assert set(kw[key].keys()).issubset(set(value.keys()))
                    value.update(kw.pop(key))
                else: [setattr(value, k, kw.pop(k)) for k in value.keys() if k in kw]
            elif key in kw: paras[key] = kw.pop(key)

        optional_keys = [optional_keys] if isinstance(optional_keys, str) else (optional_keys or [])
        assert set(kw.keys()).issubset(set(optional_keys)), f'{kw.keys()} is not in {optional_keys}'
        paras.update(kw)
        return paras

    def set_default_parameters(self, *, optional_keys: Optional[List[str]] = None, **kw):
        """set_default_parameters(*, optional_keys: List[str] = [], **kw) -> None

设置用于推理或评估的默认参数。

Args:
    optional_keys (List[str])：允许传入额外参数的可选键列表。
    **kw：用于设置默认参数的键值对，如 temperature、top_k 等。
"""
        self._modify_parameters(self.template_message, kw, optional_keys=optional_keys or [])

    def _cache_miss_handler(self):
        if not self._url or self._url == fake_url:
            raise RuntimeError('Cache miss, please use `start()` to deploy the module first')

    def __getstate__(self):
        state = self.__dict__.copy()
        state['base_model'] = self._impl._base_model
        return state

    def __setstate__(self, state):
        self.__dict__.update(state)
        self._impl._base_model = state['base_model']

`wait()`

等待模型部署任务完成，该方法会阻塞当前线程直到部署完成。

Examples:

>>> import lazyllm
>>> class Mywait(lazyllm.module.llms.TrainableModule):
...    def forward(self):
...        self.wait()

Source code in lazyllm/module/llms/trainablemodule.py

    def wait(self):
        """等待模型部署任务完成，该方法会阻塞当前线程直到部署完成。


Examples:
    >>> import lazyllm
    >>> class Mywait(lazyllm.module.llms.TrainableModule):
    ...    def forward(self):
    ...        self.wait()
    """
        if launcher := self._impl._launchers['default'].get('deploy'):
            launcher.wait()

`stop(task_name=None)`

暂停模型特定任务。

Parameters:

task_name (str, default: None ) –

需要暂停的任务名, 默认为None(默认暂停deploy任务)

Examples:

>>> import lazyllm
>>> class Mystop(lazyllm.module.llms.TrainableModule):
...    def forward(self, task):
...        self.stop(task)

Source code in lazyllm/module/llms/trainablemodule.py

    def stop(self, task_name: Optional[str] = None):
        """暂停模型特定任务。

Args:
    task_name(str): 需要暂停的任务名, 默认为None(默认暂停deploy任务)


Examples:
    >>> import lazyllm
    >>> class Mystop(lazyllm.module.llms.TrainableModule):
    ...    def forward(self, task):
    ...        self.stop(task)
    """
        try:
            launcher = self._impl._launchers['manual' if task_name else 'default'][task_name or 'deploy']
        except KeyError:
            raise RuntimeError('Cannot stop an unstarted task')
        if not task_name: self._impl._get_deploy_tasks.flag.reset()
        launcher.cleanup()

`prompt(prompt='', history=None)`

处理输入的prompt生成符合模型需求的格式。

Parameters:

prompt (str, default: '' ) –

输入的prompt, 默认为空。
history (**List, default: None ) –

对话历史记忆。

Examples:

>>> import lazyllm
>>> class Myprompt(lazyllm.module.llms.TrainableModule):
...    def forward(self, prompt, history):
...        self.prompt(prompt,history)

Source code in lazyllm/module/llms/trainablemodule.py

    def prompt(self, prompt: Union[str, dict] = '', history: Optional[List[List[str]]] = None):
        """处理输入的prompt生成符合模型需求的格式。

Args:
    prompt(str): 输入的prompt, 默认为空。
    history(**List): 对话历史记忆。


Examples:
    >>> import lazyllm
    >>> class Myprompt(lazyllm.module.llms.TrainableModule):
    ...    def forward(self, prompt, history):
    ...        self.prompt(prompt,history)
    """
        if self.base_model != '' and prompt == '' and self.type != 'LLM':
            prompt = None
        clear_system = isinstance(prompt, dict) and prompt.get('drop_builtin_system')
        prompter = super(__class__, self).prompt(prompt, history)._prompt
        self._tools = getattr(prompter, '_tools', None)
        keys = ModelManager.get_model_prompt_keys(self.base_model).copy()
        if keys:
            if clear_system: keys['system'] = ''
            prompter._set_model_configs(**keys)
            for key in ['tool_start_token', 'tool_args_token', 'tool_end_token']:
                if key in keys: setattr(self, f'_{key}', keys[key])
        if hasattr(self, '_openai_module'):
            self._openai_module.prompt(prompt, history=history)
        return self

`log_path(task_name=None)`

获取任务日志路径。

根据任务名称获取对应的日志文件路径，支持默认部署任务和手动指定任务。

Parameters:

task_name (Optional[str], default: None ) –

任务名称，默认为None（获取默认部署任务日志）

Returns:

str –

日志文件路径

Source code in lazyllm/module/llms/trainablemodule.py

    def log_path(self, task_name: Optional[str] = None):
        """获取任务日志路径。

根据任务名称获取对应的日志文件路径，支持默认部署任务和手动指定任务。

Args:
    task_name (Optional[str]): 任务名称，默认为None（获取默认部署任务日志）

Returns:
    str: 日志文件路径
"""
        launcher = self._impl._launchers['manual' if task_name else 'default'][task_name or 'deploy']
        return launcher.log_path

`forward_openai(__input=package(), *, llm_chat_history=None, lazyllm_files=None, tools=None, stream_output=False, max_retries=3, **kw)`

使用OpenAI兼容接口进行前向推理。

通过OpenAI标准API格式调用部署的模型服务，支持聊天历史、文件处理、工具调用和流式输出。

Parameters:

__input (Union[Tuple[Union[str, Dict], str], str, Dict], default: package() ) –

输入数据，可以是文本、字典或打包数据
llm_chat_history –

聊天历史记录
lazyllm_files –

文件数据
tools –

工具调用配置
stream_output (bool, default: False ) –

是否流式输出
**kw –

其他关键字参数

Returns:

–

模型推理结果

Source code in lazyllm/module/llms/trainablemodule.py

    def forward_openai(self, __input: Union[Tuple[Union[str, Dict], str], str, Dict] = package(),  # noqa B008
                       *, llm_chat_history=None, lazyllm_files=None, tools=None, stream_output=False,
                       max_retries: int = 3, **kw):
        """使用OpenAI兼容接口进行前向推理。

通过OpenAI标准API格式调用部署的模型服务，支持聊天历史、文件处理、工具调用和流式输出。

Args:
    __input (Union[Tuple[Union[str, Dict], str], str, Dict]): 输入数据，可以是文本、字典或打包数据
    llm_chat_history: 聊天历史记录
    lazyllm_files: 文件数据
    tools: 工具调用配置
    stream_output (bool): 是否流式输出
    **kw: 其他关键字参数

Returns:
    模型推理结果
"""
        if not getattr(self, '_openai_module', None):
            model_type = self.type.lower()
            if model_type in ['llm', 'vlm']:
                self._openai_module = lazyllm.OnlineChatModule(
                    source='openai', model='lazyllm', base_url=self._url, skip_auth=True, type=model_type,
                    stream=self._stream).share(prompt=self._prompt, format=self._formatter)
                self._openai_module._prompt._set_model_configs(
                    system='You are LazyLLM, a large language model developed by SenseTime.'
                )
            elif model_type in ['embed', 'rerank']:
                self._openai_module = lazyllm.OnlineEmbeddingModule(
                    source='openai', embed_model_name='lazyllm', embed_url=self._url, type=model_type)
            else:
                raise ValueError(f'Unsupported type: {model_type} for openai compatible module')
            self._openai_module.used_by(self._module_id)
        return self._openai_module.forward(__input, llm_chat_history=llm_chat_history, lazyllm_files=lazyllm_files,
                                           tools=tools, stream_output=stream_output, max_retries=max_retries, **kw)

`forward_standard(__input=package(), *, llm_chat_history=None, lazyllm_files=None, tools=None, stream_output=False, max_retries=3, **kw)`

使用标准接口进行前向推理。

通过自定义标准API格式调用部署的模型服务，支持模板消息、文件编码和流式输出。

Parameters:

__input (Union[Tuple[Union[str, Dict], str], str, Dict], default: package() ) –

输入数据，可以是文本、字典或打包数据
llm_chat_history –

聊天历史记录
lazyllm_files –

文件数据
tools –

工具调用配置
stream_output (bool, default: False ) –

是否流式输出
**kw –

其他关键字参数

Returns:

–

模型推理结果

Source code in lazyllm/module/llms/trainablemodule.py

    def forward_standard(self, __input: Union[Tuple[Union[str, Dict], str], str, Dict] = package(),  # noqa B008
                         *, llm_chat_history=None, lazyllm_files=None, tools=None, stream_output=False,
                         max_retries: int = 3, **kw):
        """使用标准接口进行前向推理。

通过自定义标准API格式调用部署的模型服务，支持模板消息、文件编码和流式输出。

Args:
    __input (Union[Tuple[Union[str, Dict], str], str, Dict]): 输入数据，可以是文本、字典或打包数据
    llm_chat_history: 聊天历史记录
    lazyllm_files: 文件数据
    tools: 工具调用配置
    stream_output (bool): 是否流式输出
    **kw: 其他关键字参数

Returns:
    模型推理结果
"""
        __input, files = self._get_files(__input, lazyllm_files)
        text_input_for_token_usage = __input = self._prompt.generate_prompt(__input, llm_chat_history, tools)
        url = self._url

        if self.template_message:
            data = self._modify_parameters(copy.deepcopy(self.template_message), kw, optional_keys='modality')
            data[self.keys_name_handle.get('inputs', 'inputs')] = __input
            if files and (keys := list(set(self.keys_name_handle).intersection(LazyLLMDeployBase.encoder_map.keys()))):
                assert len(keys) == 1, 'Only one key is supported for encoder_mapping'
                data[self.keys_name_handle[keys[0]]] = encode_files(files, LazyLLMDeployBase.encoder_map[keys[0]])

            if stream_output:
                if self.stream_url_suffix and not url.endswith(self.stream_url_suffix):
                    url += self.stream_url_suffix
                if 'stream' in data: data['stream'] = stream_output
        else:
            data = __input
            if stream_output: LOG.warning('stream_output is not supported when template_message is not set, ignore it')
            assert not kw, 'kw is not supported when template_message is not set'

        if tools or self._tools:
            stop_key = next((k for k in data if k.startswith('stop')), 'stop')
            data[stop_key] = (data.get(stop_key) or []) + [self._tool_end_token]

        inputs_key = self.keys_name_handle.get('inputs', 'inputs')
        original_input = data.get(inputs_key, '') if isinstance(data, dict) else data
        _RETRY_DELAYS = [3, 10, 30]
        partial_out: List = []

        with self.stream_output((stream_output := (stream_output or self._stream))):
            for attempt in range(max_retries):
                if attempt > 0 and partial_out:
                    partial_messages = partial_out[0]
                    if isinstance(data, dict):
                        data = dict(data)
                        data[inputs_key] = original_input + partial_messages
                    else:
                        data = original_input + partial_messages
                    partial_out.clear()
                try:
                    return self._forward_impl(data, stream_output=stream_output, url=url,
                                              text_input=text_input_for_token_usage, partial_out=partial_out)
                except (requests.exceptions.ChunkedEncodingError, requests.exceptions.ConnectionError):
                    if attempt < max_retries - 1:
                        LOG.warning(f'Stream interrupted (attempt {attempt + 1}), retrying...')
                        time.sleep(_RETRY_DELAYS[min(attempt, len(_RETRY_DELAYS) - 1)])
                        continue
                    raise
                except requests.RequestException:
                    if attempt < max_retries - 1:
                        LOG.warning(f'Request failed (attempt {attempt + 1}), retrying...')
                        time.sleep(_RETRY_DELAYS[min(attempt, len(_RETRY_DELAYS) - 1)])
                        continue
                    raise

`forward(__input=package(), *, llm_chat_history=None, lazyllm_files=None, tools=None, stream_output=False, max_retries=3, **kw)`

自动构建符合模型要求的输入数据结构，适配多模态场景。

Examples:

>>> import lazyllm
>>> from lazyllm.module import TrainableModule
>>> class MyModule(TrainableModule):
...     def forward(self, __input, **kw):
...         return f"processed: {__input}"
...
>>> MyModule()("Hello")
'processed: Hello'

Source code in lazyllm/module/llms/trainablemodule.py

    def forward(self, __input: Union[Tuple[Union[str, Dict], str], str, Dict] = package(),  # noqa B008
                *, llm_chat_history=None, lazyllm_files=None, tools=None, stream_output=False,
                max_retries: int = 3, **kw):
        """自动构建符合模型要求的输入数据结构，适配多模态场景。


Examples:
    >>> import lazyllm
    >>> from lazyllm.module import TrainableModule
    >>> class MyModule(TrainableModule):
    ...     def forward(self, __input, **kw):
    ...         return f"processed: {__input}"
    ...
    >>> MyModule()("Hello")
    'processed: Hello'
    """
        if self._url.endswith('/v1/'):
            return self.forward_openai(__input, llm_chat_history=llm_chat_history, lazyllm_files=lazyllm_files,
                                       tools=tools, stream_output=stream_output, max_retries=max_retries, **kw)
        else:
            return self.forward_standard(__input, llm_chat_history=llm_chat_history, lazyllm_files=lazyllm_files,
                                         tools=tools, stream_output=stream_output, max_retries=max_retries, **kw)

`lazyllm.module.UrlModule`

Bases: ModuleBase, LLMBase, _UrlHelper

可以将ServerModule部署得到的Url包装成一个Module，调用 __call__ 时会访问该服务。

Parameters:

url (str, default: '' ) –

要包装的服务的Url，默认为空字符串
stream (bool | Dict[str, str], default: False ) –

是否流式请求和输出，默认为非流式
return_trace (bool, default: False ) –

是否将结果记录在trace中，默认为False
init_prompt (bool, default: True ) –

是否初始化prompt，默认为True

Examples:

>>> import lazyllm
>>> def demo(input): return input * 2
... 
>>> s = lazyllm.ServerModule(demo, launcher=lazyllm.launchers.empty(sync=False))
>>> s.start()
INFO:     Uvicorn running on http://0.0.0.0:35485
>>> u = lazyllm.UrlModule(url=s._url)
>>> print(u(1))
2

Source code in lazyllm/module/servermodule.py

class UrlModule(ModuleBase, LLMBase, _UrlHelper):
    """可以将ServerModule部署得到的Url包装成一个Module，调用 ``__call__`` 时会访问该服务。

Args:
    url (str): 要包装的服务的Url，默认为空字符串
    stream (bool|Dict[str, str]): 是否流式请求和输出，默认为非流式
    return_trace (bool): 是否将结果记录在trace中，默认为False
    init_prompt (bool): 是否初始化prompt，默认为True


Examples:
    >>> import lazyllm
    >>> def demo(input): return input * 2
    ... 
    >>> s = lazyllm.ServerModule(demo, launcher=lazyllm.launchers.empty(sync=False))
    >>> s.start()
    INFO:     Uvicorn running on http://0.0.0.0:35485
    >>> u = lazyllm.UrlModule(url=s._url)
    >>> print(u(1))
    2
    """

    def __new__(cls, *args, **kw):
        if cls is not UrlModule:
            return super().__new__(cls)
        return ServerModule(*args, **kw)

    def __init__(self, *, url: Optional[str] = '', stream: Union[bool, Dict[str, str]] = False,
                 return_trace: bool = False, init_prompt: bool = True):
        super().__init__(return_trace=return_trace)
        LLMBase.__init__(self, stream=stream, init_prompt=init_prompt)
        _UrlHelper.__init__(self, url)

    def _estimate_token_usage(self, text):
        if not isinstance(text, str):
            return 0
        # extract english words, number and comma
        pattern = r'\b[a-zA-Z0-9]+\b|,'
        ascii_words = re.findall(pattern, text)
        ascii_ch_count = sum(len(ele) for ele in ascii_words)
        non_ascii_pattern = r'[^\x00-\x7F]'
        non_ascii_chars = re.findall(non_ascii_pattern, text)
        non_ascii_char_count = len(non_ascii_chars)
        return int(ascii_ch_count / 3.0 + non_ascii_char_count + 1)

    def _decode_line(self, line: bytes):
        try:
            return pickle.loads(codecs.decode(line, 'base64'))
        except Exception:
            return line.decode('utf-8')

    def _extract_and_format(self, output: str) -> str:
        return output

    def forward(self, *args, **kw):
        """定义了每次执行的计算步骤，ModuleBase的所有的子类都需要重写这个函数。


Examples:
    >>> import lazyllm
    >>> class MyModule(lazyllm.module.ModuleBase):
    ...    def forward(self, input):
    ...        return input + 1
    ...
    >>> MyModule()(1)
    2
    """
        raise NotImplementedError

    def __call__(self, *args, **kw):
        assert self._url is not None, f'Please start {self.__class__} first'
        if len(args) > 1:
            return super(__class__, self).__call__(package(args), **kw)
        return super(__class__, self).__call__(*args, **kw)

    def __repr__(self):
        return lazyllm.make_repr('Module', 'Url', name=self.name, url=self._url,
                                 stream=self._stream, return_trace=self._return_trace)

`forward(*args, **kw)`

定义了每次执行的计算步骤，ModuleBase的所有的子类都需要重写这个函数。

Examples:

>>> import lazyllm
>>> class MyModule(lazyllm.module.ModuleBase):
...    def forward(self, input):
...        return input + 1
...
>>> MyModule()(1)
2

Source code in lazyllm/module/servermodule.py

    def forward(self, *args, **kw):
        """定义了每次执行的计算步骤，ModuleBase的所有的子类都需要重写这个函数。


Examples:
    >>> import lazyllm
    >>> class MyModule(lazyllm.module.ModuleBase):
    ...    def forward(self, input):
    ...        return input + 1
    ...
    >>> MyModule()(1)
    2
    """
        raise NotImplementedError

`lazyllm.module.ServerModule`

Bases: UrlModule

ServerModule 类，继承自 UrlModule，封装了将任意可调用对象部署为 API 服务的能力。
通过 FastAPI 实现，可以启动一个主服务和多个卫星服务，并支持流式调用、预处理和后处理逻辑。
既可以传入本地可调用对象启动服务，也可以通过 URL 直接连接远程服务。

Parameters:

m (Optional[Union[str, ModuleBase]], default: None ) –

被包装成服务的模块或其名称。若为字符串则表示 URL，此时 url 必须为 None；若为 ModuleBase 则包装为服务。
pre (Optional[Callable], default: None ) –

前处理函数，在服务进程执行，默认为 None。
post (Optional[Callable], default: None ) –

后处理函数，在服务进程执行，默认为 None。
stream (Union[bool, Dict], default: False ) –

是否开启流式输出。可以是布尔值，或包含流式配置的字典，默认为 False。
return_trace (Optional[bool], default: False ) –

是否返回调试追踪信息。默认为 False。
port (Optional[int], default: None ) –

指定服务部署的端口。默认为 None，将自动分配端口。
pythonpath (Optional[str], default: None ) –

传递给子进程的 PYTHONPATH 环境变量，默认为 None。
launcher (Optional[LazyLLMLaunchersBase], default: None ) –

启动服务所使用的 Launcher，默认使用异步远程部署。
url (Optional[str], default: None ) –

已部署服务的 URL 地址。若提供，则 m 必须为 None。

Examples:

>>> import lazyllm
>>> def demo(input): return input * 2
...
>>> s = lazyllm.ServerModule(demo, launcher=launchers.empty(sync=False))
>>> s.start()
INFO:     Uvicorn running on http://0.0.0.0:35485
>>> print(s(1))
2

>>> class MyServe(object):
...     def __call__(self, input):
...         return 2 * input
...
...     @lazyllm.FastapiApp.post
...     def server1(self, input):
...         return f'reply for {input}'
...
...     @lazyllm.FastapiApp.get
...     def server2(self):
...        return f'get method'
...
>>> m = lazyllm.ServerModule(MyServe(), launcher=launchers.empty(sync=False))
>>> m.start()
INFO:     Uvicorn running on http://0.0.0.0:32028
>>> print(m(1))
2

Source code in lazyllm/module/servermodule.py

class ServerModule(UrlModule):
    """ServerModule 类，继承自 UrlModule，封装了将任意可调用对象部署为 API 服务的能力。  
通过 FastAPI 实现，可以启动一个主服务和多个卫星服务，并支持流式调用、预处理和后处理逻辑。  
既可以传入本地可调用对象启动服务，也可以通过 URL 直接连接远程服务。

Args:
    m (Optional[Union[str, ModuleBase]]): 被包装成服务的模块或其名称。若为字符串则表示 URL，此时 `url` 必须为 None；若为 ModuleBase 则包装为服务。
    pre (Optional[Callable]): 前处理函数，在服务进程执行，默认为 ``None``。
    post (Optional[Callable]): 后处理函数，在服务进程执行，默认为 ``None``。
    stream (Union[bool, Dict]): 是否开启流式输出。可以是布尔值，或包含流式配置的字典，默认为 ``False``。
    return_trace (Optional[bool]): 是否返回调试追踪信息。默认为 ``False``。
    port (Optional[int]): 指定服务部署的端口。默认为 ``None``，将自动分配端口。
    pythonpath (Optional[str]): 传递给子进程的 PYTHONPATH 环境变量，默认为 ``None``。
    launcher (Optional[LazyLLMLaunchersBase]): 启动服务所使用的 Launcher，默认使用异步远程部署。
    url (Optional[str]): 已部署服务的 URL 地址。若提供，则 `m` 必须为 None。


Examples:
    >>> import lazyllm
    >>> def demo(input): return input * 2
    ...
    >>> s = lazyllm.ServerModule(demo, launcher=launchers.empty(sync=False))
    >>> s.start()
    INFO:     Uvicorn running on http://0.0.0.0:35485
    >>> print(s(1))
    2

    >>> class MyServe(object):
    ...     def __call__(self, input):
    ...         return 2 * input
    ...
    ...     @lazyllm.FastapiApp.post
    ...     def server1(self, input):
    ...         return f'reply for {input}'
    ...
    ...     @lazyllm.FastapiApp.get
    ...     def server2(self):
    ...        return f'get method'
    ...
    >>> m = lazyllm.ServerModule(MyServe(), launcher=launchers.empty(sync=False))
    >>> m.start()
    INFO:     Uvicorn running on http://0.0.0.0:32028
    >>> print(m(1))
    2
    """
    def __init__(self, m: Optional[Union[str, ModuleBase]] = None, pre: Optional[Callable] = None,
                 post: Optional[Callable] = None, stream: Union[bool, Dict] = False,
                 return_trace: bool = False, port: Optional[int] = None, pythonpath: Optional[str] = None,
                 launcher: Optional[LazyLLMLaunchersBase] = None, url: Optional[str] = None,
                 num_replicas: int = 1, security_key: Optional[Union[str, bool]] = None):
        assert stream is False or return_trace is False, 'Module with stream output has no trace'
        assert (post is None) or (stream is False), 'Stream cannot be true when post-action exists'
        if isinstance(m, str):
            assert url is None, 'url should be None when m is a url'
            url, m = m, None
        if url:
            assert is_valid_url(url), f'Invalid url: {url}'
            assert m is None, 'm should be None when url is provided'
        super().__init__(url=url, stream=stream, return_trace=return_trace)
        self._security_key = f'sk-{str(uuid.uuid4().hex)}' if security_key is True else security_key
        self._impl = _ServerModuleImpl(m, pre, post, launcher, port, pythonpath, self._url_wrapper,
                                       num_replicas=num_replicas, security_key=self._security_key)
        if url: self._impl._get_deploy_tasks.flag.set()

    _url_id = property(lambda self: self._impl._module_id)

    def wait(self):
        """等待当前模块服务的启动或执行过程完成。  
通常用于阻塞主线程，直到服务正常结束或中断。  
"""
        self._impl._launcher.wait()

    def stop(self):
        """停止当前模块服务以及其相关子进程。  
调用后，模块将不再响应请求。  
"""
        self._impl.stop()

    @property
    def status(self):
        return self._impl._launcher.status

    def _call(self, fname, *args, **kwargs):
        args, kwargs = lazyllm.dump_obj(args), lazyllm.dump_obj(kwargs)
        url = urljoin(self._url.rsplit('/', 1)[0], '_call')
        headers = {
            'Content-Type': 'application/json',
            'Global-Parameters': globals.pickled_data,
            'Session-ID': globals._sid,
        }
        r = requests.post(url, json=(fname, args, kwargs), headers=headers)
        if r.status_code != 200:
            try:
                error_info = r.json()
            except ValueError:
                error_info = r.text
            raise requests.RequestException(f'{r.status_code}: {error_info}')
        return pickle.loads(codecs.decode(r.content, 'base64'))

    def forward(self, __input: Union[Tuple[Union[str, Dict], str], str, Dict] = package(), **kw):  # noqa B008
        headers = {
            'Content-Type': 'application/json',
            'Global-Parameters': globals.pickled_data,
            'Session-ID': globals._sid,
            'Security-Key': self._security_key,
        }
        data = obj2str((__input, kw))

        # context bug with httpx, so we use requests
        with requests.post(self._url, json=data, stream=True, headers=headers,
                           proxies={'http': None, 'https': None}) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))

            messages = ''
            with self.stream_output(self._stream):
                for line in r.iter_lines(delimiter=b'<|lazyllm_delimiter|>'):
                    line = self._decode_line(line)
                    if self._stream:
                        self._stream_output(str(line), getattr(self._stream, 'get', lambda x: None)('color'))
                    messages = (messages + str(line)) if self._stream else line

                temp_output = self._extract_and_format(messages)
                return self._formatter(temp_output)

    def __repr__(self):
        return lazyllm.make_repr('Module', 'Server', subs=[repr(self._impl._m)], name=self.name,
                                 stream=self._stream, return_trace=self._return_trace)

`wait()`

等待当前模块服务的启动或执行过程完成。
通常用于阻塞主线程，直到服务正常结束或中断。

Source code in lazyllm/module/servermodule.py

    def wait(self):
        """等待当前模块服务的启动或执行过程完成。  
通常用于阻塞主线程，直到服务正常结束或中断。  
"""
        self._impl._launcher.wait()

`stop()`

停止当前模块服务以及其相关子进程。
调用后，模块将不再响应请求。

Source code in lazyllm/module/servermodule.py

    def stop(self):
        """停止当前模块服务以及其相关子进程。  
调用后，模块将不再响应请求。  
"""
        self._impl.stop()

`lazyllm.module.AutoModel`

用于快速创建在线推理模块 OnlineModule 或本地 TrainableModule 的工厂类。它会优先采用用户传入的参数，若开启 config 则会根据 auto_model_config_map 中的配置进行覆盖，然后自动判断应当构建在线模块还是本地模块：

当判定为在线模块时，参数会透传给 OnlineModule（自动匹配 OnlineChatModule / OnlineEmbeddingModule / OnlineMultiModalModule）。
当判定为本地模块时，则以 model 与用户参数初始化 TrainableModule，并读取 config map 里的配置参数。

Parameters:

model (str) –

指定模型名称。例如 Qwen3-32B。必填。
config_id (Optional[str]) –

指定配置文件里的id。默认为空。
source (Optional[str]) –

使用的服务提供方。为在线模块（OnlineModule）指定 qwen / glm / openai 等；若设为 local 则强制创建本地 TrainableModule。
type (Optional[str]) –

模型类型。若未指定会尝试从 kwargs 中获取或由在线模块自动推断。
config (Union[str, bool]) –

是否启用 auto_model_config_map 的覆盖逻辑，或者用户指定的 config 文件路径。默认为 True。
**kwargs –

仅接受 model 的同义字段 base_model、embed_model_name 和 model_name，不接收其他用户自定义字段。其他模型参数（如 stream、type、url 等）应在配置文件（auto_model_config_map）中指定，由 config_id 引用后自动注入。

Source code in lazyllm/module/llms/automodel.py

class AutoModel:
    """用于快速创建在线推理模块 OnlineModule 或本地 TrainableModule 的工厂类。它会优先采用用户传入的参数，若开启 ``config`` 则会根据 ``auto_model_config_map`` 中的配置进行覆盖，然后自动判断应当构建在线模块还是本地模块：

- 当判定为在线模块时，参数会透传给 OnlineModule（自动匹配 OnlineChatModule / OnlineEmbeddingModule / OnlineMultiModalModule）。

- 当判定为本地模块时，则以 ``model`` 与用户参数初始化 TrainableModule，并读取 config map 里的配置参数。

Args:
    model (str): 指定模型名称。例如 ``Qwen3-32B``。必填。
    config_id (Optional[str]): 指定配置文件里的id。默认为空。
    source (Optional[str]): 使用的服务提供方。为在线模块（``OnlineModule``）指定 ``qwen`` / ``glm`` / ``openai`` 等；若设为 ``local`` 则强制创建本地 TrainableModule。
    type (Optional[str]): 模型类型。若未指定会尝试从 kwargs 中获取或由在线模块自动推断。
    config (Union[str, bool]): 是否启用 ``auto_model_config_map`` 的覆盖逻辑，或者用户指定的 config 文件路径。默认为 True。
    **kwargs: 仅接受 `model` 的同义字段 `base_model`、`embed_model_name` 和 `model_name`，不接收其他用户自定义字段。其他模型参数（如 ``stream``、``type``、``url`` 等）应在配置文件（``auto_model_config_map``）中指定，由 ``config_id`` 引用后自动注入。
"""

    def __new__(cls, model: Optional[str] = None, *, config_id: Optional[str] = None, source: Optional[str] = None,  # noqa C901
                type: Optional[str] = None, config: Union[str, bool] = True, **kwargs: Any):
        # check and accomodate user params
        model = model or kwargs.pop('base_model', kwargs.pop('embed_model_name', kwargs.pop('model_name', None)))

        if source == 'dynamic':
            from .onlinemodule import OnlineChatModule
            dynamic_auth = kwargs.pop('dynamic_auth', False)
            return OnlineChatModule(source='dynamic', dynamic_auth=dynamic_auth, type=type, **kwargs)

        if model in lazyllm.online.chat:
            if source is not None:
                raise ValueError(
                    f'`{model!r}` is a recognised source name; pass it as `source=` and '
                    f'do not also set `source={source!r}`.')
            source, model = model, None

        if not model:
            try:
                return lazyllm.OnlineModule(source=source, type=type)
            except Exception as e:
                raise RuntimeError(f'`model` is not provided in AutoModel, and {e}') from None

        trainable_entry, online_entry = get_candidate_entries(model, config_id, source, config)

        # 1) first: try TrainableModule with trainable config (for directly connecting deployed endpoint)
        if trainable_entry is not None:
            trainable_args = process_trainable_args(
                model=model, type=type, source=source, config=config, entry=trainable_entry
            )
            try:
                module = TrainableModule(**trainable_args)
                if module._url or module._impl._get_deploy_tasks.flag: return module
            except Exception as e:
                LOG.warning('Fail to create `TrainableModule`, will try to '
                            f'load model {model} with `OnlineModule`. Since the error: {e}')

        # 2) second: try OnlineModule with online config if found
        if online_entry is not None:
            online_args = process_online_args(model=model, source=source, type=type, entry=online_entry)
            if online_args: return OnlineModule(**online_args)

        # 3) finally: fallback (no config or config unusable)
        try:
            return OnlineModule(model=model, source=source, type=type)
        except Exception as e:
            LOG.warning('`OnlineModule` creation failed, and will try to '
                        f'load model {model} with local `TrainableModule`. Since the error: {e}')
            return TrainableModule(model, type=type)

`lazyllm.module.TrialModule`

Bases: object

参数网格搜索模块，会遍历其所有的submodule，收集所有的可被搜索的参数，遍历这些参数进行微调、部署和评测

Parameters:

m (Callable) –

被网格搜索参数的子模块，微调、部署和评测都会基于这个模块进行

Examples:

>>> import lazyllm
>>> from lazyllm import finetune, deploy
>>> m = lazyllm.TrainableModule('b1', 't').finetune_method(finetune.dummy, **dict(a=lazyllm.Option(['f1', 'f2'])))
>>> m.deploy_method(deploy.dummy).mode('finetune').prompt(None)
>>> s = lazyllm.ServerModule(m, post=lambda x, ori: f'post2({x})')
>>> s.evalset([1, 2, 3])
>>> t = lazyllm.TrialModule(s)
>>> t.update()
>>>
dummy finetune!, and init-args is {a: f1}
dummy finetune!, and init-args is {a: f2}
[["post2(reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1})", "post2(reply for 2, and parameters is {'do_sample': False, 'temperature': 0.1})", "post2(reply for 3, and parameters is {'do_sample': False, 'temperature': 0.1})"], ["post2(reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1})", "post2(reply for 2, and parameters is {'do_sample': False, 'temperature': 0.1})", "post2(reply for 3, and parameters is {'do_sample': False, 'temperature': 0.1})"]]

Source code in lazyllm/module/trialmodule.py

class TrialModule(object):
    """参数网格搜索模块，会遍历其所有的submodule，收集所有的可被搜索的参数，遍历这些参数进行微调、部署和评测

Args:
    m (Callable): 被网格搜索参数的子模块，微调、部署和评测都会基于这个模块进行


Examples:
    >>> import lazyllm
    >>> from lazyllm import finetune, deploy
    >>> m = lazyllm.TrainableModule('b1', 't').finetune_method(finetune.dummy, **dict(a=lazyllm.Option(['f1', 'f2'])))
    >>> m.deploy_method(deploy.dummy).mode('finetune').prompt(None)
    >>> s = lazyllm.ServerModule(m, post=lambda x, ori: f'post2({x})')
    >>> s.evalset([1, 2, 3])
    >>> t = lazyllm.TrialModule(s)
    >>> t.update()
    >>>
    dummy finetune!, and init-args is {a: f1}
    dummy finetune!, and init-args is {a: f2}
    [["post2(reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1})", "post2(reply for 2, and parameters is {'do_sample': False, 'temperature': 0.1})", "post2(reply for 3, and parameters is {'do_sample': False, 'temperature': 0.1})"], ["post2(reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1})", "post2(reply for 2, and parameters is {'do_sample': False, 'temperature': 0.1})", "post2(reply for 3, and parameters is {'do_sample': False, 'temperature': 0.1})"]]
    """
    def __init__(self, m):
        self.m = m

    @staticmethod
    def work(m, q):
        """静态方法，用于在子进程中复制模块、执行更新操作，并将评测结果放入队列中。

Args:
    m (Callable): 要执行更新操作的模块。
    q (multiprocessing.Queue): 用于存放评测结果的队列。
"""
        # update option at module.update()
        m = copy.deepcopy(m)
        m.update()
        q.put(m.eval_result)

    def update(self):
        """遍历模块的所有配置选项，使用多进程并行执行模块更新，并收集每个配置的评测结果。
"""
        options = get_options(self.m)
        q = multiprocessing.Queue()
        ps = []
        for _ in OptionIter(options, get_options):
            p = ForkProcess(target=TrialModule.work, args=(self.m, q), sync=True)
            ps.append(p)
            p.start()
            time.sleep(1)
        [p.join() for p in ps]
        result = [q.get() for p in ps]
        LOG.info(f'{result}')

`update()`

遍历模块的所有配置选项，使用多进程并行执行模块更新，并收集每个配置的评测结果。

Source code in lazyllm/module/trialmodule.py

    def update(self):
        """遍历模块的所有配置选项，使用多进程并行执行模块更新，并收集每个配置的评测结果。
"""
        options = get_options(self.m)
        q = multiprocessing.Queue()
        ps = []
        for _ in OptionIter(options, get_options):
            p = ForkProcess(target=TrialModule.work, args=(self.m, q), sync=True)
            ps.append(p)
            p.start()
            time.sleep(1)
        [p.join() for p in ps]
        result = [q.get() for p in ps]
        LOG.info(f'{result}')

`work(m, q)` `staticmethod`

静态方法，用于在子进程中复制模块、执行更新操作，并将评测结果放入队列中。

Parameters:

m (Callable) –

要执行更新操作的模块。
q (Queue) –

用于存放评测结果的队列。

Source code in lazyllm/module/trialmodule.py

    @staticmethod
    def work(m, q):
        """静态方法，用于在子进程中复制模块、执行更新操作，并将评测结果放入队列中。

Args:
    m (Callable): 要执行更新操作的模块。
    q (multiprocessing.Queue): 用于存放评测结果的队列。
"""
        # update option at module.update()
        m = copy.deepcopy(m)
        m.update()
        q.put(m.eval_result)

`lazyllm.module.OnlineChatModule`

Bases: _DynamicSourceRouterMixin, LLMBase

用来管理创建目前市面上公开的大模型平台访问模块，目前支持openai、sensenova、glm、kimi、qwen、doubao、ppio、deekseek(由于该平台暂时不让充值了，暂时不支持访问)。平台的api key获取方法参见开始入门

Parameters:

model (str, default: None ) –

指定要访问的模型 (注意使用豆包时需用 Model ID 或 Endpoint ID，获取方式详见获取推理接入点。使用模型前，要先在豆包平台开通对应服务。)，默认为 gpt-3.5-turbo(openai) / SenseChat-5(sensenova) / glm-4(glm) / moonshot-v1-8k(kimi) / qwen-plus(qwen) / mistral-7b-instruct-v0.2(doubao) / deepseek/deepseek-v3.2(ppio)
source (str, default: None ) –

指定要创建的模块类型，可选为 openai / sensenova / glm / kimi / qwen / doubao / ppio / deepseek(暂时不支持访问)。也可以直接将 source 名称作为 model 传入，系统会自动识别并交换两者。
url (str, default: None ) –

指定要访问的平台的基础链接，默认是官方链接。也可使用别名 base_url 传入。
system_prompt (str) –

指定请求的system prompt，默认是官方给的system prompt
api_key (str, default: None ) –

可显式传入 API Key；当设置为 auto 或 dynamic 时，将在运行时从配置读取，支持动态切换 key
stream (bool, default: True ) –

是否流式请求和输出，默认为流式
dynamic_auth (bool, default: False ) –

是否启用动态鉴权；为 True 时等价于 api_key='dynamic'
return_trace (bool, default: False ) –

是否将结果记录在trace中，默认为False

Examples:

>>> import lazyllm
>>> from functools import partial
>>> m = lazyllm.OnlineChatModule(source="sensenova", stream=True)
>>> query = "Hello!"
>>> with lazyllm.ThreadPoolExecutor(1) as executor:
...     future = executor.submit(partial(m, llm_chat_history=[]), query)
...     while True:
...         if value := lazyllm.FileSystemQueue().dequeue():
...             print(f"output: {''.join(value)}")
...         elif future.done():
...             break
...     print(f"ret: {future.result()}")
...
output: Hello
output: ! How can I assist you today?
ret: Hello! How can I assist you today?
>>> from lazyllm.components.formatter import encode_query_with_filepaths
>>> vlm = lazyllm.OnlineChatModule(source="sensenova", model="SenseChat-Vision")
>>> query = "what is it?"
>>> inputs = encode_query_with_filepaths(query, ["/path/to/your/image"])
>>> print(vlm(inputs))

Source code in lazyllm/module/llms/onlinemodule/chat.py

class OnlineChatModule(_DynamicSourceRouterMixin, LLMBase, metaclass=_ChatModuleMeta):
    """用来管理创建目前市面上公开的大模型平台访问模块，目前支持openai、sensenova、glm、kimi、qwen、doubao、ppio、deekseek(由于该平台暂时不让充值了，暂时不支持访问)。平台的api key获取方法参见 [开始入门](/#platform)

Args:
    model (str): 指定要访问的模型 (注意使用豆包时需用 Model ID 或 Endpoint ID，获取方式详见 [获取推理接入点](https://www.volcengine.com/docs/82379/1099522)。使用模型前，要先在豆包平台开通对应服务。)，默认为 ``gpt-3.5-turbo(openai)`` / ``SenseChat-5(sensenova)`` / ``glm-4(glm)`` / ``moonshot-v1-8k(kimi)`` / ``qwen-plus(qwen)`` / ``mistral-7b-instruct-v0.2(doubao)`` / ``deepseek/deepseek-v3.2(ppio)`` 
    source (str): 指定要创建的模块类型，可选为 ``openai`` /  ``sensenova`` /  ``glm`` /  ``kimi`` /  ``qwen`` / ``doubao`` / ``ppio`` / ``deepseek(暂时不支持访问)``。也可以直接将 source 名称作为 ``model`` 传入，系统会自动识别并交换两者。
    url (str): 指定要访问的平台的基础链接，默认是官方链接。也可使用别名 ``base_url`` 传入。
    system_prompt (str): 指定请求的system prompt，默认是官方给的system prompt
    api_key (str): 可显式传入 API Key；当设置为 ``auto`` 或 ``dynamic`` 时，将在运行时从配置读取，支持动态切换 key
    stream (bool): 是否流式请求和输出，默认为流式
    dynamic_auth (bool): 是否启用动态鉴权；为 True 时等价于 ``api_key='dynamic'``
    return_trace (bool): 是否将结果记录在trace中，默认为False


Examples:
    >>> import lazyllm
    >>> from functools import partial
    >>> m = lazyllm.OnlineChatModule(source="sensenova", stream=True)
    >>> query = "Hello!"
    >>> with lazyllm.ThreadPoolExecutor(1) as executor:
    ...     future = executor.submit(partial(m, llm_chat_history=[]), query)
    ...     while True:
    ...         if value := lazyllm.FileSystemQueue().dequeue():
    ...             print(f"output: {''.join(value)}")
    ...         elif future.done():
    ...             break
    ...     print(f"ret: {future.result()}")
    ...
    output: Hello
    output: ! How can I assist you today?
    ret: Hello! How can I assist you today?
    >>> from lazyllm.components.formatter import encode_query_with_filepaths
    >>> vlm = lazyllm.OnlineChatModule(source="sensenova", model="SenseChat-Vision")
    >>> query = "what is it?"
    >>> inputs = encode_query_with_filepaths(query, ["/path/to/your/image"])
    >>> print(vlm(inputs))
    """
    _dynamic_module_slot = 'chat'
    _dynamic_source_error = 'No source is configured for dynamic LLM source.'

    def __new__(cls, model: str = None, source: str = None, url: str = None, stream: bool = True,
                return_trace: bool = False, skip_auth: bool = False, type: Optional[str] = None,
                api_key: str = None, static_params: Optional[StaticParams] = None, id: Optional[str] = None,
                name: Optional[str] = None, group_id: Optional[str] = None, dynamic_auth: bool = False, **kwargs):
        model, source, url, kwargs = resolve_online_params(
            model, source, url, kwargs, url_aliases='base_url', source_registry=lazyllm.online.chat)
        if cls._should_use_dynamic(source, dynamic_auth, skip_auth):
            return super().__new__(cls)

        if source is None and api_key is not None:
            raise ValueError('No source is given but an api_key is provided.')
        source, default_key = select_source_with_default_key(lazyllm.online.chat, source, LLMType.CHAT)
        api_key = api_key if api_key is not None else default_key
        if skip_auth and not url:
            raise ValueError('url must be set for local serving.')

        type = cls._resolve_type_name(type, model, options=[LLMType.LLM, LLMType.CHAT, LLMType.VLM])
        return getattr(lazyllm.online.chat, source)(
            base_url=url, model=model, stream=stream, return_trace=return_trace,
            api_key=api_key, skip_auth=skip_auth, type=type, **kwargs)

    def __init__(self, model: str = None, source: str = None, url: str = None, stream: bool = True,
                 return_trace: bool = False, skip_auth: bool = False, type: Optional[str] = None,
                 api_key: str = None, static_params: Optional[StaticParams] = None, id: Optional[str] = None,
                 name: Optional[str] = None, group_id: Optional[str] = None, dynamic_auth: bool = False, **kwargs):
        model, source, url, kwargs = resolve_online_params(
            model, source, url, kwargs, url_aliases='base_url', source_registry=lazyllm.online.chat)
        normalized_type = self._resolve_type_name(type, model, options=[LLMType.LLM, LLMType.CHAT, LLMType.VLM])
        _DynamicSourceRouterMixin.__init__(self, id=id, name=name, group_id=group_id, return_trace=return_trace)
        LLMBase.__init__(self, stream=stream, type=normalized_type, static_params=static_params)
        self._kwargs = kwargs
        self._base_url = url
        self._model_name = model
        self._skip_auth = skip_auth
        self._init_dynamic_auth(api_key, dynamic_auth)

    def _build_supplier(self, source: str, skip_auth: bool):
        params = {
            'base_url': self._base_url, 'model': self._model_name, 'stream': self._stream, 'type': self._type,
            'static_params': self._static_params, 'skip_auth': skip_auth, 'api_key': self._api_key,
            'return_trace': self._return_trace, **self._kwargs}
        supplier = getattr(lazyllm.online.chat, source)(**params)
        supplier.prompt(getattr(self, '_prompt', None))
        supplier.formatter(getattr(self, '_formatter', None))
        return supplier

`lazyllm.module.llms.onlinemodule.supplier.doubao.DoubaoChat`

Bases: OnlineChatModuleBase

豆包（Doubao）在线聊天模块，继承自 OnlineChatModuleBase。
封装了对字节跳动 Doubao API 的调用，用于进行多轮问答交互。默认使用模型 doubao-1-5-pro-32k-250115，支持流式输出和调用链追踪。

Parameters:

model (str, default: None ) –

使用的模型名称，默认为 doubao-1-5-pro-32k-250115。
base_url (str, default: None ) –

API 基础 URL，默认为 "https://ark.cn-beijing.volces.com/api/v3/"。
api_key (Optional[str], default: None ) –

Doubao API Key，若未提供，则从 lazyllm.config['doubao_api_key'] 读取。
stream (bool, default: True ) –

是否启用流式输出，默认为 True。
return_trace (bool, default: False ) –

是否返回调用链追踪信息，默认为 False。
**kwargs –

其他传递给基类 OnlineChatModuleBase 的参数。

Source code in lazyllm/module/llms/onlinemodule/supplier/doubao.py

class DoubaoChat(OnlineChatModuleBase):
    """豆包（Doubao）在线聊天模块，继承自 OnlineChatModuleBase。  
封装了对字节跳动 Doubao API 的调用，用于进行多轮问答交互。默认使用模型 `doubao-1-5-pro-32k-250115`，支持流式输出和调用链追踪。

Args:
    model (str): 使用的模型名称，默认为 `doubao-1-5-pro-32k-250115`。
    base_url (str): API 基础 URL，默认为 "https://ark.cn-beijing.volces.com/api/v3/"。
    api_key (Optional[str]): Doubao API Key，若未提供，则从 lazyllm.config['doubao_api_key'] 读取。
    stream (bool): 是否启用流式输出，默认为 True。
    return_trace (bool): 是否返回调用链追踪信息，默认为 False。
    **kwargs: 其他传递给基类 OnlineChatModuleBase 的参数。
"""
    MODEL_NAME = 'doubao-1-5-pro-32k-250115'
    VLM_MODEL_PREFIX = ['doubao-seed-1-6-vision', 'doubao-1-5-ui-tars']

    def __init__(self, model: Optional[str] = None, base_url: Optional[str] = None,
                 api_key: str = None, stream: bool = True, return_trace: bool = False, **kwargs):
        base_url = base_url or 'https://ark.cn-beijing.volces.com/api/v3/'
        super().__init__(api_key=api_key or self._default_api_key(), base_url=base_url,
                         model_name=model or lazyllm.config['doubao_model_name'] or DoubaoChat.MODEL_NAME,
                         stream=stream, return_trace=return_trace, **kwargs)

    def _get_system_prompt(self):
        return ('You are Doubao, an AI assistant. Your task is to provide appropriate responses '
                'and support to user\'s questions and requests.')

    def _validate_api_key(self):
        """Validate API Key by sending a minimal request"""
        try:
            # Doubao (Volcano Engine) validates API key using a minimal chat request
            data = {
                'model': self._model_name,
                'messages': [{'role': 'user', 'content': 'hi'}],
                'max_tokens': 1  # Only generate 1 token for validation
            }
            response = requests.post(self._chat_url, headers=self._header, json=data, timeout=10)
            return response.status_code == 200
        except Exception:
            return False

`lazyllm.module.llms.onlinemodule.supplier.ppio.PPIOChat`

Bases: OnlineChatModuleBase

PPIO（派欧云）在线聊天模块，继承自 OnlineChatModuleBase。
封装了对 PPIO (Paiou Cloud) API 的调用，用于进行多轮问答交互。默认使用模型 deepseek/deepseek-v3.2，支持流式输出和调用链追踪。PPIO 提供 OpenAI 兼容的 API 接口。

Parameters:

model (str, default: None ) –

使用的模型名称，默认为 deepseek/deepseek-v3.2。
base_url (str, default: None ) –

API 基础 URL，默认为 "https://api.ppinfra.com/openai"。
api_key (Optional[str], default: None ) –

PPIO API Key，若未提供，则从 lazyllm.config['ppio_api_key'] 读取。
stream (bool, default: True ) –

是否启用流式输出，默认为 True。
return_trace (bool, default: False ) –

是否返回调用链追踪信息，默认为 False。
**kwargs –

其他传递给基类 OnlineChatModuleBase 的参数。

Examples:

>>> import lazyllm
>>> # Set environment variable: export LAZYLLM_PPIO_API_KEY=your_api_key
>>> # Or create config file ~/.lazyllm/config.json: {"ppio_api_key": "your_api_key"}
>>> chat = lazyllm.OnlineChatModule(source='ppio', model='deepseek/deepseek-v3.2')
>>> response = chat('Hello, how are you?')
>>> print(response)

Source code in lazyllm/module/llms/onlinemodule/supplier/ppio.py

class PPIOChat(OnlineChatModuleBase):
    """PPIO（派欧云）在线聊天模块，继承自 OnlineChatModuleBase。  
封装了对 PPIO (Paiou Cloud) API 的调用，用于进行多轮问答交互。默认使用模型 `deepseek/deepseek-v3.2`，支持流式输出和调用链追踪。PPIO 提供 OpenAI 兼容的 API 接口。

Args:
    model (str): 使用的模型名称，默认为 `deepseek/deepseek-v3.2`。
    base_url (str): API 基础 URL，默认为 "https://api.ppinfra.com/openai"。
    api_key (Optional[str]): PPIO API Key，若未提供，则从 lazyllm.config['ppio_api_key'] 读取。
    stream (bool): 是否启用流式输出，默认为 True。
    return_trace (bool): 是否返回调用链追踪信息，默认为 False。
    **kwargs: 其他传递给基类 OnlineChatModuleBase 的参数。


Examples:
    >>> import lazyllm
    >>> # Set environment variable: export LAZYLLM_PPIO_API_KEY=your_api_key
    >>> # Or create config file ~/.lazyllm/config.json: {"ppio_api_key": "your_api_key"}
    >>> chat = lazyllm.OnlineChatModule(source='ppio', model='deepseek/deepseek-v3.2')
    >>> response = chat('Hello, how are you?')
    >>> print(response)
    """
    TRAINABLE_MODEL_LIST = []
    NO_PROXY = False

    # Initialize PPIO module.
    # Args:
    #     base_url: API base URL, defaults to 'https://api.ppinfra.com/openai'
    #     model: Model name, defaults to 'deepseek/deepseek-v3.2'
    #     api_key: API key, if not provided, will be read from config
    #     stream: Whether to use streaming output, defaults to True
    #     return_trace: Whether to return execution trace, defaults to False
    #     skip_auth: Whether to skip authentication, defaults to False
    #     **kw: Other parameters
    def __init__(self, base_url: Optional[str] = None, model: Optional[str] = None,
                 api_key: str = None, stream: bool = True,
                 return_trace: bool = False, skip_auth: bool = False, **kw):
        base_url = base_url or 'https://api.ppinfra.com/openai/'
        model = model or 'deepseek/deepseek-v3.2'
        super().__init__(api_key=api_key or self._default_api_key(), base_url=base_url,
                         model_name=model, stream=stream, return_trace=return_trace, skip_auth=skip_auth, **kw)

    # Return PPIO system prompt.
    def _get_system_prompt(self):
        return 'You are a helpful AI assistant.'

    # Validate API key by sending a minimal chat request.
    def _validate_api_key(self):
        try:
            data = {'model': self._model_name, 'messages': [{'role': 'user', 'content': 'hi'}], 'max_tokens': 1}
            response = requests.post(self._chat_url, headers=self._header, json=data, timeout=10)
            return response.status_code == 200
        except Exception:
            return False

    # Chat API URL - PPIO endpoint is /openai/chat/completions.
    def _get_chat_url(self, url):
        base = (url or '').rstrip('/')
        if base.endswith('/chat/completions'):
            return url
        if not base.endswith(('/openai', '/v1')):
            base = f'{base}/openai/'
        else:
            base = f'{base}/'
        return urljoin(base, 'chat/completions')

    # PPIO does not support deployment, return model name and running status.
    def _create_deployment(self) -> Tuple[str, str]:
        return (self._model_name, 'RUNNING')

    # PPIO does not support deployment query, return running status.
    def _query_deployment(self, deployment_id) -> str:
        return 'RUNNING'

    def __repr__(self):
        return lazyllm.make_repr('Module', 'PPIO', name=self._model_name, url=self._base_url,
                                 stream=bool(self._stream), return_trace=self._return_trace)

`lazyllm.module.OnlineEmbeddingModule`

Bases: _DynamicSourceRouterMixin

用来管理创建目前市面上的在线Embedding服务模块，目前支持openai、sensenova、glm、qwen、doubao

Parameters:

model (str, default: None ) –

指定要访问的模型 (注意使用豆包时需用 Model ID 或 Endpoint ID，获取方式详见获取推理接入点。使用模型前，要先在豆包平台开通对应服务。)，默认为 text-embedding-ada-002(openai) / nova-embedding-stable(sensenova) / embedding-2(glm) / text-embedding-v1(qwen) / doubao-embedding-text-240715(doubao)。也可使用别名 embed_model_name 或 model_name 传入。也可将 source 名称直接作为 model 传入，系统会自动识别并交换两者。
source (str, default: None ) –

指定要创建的模块类型，可选为 openai / sensenova / glm / qwen / doubao
url (str, default: None ) –

指定要访问的平台的基础链接，默认是官方链接。也可使用别名 embed_url 或 base_url 传入。
type (str, default: None ) –

模型服务类型，可选 embed / rerank，默认根据模型名自动推断。
api_key (str, default: None ) –

可显式传入 API Key；当设置为 auto 或 dynamic 时，将在运行时从配置读取，支持动态切换 key
dynamic_auth (bool, default: False ) –

是否启用动态鉴权；为 True 时等价于 api_key='dynamic'
return_trace (bool, default: False ) –

是否将结果记录在trace中，默认为False
batch_size (int, default: 32 ) –

批量请求时每批的大小，默认为32

Examples:

>>> import lazyllm
>>> m = lazyllm.OnlineEmbeddingModule(source="sensenova")
>>> emb = m("hello world")
>>> print(f"emb: {emb}")
emb: [0.0010528564, 0.0063285828, 0.0049476624, -0.012008667, ..., -0.009124756, 0.0032043457, -0.051696777]
>>> m2 = lazyllm.OnlineEmbeddingModule("sensenova")
>>> emb2 = m2("hello world")

Source code in lazyllm/module/llms/onlinemodule/embedding.py

class OnlineEmbeddingModule(_DynamicSourceRouterMixin, metaclass=__EmbedModuleMeta):
    """用来管理创建目前市面上的在线Embedding服务模块，目前支持openai、sensenova、glm、qwen、doubao

Args:
    model (str): 指定要访问的模型 (注意使用豆包时需用 Model ID 或 Endpoint ID，获取方式详见 [获取推理接入点](https://www.volcengine.com/docs/82379/1099522)。使用模型前，要先在豆包平台开通对应服务。)，默认为 ``text-embedding-ada-002(openai)`` / ``nova-embedding-stable(sensenova)`` / ``embedding-2(glm)`` / ``text-embedding-v1(qwen)`` / ``doubao-embedding-text-240715(doubao)``。也可使用别名 ``embed_model_name`` 或 ``model_name`` 传入。也可将 source 名称直接作为 ``model`` 传入，系统会自动识别并交换两者。
    source (str): 指定要创建的模块类型，可选为 ``openai`` /  ``sensenova`` /  ``glm`` /  ``qwen`` / ``doubao``
    url (str): 指定要访问的平台的基础链接，默认是官方链接。也可使用别名 ``embed_url`` 或 ``base_url`` 传入。
    type (str): 模型服务类型，可选 ``embed`` / ``rerank``，默认根据模型名自动推断。
    api_key (str): 可显式传入 API Key；当设置为 ``auto`` 或 ``dynamic`` 时，将在运行时从配置读取，支持动态切换 key
    dynamic_auth (bool): 是否启用动态鉴权；为 True 时等价于 ``api_key='dynamic'``
    return_trace (bool): 是否将结果记录在trace中，默认为False
    batch_size (int): 批量请求时每批的大小，默认为32


Examples:
    >>> import lazyllm
    >>> m = lazyllm.OnlineEmbeddingModule(source="sensenova")
    >>> emb = m("hello world")
    >>> print(f"emb: {emb}")
    emb: [0.0010528564, 0.0063285828, 0.0049476624, -0.012008667, ..., -0.009124756, 0.0032043457, -0.051696777]
    >>> m2 = lazyllm.OnlineEmbeddingModule("sensenova")
    >>> emb2 = m2("hello world")
    """
    _dynamic_module_slot = 'embed'
    _dynamic_source_error = 'No source is configured for dynamic embedding source.'

    @staticmethod
    def _resolve_type_name(type_name: Optional[str], embed_model_name: Optional[str]) -> str:
        if type_name is not None:
            if type_name == LLMType.CROSS_MODAL_EMBED:
                return 'cross_modal_embed'
            return type_name
        resolved = get_model_type(embed_model_name) if embed_model_name else 'embed'
        return resolved if resolved in ('embed', 'rerank', 'cross_modal_embed') else 'embed'

    @staticmethod
    def _create_supplier(source: str, type_name: str, embed_model_name: str, params: dict):
        if type_name == 'cross_modal_embed':
            if source == 'doubao':
                return DoubaoMultimodalEmbed(**params)
            if source == 'qwen':
                return QwenMultimodalEmbed(**params)
            if source == 'siliconflow':
                return SiliconFlowMultimodalEmbed(**params)
            # OpenAI-compatible self-hosted cross-modal embedding (e.g. siglip via source=openai).
            if source in lazyllm.online.embed:
                return getattr(lazyllm.online.embed, source)(**params)
            raise ValueError(f'Source {source!r} does not support CROSS_MODAL_EMBED.')
        if type_name == 'embed':
            if source == 'doubao' and embed_model_name and embed_model_name.startswith('doubao-embedding-vision'):
                return DoubaoMultimodalEmbed(**params)
            if source == 'qwen' and _is_qwen_multimodal_embed_model(embed_model_name):
                return QwenMultimodalEmbed(**params)
            if source == 'siliconflow' and _is_siliconflow_multimodal_embed_model(embed_model_name):
                return SiliconFlowMultimodalEmbed(**params)
            if source == 'doubao':
                return DoubaoEmbed(**params)
            return getattr(lazyllm.online.embed, source)(**params)
        if type_name == 'rerank':
            return getattr(lazyllm.online.rerank, source)(**params)
        raise ValueError('Unknown type of online embedding module.')

    @staticmethod
    def _is_embed_source(name: str) -> bool:
        return name in lazyllm.online.embed or name in lazyllm.online.rerank

    def __new__(cls, model: str = None, source: str = None, url: str = None,
                return_trace: bool = False, api_key: str = None, dynamic_auth: bool = False,
                skip_auth: bool = False, id: Optional[str] = None, name: Optional[str] = None,
                group_id: Optional[str] = None, type: Optional[str] = None, batch_size: int = 32,
                num_worker: int = 4, **kwargs):
        model, source, url, kwargs = resolve_online_params(
            model, source, url, kwargs,
            model_aliases=('embed_model_name', 'model_name'), url_aliases=('embed_url', 'base_url'),
            source_registry=OnlineEmbeddingModule._is_embed_source)
        if cls._should_use_dynamic(source, dynamic_auth, skip_auth):
            return super().__new__(cls)
        if source is None and api_key is not None:
            raise ValueError('No source is given but an api_key is provided.')
        type_name = OnlineEmbeddingModule._resolve_type_name(type, model)
        if type_name in ('embed', 'cross_modal_embed'):
            source, default_key = select_source_with_default_key(lazyllm.online.embed, source, LLMType.EMBED)
        elif type_name == 'rerank':
            source, default_key = select_source_with_default_key(lazyllm.online.rerank, source, LLMType.RERANK)
        else:
            raise ValueError('Unknown type of online embedding module.')
        api_key = api_key if api_key is not None else default_key
        if skip_auth and not url:
            raise ValueError('url must be set for local serving.')
        params = {'embed_url': url, 'embed_model_name': model, 'return_trace': return_trace,
                  'batch_size': batch_size, 'num_worker': num_worker,
                  'api_key': api_key, 'skip_auth': skip_auth, **kwargs}
        return OnlineEmbeddingModule._create_supplier(source, type_name, model, params)

    def __init__(self, model: str = None, source: str = None, url: str = None,
                 return_trace: bool = False, api_key: str = None, dynamic_auth: bool = False,
                 skip_auth: bool = False, id: Optional[str] = None, name: Optional[str] = None,
                 group_id: Optional[str] = None, type: Optional[str] = None, batch_size: int = 32,
                 num_worker: int = 4, **kwargs):
        model, source, url, kwargs = resolve_online_params(
            model, source, url, kwargs,
            model_aliases=('embed_model_name', 'model_name'), url_aliases=('embed_url', 'base_url'),
            source_registry=OnlineEmbeddingModule._is_embed_source)
        _DynamicSourceRouterMixin.__init__(self, id=id, name=name, group_id=group_id, return_trace=return_trace)
        self._embed_url = url
        self._embed_model_name = model
        if source == 'dynamic' and type is None:
            raise ValueError('type must be explicitly provided when source is dynamic.')
        self._type = OnlineEmbeddingModule._resolve_type_name(type, model)
        self._skip_auth = skip_auth
        self._kwargs = kwargs
        self._kwargs.setdefault('num_worker', num_worker)
        self._batch_size = batch_size
        self._init_dynamic_auth(api_key, dynamic_auth)

    def _build_supplier(self, source: str, skip_auth: bool):
        params = {'embed_url': self._embed_url, 'embed_model_name': self._embed_model_name,
                  'return_trace': self._return_trace, 'batch_size': self._batch_size,
                  'api_key': self._api_key, 'skip_auth': skip_auth, **self._kwargs}
        return OnlineEmbeddingModule._create_supplier(source, self._type, self._embed_model_name, params)

`lazyllm.module.OnlineMultiModalModule`

Bases: _DynamicSourceRouterMixin

用来管理创建在线多模态服务模块，目前支持 stt / tts / text2image / image_editing 类型。

Parameters:

model (str, default: None ) –

指定要访问的模型名称。
source (str, default: None ) –

指定要创建的模块类型，如 qwen / glm / minimax / siliconflow / doubao 等。
type (str, default: None ) –

多模态任务类型，可选 stt / tts / text2image / image_editing。
url (str, default: None ) –

指定要访问的平台基础链接，默认使用各平台官方链接。也可使用别名 base_url 传入。
api_key (str, default: None ) –

可显式传入 API Key；当设置为 auto 或 dynamic 时，将在运行时从配置读取，支持动态切换 key。
dynamic_auth (bool, default: False ) –

是否启用动态鉴权；为 True 时等价于 api_key='dynamic'。
return_trace (bool, default: False ) –

是否将结果记录在 trace 中，默认为 False。

Examples:

>>> import lazyllm
>>> stt = lazyllm.OnlineMultiModalModule(source='qwen', type='stt', api_key='dynamic')
>>> tts = lazyllm.OnlineMultiModalModule(source='qwen', type='tts', dynamic_auth=True)
>>> img = lazyllm.OnlineMultiModalModule(source='qwen', type='text2image')

Source code in lazyllm/module/llms/onlinemodule/multimodal.py

class OnlineMultiModalModule(_DynamicSourceRouterMixin, metaclass=_OnlineMultiModalMeta):
    """用来管理创建在线多模态服务模块，目前支持 ``stt`` / ``tts`` / ``text2image`` / ``image_editing`` 类型。

Args:
    model (str): 指定要访问的模型名称。
    source (str): 指定要创建的模块类型，如 ``qwen`` / ``glm`` / ``minimax`` / ``siliconflow`` / ``doubao`` 等。
    type (str): 多模态任务类型，可选 ``stt`` / ``tts`` / ``text2image`` / ``image_editing``。
    url (str): 指定要访问的平台基础链接，默认使用各平台官方链接。也可使用别名 ``base_url`` 传入。
    api_key (str): 可显式传入 API Key；当设置为 ``auto`` 或 ``dynamic`` 时，将在运行时从配置读取，支持动态切换 key。
    dynamic_auth (bool): 是否启用动态鉴权；为 True 时等价于 ``api_key='dynamic'``。
    return_trace (bool): 是否将结果记录在 trace 中，默认为 False。


Examples:
    >>> import lazyllm
    >>> stt = lazyllm.OnlineMultiModalModule(source='qwen', type='stt', api_key='dynamic')
    >>> tts = lazyllm.OnlineMultiModalModule(source='qwen', type='tts', dynamic_auth=True)
    >>> img = lazyllm.OnlineMultiModalModule(source='qwen', type='text2image')
    """
    _dynamic_module_slot = 'multimodal'
    _dynamic_source_error = 'No source is configured for dynamic multimodal source.'
    TYPE_GROUP_MAP = {
        'stt': LLMType.STT,
        'tts': LLMType.TTS,
        'text2image': LLMType.TEXT2IMAGE,
        'image_editing': LLMType.TEXT2IMAGE,
    }

    @staticmethod
    def _resolve_type_name(type_name: Optional[str], model: Optional[str]) -> str:
        if type_name is not None:
            return LLMType._normalize(type_name)
        resolved = get_model_type(model) if model else None
        if resolved == 'sd':
            return 'text2image'
        if resolved not in OnlineMultiModalModule.TYPE_GROUP_MAP:
            raise ValueError(
                f'Cannot infer multimodal type from model {model!r}. '
                f'Please provide `type` explicitly (one of: {list(OnlineMultiModalModule.TYPE_GROUP_MAP.keys())}).')
        return resolved

    @staticmethod
    def _validate_parameters(source: Optional[str], model: Optional[str], type: str, url: Optional[str],
                             skip_auth: bool = False, **kwargs) -> tuple:
        if type not in OnlineMultiModalModule.TYPE_GROUP_MAP:
            raise ValueError(
                f'Invalid type: {type!r}. Must be one of: {list(OnlineMultiModalModule.TYPE_GROUP_MAP.keys())}')
        register_type = OnlineMultiModalModule.TYPE_GROUP_MAP.get(type).lower()
        model, source, url, kwargs = resolve_online_params(
            model, source, url, kwargs, source_registry=lazyllm.online[register_type])
        source, default_key = select_source_with_default_key(lazyllm.online[register_type], source, type)
        if default_key and not kwargs.get('api_key'):
            kwargs['api_key'] = default_key
        if skip_auth and not url:
            raise ValueError('url must be set for local serving.')
        default_module_cls = getattr(lazyllm.online[register_type], source)
        default_model_name = getattr(default_module_cls, 'IMAGE_EDITING_MODEL_NAME' if type == 'image_editing'
                                     else 'MODEL_NAME', None)
        if model is None and default_model_name:
            model = default_model_name
            lazyllm.LOG.info(f'For type {type}, source {source}. Automatically selected default model: {model}')
        if url is not None:
            kwargs['base_url'] = url
        return source, model, kwargs

    def __new__(cls, model: str = None, source: str = None, url: str = None, type: str = None,
                return_trace: bool = False, api_key: str = None, dynamic_auth: bool = False,
                skip_auth: bool = False, id: Optional[str] = None, name: Optional[str] = None,
                group_id: Optional[str] = None, **kwargs):
        model, source, url, kwargs = resolve_online_params(
            model, source, url, kwargs, url_aliases='base_url')
        if cls._should_use_dynamic(source, dynamic_auth, skip_auth):
            return super().__new__(cls)
        if source is None and api_key is not None:
            raise ValueError('No source is given but an api_key is provided.')
        if api_key is not None:
            kwargs['api_key'] = api_key
        type = OnlineMultiModalModule._resolve_type_name(
            type if type is not None else kwargs.pop('function', None), model)
        source, model, kwargs_normalized = OnlineMultiModalModule._validate_parameters(
            source=source, model=model, type=type, url=url, skip_auth=skip_auth, **kwargs)
        params = {'return_trace': return_trace, 'type': type}
        if model is not None:
            params['model'] = model
        params.update(kwargs_normalized)
        register_type = OnlineMultiModalModule.TYPE_GROUP_MAP.get(type).lower()
        return getattr(lazyllm.online[register_type], source)(**params)

    def __init__(self, model: str = None, source: str = None, url: str = None, type: str = None,
                 return_trace: bool = False, api_key: str = None, dynamic_auth: bool = False,
                 skip_auth: bool = False, id: Optional[str] = None, name: Optional[str] = None,
                 group_id: Optional[str] = None, **kwargs):
        model, source, url, kwargs = resolve_online_params(
            model, source, url, kwargs, url_aliases='base_url')
        _DynamicSourceRouterMixin.__init__(self, id=id, name=name, group_id=group_id, return_trace=return_trace)
        self._model_name = model
        self._base_url = url
        self._skip_auth = skip_auth
        self._type = self._resolve_type_name(type, model)
        self._kwargs = kwargs
        self._init_dynamic_auth(api_key, dynamic_auth)

    def _build_supplier(self, source: str, skip_auth: bool):
        params = {'base_url': self._base_url, 'model': self._model_name, 'return_trace': self._return_trace,
                  'type': self._type, 'api_key': self._api_key, 'skip_auth': skip_auth, **self._kwargs}
        register_type = OnlineMultiModalModule.TYPE_GROUP_MAP.get(self._type).lower()
        return getattr(lazyllm.online[register_type], source)(**params)

`lazyllm.module.llms.onlinemodule.supplier.openai.OpenAIEmbed`

Bases: LazyLLMOnlineEmbedModuleBase

OpenAI 在线嵌入模块。该类封装了对 OpenAI 嵌入 API 的调用，默认使用模型 text-embedding-ada-002，用于将文本编码为向量表示。

Parameters:

embed_url (str, default: None ) –

OpenAI 嵌入 API 的 URL，默认为 "https://api.openai.com/v1/embeddings"。
embed_model_name (str, default: None ) –

使用的嵌入模型名称，默认为 "text-embedding-ada-002"。
api_key (str, default: None ) –

OpenAI 的 API Key。若未提供，则从 lazyllm.config 中读取。

Source code in lazyllm/module/llms/onlinemodule/supplier/openai.py

class OpenAIEmbed(LazyLLMOnlineEmbedModuleBase):
    """OpenAI 在线嵌入模块。
该类封装了对 OpenAI 嵌入 API 的调用，默认使用模型 `text-embedding-ada-002`，用于将文本编码为向量表示。

Args:
    embed_url (str): OpenAI 嵌入 API 的 URL，默认为 "https://api.openai.com/v1/embeddings"。
    embed_model_name (str): 使用的嵌入模型名称，默认为 "text-embedding-ada-002"。
    api_key (str, optional): OpenAI 的 API Key。若未提供，则从 lazyllm.config 中读取。
"""
    NO_PROXY = True

    def __init__(self, embed_url: Optional[str] = None, embed_model_name: Optional[str] = None,
                 api_key: str = None, batch_size: int = 16, **kw):
        embed_url = embed_url or 'https://api.openai.com/v1/'
        embed_model_name = embed_model_name or 'text-embedding-ada-002'
        super().__init__(embed_url, api_key or self._default_api_key(),
                         embed_model_name, batch_size=batch_size, **kw)

    def _set_embed_url(self):
        self._embed_url = urljoin(self._embed_url, 'embeddings')

`lazyllm.module.llms.onlinemodule.supplier.qwen.QwenSTT`

Bases: LazyLLMOnlineSTTModuleBase

基于千问多模态接口的语音转文本（STT）模块，默认使用 paraformer-v2 模型。

Parameters:

model (str, default: None ) –

模型名称。默认为 None，将依次从 lazyllm.config['qwen_stt_model_name'] 或 QwenSTT.MODEL_NAME 获取。
api_key (str, default: None ) –

千问 API 的密钥。默认为 None。
return_trace (bool, default: False ) –

是否返回推理的中间 trace 信息。默认为 False。
**kwargs –

传递给父类 LazyLLMOnlineSTTModuleBase 的额外参数。

Source code in lazyllm/module/llms/onlinemodule/supplier/qwen.py

class QwenSTT(LazyLLMOnlineSTTModuleBase):
    """基于千问多模态接口的语音转文本（STT）模块，默认使用 ``paraformer-v2`` 模型。

Args:
    model (str): 模型名称。默认为 ``None``，将依次从 ``lazyllm.config['qwen_stt_model_name']`` 或 ``QwenSTT.MODEL_NAME`` 获取。
    api_key (str): 千问 API 的密钥。默认为 ``None``。
    return_trace (bool): 是否返回推理的中间 trace 信息。默认为 ``False``。
    **kwargs: 传递给父类 ``LazyLLMOnlineSTTModuleBase`` 的额外参数。
"""
    MODEL_NAME = 'paraformer-v2'

    def __init__(self, model: str = None, api_key: str = None, return_trace: bool = False,
                 base_url: Optional[str] = None,
                 base_websocket_url: Optional[str] = None, **kwargs):
        _ensure_dashscope_urls_initialized()
        base_url = base_url or _DASHSCOPE_DEFAULT_HTTP_URL
        base_websocket_url = base_websocket_url or _DASHSCOPE_DEFAULT_WEBSOCKET_URL
        if base_url and base_url != _DASHSCOPE_DEFAULT_HTTP_URL:
            LOG.warning('QwenSTT ignores `base_url`; use `set_dashscope_urls` instead.')
        if base_websocket_url and base_websocket_url != _DASHSCOPE_DEFAULT_WEBSOCKET_URL:
            LOG.warning('QwenSTT ignores `base_websocket_url`; use `set_dashscope_urls` instead.')
        model_name = model or lazyllm.config['qwen_stt_model_name'] or QwenSTT.MODEL_NAME
        super().__init__(
            api_key=api_key or self._default_api_key(),
            model_name=model_name,
            return_trace=return_trace,
            base_url=base_url,
            **kwargs,
        )

    def _forward(self, files: List[str] = [], url: str = None, model: str = None, **kwargs):  # noqa B006
        assert any(file.startswith('http') for file in files), 'QwenSTT only supports http file urls'
        if url and url != self._base_url:
            raise Exception('Qwen STT forward() does not support overriding the `url` parameter, please remove it.')
        if 'base_websocket_url' in kwargs:
            raise Exception('Qwen STT forward() does not support overriding the `base_websocket_url` parameter.')
        call_params = {'model': model, 'file_urls': files, **kwargs}
        if self._api_key: call_params['api_key'] = self._api_key
        task_response = dashscope.audio.asr.Transcription.async_call(**call_params)
        transcribe_response = dashscope.audio.asr.Transcription.wait(task=task_response.output.task_id,
                                                                     api_key=self._api_key)
        if transcribe_response.status_code == HTTPStatus.OK:
            result_text = ''
            for task in transcribe_response.output.results:
                assert task['subtask_status'] == 'SUCCEEDED', 'subtask_status is not SUCCEEDED'
                response = json.loads(requests.get(task['transcription_url']).text)
                for transcript in response['transcripts']:
                    result_text += re.sub(r'<[^>]+>', '', transcript['text'])
            return result_text
        else:
            LOG.error(f'failed to transcribe: {transcribe_response.output}')
            raise Exception(f'failed to transcribe: {transcribe_response.output.message}')

`lazyllm.module.OnlineChatModuleBase = LazyLLMOnlineChatModuleBase` `module-attribute`

`lazyllm.module.OnlineEmbeddingModuleBase`

Bases: LazyLLMOnlineBase

OnlineEmbeddingModuleBase是管理开放平台的嵌入模型接口的基类，用于请求文本获取嵌入向量。不建议直接对该类进行直接实例化。需要特定平台类继承该类进行实例化。

如果你需要支持新的开放平台的嵌入模型的能力，请让你自定义的类继承自OnlineEmbeddingModuleBase：

如果新平台的嵌入模型的请求和返回数据格式都和openai一样，可以不用做任何处理，只传url和模型即可
如果新平台的嵌入模型的请求或者返回的数据格式和openai不一样，需要重写_encapsulated_data或_parse_response方法。
配置新平台支持的api_key到全局变量，通过lazyllm.config.add(变量名，类型，默认值，环境变量名)进行添加

Parameters:

embed_url (str) –

嵌入API的URL地址。
api_key (str) –

API访问密钥。
embed_model_name (str) –

嵌入模型名称。
return_trace (bool, default: False ) –

是否返回追踪信息，默认为False。

Examples:

>>> import lazyllm
>>> from lazyllm.module import OnlineEmbeddingModuleBase
>>> class NewPlatformEmbeddingModule(OnlineEmbeddingModuleBase):
...     def __init__(self,
...                 embed_url: str = '<new platform embedding url>',
...                 embed_model_name: str = '<new platform embedding model name>'):
...         super().__init__(embed_url, lazyllm.config['new_platform_api_key'], embed_model_name)
...
>>> class NewPlatformEmbeddingModule1(OnlineEmbeddingModuleBase):
...     def __init__(self,
...                 embed_url: str = '<new platform embedding url>',
...                 embed_model_name: str = '<new platform embedding model name>'):
...         super().__init__(embed_url, lazyllm.config['new_platform_api_key'], embed_model_name)
...
...     def _encapsulated_data(self, text:str, **kwargs):
...         pass
...         return json_data
...
...     def _parse_response(self, response: dict[str, any]):
...         pass
...         return embedding

Source code in lazyllm/module/llms/onlinemodule/base/onlineEmbeddingModuleBase.py

class OnlineEmbeddingModuleBase(LazyLLMOnlineBase):
    """OnlineEmbeddingModuleBase是管理开放平台的嵌入模型接口的基类，用于请求文本获取嵌入向量。不建议直接对该类进行直接实例化。需要特定平台类继承该类进行实例化。

如果你需要支持新的开放平台的嵌入模型的能力，请让你自定义的类继承自OnlineEmbeddingModuleBase：

1. 如果新平台的嵌入模型的请求和返回数据格式都和openai一样，可以不用做任何处理，只传url和模型即可
2. 如果新平台的嵌入模型的请求或者返回的数据格式和openai不一样，需要重写_encapsulated_data或_parse_response方法。
3. 配置新平台支持的api_key到全局变量，通过lazyllm.config.add(变量名，类型，默认值，环境变量名)进行添加

Args:
    embed_url (str): 嵌入API的URL地址。
    api_key (str): API访问密钥。
    embed_model_name (str): 嵌入模型名称。
    return_trace (bool, optional): 是否返回追踪信息，默认为False。


Examples:
    >>> import lazyllm
    >>> from lazyllm.module import OnlineEmbeddingModuleBase
    >>> class NewPlatformEmbeddingModule(OnlineEmbeddingModuleBase):
    ...     def __init__(self,
    ...                 embed_url: str = '<new platform embedding url>',
    ...                 embed_model_name: str = '<new platform embedding model name>'):
    ...         super().__init__(embed_url, lazyllm.config['new_platform_api_key'], embed_model_name)
    ...
    >>> class NewPlatformEmbeddingModule1(OnlineEmbeddingModuleBase):
    ...     def __init__(self,
    ...                 embed_url: str = '<new platform embedding url>',
    ...                 embed_model_name: str = '<new platform embedding model name>'):
    ...         super().__init__(embed_url, lazyllm.config['new_platform_api_key'], embed_model_name)
    ...
    ...     def _encapsulated_data(self, text:str, **kwargs):
    ...         pass
    ...         return json_data
    ...
    ...     def _parse_response(self, response: dict[str, any]):
    ...         pass
    ...         return embedding
    """
    NO_PROXY = True
    __lazyllm_registry_disable__ = True
    RAW_URL_PREFIX = '!'
    _EMBED_SUFFIXES = ('embeddings', 'embed', 'sparse_embed', 'rerank')

    def __init__(self, embed_url: str, api_key: str, embed_model_name: str, skip_auth: bool = False,
                 return_trace: bool = False, batch_size: int = 32, num_worker: int = 1, timeout: int = 60):
        super().__init__(api_key=api_key, skip_auth=skip_auth, return_trace=return_trace)
        self._embed_url = embed_url
        self._embed_model_name = embed_model_name
        self._batch_size = batch_size
        self._num_worker = num_worker
        self._timeout = timeout

        if ModelManager.get_model_type(embed_model_name) == 'rerank':
            self._batch_size = 1
        else:
            self._batch_size = batch_size

        if hasattr(self, '_set_embed_url'): self._set_embed_url()

    @property
    def type(self):
        return 'EMBED'

    def _normalize_embed_url(self, url: str) -> Tuple[str, bool]:
        if url.startswith(self.RAW_URL_PREFIX):
            return url[len(self.RAW_URL_PREFIX):], True
        if any(url.rstrip('/').endswith(s) for s in self._EMBED_SUFFIXES):
            return url, True
        return url, False

    def _get_embed_url(self, url: str) -> str:
        url, done = self._normalize_embed_url(url)
        if done: return url
        suffix = 'rerank' if self.type == 'RERANK' else 'embeddings'
        if not url.endswith('/'): url += '/'
        return urljoin(url, suffix)

    @property
    def batch_size(self):
        return self._batch_size

    @batch_size.setter
    def batch_size(self, value: int):
        self._batch_size = value

    def forward(self, input: Union[List, str], url: str = None, model: str = None, **kwargs
                ) -> Union[List[float], List[List[float]]]:
        model, _, url, kwargs = resolve_online_params(
            model, None, url, kwargs,
            model_aliases=('model_name', 'embed_model_name', 'embed_name'),
            url_aliases=('base_url', 'embed_url'))
        runtime_url = self._get_embed_url(url) if url else self._embed_url
        runtime_model = model or self._embed_model_name

        if runtime_model is not None:
            kwargs['model'] = runtime_model
        data = self._encapsulated_data(input, **kwargs)
        proxies = {'http': None, 'https': None} if self.NO_PROXY else None
        if isinstance(data, list):
            return self.run_embed_batch(input, data, proxies, runtime_url, **kwargs)
        else:
            with requests.post(runtime_url, json=data, headers=self._header, proxies=proxies,
                               timeout=self._timeout) as r:
                if r.status_code == 200:
                    return self._parse_response(r.json(), input=input)
                else:
                    err_body = '\n'.join([c.decode('utf-8') for c in r.iter_content(None)])
                    LOG.error(f'[OnlineEmbeddingModuleBase] HTTP {r.status_code} url={runtime_url!r} body={err_body!r}')
                    raise requests.RequestException(err_body)

    def _encapsulated_data(self, input: Union[List, str], **kwargs):
        if isinstance(input, str):
            json_data = {
                'input': [input],
                'model': self._embed_model_name
            }
            if len(kwargs) > 0:
                json_data.update(kwargs)
            return json_data
        else:
            text_batch = [input[i: i + self._batch_size] for i in range(0, len(input), self._batch_size)]
            json_data = [{'input': texts, 'model': self._embed_model_name} for texts in text_batch]
            if len(kwargs) > 0:
                for i in range(len(json_data)):
                    json_data[i].update(kwargs)
            return json_data

    def _parse_response(self, response: Dict, input: Union[List, str]) -> Union[List[List[float]], List[float]]:
        data = response.get('data', [])
        if not data:
            raise Exception('no data received')
        if isinstance(input, str):
            return data[0].get('embedding', [])
        else:
            return [res.get('embedding', []) for res in data]

    def run_embed_batch(self, input: List, data: List, proxies, url: str = None, **kwargs):
        """执行批量嵌入处理的内部方法。

此方法负责处理批量文本嵌入请求，支持单线程和多线程两种处理模式。
当遇到请求失败时，会自动调整批处理大小并重试，提供健壮的错误处理机制。

Args:
    input (List): 原始的输入文本列表
    data (List): 封装好的批量请求数据列表
    proxies: 代理设置，如果NO_PROXY为True则设置为None
    url (str, optional): 本次请求使用的完整接口地址，默认为初始传入的 embed_url
    **kwargs: 其他关键字参数

**Returns:**

- 嵌入向量列表的列表，每个子列表对应一个输入文本的嵌入向量
"""
        ret = [[] for _ in range(len(input))]
        flag = False
        url = url or self._embed_url
        if self._num_worker == 1:
            with requests.Session() as session:
                while not flag:
                    for i in range(len(data)):
                        r = session.post(url, json=data[i], headers=self._header,
                                         proxies=proxies, timeout=self._timeout)
                        if r.status_code == 200:
                            vec = self._parse_response(r.json(), input=input)
                            start = i * self._batch_size
                            ret[start: start + len(vec)] = vec
                            if i == len(data) - 1:
                                flag = True
                        else:
                            error_msg = '\n'.join([c.decode('utf-8') for c in r.iter_content(None)])
                            if self._batch_size == 1 or r.status_code in [401, 429]:
                                raise requests.RequestException(error_msg)
                            else:
                                msg = f'Online embedding:{self._embed_model_name} post failed, adjust batch_size: '
                                msg = msg + f' from {self._batch_size} to {max(self._batch_size // 2, 1)}'
                                LOG.warning(msg)
                                self._batch_size = max(self._batch_size // 2, 1)
                                data = self._encapsulated_data(input, **kwargs)
                                break
        else:
            with ThreadPoolExecutor(max_workers=self._num_worker) as executor:
                while not flag:
                    futures = [executor.submit(requests.post, url, json=t, headers=self._header,
                                               proxies=proxies, timeout=self._timeout) for t in data]
                    fut_to_index = {fut: idx for idx, fut in enumerate(futures)}
                    for fut in as_completed(futures):
                        r = fut.result()
                        i = fut_to_index.pop(fut)
                        if r.status_code == 200:
                            vec = self._parse_response(r.json(), input=input)
                            start = i * self._batch_size
                            ret[start: start + len(vec)] = vec
                            if len(fut_to_index) == 0:
                                flag = True
                        else:
                            wait(futures)
                            error_msg = '\n'.join([c.decode('utf-8') for c in r.iter_content(None)])
                            if self._batch_size == 1 or r.status_code in [401, 429]:
                                raise requests.RequestException(error_msg)
                            else:
                                msg = f'Online embedding:{self._embed_model_name} post failed, adjust batch_size: '
                                msg = msg + f' from {self._batch_size} to {max(self._batch_size // 2, 1)}'
                                LOG.warning(msg)
                                self._batch_size = max(self._batch_size // 2, 1)
                                data = self._encapsulated_data(input, **kwargs)
                                break
        return ret

`run_embed_batch(input, data, proxies, url=None, **kwargs)`

执行批量嵌入处理的内部方法。

此方法负责处理批量文本嵌入请求，支持单线程和多线程两种处理模式。当遇到请求失败时，会自动调整批处理大小并重试，提供健壮的错误处理机制。

Parameters:

input (List) –

原始的输入文本列表
data (List) –

封装好的批量请求数据列表
proxies –

代理设置，如果NO_PROXY为True则设置为None
url (str, default: None ) –

本次请求使用的完整接口地址，默认为初始传入的 embed_url
**kwargs –

其他关键字参数

Returns:

嵌入向量列表的列表，每个子列表对应一个输入文本的嵌入向量

Source code in lazyllm/module/llms/onlinemodule/base/onlineEmbeddingModuleBase.py

    def run_embed_batch(self, input: List, data: List, proxies, url: str = None, **kwargs):
        """执行批量嵌入处理的内部方法。

此方法负责处理批量文本嵌入请求，支持单线程和多线程两种处理模式。
当遇到请求失败时，会自动调整批处理大小并重试，提供健壮的错误处理机制。

Args:
    input (List): 原始的输入文本列表
    data (List): 封装好的批量请求数据列表
    proxies: 代理设置，如果NO_PROXY为True则设置为None
    url (str, optional): 本次请求使用的完整接口地址，默认为初始传入的 embed_url
    **kwargs: 其他关键字参数

**Returns:**

- 嵌入向量列表的列表，每个子列表对应一个输入文本的嵌入向量
"""
        ret = [[] for _ in range(len(input))]
        flag = False
        url = url or self._embed_url
        if self._num_worker == 1:
            with requests.Session() as session:
                while not flag:
                    for i in range(len(data)):
                        r = session.post(url, json=data[i], headers=self._header,
                                         proxies=proxies, timeout=self._timeout)
                        if r.status_code == 200:
                            vec = self._parse_response(r.json(), input=input)
                            start = i * self._batch_size
                            ret[start: start + len(vec)] = vec
                            if i == len(data) - 1:
                                flag = True
                        else:
                            error_msg = '\n'.join([c.decode('utf-8') for c in r.iter_content(None)])
                            if self._batch_size == 1 or r.status_code in [401, 429]:
                                raise requests.RequestException(error_msg)
                            else:
                                msg = f'Online embedding:{self._embed_model_name} post failed, adjust batch_size: '
                                msg = msg + f' from {self._batch_size} to {max(self._batch_size // 2, 1)}'
                                LOG.warning(msg)
                                self._batch_size = max(self._batch_size // 2, 1)
                                data = self._encapsulated_data(input, **kwargs)
                                break
        else:
            with ThreadPoolExecutor(max_workers=self._num_worker) as executor:
                while not flag:
                    futures = [executor.submit(requests.post, url, json=t, headers=self._header,
                                               proxies=proxies, timeout=self._timeout) for t in data]
                    fut_to_index = {fut: idx for idx, fut in enumerate(futures)}
                    for fut in as_completed(futures):
                        r = fut.result()
                        i = fut_to_index.pop(fut)
                        if r.status_code == 200:
                            vec = self._parse_response(r.json(), input=input)
                            start = i * self._batch_size
                            ret[start: start + len(vec)] = vec
                            if len(fut_to_index) == 0:
                                flag = True
                        else:
                            wait(futures)
                            error_msg = '\n'.join([c.decode('utf-8') for c in r.iter_content(None)])
                            if self._batch_size == 1 or r.status_code in [401, 429]:
                                raise requests.RequestException(error_msg)
                            else:
                                msg = f'Online embedding:{self._embed_model_name} post failed, adjust batch_size: '
                                msg = msg + f' from {self._batch_size} to {max(self._batch_size // 2, 1)}'
                                LOG.warning(msg)
                                self._batch_size = max(self._batch_size // 2, 1)
                                data = self._encapsulated_data(input, **kwargs)
                                break
        return ret

`lazyllm.module.llms.onlinemodule.supplier.doubao.DoubaoEmbed`

Bases: LazyLLMOnlineEmbedModuleBase

豆包嵌入类，继承自 OnlineEmbeddingModuleBase，封装了调用豆包在线文本嵌入服务的功能。
通过指定服务接口 URL、模型名称及 API Key，支持远程获取文本向量表示。

Parameters:

embed_url (Optional[str], default: None ) –

豆包文本嵌入服务的接口 URL，默认指向北京区域的服务地址。
embed_model_name (Optional[str], default: None ) –

使用的豆包嵌入模型名称，默认为 "doubao-embedding-text-240715"。
api_key (Optional[str], default: None ) –

访问豆包服务的 API Key，若未提供则从 lazyllm 配置中读取。

Source code in lazyllm/module/llms/onlinemodule/supplier/doubao.py

class DoubaoEmbed(LazyLLMOnlineEmbedModuleBase):
    """豆包嵌入类，继承自 OnlineEmbeddingModuleBase，封装了调用豆包在线文本嵌入服务的功能。  
通过指定服务接口 URL、模型名称及 API Key，支持远程获取文本向量表示。

Args:
    embed_url (Optional[str]): 豆包文本嵌入服务的接口 URL，默认指向北京区域的服务地址。
    embed_model_name (Optional[str]): 使用的豆包嵌入模型名称，默认为 "doubao-embedding-text-240715"。
    api_key (Optional[str]): 访问豆包服务的 API Key，若未提供则从 lazyllm 配置中读取。
"""
    def __init__(self,
                 embed_url: Optional[str] = None,
                 embed_model_name: Optional[str] = None,
                 api_key: str = None,
                 batch_size: int = 16,
                 **kw):
        embed_url = embed_url or 'https://ark.cn-beijing.volces.com/api/v3/embeddings'
        embed_model_name = embed_model_name or 'doubao-embedding-text-240715'
        super().__init__(embed_url, api_key or self._default_api_key(), embed_model_name,
                         batch_size=batch_size, **kw)

`lazyllm.module.llms.onlinemodule.supplier.doubao.DoubaoMultimodalEmbed`

Bases: LazyLLMOnlineMultimodalEmbedModuleBase

豆包多模态嵌入类，继承自 OnlineEmbeddingModuleBase，封装了调用豆包在线多模态（文本+图像）嵌入服务的功能。
支持将文本和图像输入转换为统一的向量表示，通过指定服务接口 URL、模型名称及 API Key，实现远程获取多模态向量。

Parameters:

embed_url (Optional[str], default: None ) –

豆包多模态嵌入服务的接口 URL，默认指向北京区域的服务地址。
embed_model_name (Optional[str], default: None ) –

使用的豆包多模态嵌入模型名称，默认为 "doubao-embedding-vision-241215"。
api_key (Optional[str], default: None ) –

访问豆包服务的 API Key，若未提供则从 lazyllm 配置中读取。

Source code in lazyllm/module/llms/onlinemodule/supplier/doubao.py

class DoubaoMultimodalEmbed(LazyLLMOnlineMultimodalEmbedModuleBase):
    """豆包多模态嵌入类，继承自 OnlineEmbeddingModuleBase，封装了调用豆包在线多模态（文本+图像）嵌入服务的功能。  
支持将文本和图像输入转换为统一的向量表示，通过指定服务接口 URL、模型名称及 API Key，实现远程获取多模态向量。

Args:
    embed_url (Optional[str]): 豆包多模态嵌入服务的接口 URL，默认指向北京区域的服务地址。
    embed_model_name (Optional[str]): 使用的豆包多模态嵌入模型名称，默认为 "doubao-embedding-vision-241215"。
    api_key (Optional[str]): 访问豆包服务的 API Key，若未提供则从 lazyllm 配置中读取。
"""
    def __init__(self,
                 embed_url: Optional[str] = None,
                 embed_model_name: str = None,
                 api_key: str = None):
        embed_url = embed_url or 'https://ark.cn-beijing.volces.com/api/v3/embeddings/multimodal'
        embed_model_name = (embed_model_name or lazyllm.config['doubao_multimodal_embed_model_name']
                            or 'doubao-embedding-vision-241215')
        super().__init__(embed_url, api_key or self._default_api_key(), embed_model_name)

    def _encapsulated_data(self, input: Union[List, str], **kwargs) -> Dict[str, str]:
        if isinstance(input, str):
            input = [{'text': input}]
        elif isinstance(input, list):
            # Validate input format, at most 1 text segment + 1 image
            if len(input) == 0:
                raise ValueError('Input list cannot be empty')
            if len(input) > 2:
                raise ValueError('Input list must contain at most 2 items (1 text and/or 1 image)')
        else:
            raise ValueError('Input must be either a string or a list of dictionaries')

        json_data = {
            'input': input,
            'model': self._embed_model_name
        }
        if len(kwargs) > 0:
            json_data.update(kwargs)

        return json_data

    def _parse_response(self, response: Dict, input: Union[List, str]) -> List[float]:
        # Doubao multimodal embedding returns a single fused embedding
        return response['data']['embedding']

`lazyllm.module.llms.onlinemodule.supplier.glm.GLMChat`

Bases: OnlineChatModuleBase, FileHandlerBase

GLMChat 类，继承自 OnlineChatModuleBase 和 FileHandlerBase，封装了对智谱 GLM 系列模型的在线调用功能。
支持对话生成、文件处理以及模型微调等能力。默认使用 GLM-4 模型，也可指定其他训练型模型（如 chatglm3-6b、chatglm_12b 等）。

Parameters:

base_url (Optional[str], default: None ) –

智谱 GLM 服务的 API 接口地址，默认为 "https://open.bigmodel.cn/api/paas/v4/"。
model (Optional[str], default: None ) –

使用的 GLM 模型名称，默认为 "glm-4"，也可选择 TRAINABLE_MODEL_LIST 中的其他模型。
api_key (Optional[str], default: None ) –

访问 GLM 服务的 API Key，若未提供则从 lazyllm 配置中读取。
stream (Optional[bool], default: True ) –

是否开启流式输出，默认为 True。
return_trace (Optional[bool], default: False ) –

是否返回调试追踪信息，默认为 False。
**kwargs –

其他传递给 OnlineChatModuleBase 的可选参数。

Source code in lazyllm/module/llms/onlinemodule/supplier/glm.py

class GLMChat(OnlineChatModuleBase, FileHandlerBase):
    """GLMChat 类，继承自 OnlineChatModuleBase 和 FileHandlerBase，封装了对智谱 GLM 系列模型的在线调用功能。  
支持对话生成、文件处理以及模型微调等能力。默认使用 GLM-4 模型，也可指定其他训练型模型（如 chatglm3-6b、chatglm_12b 等）。

Args:
    base_url (Optional[str]): 智谱 GLM 服务的 API 接口地址，默认为 "https://open.bigmodel.cn/api/paas/v4/"。
    model (Optional[str]): 使用的 GLM 模型名称，默认为 "glm-4"，也可选择 TRAINABLE_MODEL_LIST 中的其他模型。
    api_key (Optional[str]): 访问 GLM 服务的 API Key，若未提供则从 lazyllm 配置中读取。
    stream (Optional[bool]): 是否开启流式输出，默认为 True。
    return_trace (Optional[bool]): 是否返回调试追踪信息，默认为 False。
    **kwargs: 其他传递给 OnlineChatModuleBase 的可选参数。
"""
    TRAINABLE_MODEL_LIST = ['chatglm3-6b', 'chatglm_12b', 'chatglm_32b', 'chatglm_66b', 'chatglm_130b']
    VLM_MODEL_PREFIX = ['glm-4.5v', 'glm-4.1v', 'glm-4v']
    MODEL_NAME = 'glm-4'

    def __init__(self, base_url: Optional[str] = None, model: Optional[str] = None,
                 api_key: str = None, stream: str = True, return_trace: bool = False, **kwargs):
        base_url = base_url or 'https://open.bigmodel.cn/api/paas/v4/'
        super().__init__(api_key=api_key or self._default_api_key(),
                         model_name=model or lazyllm.config['glm_model_name'] or GLMChat.MODEL_NAME,
                         base_url=base_url, stream=stream, return_trace=return_trace, **kwargs)
        FileHandlerBase.__init__(self)
        self.default_train_data = {
            'model': None,
            'training_file': None,
            'validation_file': None,
            'extra_hyperparameters': {
                'fine_tuning_method': None,  # lora\full, default: lora,
                'fine_tuning_parameters': {
                    'max_sequence_length': None  # [1, 8192](int), default: 8192
                }
            },
            'hyperparameters': {
                'learning_rate_multiplier': 0.01,  # (0,5] , default: 1.0
                'batch_size': None,  # [1, 32], default: 8
                'n_epochs': 1,  # [1, 10], default: 3
            },
            'suffix': None,
            'request_id': None
        }
        self.fine_tuning_job_id = None

    def _get_system_prompt(self):
        return ('You are ChatGLM, an AI assistant developed based on a language model trained by Zhipu AI. '
                'Your task is to provide appropriate responses and support for user\'s questions and requests.')

    def _get_models_list(self):
        return ['glm-4', 'glm-4v', 'glm-3-turbo', 'chatglm-turbo', 'cogview-3', 'embedding-2', 'text-embedding']

    def _convert_file_format(self, filepath: str) -> str:
        with open(filepath, 'r', encoding='utf-8') as fr:
            dataset = [json.loads(line) for line in fr]

        json_strs = []
        for ex in dataset:
            lineEx = {'messages': []}
            messages = ex.get('messages', [])
            for message in messages:
                role = message.get('role', '')
                content = message.get('content', '')
                if role in ['system', 'user', 'assistant']:
                    lineEx['messages'].append({'role': role, 'content': content})
            json_strs.append(json.dumps(lineEx, ensure_ascii=False))

        return '\n'.join(json_strs)

    def _upload_train_file(self, train_file):
        url = urljoin(self._base_url, 'files')
        self.get_finetune_data(train_file)

        file_object = {
            'purpose': (None, 'fine-tune', None),
            'file': (os.path.basename(train_file), self._dataHandler, 'application/json')
        }

        with requests.post(url, headers=self._get_empty_header(), files=file_object) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))

            # delete temporary training file
            self._dataHandler.close()
            return r.json()['id']

    def _update_kw(self, data, normal_config):
        cur_data = self.default_train_data.copy()
        cur_data.update(data)

        cur_data['extra_hyperparameters']['fine_tuning_method'] = normal_config['finetuning_type'].strip().lower()
        cur_data['extra_hyperparameters']['fine_tuning_parameters']['max_sequence_length'] = normal_config['cutoff_len']
        cur_data['hyperparameters']['learning_rate_multiplier'] = normal_config['learning_rate']
        cur_data['hyperparameters']['batch_size'] = normal_config['batch_size']
        cur_data['hyperparameters']['n_epochs'] = normal_config['num_epochs']
        cur_data['suffix'] = str(uuid.uuid4())[:7]
        return cur_data

    def _create_finetuning_job(self, train_model, train_file_id, **kw) -> Tuple[str, str]:
        url = urljoin(self._base_url, 'fine_tuning/jobs')
        data = {'model': train_model, 'training_file': train_file_id}
        if len(kw) > 0:
            if 'finetuning_type' in kw:
                data = self._update_kw(data, kw)
            else:
                data.update(kw)

        with requests.post(url, headers=self._header, json=data) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))

            fine_tuning_job_id = r.json()['id']
            self.fine_tuning_job_id = fine_tuning_job_id
            status = self._status_mapping(r.json()['status'])
            return (fine_tuning_job_id, status)

    def _cancel_finetuning_job(self, fine_tuning_job_id=None):
        if not fine_tuning_job_id and not self.fine_tuning_job_id:
            return 'Invalid'
        job_id = fine_tuning_job_id if fine_tuning_job_id else self.fine_tuning_job_id
        fine_tune_url = os.path.join(self._base_url, f'fine_tuning/jobs/{job_id}/cancel')
        with requests.post(fine_tune_url, headers=self._header) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))
        status = r.json()['status']
        if status == 'cancelled':
            return 'Cancelled'
        else:
            return f'JOB {job_id} status: {status}'

    def _query_finetuned_jobs(self):
        fine_tune_url = os.path.join(self._base_url, 'fine_tuning/jobs/')
        with requests.get(fine_tune_url, headers=self._get_empty_header()) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))
        return r.json()

    def _get_finetuned_model_names(self) -> Tuple[List[Tuple[str, str]], List[Tuple[str, str]]]:
        model_data = self._query_finetuned_jobs()
        res = list()
        for model in model_data['data']:
            res.append([model['id'], model['fine_tuned_model'], self._status_mapping(model['status'])])
        return res

    def _status_mapping(self, status):
        if status == 'succeeded':
            return 'Done'
        elif status == 'failed':
            return 'Failed'
        elif status == 'cancelled':
            return 'Cancelled'
        elif status == 'running':
            return 'Running'
        else:  # create, validating_files, queued
            return 'Pending'

    def _query_job_status(self, fine_tuning_job_id=None):
        if not fine_tuning_job_id and not self.fine_tuning_job_id:
            raise RuntimeError('No job ID specified. Please ensure that a valid "fine_tuning_job_id" is '
                               'provided as an argument or started a training job.')
        job_id = fine_tuning_job_id if fine_tuning_job_id else self.fine_tuning_job_id
        _, status = self._query_finetuning_job(job_id)
        return self._status_mapping(status)

    def _get_log(self, fine_tuning_job_id=None):
        if not fine_tuning_job_id and not self.fine_tuning_job_id:
            raise RuntimeError('No job ID specified. Please ensure that a valid "fine_tuning_job_id" is '
                               'provided as an argument or started a training job.')
        job_id = fine_tuning_job_id if fine_tuning_job_id else self.fine_tuning_job_id
        fine_tune_url = os.path.join(self._base_url, f'fine_tuning/jobs/{job_id}/events')
        with requests.get(fine_tune_url, headers=self._get_empty_header()) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))
        return job_id, r.json()

    def _get_curr_job_model_id(self):
        if not self.fine_tuning_job_id:
            return None, None
        model_id, _ = self._query_finetuning_job(self.fine_tuning_job_id)
        return self.fine_tuning_job_id, model_id

    def _query_finetuning_job_info(self, fine_tuning_job_id):
        fine_tune_url = os.path.join(self._base_url, f'fine_tuning/jobs/{fine_tuning_job_id}')
        with requests.get(fine_tune_url, headers=self._get_empty_header()) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))
        return r.json()

    def _query_finetuning_job(self, fine_tuning_job_id) -> Tuple[str, str]:
        info = self._query_finetuning_job_info(fine_tuning_job_id)
        status = info['status']
        fine_tuned_model = info['fine_tuned_model'] if 'fine_tuned_model' in info else None
        return (fine_tuned_model, status)

    def _query_finetuning_cost(self, fine_tuning_job_id):
        info = self._query_finetuning_job_info(fine_tuning_job_id)
        if 'trained_tokens' in info and info['trained_tokens']:
            return info['trained_tokens']
        else:
            return None

    def _create_deployment(self) -> Tuple[str]:
        return (self._model_name, 'RUNNING')

    def _query_deployment(self, deployment_id) -> str:
        return 'RUNNING'

`lazyllm.module.llms.onlinemodule.supplier.glm.GLMText2Image`

Bases: LazyLLMOnlineText2ImageModuleBase

GLM文本生成图像模块，继承自 GLMMultiModal，封装了调用 GLM CogView-4 模型生成图像的功能。
支持根据文本提示（prompt）生成指定数量和分辨率的图像，并可通过 API Key 调用远程服务。

Parameters:

model_name (Optional[str], default: None ) –

使用的 GLM 模型名称，默认使用 "cogview-4-250304" 或配置中的 'glm_text_to_image_model_name'。
api_key (Optional[str], default: None ) –

API Key，用于访问 GLM 图像生成服务。
return_trace (bool, default: False ) –

是否返回调试追踪信息，默认为 False。
**kwargs –

其他传递给 GLMMultiModal 的参数。

Source code in lazyllm/module/llms/onlinemodule/supplier/glm.py

class GLMText2Image(LazyLLMOnlineText2ImageModuleBase):
    """GLM文本生成图像模块，继承自 GLMMultiModal，封装了调用 GLM CogView-4 模型生成图像的功能。  
支持根据文本提示（prompt）生成指定数量和分辨率的图像，并可通过 API Key 调用远程服务。

Args:
    model_name (Optional[str]): 使用的 GLM 模型名称，默认使用 "cogview-4-250304" 或配置中的 'glm_text_to_image_model_name'。
    api_key (Optional[str]): API Key，用于访问 GLM 图像生成服务。
    return_trace (bool): 是否返回调试追踪信息，默认为 False。
    **kwargs: 其他传递给 GLMMultiModal 的参数。
"""
    MODEL_NAME = 'cogview-4-250304'

    def __init__(self, model_name: str = None, api_key: str = None, return_trace: bool = False,
                 base_url: Optional[str] = None, **kwargs):
        base_url = base_url or 'https://open.bigmodel.cn/api/paas/v4'
        super().__init__(model_name=model_name or GLMText2Image.MODEL_NAME, api_key=api_key or self._default_api_key(),
                         return_trace=return_trace, base_url=base_url, **kwargs)
        if self._type == LLMType.IMAGE_EDITING:
            raise ValueError('GLM series models do not support image editing now.')

    def _forward(self, input: str = None, n: int = 1, size: str = '1024x1024',
                 url: str = None, model: str = None, **kwargs):
        runtime_url = url or self._base_url
        runtime_model = model or self._model_name
        client = _zhipu_client(base_url=runtime_url, api_key=self._api_key)
        call_params = {
            'model': runtime_model,
            'prompt': input,
            'n': n,
            'size': size,
            **kwargs
        }
        response = client.images.generations(**call_params)
        return encode_query_with_filepaths(None, bytes_to_file([requests.get(result.url).content
                                                                for result in response.data]))

`lazyllm.module.llms.onlinemodule.supplier.qwen.QwenText2Image`

Bases: LazyLLMOnlineText2ImageModuleBase

Qwen文本生成图像模块和图像编辑模块，继承自 LazyLLMOnlineText2ImageModuleBase，封装了调用 Qwen Wanx2.1-t2i-turbo 模型生成图像的能力和调用Qwen-image-edit-plus模型进行图像编辑的能力。
支持根据文本提示生成指定数量和分辨率的图像，支持图像编辑，并可设置负面提示、随机种子及扩展提示功能，通过 DashScope API 远程调用服务。

Parameters:

model (Optional[str], default: None ) –

使用的 Qwen 模型名称，默认从配置 'qwen_text2image_model_name' 获取，若未设置则使用 "wanx2.1-t2i-turbo"。
api_key (Optional[str], default: None ) –

调用 DashScope 服务的 API Key。
return_trace (bool, default: False ) –

是否返回调试追踪信息，默认为 False。
**kwargs –

其他传递给 LazyLLMOnlineText2ImageModuleBase 的参数。

Source code in lazyllm/module/llms/onlinemodule/supplier/qwen.py

class QwenText2Image(LazyLLMOnlineText2ImageModuleBase):
    """Qwen文本生成图像模块和图像编辑模块，继承自 LazyLLMOnlineText2ImageModuleBase，封装了调用 Qwen Wanx2.1-t2i-turbo 模型生成图像的能力和调用Qwen-image-edit-plus模型进行图像编辑的能力。  
支持根据文本提示生成指定数量和分辨率的图像，支持图像编辑，并可设置负面提示、随机种子及扩展提示功能，通过 DashScope API 远程调用服务。

Args:
    model (Optional[str]): 使用的 Qwen 模型名称，默认从配置 'qwen_text2image_model_name' 获取，若未设置则使用 "wanx2.1-t2i-turbo"。
    api_key (Optional[str]): 调用 DashScope 服务的 API Key。
    return_trace (bool): 是否返回调试追踪信息，默认为 False。
    **kwargs: 其他传递给 LazyLLMOnlineText2ImageModuleBase 的参数。
"""
    MODEL_NAME = 'wanx2.1-t2i-turbo'
    IMAGE_EDITING_MODEL_NAME = 'qwen-image-edit-plus'

    def __init__(self, model: str = None, api_key: str = None, return_trace: bool = False,
                 base_url: Optional[str] = None,
                 base_websocket_url: Optional[str] = None,
                 **kwargs):
        _ensure_dashscope_urls_initialized()
        base_url = base_url or _DASHSCOPE_DEFAULT_HTTP_URL
        base_websocket_url = base_websocket_url or _DASHSCOPE_DEFAULT_WEBSOCKET_URL
        if base_url and base_url != _DASHSCOPE_DEFAULT_HTTP_URL:
            LOG.warning('QwenText2Image ignores `base_url`; use `set_dashscope_urls` instead.')
        if base_websocket_url and base_websocket_url != _DASHSCOPE_DEFAULT_WEBSOCKET_URL:
            LOG.warning('QwenText2Image ignores `base_websocket_url`; use `set_dashscope_urls` instead.')
        super().__init__(api_key=api_key or self._default_api_key(), model_name=model,
                         return_trace=return_trace, base_url=base_url, **kwargs)

    def _call_sync_text2image(self, call_params):
        task_response = dashscope.MultiModalConversation.call(**call_params)
        if task_response.status_code != HTTPStatus.OK:
            raise RuntimeError(
                f'Failed to create image synthesis task, '
                f'status: {task_response.status_code}, message: {task_response.message}'
            )
        return task_response

    def _call_async_text2image(self, call_params):
        task_response = dashscope.ImageSynthesis.async_call(**call_params)
        if task_response.status_code != HTTPStatus.OK:
            raise RuntimeError(
                f'Failed to create image synthesis task, '
                f'status: {task_response.status_code}, message: {task_response.message}'
            )
        task_id = getattr(task_response.output, 'task_id', None)
        if not task_id:
            raise RuntimeError('No task_id returned from async image synthesis call')
        response = dashscope.ImageSynthesis.wait(task=task_id, api_key=self._api_key)
        return response

    def _extract_sync_image_urls(self, response):
        try:
            image_urls = []
            for idx, content in enumerate(response.output.choices[0].message.content):
                try:
                    image_url = content['image']
                    if image_url:
                        image_urls.append(image_url)
                except Exception as e:
                    LOG.warning(f'Failed to extract image URL from item {idx}: {str(e)}')
                    continue
            if not image_urls:
                LOG.warning('No image URLs found in content')
            return image_urls
        except Exception as e:
            LOG.error(f'Failed to extract sync image URLs: {str(e)}')
            return []

    def _extract_async_image_urls(self, response):
        try:
            output = getattr(response, 'output', None)
            if not output:
                return []
            results = getattr(output, 'results', [])
            if not results:
                LOG.warning('No results in async response output')
                return []
            return [getattr(result, 'url', None) for result in results if getattr(result, 'url', None)]
        except Exception as e:
            LOG.error(f'Failed to extract async image URLs: {str(e)}')
            return []

    def _forward(self, input: str = None, files: List[str] = None, negative_prompt: str = None, n: int = 1,
                 prompt_extend: bool = True, size: str = '1024*1024', seed: int = None,
                 url: str = None, model: str = None, **kwargs):
        has_ref_image = files is not None and len(files) > 0
        reference_image_data = None
        messages = []
        if url and url != self._base_url:
            raise Exception('Qwen Text2Image forward() does not support overriding the `url` parameter, '
                            'please remove it.')
        if 'base_websocket_url' in kwargs:
            raise Exception('Qwen Text2Image forward() does not support overriding the `base_websocket_url` parameter.')
        if self._type == LLMType.IMAGE_EDITING and not has_ref_image:
            raise ValueError(
                f'Image editing is enabled for model {self._model_name}, but no image file was provided. '
                f'Please provide an image file via the "files" parameter.'
            )
        if self._type != LLMType.IMAGE_EDITING and has_ref_image:
            raise ValueError(
                f'Image file was provided, but image editing is not enabled for model {self._model_name}. '
                f'Please use default image-editing model {self.IMAGE_EDITING_MODEL_NAME} or other image-editing model'
            )
        if has_ref_image:
            image_results = self._load_images(files)
            content = []
            for base64_str, _ in image_results:
                reference_image_data = f'data:image/png;base64,{base64_str}'
                content.append({'image': reference_image_data})
            content.append({'text': input})
            messages = [
                {
                    'role': 'user',
                    'content': content
                }
            ]

        call_params = {
            'model': model,
            'negative_prompt': negative_prompt,
            'n': n,
            'prompt_extend': prompt_extend,
            'size': size,
            **kwargs
        }
        if self._api_key: call_params['api_key'] = self._api_key
        if seed: call_params['seed'] = seed
        if has_ref_image:
            call_params['messages'] = messages
            response = self._call_sync_text2image(call_params)
            image_urls = self._extract_sync_image_urls(response)
        else:
            call_params['prompt'] = input
            response = self._call_async_text2image(call_params)
            image_urls = self._extract_async_image_urls(response)
        if response.status_code != HTTPStatus.OK:
            error_msg = getattr(response.output, 'message', 'Unknown error')
            raise Exception(f'Image generation failed: {error_msg}')
        image_results = self._load_images(image_urls)
        image_bytes = [data for _, data in image_results]
        return encode_query_with_filepaths(None, bytes_to_file(image_bytes))

`lazyllm.module.llms.onlinemodule.supplier.kimi.KimiChat`

Bases: OnlineChatModuleBase

KimiChat 类，继承自 OnlineChatModuleBase，封装了调用 Moonshot AI 提供的 Kimi 聊天服务的能力。
可通过指定 API Key、模型名称和服务 URL，支持中文和英文的安全问答交互，并支持图像输入的 base64 格式处理。

Parameters:

base_url (str, default: None ) –

Kimi 服务的基础 URL，默认为 "https://api.moonshot.cn/"。
model (str, default: None ) –

使用的 Kimi 模型名称，默认为 "moonshot-v1-8k"。
api_key (Optional[str], default: None ) –

访问 Kimi 服务的 API Key，若未提供则从 lazyllm 配置中读取。
stream (bool, default: True ) –

是否开启流式输出，默认为 True。
return_trace (bool, default: False ) –

是否返回调试追踪信息，默认为 False。
**kwargs –

其他传递给 OnlineChatModuleBase 的参数。

Source code in lazyllm/module/llms/onlinemodule/supplier/kimi.py

class KimiChat(OnlineChatModuleBase):
    """KimiChat 类，继承自 OnlineChatModuleBase，封装了调用 Moonshot AI 提供的 Kimi 聊天服务的能力。  
可通过指定 API Key、模型名称和服务 URL，支持中文和英文的安全问答交互，并支持图像输入的 base64 格式处理。

Args:
    base_url (str): Kimi 服务的基础 URL，默认为 "https://api.moonshot.cn/"。
    model (str): 使用的 Kimi 模型名称，默认为 "moonshot-v1-8k"。
    api_key (Optional[str]): 访问 Kimi 服务的 API Key，若未提供则从 lazyllm 配置中读取。
    stream (bool): 是否开启流式输出，默认为 True。
    return_trace (bool): 是否返回调试追踪信息，默认为 False。
    **kwargs: 其他传递给 OnlineChatModuleBase 的参数。
"""

    def __init__(self, base_url: Optional[str] = None, model: Optional[str] = None,
                 api_key: str = None, stream: bool = True, return_trace: bool = False, **kwargs):
        base_url = base_url or 'https://api.moonshot.cn/'
        model = model or 'moonshot-v1-8k'
        super().__init__(api_key=api_key or self._default_api_key(), base_url=base_url,
                         model_name=model, stream=stream, return_trace=return_trace, **kwargs)

    def _get_system_prompt(self):
        return ('You are Kimi, an AI assistant provided by Moonshot AI. You are better at speaking '
                'Chinese and English. You will provide users with safe, helpful, and accurate answers. '
                'At the same time, you will reject all answers involving terrorism, racial discrimination, '
                'pornographic violence, etc. Moonshot AI is a proper noun and cannot be translated '
                'into other languages.')

    def _get_chat_url(self, url):
        if url.rstrip('/').endswith('v1/chat/completions'):
            return url
        return urljoin(url, 'v1/chat/completions')

    def _format_vl_chat_image_url(self, image_url, mime):
        assert not image_url.startswith('http'), 'Kimi vision model only supports base64 format'
        assert mime is not None, 'Kimi Module requires mime info.'
        image_url = f'data:{mime};base64,{image_url}'
        return [{'type': 'image_url', 'image_url': {'url': image_url}}]

    def _format_vl_chat_query(self, query: str):
        return query

    def _validate_api_key(self):
        try:
            models_url = urljoin(self._base_url, 'v1/models')
            response = requests.get(models_url, headers=self._header, timeout=10)
            return response.status_code == 200
        except Exception:
            return False

`lazyllm.module.llms.onlinemodule.fileHandler.FileHandlerBase`

FileHandlerBase 是处理微调数据文件的基类，主要用于验证和转换微调数据格式。
该类不支持直接实例化，需要子类继承并实现特定的文件格式转换逻辑。

功能包括： 1. 验证微调数据文件格式是否为标准的 .jsonl。 2. 检查每条数据是否包含符合规范的消息格式（包含 role 和 content 字段）。 3. 验证角色类型是否在允许范围内（system、knowledge、user、assistant）。 4. 确保每个对话示例包含至少一条 assistant 回复。 5. 提供临时文件存储机制，便于后续处理。

Examples:

>>> import lazyllm
>>> from lazyllm.module.llms.onlinemodule.fileHandler import FileHandlerBase
>>> import tempfile
>>> import json
>>> sample_data = [
...     {"messages": [{"role": "user", "content": "Hello"}, {"role": "assistant", "content": "Hi there!"}]},
...     {"messages": [{"role": "user", "content": "How are you?"}, {"role": "assistant", "content": "I'm doing well, thank you!"}]}
... ] 
>>> with tempfile.NamedTemporaryFile(mode='w', suffix='.jsonl', delete=False) as f:
...     for item in sample_data:
...         f.write(json.dumps(item, ensure_ascii=False) + '
')
...     temp_file_path = f.name
>>> class CustomFileHandler(FileHandlerBase):
...     def _convert_file_format(self, filepath: str) -> str:
...         with open(filepath, 'r', encoding='utf-8') as f:
...             data = [json.loads(line) for line in f]
...         converted_data = []
...         for item in data:
...             messages = item.get('messages', [])
...             conversation = []
...             for msg in messages:
...                 conversation.append(f"{msg['role']}: {msg['content']}")
...             converted_data.append('
'.join(conversation))
...         return '
---
'.join(converted_data)
>>> handler = CustomFileHandler()
>>> try:
...     result = handler.get_finetune_data(temp_file_path)
...     print("数据验证和转换成功")
... except Exception as e:
...     print(f"错误: {e}")
... finally:
...     import os
...     os.unlink(temp_file_path)

Source code in lazyllm/module/llms/onlinemodule/fileHandler.py

class FileHandlerBase:
    """FileHandlerBase 是处理微调数据文件的基类，主要用于验证和转换微调数据格式。  
该类不支持直接实例化，需要子类继承并实现特定的文件格式转换逻辑。

功能包括：
    1. 验证微调数据文件格式是否为标准的 `.jsonl`。
    2. 检查每条数据是否包含符合规范的消息格式（包含 `role` 和 `content` 字段）。
    3. 验证角色类型是否在允许范围内（system、knowledge、user、assistant）。
    4. 确保每个对话示例包含至少一条 assistant 回复。
    5. 提供临时文件存储机制，便于后续处理。


Examples:
    >>> import lazyllm
    >>> from lazyllm.module.llms.onlinemodule.fileHandler import FileHandlerBase
    >>> import tempfile
    >>> import json
    >>> sample_data = [
    ...     {"messages": [{"role": "user", "content": "Hello"}, {"role": "assistant", "content": "Hi there!"}]},
    ...     {"messages": [{"role": "user", "content": "How are you?"}, {"role": "assistant", "content": "I'm doing well, thank you!"}]}
    ... ] 
    >>> with tempfile.NamedTemporaryFile(mode='w', suffix='.jsonl', delete=False) as f:
    ...     for item in sample_data:
    ...         f.write(json.dumps(item, ensure_ascii=False) + '
    ')
    ...     temp_file_path = f.name
    >>> class CustomFileHandler(FileHandlerBase):
    ...     def _convert_file_format(self, filepath: str) -> str:
    ...         with open(filepath, 'r', encoding='utf-8') as f:
    ...             data = [json.loads(line) for line in f]
    ...         converted_data = []
    ...         for item in data:
    ...             messages = item.get('messages', [])
    ...             conversation = []
    ...             for msg in messages:
    ...                 conversation.append(f"{msg['role']}: {msg['content']}")
    ...             converted_data.append('
    '.join(conversation))
    ...         return '
    ---
    '.join(converted_data)
    >>> handler = CustomFileHandler()
    >>> try:
    ...     result = handler.get_finetune_data(temp_file_path)
    ...     print("数据验证和转换成功")
    ... except Exception as e:
    ...     print(f"错误: {e}")
    ... finally:
    ...     import os
    ...     os.unlink(temp_file_path)
    """

    def __init__(self):
        self._roles = ['system', 'knowledge', 'user', 'assistant']

    def _validate_json(self, data_path: str) -> None:  # noqa C901
        # Check if file name format
        if os.path.splitext(data_path)[-1] != '.jsonl':
            raise ValueError('The file name must end with .jsonl')
        # Check if the file exists
        if not os.path.exists(data_path):
            raise FileNotFoundError(f'File {data_path} does not exist.')

        # Load dataset
        with open(data_path, 'r', encoding='utf-8') as f:
            dataset = [json.loads(line) for line in f]

        # Initial dataset stats
        lazyllm.LOG.info('Num examples:', len(dataset))
        lazyllm.LOG.info('First example:')
        for message in dataset[0]['messages']:
            lazyllm.LOG.info(message)

        # Format error checks
        format_error: Dict[str, list[int]] = defaultdict(list)
        for index, line in enumerate(dataset, start=1):
            # Check if example is a dictionary type
            if not isinstance(line, dict):
                format_error['data_type'].append(index)
                continue

            messages = line.get('messages', None)
            # Check if messages keyword exists
            if messages is None:
                format_error['missing_messages_list'].append(index)
                continue

            for message in messages:
                if 'role' not in message or 'content' not in message:
                    format_error['message_missing_key'].append(index)

                if any(k not in ('role', 'content') for k in message):
                    format_error['message_unrecognized_key'].append(index)

                if message.get('role', None) not in self._roles:
                    format_error['unrecognized_role'].append(index)

                content = message.get('content', None)
                if content is None or not isinstance(content, str):
                    format_error['missing_content'].append(index)

            if not any(message.get('role', None) == 'assistant' for message in messages):
                format_error['example_missing_assistant_message'].append(index)

        if format_error:
            lazyllm.LOG.error('Found errors: ')
            for k, v in format_error.items():
                lazyllm.LOG.error(f'Error Type: {k}, Error number: {len(v)}')
                lazyllm.LOG.error(f'Error Type: {k}, Error line number: {v}')
        else:
            lazyllm.LOG.info('No errors found')

    def get_finetune_data(self, filepath: str) -> str:
        """获取并处理微调数据文件，包括验证文件格式和转换为目标平台支持的格式。

Args:
    filepath (str): 微调数据文件的路径，必须是.jsonl格式
"""
        self._validate_json(filepath)
        self._save_tempfile(self._convert_file_format(filepath))

    def _save_tempfile(self, data: str):
        self._dataHandler = tempfile.TemporaryFile()
        self._dataHandler.write(data.encode())
        self._dataHandler.seek(0)

    def _convert_file_format(self, filepath: str) -> str:
        raise NotImplementedError

`get_finetune_data(filepath)`

获取并处理微调数据文件，包括验证文件格式和转换为目标平台支持的格式。

Parameters:

filepath (str) –

微调数据文件的路径，必须是.jsonl格式

Source code in lazyllm/module/llms/onlinemodule/fileHandler.py

    def get_finetune_data(self, filepath: str) -> str:
        """获取并处理微调数据文件，包括验证文件格式和转换为目标平台支持的格式。

Args:
    filepath (str): 微调数据文件的路径，必须是.jsonl格式
"""
        self._validate_json(filepath)
        self._save_tempfile(self._convert_file_format(filepath))

`lazyllm.module.llms.onlinemodule.supplier.glm.GLMChat`

Bases: OnlineChatModuleBase, FileHandlerBase

GLMChat 类，继承自 OnlineChatModuleBase 和 FileHandlerBase，封装了对智谱 GLM 系列模型的在线调用功能。
支持对话生成、文件处理以及模型微调等能力。默认使用 GLM-4 模型，也可指定其他训练型模型（如 chatglm3-6b、chatglm_12b 等）。

Parameters:

base_url (Optional[str], default: None ) –

智谱 GLM 服务的 API 接口地址，默认为 "https://open.bigmodel.cn/api/paas/v4/"。
model (Optional[str], default: None ) –

使用的 GLM 模型名称，默认为 "glm-4"，也可选择 TRAINABLE_MODEL_LIST 中的其他模型。
api_key (Optional[str], default: None ) –

访问 GLM 服务的 API Key，若未提供则从 lazyllm 配置中读取。
stream (Optional[bool], default: True ) –

是否开启流式输出，默认为 True。
return_trace (Optional[bool], default: False ) –

是否返回调试追踪信息，默认为 False。
**kwargs –

其他传递给 OnlineChatModuleBase 的可选参数。

Source code in lazyllm/module/llms/onlinemodule/supplier/glm.py

class GLMChat(OnlineChatModuleBase, FileHandlerBase):
    """GLMChat 类，继承自 OnlineChatModuleBase 和 FileHandlerBase，封装了对智谱 GLM 系列模型的在线调用功能。  
支持对话生成、文件处理以及模型微调等能力。默认使用 GLM-4 模型，也可指定其他训练型模型（如 chatglm3-6b、chatglm_12b 等）。

Args:
    base_url (Optional[str]): 智谱 GLM 服务的 API 接口地址，默认为 "https://open.bigmodel.cn/api/paas/v4/"。
    model (Optional[str]): 使用的 GLM 模型名称，默认为 "glm-4"，也可选择 TRAINABLE_MODEL_LIST 中的其他模型。
    api_key (Optional[str]): 访问 GLM 服务的 API Key，若未提供则从 lazyllm 配置中读取。
    stream (Optional[bool]): 是否开启流式输出，默认为 True。
    return_trace (Optional[bool]): 是否返回调试追踪信息，默认为 False。
    **kwargs: 其他传递给 OnlineChatModuleBase 的可选参数。
"""
    TRAINABLE_MODEL_LIST = ['chatglm3-6b', 'chatglm_12b', 'chatglm_32b', 'chatglm_66b', 'chatglm_130b']
    VLM_MODEL_PREFIX = ['glm-4.5v', 'glm-4.1v', 'glm-4v']
    MODEL_NAME = 'glm-4'

    def __init__(self, base_url: Optional[str] = None, model: Optional[str] = None,
                 api_key: str = None, stream: str = True, return_trace: bool = False, **kwargs):
        base_url = base_url or 'https://open.bigmodel.cn/api/paas/v4/'
        super().__init__(api_key=api_key or self._default_api_key(),
                         model_name=model or lazyllm.config['glm_model_name'] or GLMChat.MODEL_NAME,
                         base_url=base_url, stream=stream, return_trace=return_trace, **kwargs)
        FileHandlerBase.__init__(self)
        self.default_train_data = {
            'model': None,
            'training_file': None,
            'validation_file': None,
            'extra_hyperparameters': {
                'fine_tuning_method': None,  # lora\full, default: lora,
                'fine_tuning_parameters': {
                    'max_sequence_length': None  # [1, 8192](int), default: 8192
                }
            },
            'hyperparameters': {
                'learning_rate_multiplier': 0.01,  # (0,5] , default: 1.0
                'batch_size': None,  # [1, 32], default: 8
                'n_epochs': 1,  # [1, 10], default: 3
            },
            'suffix': None,
            'request_id': None
        }
        self.fine_tuning_job_id = None

    def _get_system_prompt(self):
        return ('You are ChatGLM, an AI assistant developed based on a language model trained by Zhipu AI. '
                'Your task is to provide appropriate responses and support for user\'s questions and requests.')

    def _get_models_list(self):
        return ['glm-4', 'glm-4v', 'glm-3-turbo', 'chatglm-turbo', 'cogview-3', 'embedding-2', 'text-embedding']

    def _convert_file_format(self, filepath: str) -> str:
        with open(filepath, 'r', encoding='utf-8') as fr:
            dataset = [json.loads(line) for line in fr]

        json_strs = []
        for ex in dataset:
            lineEx = {'messages': []}
            messages = ex.get('messages', [])
            for message in messages:
                role = message.get('role', '')
                content = message.get('content', '')
                if role in ['system', 'user', 'assistant']:
                    lineEx['messages'].append({'role': role, 'content': content})
            json_strs.append(json.dumps(lineEx, ensure_ascii=False))

        return '\n'.join(json_strs)

    def _upload_train_file(self, train_file):
        url = urljoin(self._base_url, 'files')
        self.get_finetune_data(train_file)

        file_object = {
            'purpose': (None, 'fine-tune', None),
            'file': (os.path.basename(train_file), self._dataHandler, 'application/json')
        }

        with requests.post(url, headers=self._get_empty_header(), files=file_object) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))

            # delete temporary training file
            self._dataHandler.close()
            return r.json()['id']

    def _update_kw(self, data, normal_config):
        cur_data = self.default_train_data.copy()
        cur_data.update(data)

        cur_data['extra_hyperparameters']['fine_tuning_method'] = normal_config['finetuning_type'].strip().lower()
        cur_data['extra_hyperparameters']['fine_tuning_parameters']['max_sequence_length'] = normal_config['cutoff_len']
        cur_data['hyperparameters']['learning_rate_multiplier'] = normal_config['learning_rate']
        cur_data['hyperparameters']['batch_size'] = normal_config['batch_size']
        cur_data['hyperparameters']['n_epochs'] = normal_config['num_epochs']
        cur_data['suffix'] = str(uuid.uuid4())[:7]
        return cur_data

    def _create_finetuning_job(self, train_model, train_file_id, **kw) -> Tuple[str, str]:
        url = urljoin(self._base_url, 'fine_tuning/jobs')
        data = {'model': train_model, 'training_file': train_file_id}
        if len(kw) > 0:
            if 'finetuning_type' in kw:
                data = self._update_kw(data, kw)
            else:
                data.update(kw)

        with requests.post(url, headers=self._header, json=data) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))

            fine_tuning_job_id = r.json()['id']
            self.fine_tuning_job_id = fine_tuning_job_id
            status = self._status_mapping(r.json()['status'])
            return (fine_tuning_job_id, status)

    def _cancel_finetuning_job(self, fine_tuning_job_id=None):
        if not fine_tuning_job_id and not self.fine_tuning_job_id:
            return 'Invalid'
        job_id = fine_tuning_job_id if fine_tuning_job_id else self.fine_tuning_job_id
        fine_tune_url = os.path.join(self._base_url, f'fine_tuning/jobs/{job_id}/cancel')
        with requests.post(fine_tune_url, headers=self._header) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))
        status = r.json()['status']
        if status == 'cancelled':
            return 'Cancelled'
        else:
            return f'JOB {job_id} status: {status}'

    def _query_finetuned_jobs(self):
        fine_tune_url = os.path.join(self._base_url, 'fine_tuning/jobs/')
        with requests.get(fine_tune_url, headers=self._get_empty_header()) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))
        return r.json()

    def _get_finetuned_model_names(self) -> Tuple[List[Tuple[str, str]], List[Tuple[str, str]]]:
        model_data = self._query_finetuned_jobs()
        res = list()
        for model in model_data['data']:
            res.append([model['id'], model['fine_tuned_model'], self._status_mapping(model['status'])])
        return res

    def _status_mapping(self, status):
        if status == 'succeeded':
            return 'Done'
        elif status == 'failed':
            return 'Failed'
        elif status == 'cancelled':
            return 'Cancelled'
        elif status == 'running':
            return 'Running'
        else:  # create, validating_files, queued
            return 'Pending'

    def _query_job_status(self, fine_tuning_job_id=None):
        if not fine_tuning_job_id and not self.fine_tuning_job_id:
            raise RuntimeError('No job ID specified. Please ensure that a valid "fine_tuning_job_id" is '
                               'provided as an argument or started a training job.')
        job_id = fine_tuning_job_id if fine_tuning_job_id else self.fine_tuning_job_id
        _, status = self._query_finetuning_job(job_id)
        return self._status_mapping(status)

    def _get_log(self, fine_tuning_job_id=None):
        if not fine_tuning_job_id and not self.fine_tuning_job_id:
            raise RuntimeError('No job ID specified. Please ensure that a valid "fine_tuning_job_id" is '
                               'provided as an argument or started a training job.')
        job_id = fine_tuning_job_id if fine_tuning_job_id else self.fine_tuning_job_id
        fine_tune_url = os.path.join(self._base_url, f'fine_tuning/jobs/{job_id}/events')
        with requests.get(fine_tune_url, headers=self._get_empty_header()) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))
        return job_id, r.json()

    def _get_curr_job_model_id(self):
        if not self.fine_tuning_job_id:
            return None, None
        model_id, _ = self._query_finetuning_job(self.fine_tuning_job_id)
        return self.fine_tuning_job_id, model_id

    def _query_finetuning_job_info(self, fine_tuning_job_id):
        fine_tune_url = os.path.join(self._base_url, f'fine_tuning/jobs/{fine_tuning_job_id}')
        with requests.get(fine_tune_url, headers=self._get_empty_header()) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))
        return r.json()

    def _query_finetuning_job(self, fine_tuning_job_id) -> Tuple[str, str]:
        info = self._query_finetuning_job_info(fine_tuning_job_id)
        status = info['status']
        fine_tuned_model = info['fine_tuned_model'] if 'fine_tuned_model' in info else None
        return (fine_tuned_model, status)

    def _query_finetuning_cost(self, fine_tuning_job_id):
        info = self._query_finetuning_job_info(fine_tuning_job_id)
        if 'trained_tokens' in info and info['trained_tokens']:
            return info['trained_tokens']
        else:
            return None

    def _create_deployment(self) -> Tuple[str]:
        return (self._model_name, 'RUNNING')

    def _query_deployment(self, deployment_id) -> str:
        return 'RUNNING'

`lazyllm.module.llms.onlinemodule.supplier.glm.GLMRerank`

Bases: LazyLLMOnlineRerankModuleBase

智谱AI的重排序模块，继承自OnlineEmbeddingModuleBase，用于对文档进行相关性重排序。

Parameters:

embed_url (str, default: None ) –

重排序API的基础URL，默认为"https://open.bigmodel.cn/api/paas/v4/rerank"。
embed_model_name (str, default: 'rerank' ) –

使用的模型名称，默认为"rerank"。
api_key (str, default: None ) –

智谱AI的API密钥，如果未提供则从lazyllm.config['glm_api_key']读取。

属性： type: 返回模型类型，固定为"ONLINE_RERANK"。

主要功能： - 对输入的查询和文档列表进行相关性重排序 - 支持自定义排序参数 - 返回每个文档的相关性得分

Source code in lazyllm/module/llms/onlinemodule/supplier/glm.py

class GLMRerank(LazyLLMOnlineRerankModuleBase):
    """智谱AI的重排序模块，继承自OnlineEmbeddingModuleBase，用于对文档进行相关性重排序。

Args:
    embed_url (str): 重排序API的基础URL，默认为"https://open.bigmodel.cn/api/paas/v4/rerank"。
    embed_model_name (str): 使用的模型名称，默认为"rerank"。
    api_key (str): 智谱AI的API密钥，如果未提供则从lazyllm.config['glm_api_key']读取。

属性：
    type: 返回模型类型，固定为"ONLINE_RERANK"。

主要功能：
    - 对输入的查询和文档列表进行相关性重排序
    - 支持自定义排序参数
    - 返回每个文档的相关性得分
"""
    def __init__(self,
                 embed_url: Optional[str] = None,
                 embed_model_name: str = 'rerank',
                 api_key: str = None, **kw):
        embed_url = embed_url or 'https://open.bigmodel.cn/api/paas/v4/rerank'
        embed_model_name = embed_model_name or 'rerank'
        super().__init__(embed_url, api_key or self._default_api_key(), embed_model_name, **kw)

    @property
    def type(self):
        return 'RERANK'

    def _encapsulated_data(self, query: str, documents: List[str], top_n: int, **kwargs) -> Dict[str, str]:
        json_data = {
            'query': query,
            'documents': documents,
            'top_n': top_n,
            'return_documents': False,
            'return_raw_scores': True
        }
        if len(kwargs) > 0:
            json_data.update(kwargs)

        return json_data

    def _parse_response(self, response: Dict, input: Union[List, str]) -> List[Tuple]:
        return [(result['index'], result['relevance_score']) for result in response['results']]

`lazyllm.module.OnlineMultiModalModule`

Bases: _DynamicSourceRouterMixin

用来管理创建在线多模态服务模块，目前支持 stt / tts / text2image / image_editing 类型。

Parameters:

model (str, default: None ) –

指定要访问的模型名称。
source (str, default: None ) –

指定要创建的模块类型，如 qwen / glm / minimax / siliconflow / doubao 等。
type (str, default: None ) –

多模态任务类型，可选 stt / tts / text2image / image_editing。
url (str, default: None ) –

指定要访问的平台基础链接，默认使用各平台官方链接。也可使用别名 base_url 传入。
api_key (str, default: None ) –

可显式传入 API Key；当设置为 auto 或 dynamic 时，将在运行时从配置读取，支持动态切换 key。
dynamic_auth (bool, default: False ) –

是否启用动态鉴权；为 True 时等价于 api_key='dynamic'。
return_trace (bool, default: False ) –

是否将结果记录在 trace 中，默认为 False。

Examples:

>>> import lazyllm
>>> stt = lazyllm.OnlineMultiModalModule(source='qwen', type='stt', api_key='dynamic')
>>> tts = lazyllm.OnlineMultiModalModule(source='qwen', type='tts', dynamic_auth=True)
>>> img = lazyllm.OnlineMultiModalModule(source='qwen', type='text2image')

Source code in lazyllm/module/llms/onlinemodule/multimodal.py

class OnlineMultiModalModule(_DynamicSourceRouterMixin, metaclass=_OnlineMultiModalMeta):
    """用来管理创建在线多模态服务模块，目前支持 ``stt`` / ``tts`` / ``text2image`` / ``image_editing`` 类型。

Args:
    model (str): 指定要访问的模型名称。
    source (str): 指定要创建的模块类型，如 ``qwen`` / ``glm`` / ``minimax`` / ``siliconflow`` / ``doubao`` 等。
    type (str): 多模态任务类型，可选 ``stt`` / ``tts`` / ``text2image`` / ``image_editing``。
    url (str): 指定要访问的平台基础链接，默认使用各平台官方链接。也可使用别名 ``base_url`` 传入。
    api_key (str): 可显式传入 API Key；当设置为 ``auto`` 或 ``dynamic`` 时，将在运行时从配置读取，支持动态切换 key。
    dynamic_auth (bool): 是否启用动态鉴权；为 True 时等价于 ``api_key='dynamic'``。
    return_trace (bool): 是否将结果记录在 trace 中，默认为 False。


Examples:
    >>> import lazyllm
    >>> stt = lazyllm.OnlineMultiModalModule(source='qwen', type='stt', api_key='dynamic')
    >>> tts = lazyllm.OnlineMultiModalModule(source='qwen', type='tts', dynamic_auth=True)
    >>> img = lazyllm.OnlineMultiModalModule(source='qwen', type='text2image')
    """
    _dynamic_module_slot = 'multimodal'
    _dynamic_source_error = 'No source is configured for dynamic multimodal source.'
    TYPE_GROUP_MAP = {
        'stt': LLMType.STT,
        'tts': LLMType.TTS,
        'text2image': LLMType.TEXT2IMAGE,
        'image_editing': LLMType.TEXT2IMAGE,
    }

    @staticmethod
    def _resolve_type_name(type_name: Optional[str], model: Optional[str]) -> str:
        if type_name is not None:
            return LLMType._normalize(type_name)
        resolved = get_model_type(model) if model else None
        if resolved == 'sd':
            return 'text2image'
        if resolved not in OnlineMultiModalModule.TYPE_GROUP_MAP:
            raise ValueError(
                f'Cannot infer multimodal type from model {model!r}. '
                f'Please provide `type` explicitly (one of: {list(OnlineMultiModalModule.TYPE_GROUP_MAP.keys())}).')
        return resolved

    @staticmethod
    def _validate_parameters(source: Optional[str], model: Optional[str], type: str, url: Optional[str],
                             skip_auth: bool = False, **kwargs) -> tuple:
        if type not in OnlineMultiModalModule.TYPE_GROUP_MAP:
            raise ValueError(
                f'Invalid type: {type!r}. Must be one of: {list(OnlineMultiModalModule.TYPE_GROUP_MAP.keys())}')
        register_type = OnlineMultiModalModule.TYPE_GROUP_MAP.get(type).lower()
        model, source, url, kwargs = resolve_online_params(
            model, source, url, kwargs, source_registry=lazyllm.online[register_type])
        source, default_key = select_source_with_default_key(lazyllm.online[register_type], source, type)
        if default_key and not kwargs.get('api_key'):
            kwargs['api_key'] = default_key
        if skip_auth and not url:
            raise ValueError('url must be set for local serving.')
        default_module_cls = getattr(lazyllm.online[register_type], source)
        default_model_name = getattr(default_module_cls, 'IMAGE_EDITING_MODEL_NAME' if type == 'image_editing'
                                     else 'MODEL_NAME', None)
        if model is None and default_model_name:
            model = default_model_name
            lazyllm.LOG.info(f'For type {type}, source {source}. Automatically selected default model: {model}')
        if url is not None:
            kwargs['base_url'] = url
        return source, model, kwargs

    def __new__(cls, model: str = None, source: str = None, url: str = None, type: str = None,
                return_trace: bool = False, api_key: str = None, dynamic_auth: bool = False,
                skip_auth: bool = False, id: Optional[str] = None, name: Optional[str] = None,
                group_id: Optional[str] = None, **kwargs):
        model, source, url, kwargs = resolve_online_params(
            model, source, url, kwargs, url_aliases='base_url')
        if cls._should_use_dynamic(source, dynamic_auth, skip_auth):
            return super().__new__(cls)
        if source is None and api_key is not None:
            raise ValueError('No source is given but an api_key is provided.')
        if api_key is not None:
            kwargs['api_key'] = api_key
        type = OnlineMultiModalModule._resolve_type_name(
            type if type is not None else kwargs.pop('function', None), model)
        source, model, kwargs_normalized = OnlineMultiModalModule._validate_parameters(
            source=source, model=model, type=type, url=url, skip_auth=skip_auth, **kwargs)
        params = {'return_trace': return_trace, 'type': type}
        if model is not None:
            params['model'] = model
        params.update(kwargs_normalized)
        register_type = OnlineMultiModalModule.TYPE_GROUP_MAP.get(type).lower()
        return getattr(lazyllm.online[register_type], source)(**params)

    def __init__(self, model: str = None, source: str = None, url: str = None, type: str = None,
                 return_trace: bool = False, api_key: str = None, dynamic_auth: bool = False,
                 skip_auth: bool = False, id: Optional[str] = None, name: Optional[str] = None,
                 group_id: Optional[str] = None, **kwargs):
        model, source, url, kwargs = resolve_online_params(
            model, source, url, kwargs, url_aliases='base_url')
        _DynamicSourceRouterMixin.__init__(self, id=id, name=name, group_id=group_id, return_trace=return_trace)
        self._model_name = model
        self._base_url = url
        self._skip_auth = skip_auth
        self._type = self._resolve_type_name(type, model)
        self._kwargs = kwargs
        self._init_dynamic_auth(api_key, dynamic_auth)

    def _build_supplier(self, source: str, skip_auth: bool):
        params = {'base_url': self._base_url, 'model': self._model_name, 'return_trace': self._return_trace,
                  'type': self._type, 'api_key': self._api_key, 'skip_auth': skip_auth, **self._kwargs}
        register_type = OnlineMultiModalModule.TYPE_GROUP_MAP.get(self._type).lower()
        return getattr(lazyllm.online[register_type], source)(**params)

`lazyllm.module.llms.onlinemodule.supplier.qwen.QwenRerank`

Bases: LazyLLMOnlineRerankModuleBase

通义千问的重排序模块，继承自OnlineEmbeddingModuleBase，用于对文档进行相关性重排序。

Parameters:

embed_url (str, default: None ) –

重排序API的基础URL，默认为"https://dashscope.aliyuncs.com/api/v1/services/rerank/text-rerank/text-rerank"。
embed_model_name (str, default: None ) –

使用的模型名称，默认为"gte-rerank"。
api_key (str, default: None ) –

通义千问的API密钥，如果未提供则从lazyllm.config['qwen_api_key']读取。
**kwargs –

其他传递给基类的参数。

属性： type: 返回模型类型，固定为"ONLINE_RERANK"。

主要功能： - 对输入的查询和文档列表进行相关性重排序 - 支持自定义排序参数 - 返回每个文档的索引和相关性得分

Source code in lazyllm/module/llms/onlinemodule/supplier/qwen.py

class QwenRerank(LazyLLMOnlineRerankModuleBase):
    """通义千问的重排序模块，继承自OnlineEmbeddingModuleBase，用于对文档进行相关性重排序。

Args:
    embed_url (str): 重排序API的基础URL，默认为"https://dashscope.aliyuncs.com/api/v1/services/rerank/text-rerank/text-rerank"。
    embed_model_name (str): 使用的模型名称，默认为"gte-rerank"。
    api_key (str): 通义千问的API密钥，如果未提供则从lazyllm.config['qwen_api_key']读取。
    **kwargs: 其他传递给基类的参数。

属性：
    type: 返回模型类型，固定为"ONLINE_RERANK"。

主要功能：
    - 对输入的查询和文档列表进行相关性重排序
    - 支持自定义排序参数
    - 返回每个文档的索引和相关性得分
"""

    def __init__(self,
                 embed_url: Optional[str] = None,
                 embed_model_name: Optional[str] = None,
                 api_key: str = None, **kw):
        embed_url = (embed_url or 'https://dashscope.aliyuncs.com/api/v1/services/'
                                  'rerank/text-rerank/text-rerank')
        embed_model_name = embed_model_name or 'gte-rerank-v2'
        super().__init__(embed_url, api_key or self._default_api_key(), embed_model_name, **kw)

    def _get_embed_url(self, url: str) -> str:
        url, done = self._normalize_embed_url(url)
        if done: return url
        if url.rstrip('/').endswith('api/v1'):
            return urljoin(url, 'services/rerank/text-rerank/text-rerank')
        return urljoin(url, 'api/v1/services/rerank/text-rerank/text-rerank')

    @property
    def type(self):
        return 'RERANK'

    def _encapsulated_data(self, query: str, documents: List[str], top_n: int, **kwargs) -> Dict[str, str]:
        json_data = {
            'input': {
                'query': query,
                'documents': documents
            },
            'parameters': {
                'top_n': top_n,
            },
            'model': self._embed_model_name
        }
        if len(kwargs) > 0:
            json_data.update(kwargs)

        return json_data

    def _parse_response(self, response: Dict, input: Union[List, str]) -> List[Tuple]:
        results = response['output']['results']
        return [(result['index'], result['relevance_score']) for result in results]

`lazyllm.module.llms.onlinemodule.supplier.qwen.QwenTTS`

Bases: LazyLLMOnlineTTSModuleBase

通义千问的文本转语音模块，继承自 LazyLLMOnlineTTSModuleBase，提供多种语音合成模型支持。

Parameters:

model (str, default: None ) –

模型名称，默认为"qwen-tts"。可选模型包括： - cosyvoice-v2 - cosyvoice-v1 - sambert - qwen-tts - qwen-tts-latest
api_key (str, default: None ) –

API密钥，默认为None，将从lazyllm.config['qwen_api_key']读取。
return_trace (bool, default: False ) –

是否返回调用追踪信息，默认为False。
**kwargs –

其他传递给基类的参数。

语音合成参数：

input (str): 要转换的文本内容。
voice (str): 说话人声音，默认使用模型默认声音。
speech_rate (float): 语速，默认为1.0。
volume (int): 音量，默认为50。
pitch (float): 音高，默认为1.0。

注意： - 不同的模型可能支持不同的声音选项 - 返回的音频数据会被自动编码为文件格式

Source code in lazyllm/module/llms/onlinemodule/supplier/qwen.py

class QwenTTS(LazyLLMOnlineTTSModuleBase):
    """通义千问的文本转语音模块，继承自 LazyLLMOnlineTTSModuleBase，提供多种语音合成模型支持。

Args:
    model (str): 模型名称，默认为"qwen-tts"。可选模型包括：
        - cosyvoice-v2
        - cosyvoice-v1
        - sambert
        - qwen-tts
        - qwen-tts-latest
    api_key (str): API密钥，默认为None，将从lazyllm.config['qwen_api_key']读取。
    return_trace (bool): 是否返回调用追踪信息，默认为False。
    **kwargs: 其他传递给基类的参数。

语音合成参数：

    input (str): 要转换的文本内容。
    voice (str): 说话人声音，默认使用模型默认声音。
    speech_rate (float): 语速，默认为1.0。
    volume (int): 音量，默认为50。
    pitch (float): 音高，默认为1.0。

注意：
    - 不同的模型可能支持不同的声音选项
    - 返回的音频数据会被自动编码为文件格式
"""
    MODEL_NAME = 'qwen-tts'
    SYNTHESIZERS = {
        'cosyvoice-v2': (synthesize_v2, 'longxiaochun_v2'),
        'cosyvoice-v1': (synthesize_v2, 'longxiaochun'),
        'sambert': (synthesize, 'zhinan-v1'),
        'qwen-tts': (synthesize_qwentts, 'Cherry'),
        'qwen-tts-latest': (synthesize_qwentts, 'Cherry')
    }

    def __init__(self, model: str = None, api_key: str = None, return_trace: bool = False,
                 base_url: Optional[str] = None,
                 base_websocket_url: Optional[str] = None,
                 **kwargs):
        _ensure_dashscope_urls_initialized()
        base_url = base_url or _DASHSCOPE_DEFAULT_HTTP_URL
        base_websocket_url = base_websocket_url or _DASHSCOPE_DEFAULT_WEBSOCKET_URL
        if base_url and base_url != _DASHSCOPE_DEFAULT_HTTP_URL:
            LOG.warning('QwenTTS ignores `base_url`; use `set_dashscope_urls` instead.')
        if base_websocket_url and base_websocket_url != _DASHSCOPE_DEFAULT_WEBSOCKET_URL:
            LOG.warning('QwenTTS ignores `base_websocket_url`; use `set_dashscope_urls` instead.')
        super().__init__(api_key=api_key or self._default_api_key(), model_name=model,
                         return_trace=return_trace, base_url=base_url, **kwargs)
        if self._model_name not in self.SYNTHESIZERS:
            raise ValueError(f'unsupported model: {self._model_name}. '
                             f'supported models: {QwenTTS.SYNTHESIZERS.keys()}')
        self._synthesizer_func, self._voice = QwenTTS.SYNTHESIZERS[self._model_name]

    def _forward(self, input: str = None, voice: str = None, speech_rate: float = 1.0, volume: int = 50,
                 pitch: float = 1.0, url: str = None, model: str = None, **kwargs):
        if url and url != self._base_url:
            raise Exception('Qwen TTS forward() does not support overriding the `url` parameter, please remove it.')
        if 'base_websocket_url' in kwargs:
            raise Exception('Qwen TTS forward() does not support overriding the `base_websocket_url` parameter.')
        # double check for the forward model name
        if model == self._model_name:
            synthesizer_func, default_voice = self._synthesizer_func, self._voice
        else:
            if model not in self.SYNTHESIZERS:
                raise ValueError(f'unsupported model: {model}. '
                                 f'supported models: {QwenTTS.SYNTHESIZERS.keys()}')
            synthesizer_func, default_voice = QwenTTS.SYNTHESIZERS[model]

        call_params = {
            'input': input,
            'model_name': model,
            'voice': voice or default_voice,
            'speech_rate': speech_rate,
            'volume': volume,
            'pitch': pitch,
            **kwargs
        }
        if self._api_key: call_params['api_key'] = self._api_key
        return encode_query_with_filepaths(None, bytes_to_file(synthesizer_func(**call_params)))

`lazyllm.module.llms.onlinemodule.supplier.sensenova.SenseNovaChat`

Bases: OnlineChatModuleBase, FileHandlerBase, _SenseNovaBase

SenseNovaChat是商汤科技开放平台的LLM接口管理组件，继承自OnlineChatModuleBase和FileHandlerBase，具备对话和文件处理能力。

Parameters:

base_url (str, default: None ) –

API的基础URL，默认为"https://api.sensenova.cn/compatible-mode/v1/"。
model (str, default: None ) –

使用的模型名称，默认为"SenseChat-5"。
api_key (str, default: None ) –

商汤API密钥，如果未提供则从lazyllm.config['sensenova_api_key']读取。
secret_key (str, default: None ) –

商汤密钥，如果未提供则从lazyllm.config['sensenova_secret_key']读取。
stream (bool, default: True ) –

是否启用流式输出，默认为True。
return_trace (bool, default: False ) –

是否返回调用链跟踪信息，默认为False。
**kwargs –

其他传递给基类的参数。

Source code in lazyllm/module/llms/onlinemodule/supplier/sensenova.py

class SenseNovaChat(OnlineChatModuleBase, FileHandlerBase, _SenseNovaBase):
    """SenseNovaChat是商汤科技开放平台的LLM接口管理组件，继承自OnlineChatModuleBase和FileHandlerBase，具备对话和文件处理能力。

Args:
    base_url (str): API的基础URL，默认为"https://api.sensenova.cn/compatible-mode/v1/"。
    model (str): 使用的模型名称，默认为"SenseChat-5"。
    api_key (str): 商汤API密钥，如果未提供则从lazyllm.config['sensenova_api_key']读取。
    secret_key (str): 商汤密钥，如果未提供则从lazyllm.config['sensenova_secret_key']读取。
    stream (bool): 是否启用流式输出，默认为True。
    return_trace (bool): 是否返回调用链跟踪信息，默认为False。
    **kwargs: 其他传递给基类的参数。
"""
    TRAINABLE_MODEL_LIST = ['nova-ptc-s-v2']
    VLM_MODEL_PREFIX = ['SenseNova-V6-Turbo', 'SenseChat-Vision', 'SenseNova-V6-Pro', 'SenseNova-V6-Reasoner',
                        'SenseNova-V6-5-Pro', 'SenseNova-V6-5-Turbo']

    def _materialize_lazy_api_key(self) -> str:
        return self._get_api_key(None, None)

    def __init__(self, base_url: Optional[str] = None, model: Optional[str] = None,
                 api_key: str = None, secret_key: str = None, stream: bool = True,
                 return_trace: bool = False, **kwargs):
        base_url = base_url or 'https://api.sensenova.cn/compatible-mode/v1/'
        model = model or 'SenseChat-5'
        if secret_key and isinstance(api_key, (tuple, list)):
            raise KeyError('multi-key is not support when secret_key is provided, please use single-key mode!')
        if api_key not in LAZY_API_KEY_TOKENS:
            api_key = self._get_api_key(api_key, secret_key)
        super().__init__(api_key=api_key, base_url=base_url, model_name=model,
                         stream=stream, return_trace=return_trace, **kwargs)
        FileHandlerBase.__init__(self)
        self._deploy_paramters = None
        self._vlm_force_format_input_with_files = True

    def _get_system_prompt(self):
        return 'You are an AI assistant, developed by SenseTime.'

    def _convert_file_format(self, filepath: str) -> None:
        with open(filepath, 'r', encoding='utf-8') as fr:
            dataset = [json.loads(line) for line in fr]

        json_strs = []
        for ex in dataset:
            lineEx = []
            messages = ex.get('messages', [])
            for message in messages:
                role = message.get('role', '')
                content = message.get('content', '')
                if role in ['system', 'knowledge', 'user', 'assistant']:
                    lineEx.append({'role': role, 'content': content})
            json_strs.append(json.dumps(lineEx, ensure_ascii=False))

        return '\n'.join(json_strs)

    def _upload_train_file(self, train_file):
        url = self._train_parameters.get('upload_url', 'https://file.sensenova.cn/v1/files')
        self.get_finetune_data(train_file)
        file_object = {
            # The correct format should be to pass in a tuple in the format of:
            # (<fileName>, <fileObject>, <Content-Type>),
            # where fileObject refers to the specific value.

            'description': (None, 'train_file', None),
            'scheme': (None, 'FINE_TUNE_2', None),
            'file': (os.path.basename(train_file), self._dataHandler, 'application/json')
        }

        train_file_id = None
        with requests.post(url, headers=self._get_empty_header(), files=file_object) as r:
            if r.status_code != 200:
                raise requests.RequestException(r.text)

            train_file_id = r.json()['id']
            # delete temporary training file
            self._dataHandler.close()
            lazyllm.LOG.info(f'train file id: {train_file_id}')

        def _create_finetuning_dataset(description, files):
            url = urljoin(self._base_url, 'fine-tune/datasets')
            data = {
                'description': description,
                'files': files
            }
            with requests.post(url, headers=self._header, json=data) as r:
                if r.status_code != 200:
                    raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))

                dataset_id = r.json()['dataset']['id']
                status = r.json()['dataset']['status']
                url = url + f'/{dataset_id}'
                while status.lower() != 'ready':
                    try:
                        time.sleep(10)
                        with requests.get(url, headers=self._header) as r:
                            if r.status_code != 200:
                                raise requests.RequestException(r.text)

                            dataset_id = r.json()['dataset']['id']
                            status = r.json()['dataset']['status']
                    except Exception as e:
                        lazyllm.LOG.error(f'error: {e}')
                        raise ValueError(f'created datasets {dataset_id} failed')
                return dataset_id

        return _create_finetuning_dataset('fine-tuning dataset', [train_file_id])

    def _create_finetuning_job(self, train_model, train_file_id, **kw) -> Tuple[str, str]:
        url = urljoin(self._base_url, 'fine-tunes')
        data = {
            'model': train_model,
            'training_file': train_file_id,
            'suffix': kw.get('suffix', 'ft-' + str(uuid.uuid4().hex))
        }
        if 'training_parameters' in kw.keys():
            data.update(kw['training_parameters'])

        with requests.post(url, headers=self._header, json=data) as r:
            if r.status_code != 200:
                raise requests.RequestException(r.text)

            fine_tuning_job_id = r.json()['job']['id']
            status = r.json()['job']['status']
            return (fine_tuning_job_id, status)

    def _validate_api_key(self):
        fine_tune_url = urljoin('https://api.sensenova.cn/v1/llm/', 'models')
        response = requests.get(fine_tune_url, headers=self._header)
        if response.status_code == 200:
            return True
        return False

    def _query_finetuning_job(self, fine_tuning_job_id) -> Tuple[str, str]:
        fine_tune_url = urljoin(self._base_url, f'fine-tunes/{fine_tuning_job_id}')
        with requests.get(fine_tune_url, headers=self._header) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))

            status = r.json()['job']['status']
            fine_tuned_model = None
            if status.lower() == 'succeeded':
                fine_tuned_model = r.json()['job']['fine_tuned_model']
            return (fine_tuned_model, status)

    def set_deploy_parameters(self, **kw):
        """设置模型部署的参数。

Args:
    **kw: 部署参数的键值对，这些参数将在创建部署时使用。
"""
        self._deploy_paramters = kw

    def _create_deployment(self) -> Tuple[str, str]:
        url = urljoin(self._base_url, 'fine-tune/servings')
        data = {
            'model': self._model_name,
            'config': {
                'run_time': 0
            }
        }
        if self._deploy_paramters and len(self._deploy_paramters) > 0:
            data.update(self._deploy_paramters)

        with requests.post(url, headers=self._header, json=data) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))

            fine_tuning_job_id = r.json()['job']['id']
            status = r.json()['job']['status']
            return (fine_tuning_job_id, status)

    def _query_deployment(self, deployment_id) -> str:
        fine_tune_url = urljoin(self._base_url, f'fine-tune/servings/{deployment_id}')
        with requests.get(fine_tune_url, headers=self._header) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))

            status = r.json()['job']['status']
            return status

    def _format_vl_chat_image_url(self, image_url, mime):
        if image_url.startswith('http'):
            return [{'type': 'image_url', 'image_url': image_url}]
        else:
            return [{'type': 'image_base64', 'image_base64': image_url}]

`set_deploy_parameters(**kw)`

设置模型部署的参数。

Parameters:

**kw –

部署参数的键值对，这些参数将在创建部署时使用。

Source code in lazyllm/module/llms/onlinemodule/supplier/sensenova.py

    def set_deploy_parameters(self, **kw):
        """设置模型部署的参数。

Args:
    **kw: 部署参数的键值对，这些参数将在创建部署时使用。
"""
        self._deploy_paramters = kw

`lazyllm.module.llms.onlinemodule.base.onlineMultiModalBase.OnlineMultiModalBase`

Bases: LazyLLMOnlineBase, LLMBase

多模态在线模型的基类，继承自LLMBase，提供多模态模型的基础功能实现。

Parameters:

model_name (str) –

模型名称，默认为None。如果未指定会产生警告。
return_trace (bool, default: False ) –

是否返回调用追踪信息，默认为False。
**kwargs –

其他传递给基类的参数。

属性：

series: 返回模型系列名称。
type: 返回模型类型，固定为"MultiModal"。

主要方法：

share(): 创建模块的共享实例。
forward(input, lazyllm_files, **kwargs): 处理输入和文件的主要方法。
_forward(input, files, **kwargs): 需要被子类实现的具体前向处理方法。

注意： - 子类必须实现_forward方法。 - 如果未指定模型名称(model_name)，系统会产生警告日志。

Source code in lazyllm/module/llms/onlinemodule/base/onlineMultiModalBase.py

class OnlineMultiModalBase(LazyLLMOnlineBase, LLMBase):
    """多模态在线模型的基类，继承自LLMBase，提供多模态模型的基础功能实现。

Args:
    model_name (str): 模型名称，默认为None。如果未指定会产生警告。
    return_trace (bool): 是否返回调用追踪信息，默认为False。
    **kwargs: 其他传递给基类的参数。

属性：

    series: 返回模型系列名称。
    type: 返回模型类型，固定为"MultiModal"。

主要方法：

    share(): 创建模块的共享实例。
    forward(input, lazyllm_files, **kwargs): 处理输入和文件的主要方法。
    _forward(input, files, **kwargs): 需要被子类实现的具体前向处理方法。

注意：
    - 子类必须实现_forward方法。
    - 如果未指定模型名称(model_name)，系统会产生警告日志。
"""
    __lazyllm_registry_disable__ = True

    def __init__(self, model: str = None, return_trace: bool = False, skip_auth: bool = False,
                 api_key: Optional[Union[str, List[str]]] = None, url: str = None, type: Optional[str] = None, **kwargs):
        super().__init__(api_key=api_key, skip_auth=skip_auth, return_trace=return_trace)
        LLMBase.__init__(self, stream=False, init_prompt=False, type=type)
        self._model_name = model if model is not None else kwargs.get('model_name')
        self._base_url = url if url is not None else kwargs.get('base_url')
        if not self._model_name:
            lazyllm.LOG.warning(f'model_name not specified for {self.series}')

    @property
    def type(self):
        return 'MultiModal'

    def _forward(self, input: Union[Dict, str] = None, files: List[str] = None, **kwargs):
        raise NotImplementedError(f'Subclass {self.__class__.__name__} must implement this method')

    def forward(self, input: Union[Dict, str] = None, *, lazyllm_files=None,
                url: str = None, model: str = None, **kwargs):
        input, files = self._get_files(input, lazyllm_files or kwargs.pop('files', None))
        model, _, url, kwargs = resolve_online_params(model, None, url, kwargs,
                                                      model_aliases='model_name', url_aliases='base_url')
        runtime_url = url or self._base_url
        runtime_model = model or self._model_name
        call_params = {'input': input, **kwargs}
        if files: call_params['files'] = files
        return self._forward(**call_params, model=runtime_model, url=runtime_url)

    def __repr__(self):
        return lazyllm.make_repr('Module', 'OnlineMultiModalModule',
                                 series=self.series,
                                 name=self._model_name,
                                 return_trace=self._return_trace)

    def _is_internal_address(self, hostname: str) -> bool:
        try:
            ip_addresses = socket.gethostbyname_ex(hostname)[2]
            for ip_str in ip_addresses:
                ip = ipaddress.ip_address(ip_str)
                if ip.is_private or ip.is_loopback or ip.is_reserved or ip.is_link_local:
                    return True
            return False
        except Exception as e:
            lazyllm.LOG.warning(f'Failed to parse hostname={hostname}: {e}')
            return True

    def _validate_url_security(self, url: str) -> None:
        if not lazyllm.config['allow_internal_network']:
            parse = urlparse(url)
            hostname = parse.hostname
            if hostname and self._is_internal_address(hostname):
                raise ValueError(
                    f'Access to internal network address is not allowed: {hostname}. '
                    f'Set LAZYLLM_ALLOW_INTERNAL_NETWORK=True to enable internal network access.'
                )

    def _validate_image_content_type(self, content_type: str, source: str) -> None:
        if not content_type.startswith('image/'):
            raise ValueError(
                f'Invalid content type for image: {content_type} from {source}. '
                f'Expected content type starting with "image/".'
            )

    def _validate_image_data(self, data: bytes, source: str) -> None:
        try:
            with PIL.Image.open(BytesIO(data)) as img:
                img.verify()
        except Exception:
            raise ValueError(
                f'Invalid image data from {source}. '
                f'The file does not appear to be a valid image.'
            )

    def _get_image_data_from_url(self, url: str, timeout: int = 30) -> bytes:
        self._validate_url_security(url)
        resp = requests.get(url, timeout=timeout, allow_redirects=False)
        resp.raise_for_status()
        content_type = resp.headers.get('Content-Type', '')
        self._validate_image_content_type(content_type, url)
        data = resp.content
        self._validate_image_data(data, url)
        return data

    def _load_images(self, image_paths: Union[str, List[str]]) -> List[tuple]:
        if isinstance(image_paths, str):
            image_paths = [image_paths]
        results = []
        for image_path in image_paths:
            try:
                if image_path.startswith('http://') or image_path.startswith('https://'):
                    data = self._get_image_data_from_url(image_path)
                else:
                    p = Path(image_path)
                    if not p.exists():
                        raise FileNotFoundError(f'Image file not found: {image_path}')
                    data = p.read_bytes()
                    self._validate_image_data(data, image_path)
                base64_str = base64.b64encode(data).decode('utf-8')
                results.append((base64_str, data))
            except Exception as e:
                lazyllm.LOG.error(f'Unexpected error loading image from {image_path}: {str(e)}')
                raise ValueError(f'Failed to load image from {image_path}: {str(e)}')
        return results

`lazyllm.module.llms.onlinemodule.base.utils.LazyLLMOnlineBase`

Bases: ModuleBase

LazyLLM 在线模块基类，继承自 ModuleBase，并使用 LazyLLMRegisterMetaClass，为所有在线服务模块提供统一的基础功能。
该类封装了在线模块的通用行为，包括缓存机制和调试追踪功能，是构建各种在线API服务模块的基础类。

功能特性

继承 ModuleBase 的所有基础功能，包括子模块管理、钩子注册等。
支持在线模块缓存机制，可通过配置控制是否启用缓存。
提供调试追踪功能，便于问题排查和性能分析。
作为所有在线服务模块（如聊天、嵌入、多模态等）的公共基类。

Parameters:

return_trace (bool, default: False ) –

是否将推理结果写入 trace 队列，用于调试和追踪。默认为 False。

使用场景

作为在线聊天模块（OnlineChatModuleBase）的基类。
作为在线嵌入模块（OnlineEmbeddingModuleBase）的基类。
作为在线多模态模块（OnlineMultiModalBase）的基类。
为自定义在线服务模块提供统一的基础功能。

Source code in lazyllm/module/llms/onlinemodule/base/utils.py

class LazyLLMOnlineBase(ModuleBase, metaclass=LazyLLMRegisterMetaClass):
    """LazyLLM 在线模块基类，继承自 ModuleBase，并使用 LazyLLMRegisterMetaClass， 为所有在线服务模块提供统一的基础功能。  
该类封装了在线模块的通用行为，包括缓存机制和调试追踪功能，是构建各种在线API服务模块的基础类。

功能特性:
    - 继承 ModuleBase 的所有基础功能，包括子模块管理、钩子注册等。
    - 支持在线模块缓存机制，可通过配置控制是否启用缓存。
    - 提供调试追踪功能，便于问题排查和性能分析。
    - 作为所有在线服务模块（如聊天、嵌入、多模态等）的公共基类。

Args:
    return_trace (bool): 是否将推理结果写入 trace 队列，用于调试和追踪。默认为 ``False``。

使用场景:
    1. 作为在线聊天模块（OnlineChatModuleBase）的基类。
    2. 作为在线嵌入模块（OnlineEmbeddingModuleBase）的基类。
    3. 作为在线多模态模块（OnlineMultiModalBase）的基类。
    4. 为自定义在线服务模块提供统一的基础功能。
"""
    _model_series = None

    def __init__(self, api_key: Optional[Union[str, List[str]]],
                 skip_auth: Optional[bool] = False, return_trace: bool = False):
        super().__init__(return_trace=return_trace)
        if not skip_auth and not api_key: raise ValueError('api_key is required')
        self._dynamic_auth = (not skip_auth) and isinstance(api_key, str) and api_key in LAZY_API_KEY_TOKENS
        self.__api_keys = '' if skip_auth else api_key
        self.__headers = ([self._get_header('')] if skip_auth else None if self._dynamic_auth else
                          [self._get_header(key) for key in (api_key if isinstance(api_key, list) else [api_key])])
        if config['cache_online_module']: self.use_cache()

    @classmethod
    def _default_api_key(cls) -> str:
        if cls._model_series is None:
            raise ValueError(f'{cls.__name__} has no _model_series; cannot resolve default api_key.')
        return globals.config[f'{cls._model_series}_api_key']

    @property
    def series(self):
        return self.__class__._model_series

    def _materialize_lazy_api_key(self) -> str:
        return self._default_api_key()

    @property
    def _api_key(self):
        if self._dynamic_auth:
            return self._materialize_lazy_api_key()
        if isinstance(self.__api_keys, list):
            return random.choice(self.__api_keys)
        return self.__api_keys

    @staticmethod
    def _get_header(api_key: str) -> dict:
        return {'Content-Type': 'application/json', **({'Authorization': 'Bearer ' + api_key} if api_key else {})}

    def _get_empty_header(self, api_key: Optional[str] = None) -> dict:
        api_key = api_key or self._api_key
        return {'Authorization': f'Bearer {api_key}'} if api_key else None

    @property
    def _header(self):
        if self._dynamic_auth:
            return self._get_header(self._api_key)
        return random.choice(self.__headers)

    @staticmethod
    def __lazyllm_after_registry_hook__(cls, group_name: str, name: str, isleaf: bool):

        allowed = set(list(LLMType))
        config_type_dict = {
            LLMType.CHAT: ('', 'The default model name for '),
            LLMType.EMBED: ('', 'The default embed model name for '),
            LLMType.RERANK: ('', 'The default rerank model name for '),
            LLMType.MULTIMODAL_EMBED: ('_multimodal_embed', 'The default multimodal embed model name for '),
            LLMType.STT: ('_stt', 'The default stt model name for '),
            LLMType.TTS: ('_tts', 'The default tts model name for '),
            LLMType.TEXT2IMAGE: ('_text2image', 'The default text2image model name for '),
        }

        check_and_add_config(key='default_source', description='The default model source for online modules.')
        check_and_add_config(key='default_key', description='The default API key for online modules.')

        if group_name == '':
            assert name == 'online'
        elif not isleaf:
            assert group_name == 'online', 'The group can only be "online" here.'
            assert name.lower() in allowed, f'Registry key {name} not in list {allowed}'
        else:
            subgroup = group_name.split('.')[-1]
            assert name.lower().endswith(subgroup), f'Class name {name} must follow \
                the schema of <SupplierType>, like <Qwen{subgroup.capitalize()}>'
            cls._model_series = supplier = name[:-len(subgroup)].lower()

            check_and_add_config(key=f'{supplier}_api_key',
                                 description=f'The API key for {supplier}', cfg=globals.config)

            if subgroup in config_type_dict:
                key_suffix, description = config_type_dict[subgroup]
                check_and_add_config(key=f'{supplier}{key_suffix}_model_name',
                                     description=f'{description}{supplier}', cfg=globals.config)

`lazyllm.module.module.ModuleCache`

Bases: object

模块缓存管理器，提供统一的缓存存储和检索功能。
该类封装了多种缓存策略（内存、文件、SQLite、Redis），支持根据配置自动选择缓存存储方式，为模块执行结果提供高效的缓存机制。

功能特性

支持多种缓存策略：内存缓存、文件缓存、SQLite数据库缓存、Redis缓存。
自动根据配置选择缓存策略，默认为内存缓存。
支持缓存模式控制（读写、只读、只写、禁用）。
提供统一的缓存接口，隐藏底层存储实现细节。
支持参数哈希化，确保缓存键的唯一性。

Parameters:

strategy (Optional[str], default: None ) –

缓存策略，可选值为 'memory'、'file'、'sqlite'、'redis'。默认为 None，将使用配置中的策略。

使用场景

为模块执行结果提供缓存，避免重复计算。
在分布式环境中使用 Redis 缓存实现共享。
使用文件或数据库缓存实现持久化存储。
根据性能需求选择不同的缓存策略。

Source code in lazyllm/module/module.py

class ModuleCache(object):
    """模块缓存管理器，提供统一的缓存存储和检索功能。  
该类封装了多种缓存策略（内存、文件、SQLite、Redis），支持根据配置自动选择缓存存储方式，为模块执行结果提供高效的缓存机制。

功能特性:
    - 支持多种缓存策略：内存缓存、文件缓存、SQLite数据库缓存、Redis缓存。
    - 自动根据配置选择缓存策略，默认为内存缓存。
    - 支持缓存模式控制（读写、只读、只写、禁用）。
    - 提供统一的缓存接口，隐藏底层存储实现细节。
    - 支持参数哈希化，确保缓存键的唯一性。

Args:
    strategy (Optional[str]): 缓存策略，可选值为 'memory'、'file'、'sqlite'、'redis'。默认为 None，将使用配置中的策略。

使用场景:
    1. 为模块执行结果提供缓存，避免重复计算。
    2. 在分布式环境中使用 Redis 缓存实现共享。
    3. 使用文件或数据库缓存实现持久化存储。
    4. 根据性能需求选择不同的缓存策略。
"""
    def __init__(self, strategy: Optional[str] = None):
        self._strategy = self._create_strategy(strategy or lazyllm.config['cache_strategy'])

    def _create_strategy(self, strategy: str) -> _CacheStorageStrategy:
        strategy = strategy.lower()
        strategies = {
            'memory': _MemoryCacheStrategy,
            'file': _FileCacheStrategy,
            'sqlite': _SQLiteCacheStrategy,
            'redis': _RedisCacheStrategy,
        }

        if strategy not in strategies:
            raise ValueError(f'Unsupported cache strategy: {strategy}. '
                             f'Available strategies: {list(strategies.keys())}')
        return strategies[strategy]()

    def _hash(self, args, kw):
        def process_value(value, hash_obj):
            meta = ''
            if isinstance(value, (list, tuple, dict, set)):
                meta = str(type(value)) + str(len(value))
            if isinstance(value, str):
                hash_obj.update(str(file_content_hash(value)).encode())
            elif isinstance(value, set):
                hash_obj.update((meta + '>').encode())
                for item in sorted(value):
                    process_value(item, hash_obj)
                hash_obj.update(('<' + meta).encode())
            elif isinstance(value, (list, tuple)):
                hash_obj.update((meta + '>').encode())
                for item in value:
                    process_value(item, hash_obj)
                hash_obj.update(('<' + meta).encode())
            elif isinstance(value, dict):
                hash_obj.update((meta + '>').encode())
                for k, v in sorted(value.items()):
                    key_meta = 'key:' + str(type(k)) + str(k)
                    hash_obj.update(key_meta.encode())
                    process_value(v, hash_obj)
                hash_obj.update(('<' + meta).encode())
            else:
                value_meta = str(type(value)) + str(value)
                hash_obj.update(value_meta.encode())
        hash_obj = hashlib.md5()
        process_value(args, hash_obj)
        if kw:
            process_value(kw, hash_obj)
        return hash_obj.hexdigest()

    def get(self, key, args, kw):
        """从缓存中获取数据。

根据提供的键和参数从缓存中检索数据。如果缓存模式不允许读取或数据不存在，将抛出异常。

Args:
    key: 缓存键，用于标识缓存数据。
    args: 位置参数，用于生成缓存哈希键。
    kw: 关键字参数，用于生成缓存哈希键。

**Returns:**

- 任意类型：缓存中存储的数据。

**异常:** 

- CacheNotFoundError: 当缓存中不存在指定数据时抛出。
- RuntimeError: 当缓存模式设置为只写（WO）时抛出。
"""
        if 'R' not in lazyllm.config['cache_mode']:
            raise CacheNotFoundError('Cannot read cache due to `LAZYLLM_CACHE_MODE = WO`')
        hash_key = self._hash(args, kw)
        value = self._strategy.get(key, hash_key)
        return transform_path(value, mode='r2a')

    def set(self, key, args, kw, value):
        """将数据存储到缓存中。

根据提供的键和参数将数据存储到缓存中。如果缓存模式不允许写入，则直接返回不执行存储操作。

Args:
    key: 缓存键，用于标识缓存数据。
    args: 位置参数，用于生成缓存哈希键。
    kw: 关键字参数，用于生成缓存哈希键。
    value: 要存储的数据。

**注意:** 

- 如果缓存模式设置为只读（RO）或禁用（NONE），此方法将直接返回而不执行存储操作。
"""
        if 'W' not in lazyllm.config['cache_mode']: return
        hash_key = self._hash(args, kw)
        value = transform_path(value, mode='a2r')
        self._strategy.set(key, hash_key, value)

    def close(self):
        """关闭缓存存储策略。

释放缓存存储策略占用的资源，如关闭数据库连接、清理内存缓存等。调用此方法后，缓存将不再可用。

**注意:** 

- 调用此方法后，缓存实例将无法继续使用。
- 不同的缓存策略可能有不同的资源清理行为。
"""
        self._strategy.close()

`close()`

关闭缓存存储策略。

释放缓存存储策略占用的资源，如关闭数据库连接、清理内存缓存等。调用此方法后，缓存将不再可用。

注意:

调用此方法后，缓存实例将无法继续使用。
不同的缓存策略可能有不同的资源清理行为。

Source code in lazyllm/module/module.py

    def close(self):
        """关闭缓存存储策略。

释放缓存存储策略占用的资源，如关闭数据库连接、清理内存缓存等。调用此方法后，缓存将不再可用。

**注意:** 

- 调用此方法后，缓存实例将无法继续使用。
- 不同的缓存策略可能有不同的资源清理行为。
"""
        self._strategy.close()

`get(key, args, kw)`

从缓存中获取数据。

根据提供的键和参数从缓存中检索数据。如果缓存模式不允许读取或数据不存在，将抛出异常。

Parameters:

key –

缓存键，用于标识缓存数据。
args –

位置参数，用于生成缓存哈希键。
kw –

关键字参数，用于生成缓存哈希键。

Returns:

任意类型：缓存中存储的数据。

异常:

CacheNotFoundError: 当缓存中不存在指定数据时抛出。
RuntimeError: 当缓存模式设置为只写（WO）时抛出。

Source code in lazyllm/module/module.py

    def get(self, key, args, kw):
        """从缓存中获取数据。

根据提供的键和参数从缓存中检索数据。如果缓存模式不允许读取或数据不存在，将抛出异常。

Args:
    key: 缓存键，用于标识缓存数据。
    args: 位置参数，用于生成缓存哈希键。
    kw: 关键字参数，用于生成缓存哈希键。

**Returns:**

- 任意类型：缓存中存储的数据。

**异常:** 

- CacheNotFoundError: 当缓存中不存在指定数据时抛出。
- RuntimeError: 当缓存模式设置为只写（WO）时抛出。
"""
        if 'R' not in lazyllm.config['cache_mode']:
            raise CacheNotFoundError('Cannot read cache due to `LAZYLLM_CACHE_MODE = WO`')
        hash_key = self._hash(args, kw)
        value = self._strategy.get(key, hash_key)
        return transform_path(value, mode='r2a')

`set(key, args, kw, value)`

将数据存储到缓存中。

根据提供的键和参数将数据存储到缓存中。如果缓存模式不允许写入，则直接返回不执行存储操作。

Parameters:

key –

缓存键，用于标识缓存数据。
args –

位置参数，用于生成缓存哈希键。
kw –

关键字参数，用于生成缓存哈希键。
value –

要存储的数据。

注意:

如果缓存模式设置为只读（RO）或禁用（NONE），此方法将直接返回而不执行存储操作。

Source code in lazyllm/module/module.py

    def set(self, key, args, kw, value):
        """将数据存储到缓存中。

根据提供的键和参数将数据存储到缓存中。如果缓存模式不允许写入，则直接返回不执行存储操作。

Args:
    key: 缓存键，用于标识缓存数据。
    args: 位置参数，用于生成缓存哈希键。
    kw: 关键字参数，用于生成缓存哈希键。
    value: 要存储的数据。

**注意:** 

- 如果缓存模式设置为只读（RO）或禁用（NONE），此方法将直接返回而不执行存储操作。
"""
        if 'W' not in lazyllm.config['cache_mode']: return
        hash_key = self._hash(args, kw)
        value = transform_path(value, mode='a2r')
        self._strategy.set(key, hash_key, value)

`lazyllm.module.llms.onlinemodule.supplier.qwen.QwenChat`

Bases: OnlineChatModuleBase, FileHandlerBase

TODO: The Qianwen model has been finetuned and deployed successfully,

   but it is not compatible with the OpenAI interface and can only
   be accessed through the Dashscope SDK.

Source code in lazyllm/module/llms/onlinemodule/supplier/qwen.py

class QwenChat(OnlineChatModuleBase, FileHandlerBase):
    """
    #TODO: The Qianwen model has been finetuned and deployed successfully,
           but it is not compatible with the OpenAI interface and can only
           be accessed through the Dashscope SDK.
    """
    TRAINABLE_MODEL_LIST = ['qwen-turbo', 'qwen-7b-chat', 'qwen-72b-chat']
    VLM_MODEL_PREFIX = ['qwen-vl-plus', 'qwen-vl-max', 'qvq-max', 'qvq-plus']
    MODEL_NAME = 'qwen-plus'

    def __init__(self, base_url: Optional[str] = None, model: Optional[str] = None,
                 api_key: str = None, stream: bool = True, return_trace: bool = False, **kwargs):
        base_url = base_url or 'https://dashscope.aliyuncs.com/'
        super().__init__(api_key=api_key or self._default_api_key(),
                         model_name=model or lazyllm.config['qwen_model_name'] or QwenChat.MODEL_NAME,
                         base_url=base_url, stream=stream, return_trace=return_trace, **kwargs)
        FileHandlerBase.__init__(self)
        self._deploy_paramters = dict()
        if stream:
            self._model_optional_params['incremental_output'] = True
        self.default_train_data = {
            'model': 'qwen-turbo',
            'training_file_ids': None,
            'validation_file_ids': None,
            'training_type': 'efficient_sft',  # sft or efficient_sft
            'hyper_parameters': {
                'n_epochs': 1,
                'batch_size': 16,
                'learning_rate': '1.6e-5',
                'split': 0.9,
                'warmup_ratio': 0.0,
                'eval_steps': 1,
                'lr_scheduler_type': 'linear',
                'max_length': 2048,
                'lora_rank': 8,
                'lora_alpha': 32,
                'lora_dropout': 0.1,
            }
        }
        self.fine_tuning_job_id = None

    def _get_system_prompt(self):
        return ('You are a large-scale language model from Alibaba Cloud, '
                'your name is Tongyi Qianwen, and you are a useful assistant.')

    def _get_chat_url(self, url):
        if url.rstrip('/').endswith('compatible-mode/v1/chat/completions'):
            return url
        return urljoin(url, 'compatible-mode/v1/chat/completions')

    def _convert_file_format(self, filepath: str) -> None:
        with open(filepath, 'r', encoding='utf-8') as fr:
            dataset = [json.loads(line) for line in fr]

        json_strs = []
        for ex in dataset:
            lineEx = {'messages': []}
            messages = ex.get('messages', [])
            for message in messages:
                role = message.get('role', '')
                content = message.get('content', '')
                if role in ['system', 'user', 'assistant']:
                    lineEx['messages'].append({'role': role, 'content': content})
            json_strs.append(json.dumps(lineEx, ensure_ascii=False))

        return '\n'.join(json_strs)

    def _upload_train_file(self, train_file):
        url = urljoin(self._base_url, 'api/v1/files')
        self.get_finetune_data(train_file)
        file_object = {
            # The correct format should be to pass in a tuple in the format of:
            # (<fileName>, <fileObject>, <Content-Type>),
            # where fileObject refers to the specific value.
            'files': (os.path.basename(train_file), self._dataHandler, 'application/json'),
            'descriptions': (None, 'training file', None)
        }

        with requests.post(url, headers=self._get_empty_header(), files=file_object) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))

            if 'data' not in r.json().keys():
                raise ValueError('No data found in response')
            if 'uploaded_files' not in r.json()['data'].keys():
                raise ValueError('No uploaded_files found in response')
            # delete temporary training file
            self._dataHandler.close()
            return r.json()['data']['uploaded_files'][0]['file_id']

    def _update_kw(self, data, normal_config):
        current_train_data = self.default_train_data.copy()
        current_train_data.update(data)

        current_train_data['hyper_parameters']['n_epochs'] = normal_config['num_epochs']
        current_train_data['hyper_parameters']['learning_rate'] = str(normal_config['learning_rate'])
        current_train_data['hyper_parameters']['lr_scheduler_type'] = normal_config['lr_scheduler_type']
        current_train_data['hyper_parameters']['batch_size'] = normal_config['batch_size']
        current_train_data['hyper_parameters']['max_length'] = normal_config['cutoff_len']
        current_train_data['hyper_parameters']['lora_rank'] = normal_config['lora_r']
        current_train_data['hyper_parameters']['lora_alpha'] = normal_config['lora_alpha']

        return current_train_data

    def _create_finetuning_job(self, train_model, train_file_id, **kw) -> Tuple[str, str]:
        url = urljoin(self._base_url, 'api/v1/fine-tunes')
        data = {
            'model': train_model,
            'training_file_ids': [train_file_id]
        }
        if 'training_parameters' in kw.keys():
            data.update(kw['training_parameters'])
        elif 'finetuning_type' in kw:
            data = self._update_kw(data, kw)

        with requests.post(url, headers=self._header, json=data) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))

            fine_tuning_job_id = r.json()['output']['job_id']
            self.fine_tuning_job_id = fine_tuning_job_id
            status = r.json()['output']['status']
            return (fine_tuning_job_id, status)

    def _cancel_finetuning_job(self, fine_tuning_job_id=None):
        if not fine_tuning_job_id and not self.fine_tuning_job_id:
            return 'Invalid'
        job_id = fine_tuning_job_id if fine_tuning_job_id else self.fine_tuning_job_id
        fine_tune_url = urljoin(self._base_url, f'api/v1/fine-tunes/{job_id}/cancel')
        with requests.post(fine_tune_url, headers=self._header) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))
        status = r.json()['output']['status']
        if status == 'success':
            return 'Cancelled'
        else:
            return f'JOB {job_id} status: {status}'

    def _query_finetuned_jobs(self):
        fine_tune_url = urljoin(self._base_url, 'api/v1/fine-tunes')
        with requests.get(fine_tune_url, headers=self._header) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))
        return r.json()

    def _get_finetuned_model_names(self) -> Tuple[List[Tuple[str, str]], List[Tuple[str, str]]]:
        model_data = self._query_finetuned_jobs()
        res = list()
        if 'jobs' not in model_data['output']:
            return res
        for model in model_data['output']['jobs']:
            status = self._status_mapping(model['status'])
            if status == 'Done':
                model_id = model['finetuned_output']
            else:
                model_id = model['model'] + '-' + model['job_id']
            res.append([model['job_id'], model_id, status])
        return res

    def _status_mapping(self, status):
        if status == 'SUCCEEDED':
            return 'Done'
        elif status == 'FAILED':
            return 'Failed'
        elif status in ('CANCELING', 'CANCELED'):
            return 'Cancelled'
        elif status == 'RUNNING':
            return 'Running'
        else:  # PENDING, QUEUING
            return 'Pending'

    def _query_job_status(self, fine_tuning_job_id=None):
        if not fine_tuning_job_id and not self.fine_tuning_job_id:
            raise RuntimeError('No job ID specified. Please ensure that a valid "fine_tuning_job_id" is '
                               'provided as an argument or started a training job.')
        job_id = fine_tuning_job_id if fine_tuning_job_id else self.fine_tuning_job_id
        _, status = self._query_finetuning_job(job_id)
        return self._status_mapping(status)

    def _get_log(self, fine_tuning_job_id=None):
        if not fine_tuning_job_id and not self.fine_tuning_job_id:
            raise RuntimeError('No job ID specified. Please ensure that a valid "fine_tuning_job_id" is '
                               'provided as an argument or started a training job.')
        job_id = fine_tuning_job_id if fine_tuning_job_id else self.fine_tuning_job_id
        fine_tune_url = urljoin(self._base_url, f'api/v1/fine-tunes/{job_id}/logs')
        with requests.get(fine_tune_url, headers=self._header) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))
        return job_id, r.json()

    def _get_curr_job_model_id(self):
        if not self.fine_tuning_job_id:
            return None, None
        model_id, _ = self._query_finetuning_job(self.fine_tuning_job_id)
        return self.fine_tuning_job_id, model_id

    def _query_finetuning_job_info(self, fine_tuning_job_id):
        fine_tune_url = urljoin(self._base_url, f'api/v1/fine-tunes/{fine_tuning_job_id}')
        with requests.get(fine_tune_url, headers=self._header) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))
        return r.json()['output']

    def _query_finetuning_job(self, fine_tuning_job_id) -> Tuple[str, str]:
        info = self._query_finetuning_job_info(fine_tuning_job_id)
        status = info['status']
        # QWen only status == 'SUCCEEDED' can have `finetuned_output`
        if 'finetuned_output' in info:
            fine_tuned_model = info['finetuned_output']
        else:
            fine_tuned_model = info['model'] + '-' + info['job_id']
        return (fine_tuned_model, status)

    def _query_finetuning_cost(self, fine_tuning_job_id):
        info = self._query_finetuning_job_info(fine_tuning_job_id)
        if 'usage' in info and info['usage']:
            return info['usage']
        else:
            return None

    def set_deploy_parameters(self, **kw):
        """设置模型部署参数。

配置部署任务的相关参数，如容量规格等，用于后续模型部署。

Args:
    **kw: 部署参数键值对。
"""
        self._deploy_paramters = kw

    def _create_deployment(self) -> Tuple[str, str]:
        url = urljoin(self._base_url, 'api/v1/deployments')
        data = {
            'model_name': self._model_name,
            'capacity': self._deploy_paramters.get('capcity', 2)
        }
        if self._deploy_paramters and len(self._deploy_paramters) > 0:
            data.update(self._deploy_paramters)

        with requests.post(url, headers=self._header, json=data) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))

            deployment_id = r.json()['output']['deployed_model']
            status = r.json()['output']['status']
            return (deployment_id, status)

    def _query_deployment(self, deployment_id) -> str:
        fine_tune_url = urljoin(self._base_url, f'api/v1/deployments/{deployment_id}')
        with requests.get(fine_tune_url, headers=self._header) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))

            status = r.json()['output']['status']
            return status

    def _format_vl_chat_image_url(self, image_url, mime):
        assert mime is not None, 'Qwen Module requires mime info.'
        image_url = f'data:{mime};base64,{image_url}'
        return [{'type': 'image_url', 'image_url': {'url': image_url}}]

`set_deploy_parameters(**kw)`

设置模型部署参数。

配置部署任务的相关参数，如容量规格等，用于后续模型部署。

Parameters:

**kw –

部署参数键值对。

Source code in lazyllm/module/llms/onlinemodule/supplier/qwen.py

    def set_deploy_parameters(self, **kw):
        """设置模型部署参数。

配置部署任务的相关参数，如容量规格等，用于后续模型部署。

Args:
    **kw: 部署参数键值对。
"""
        self._deploy_paramters = kw

`lazyllm.module.llms.onlinemodule.supplier.qwen.QwenEmbed`

Bases: LazyLLMOnlineEmbedModuleBase

通义千问在线文本嵌入模块。

该类继承自OnlineEmbeddingModuleBase，提供了与通义千问文本嵌入API的交互能力，支持将文本转换为向量表示。

Parameters:

embed_url (str, default: None ) –

嵌入API的URL地址。默认为通义千问官方API地址
embed_model_name (str, default: None ) –

嵌入模型名称。默认为 'text-embedding-v1'
api_key (str, default: None ) –

API密钥。默认为从配置中获取的 'qwen_api_key'

Source code in lazyllm/module/llms/onlinemodule/supplier/qwen.py

class QwenEmbed(LazyLLMOnlineEmbedModuleBase):
    """通义千问在线文本嵌入模块。

该类继承自OnlineEmbeddingModuleBase，提供了与通义千问文本嵌入API的交互能力，支持将文本转换为向量表示。

Args:
    embed_url (str, optional): 嵌入API的URL地址。默认为通义千问官方API地址
    embed_model_name (str, optional): 嵌入模型名称。默认为 'text-embedding-v1'
    api_key (str, optional): API密钥。默认为从配置中获取的 'qwen_api_key'
"""

    def __init__(self,
                 embed_url: Optional[str] = None,
                 embed_model_name: Optional[str] = None,
                 api_key: str = None,
                 batch_size: int = 16,
                 **kw):
        embed_url = (embed_url or 'https://dashscope.aliyuncs.com/api/v1/services/'
                                  'embeddings/text-embedding/text-embedding')
        embed_model_name = embed_model_name or 'text-embedding-v1'
        super().__init__(embed_url, api_key or self._default_api_key(), embed_model_name,
                         batch_size=batch_size, **kw)

    def _get_embed_url(self, url: str) -> str:
        url, done = self._normalize_embed_url(url)
        if done: return url
        if url.rstrip('/').endswith('api/v1'):
            return urljoin(url, 'services/embeddings/text-embedding/text-embedding')
        return urljoin(url, 'api/v1/services/embeddings/text-embedding/text-embedding')

    def _encapsulated_data(self, text: Union[List, str], **kwargs):
        if isinstance(text, str):
            json_data = {
                'input': {
                    'texts': [text]
                },
                'model': self._embed_model_name
            }
            if len(kwargs) > 0:
                json_data.update(kwargs)
            return json_data
        else:
            text_batch = [text[i: i + self._batch_size] for i in range(0, len(text), self._batch_size)]
            json_data = [{'input': {'texts': texts}, 'model': self._embed_model_name} for texts in text_batch]
            if len(kwargs) > 0:
                for i in range(len(json_data)):
                    json_data[i].update(kwargs)
            return json_data

    def _parse_response(self, response: Dict, input: Union[List, str]) -> Union[List[List[float]], List[float]]:
        output = response.get('output', {})
        if not output:
            return []
        embeddings = output.get('embeddings', [])
        if not embeddings:
            return []
        if isinstance(input, str):
            return embeddings[0].get('embedding', [])
        else:
            return [res.get('embedding', []) for res in embeddings]

`lazyllm.module.llms.onlinemodule.supplier.glm.GLMEmbed`

Bases: LazyLLMOnlineEmbedModuleBase

GLM嵌入模型接口类，用于调用智谱AI的文本嵌入服务。

Parameters:

embed_url (str, default: None ) –

嵌入服务API地址，默认为"https://open.bigmodel.cn/api/paas/v4/embeddings"
embed_model_name (str, default: None ) –

嵌入模型名称，默认为"embedding-2"
api_key (str, default: None ) –

API密钥

Source code in lazyllm/module/llms/onlinemodule/supplier/glm.py

class GLMEmbed(LazyLLMOnlineEmbedModuleBase):
    """GLM嵌入模型接口类，用于调用智谱AI的文本嵌入服务。

Args:
    embed_url (str): 嵌入服务API地址，默认为"https://open.bigmodel.cn/api/paas/v4/embeddings"
    embed_model_name (str): 嵌入模型名称，默认为"embedding-2"
    api_key (str): API密钥
"""
    def __init__(self,
                 embed_url: Optional[str] = None,
                 embed_model_name: Optional[str] = None,
                 api_key: str = None,
                 batch_size: int = 16,
                 **kw):
        embed_url = embed_url or 'https://open.bigmodel.cn/api/paas/v4/embeddings'
        embed_model_name = embed_model_name or 'embedding-2'
        super().__init__(embed_url, api_key or self._default_api_key(), embed_model_name,
                         batch_size=batch_size, **kw)

`lazyllm.module.llms.onlinemodule.supplier.glm.GLMSTT`

Bases: LazyLLMOnlineSTTModuleBase

GLM语音识别模块，继承自GLMMultiModal。

提供基于智谱AI的语音转文本(STT)功能，支持音频文件的语音识别。

Parameters:

model_name (str, default: None ) –

模型名称，默认为配置中的模型名或"glm-asr"
api_key (str, default: None ) –

API密钥，默认为配置中的密钥
return_trace (bool, default: False ) –

是否返回追踪信息，默认为False
**kwargs –

其他模型参数

Source code in lazyllm/module/llms/onlinemodule/supplier/glm.py

class GLMSTT(LazyLLMOnlineSTTModuleBase):
    """GLM语音识别模块，继承自GLMMultiModal。

提供基于智谱AI的语音转文本(STT)功能，支持音频文件的语音识别。

Args:
    model_name (str, optional): 模型名称，默认为配置中的模型名或"glm-asr"
    api_key (str, optional): API密钥，默认为配置中的密钥
    return_trace (bool, optional): 是否返回追踪信息，默认为False
    **kwargs: 其他模型参数
"""
    MODEL_NAME = 'glm-asr'

    def __init__(self, model_name: str = None, api_key: str = None,
                 base_url: Optional[str] = None,
                 return_trace: bool = False, **kwargs):
        base_url = base_url or 'https://open.bigmodel.cn/api/paas/v4'
        super().__init__(model_name=model_name or GLMSTT.MODEL_NAME,
                         api_key=api_key or self._default_api_key(), return_trace=return_trace,
                         base_url=base_url, **kwargs)

    def _forward(self, files: List[str] = [], url: str = None, model: str = None, **kwargs):  # noqa B006
        assert len(files) == 1, 'GLMSTT only supports one file'
        assert os.path.exists(files[0]), f'File {files[0]} not found'
        client = _zhipu_client(base_url=url or self._base_url, api_key=self._api_key)
        transcriptResponse = client.audio.transcriptions.create(
            model=model,
            file=open(files[0], 'rb'),
        )
        return transcriptResponse.text

`lazyllm.module.llms.onlinemodule.supplier.deepseek.DeepSeekChat`

Bases: OnlineChatModuleBase

DeepSeek大语言模型接口模块。

Parameters:

base_url (str, default: None ) –

API基础URL，默认为"https://api.deepseek.com"
model (str, default: None ) –

模型名称，默认为"deepseek-chat"
api_key (str, default: None ) –

API密钥，如果为None则从配置中获取
stream (bool, default: True ) –

启用流式输出，默认为True
return_trace (bool, default: False ) –

返回追踪信息，默认为False
**kwargs –

其他传递给基类的参数

Source code in lazyllm/module/llms/onlinemodule/supplier/deepseek.py

class DeepSeekChat(OnlineChatModuleBase):
    """DeepSeek大语言模型接口模块。

Args:
    base_url (str): API基础URL，默认为"https://api.deepseek.com"
    model (str): 模型名称，默认为"deepseek-chat"
    api_key (str): API密钥，如果为None则从配置中获取
    stream (bool): 启用流式输出，默认为True
    return_trace (bool): 返回追踪信息，默认为False
    **kwargs: 其他传递给基类的参数
"""
    def __init__(self, base_url: Optional[str] = None, model: Optional[str] = None,
                 api_key: str = None, stream: bool = True, return_trace: bool = False, **kwargs):
        base_url = base_url or 'https://api.deepseek.com'
        model = model or 'deepseek-chat'
        if model in ('deepseek-chat', 'deepseek-reasoner'):
            LOG.warning(
                f'Model "{model}" is deprecated and will be removed after 2026/07/24. '
                'Please use "deepseek-v4-flash" or "deepseek-v4-pro" instead.')
        super().__init__(api_key=api_key or self._default_api_key(),
                         base_url=base_url, model_name=model, stream=stream, return_trace=return_trace, **kwargs)

    def _get_system_prompt(self):
        return 'You are an intelligent assistant developed by China\'s DeepSeek. You are a helpful assistanti.'

    def _validate_api_key(self):
        try:
            models_url = urljoin(self._base_url, 'models')
            response = requests.get(models_url, headers=self._header, timeout=10)
            return response.status_code == 200
        except Exception:
            return False

`lazyllm.module.llms.onlinemodule.supplier.doubao.DoubaoText2Image`

Bases: LazyLLMOnlineText2ImageModuleBase

字节跳动豆包文生图模块，支持纯文本生成图像和图像编辑模型。

基于字节跳动豆包多模态模型的文生图、图像编辑功能，继承自 LazyLLMOnlineText2ImageModuleBase，通过火山引擎 Ark SDK 调用豆包接口，提供高质量的文本到图像生成能力。

Parameters:

api_key (str, default: None ) –

豆包API密钥，默认为None。
model_name (str) –

模型名称，默认为"doubao-seedream-3-0-t2i-250415"。
return_trace (bool, default: False ) –

是否返回追踪信息，默认为False。
**kwargs –

其他传递给父类的参数。

Source code in lazyllm/module/llms/onlinemodule/supplier/doubao.py

class DoubaoText2Image(LazyLLMOnlineText2ImageModuleBase):
    """字节跳动豆包文生图模块，支持纯文本生成图像和图像编辑模型。

基于字节跳动豆包多模态模型的文生图、图像编辑功能，继承自 LazyLLMOnlineText2ImageModuleBase，
通过火山引擎 Ark SDK 调用豆包接口，提供高质量的文本到图像生成能力。

Args:
    api_key (str, optional): 豆包API密钥，默认为None。
    model_name (str, optional): 模型名称，默认为"doubao-seedream-3-0-t2i-250415"。
    return_trace (bool, optional): 是否返回追踪信息，默认为False。
    **kwargs: 其他传递给父类的参数。
"""
    MODEL_NAME = 'doubao-seedream-3-0-t2i-250415'
    IMAGE_EDITING_MODEL_NAME = 'doubao-seedream-3-0-t2i-250415'

    def __init__(self, api_key: str = None, model: Optional[str] = None, url: Optional[str] = None,
                 return_trace: bool = False, **kwargs):
        url = url or 'https://ark.cn-beijing.volces.com/api/v3'
        resolved_model = model or lazyllm.config['doubao_text2image_model_name'] or DoubaoText2Image.MODEL_NAME
        super().__init__(model=resolved_model, api_key=api_key or self._default_api_key(),
                         return_trace=return_trace, url=url, **kwargs)

    def _ark_client(self, base_url=None):
        return volcenginesdkarkruntime.Ark(base_url=(base_url or self._base_url), api_key=self._api_key)

    def _forward(self, input: str = None, files: List[str] = None, n: int = 1, size: str = '1024x1024', seed: int = -1,
                 guidance_scale: float = 2.5, watermark: bool = True, model: str = None, url: str = None, **kwargs):
        has_ref_image = files is not None and len(files) > 0
        if self._type == LLMType.IMAGE_EDITING and not has_ref_image:
            LOG.warning(
                f'Image editing is enabled for model {self._model_name}, but no image file was provided. '
                f'Please provide an image file via the "files" parameter.'
            )
        if self._type != LLMType.IMAGE_EDITING and has_ref_image:
            msg = str(f'Image file was provided, but image editing is not enabled for model {self._model_name}. Please '
                      f'use default image-editing model {self.IMAGE_EDITING_MODEL_NAME} or other image-editing model.')
            raise ValueError(msg)

        if has_ref_image:
            image_results = self._load_images(files)
            contents = [f'data:image/png;base64,{base64_str}' for base64_str, _ in image_results]
        api_params = {
            'model': model,
            'prompt': input,
            'size': size,
            'seed': seed,
            'guidance_scale': guidance_scale,
            'watermark': watermark,
            **kwargs
        }
        if has_ref_image:
            api_params['image'] = contents
            if n > 1:
                api_params['sequential_image_generation'] = 'auto'
                max_images = min(n, 15)
                sigo = volcenginesdkarkruntime.types.images.SequentialImageGenerationOptions
                api_params['sequential_image_generation_options'] = sigo(max_images=max_images)
        imagesResponse = self._ark_client(base_url=url).images.generate(**api_params)
        image_contents = [requests.get(result.url).content for result in imagesResponse.data]
        return encode_query_with_filepaths(None, bytes_to_file(image_contents))

`lazyllm.module.llms.onlinemodule.supplier.openai.OpenAIChat`

Bases: OnlineChatModuleBase, FileHandlerBase

OpenAI API集成模块，用于聊天完成和微调操作。

提供与OpenAI聊天模型交互的接口，支持推理和微调功能。继承自OnlineChatModuleBase和FileHandlerBase。

Parameters:

base_url (str, default: None ) –

OpenAI API基础URL，默认为"https://api.openai.com/v1/"。
model (str, default: None ) –

用于聊天完成的模型名称，默认为"gpt-3.5-turbo"。
api_key (str, default: None ) –

OpenAI API密钥，默认为lazyllm.config['openai_api_key']。
stream (bool, default: True ) –

使用流式响应，默认为True。
return_trace (bool, default: False ) –

返回追踪信息，默认为False。
**kwargs –

传递给OnlineChatModuleBase的额外参数。

Source code in lazyllm/module/llms/onlinemodule/supplier/openai.py

class OpenAIChat(OnlineChatModuleBase, FileHandlerBase):
    """OpenAI API集成模块，用于聊天完成和微调操作。

提供与OpenAI聊天模型交互的接口，支持推理和微调功能。继承自OnlineChatModuleBase和FileHandlerBase。

Args:
    base_url (str, optional): OpenAI API基础URL，默认为"https://api.openai.com/v1/"。
    model (str, optional): 用于聊天完成的模型名称，默认为"gpt-3.5-turbo"。
    api_key (str, optional): OpenAI API密钥，默认为lazyllm.config['openai_api_key']。
    stream (bool, optional): 使用流式响应，默认为True。
    return_trace (bool, optional): 返回追踪信息，默认为False。
    **kwargs: 传递给OnlineChatModuleBase的额外参数。
"""
    TRAINABLE_MODEL_LIST = ['gpt-3.5-turbo-0125', 'gpt-3.5-turbo-1106',
                            'gpt-3.5-turbo-0613', 'babbage-002',
                            'davinci-002', 'gpt-4-0613']
    NO_PROXY = False

    def __init__(self, base_url: Optional[str] = None, model: Optional[str] = None,
                 api_key: str = None, stream: bool = True, return_trace: bool = False, skip_auth: bool = False, **kw):
        base_url = base_url or 'https://api.openai.com/v1/'
        model = model or 'gpt-3.5-turbo'
        super().__init__(api_key=api_key or self._default_api_key(),
                         base_url=base_url, model_name=model, stream=stream, return_trace=return_trace,
                         skip_auth=skip_auth, **kw)
        FileHandlerBase.__init__(self)
        self.default_train_data = {
            'model': 'gpt-3.5-turbo-0613',
            'training_file': None,
            'validation_file': None,
            'hyperparameters': {
                'n_epochs': 1,
                'batch_size': 16,
                'learning_rate_multiplier': '1.6e-5',
            }
        }
        self.fine_tuning_job_id = None

    def _get_system_prompt(self):
        return 'You are ChatGPT, a large language model trained by OpenAI. You are a helpful assistant.'

    def _convert_file_format(self, filepath: str) -> str:
        with open(filepath, 'r', encoding='utf-8') as fr:
            dataset = [json.loads(line) for line in fr]

        json_strs = []
        for ex in dataset:
            lineEx = {'messages': []}
            messages = ex.get('messages', [])
            for message in messages:
                role = message.get('role', '')
                content = message.get('content', '')
                if role in ['system', 'user', 'assistant']:
                    lineEx['messages'].append({'role': role, 'content': content})
            json_strs.append(json.dumps(lineEx, ensure_ascii=False))

        return '\n'.join(json_strs)

    def _upload_train_file(self, train_file):
        url = urljoin(self._base_url, 'files')
        self.get_finetune_data(train_file)
        file_object = {
            'purpose': (None, 'fine-tune', None),
            'file': (os.path.basename(train_file), self._dataHandler, 'application/json')
        }

        with requests.post(url, headers=self._get_empty_header(), files=file_object) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))

            # delete temporary training file
            self._dataHandler.close()
            return r.json()['id']

    def _update_kw(self, data, normal_config):
        current_train_data = self.default_train_data.copy()
        current_train_data.update(data)

        current_train_data['hyperparameters']['n_epochs'] = normal_config['num_epochs']
        current_train_data['hyperparameters']['learning_rate_multiplier'] = str(normal_config['learning_rate'])
        current_train_data['hyperparameters']['batch_size'] = normal_config['batch_size']
        current_train_data['suffix'] = str(uuid.uuid4())[:7]

        return current_train_data

    def _create_finetuning_job(self, train_model, train_file_id, **kw) -> Tuple[str, str]:
        url = urljoin(self._base_url, 'fine_tuning/jobs')
        data = {
            'model': train_model,
            'training_file': train_file_id
        }
        if len(kw) > 0:
            if 'finetuning_type' in kw:
                data = self._update_kw(data, kw)
            else:
                data.update(kw)

        with requests.post(url, headers=self._header, json=data) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))

            fine_tuning_job_id = r.json()['id']
            self.fine_tuning_job_id = fine_tuning_job_id
            status = r.json()['status']
            return (fine_tuning_job_id, status)

    def _cancel_finetuning_job(self, fine_tuning_job_id=None):
        if not fine_tuning_job_id and not self.fine_tuning_job_id:
            return 'Invalid'
        job_id = fine_tuning_job_id if fine_tuning_job_id else self.fine_tuning_job_id
        fine_tune_url = urljoin(self._base_url, f'fine_tuning/jobs/{job_id}/cancel')
        with requests.post(fine_tune_url, headers=self._get_empty_header()) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))
        status = r.json()['status']
        if status == 'cancelled':
            return 'Cancelled'
        else:
            return f'JOB {job_id} status: {status}'

    def _query_finetuned_jobs(self):
        fine_tune_url = urljoin(self._base_url, 'fine_tuning/jobs')
        with requests.get(fine_tune_url, headers=self._get_empty_header()) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))
        return r.json()

    def _get_finetuned_model_names(self) -> Tuple[List[Tuple[str, str]], List[Tuple[str, str]]]:
        model_data = self._query_finetuned_jobs()
        res = list()
        for model in model_data['data']:
            res.append([model['id'], model['fine_tuned_model'], self._status_mapping(model['status'])])
        return res

    def _status_mapping(self, status):
        if status == 'succeeded':
            return 'Done'
        elif status == 'failed':
            return 'Failed'
        elif status == 'cancelled':
            return 'Cancelled'
        elif status == 'running':
            return 'Running'
        else:  # validating_files, queued
            return 'Pending'

    def _query_job_status(self, fine_tuning_job_id=None):
        if not fine_tuning_job_id and not self.fine_tuning_job_id:
            raise RuntimeError('No job ID specified. Please ensure that a valid "fine_tuning_job_id" is '
                               'provided as an argument or started a training job.')
        job_id = fine_tuning_job_id if fine_tuning_job_id else self.fine_tuning_job_id
        _, status = self._query_finetuning_job(job_id)
        return self._status_mapping(status)

    def _get_log(self, fine_tuning_job_id=None):
        if not fine_tuning_job_id and not self.fine_tuning_job_id:
            raise RuntimeError('No job ID specified. Please ensure that a valid "fine_tuning_job_id" is '
                               'provided as an argument or started a training job.')
        job_id = fine_tuning_job_id if fine_tuning_job_id else self.fine_tuning_job_id
        fine_tune_url = urljoin(self._base_url, f'fine_tuning/jobs/{job_id}/events')
        with requests.get(fine_tune_url, headers=self._get_empty_header()) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))
        return job_id, r.json()

    def _get_curr_job_model_id(self):
        if not self.fine_tuning_job_id:
            return None, None
        model_id, _ = self._query_finetuning_job(self.fine_tuning_job_id)
        return self.fine_tuning_job_id, model_id

    def _query_finetuning_job_info(self, fine_tuning_job_id):
        fine_tune_url = urljoin(self._base_url, f'fine_tuning/jobs/{fine_tuning_job_id}')
        with requests.get(fine_tune_url, headers=self._get_empty_header()) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))
        return r.json()

    def _query_finetuning_job(self, fine_tuning_job_id) -> Tuple[str, str]:
        info = self._query_finetuning_job_info(fine_tuning_job_id)
        status = info['status']
        fine_tuned_model = info['fine_tuned_model'] if 'fine_tuned_model' in info else None
        return (fine_tuned_model, status)

    def _query_finetuning_cost(self, fine_tuning_job_id):
        info = self._query_finetuning_job_info(fine_tuning_job_id)
        if 'trained_tokens' in info and info['trained_tokens']:
            return info['trained_tokens']
        else:
            return None

    def _create_deployment(self) -> Tuple[str, str]:
        return (self._model_name, 'RUNNING')

    def _query_deployment(self, deployment_id) -> str:
        return 'RUNNING'

`lazyllm.module.llms.onlinemodule.supplier.openai.OpenAIRerank`

Bases: LazyLLMOnlineRerankModuleBase

OpenAIRerank 类用于调用 OpenAI 的 Reranking 接口，对文本列表进行重排序（Re-ranking）。

该类继承自 OnlineEmbeddingModuleBase，主要功能包括：

设置嵌入（Embedding）模型的 URL 和名称；
封装请求数据并调用 OpenAI Rerank API；
解析返回的排序结果。

Parameters:

embed_url (str, default: None ) –

OpenAI API 的基础 URL，默认值为 'https://api.openai.com/v1/'。
embed_model_name (str, default: None ) –

嵌入模型名称，用于指定 Rerank 模型。
api_key (str, default: None ) –

OpenAI API Key，可选，如果未提供，则使用 lazyllm 配置中的默认值。
**kw –

其他可选关键字参数，传递给父类构造函数。

Source code in lazyllm/module/llms/onlinemodule/supplier/openai.py

class OpenAIRerank(LazyLLMOnlineRerankModuleBase):
    """
OpenAIRerank 类用于调用 OpenAI 的 Reranking 接口，对文本列表进行重排序（Re-ranking）。

该类继承自 `OnlineEmbeddingModuleBase`，主要功能包括：

- 设置嵌入（Embedding）模型的 URL 和名称；
- 封装请求数据并调用 OpenAI Rerank API；
- 解析返回的排序结果。

Args:
    embed_url (str): OpenAI API 的基础 URL，默认值为 'https://api.openai.com/v1/'。
    embed_model_name (str): 嵌入模型名称，用于指定 Rerank 模型。
    api_key (str): OpenAI API Key，可选，如果未提供，则使用 lazyllm 配置中的默认值。
    **kw: 其他可选关键字参数，传递给父类构造函数。
"""
    NO_PROXY = True

    def __init__(self, embed_url: Optional[str] = None, embed_model_name: Optional[str] = None,
                 api_key: str = None, **kw):
        embed_url = embed_url or 'https://api.openai.com/v1/'
        embed_model_name = embed_model_name or 'rerank-multilingual-v3.0'
        super().__init__(embed_url, api_key or self._default_api_key(), embed_model_name, **kw)

    def _set_embed_url(self):
        self._embed_url = urljoin(self._embed_url, 'rerank')

    @property
    def type(self):
        return 'RERANK'

    def _encapsulated_data(self, query: str, documents: List[str], top_n: int, **kwargs) -> Dict[str, str]:
        json_data = {
            'query': query,
            'documents': documents,
            'top_n': top_n,
            'model': self._embed_model_name
        }
        if len(kwargs) > 0:
            json_data.update(kwargs)

        return json_data

    def _parse_response(self, response: Dict, input: Union[List, str]) -> List[Tuple]:
        results = response['results']
        return [(result['index'], result['relevance_score']) for result in results]

`lazyllm.module.llms.onlinemodule.supplier.sensenova.SenseNovaEmbed`

Bases: LazyLLMOnlineEmbedModuleBase, _SenseNovaBase

商汤科技SenseNova嵌入模型模块，用于文本向量化操作。提供与商汤科技SenseNova嵌入模型交互的接口，支持文本到向量的转换功能。继承自OnlineEmbeddingModuleBase和_SenseNovaBase。

Parameters:

embed_url (str, default: None ) –

嵌入API的URL地址，默认为"https://api.sensenova.cn/v1/llm/embeddings"。
embed_model_name (str, default: None ) –

嵌入模型名称，默认为"nova-embedding-stable"。
api_key (str, default: None ) –

API访问密钥，默认为None。
secret_key (str, default: None ) –

API秘密密钥，默认为None。

Source code in lazyllm/module/llms/onlinemodule/supplier/sensenova.py

class SenseNovaEmbed(LazyLLMOnlineEmbedModuleBase, _SenseNovaBase):
    """商汤科技SenseNova嵌入模型模块，用于文本向量化操作。提供与商汤科技SenseNova嵌入模型交互的接口，支持文本到向量的转换功能。继承自OnlineEmbeddingModuleBase和_SenseNovaBase。

Args:
    embed_url (str, optional): 嵌入API的URL地址，默认为"https://api.sensenova.cn/v1/llm/embeddings"。
    embed_model_name (str, optional): 嵌入模型名称，默认为"nova-embedding-stable"。
    api_key (str, optional): API访问密钥，默认为None。
    secret_key (str, optional): API秘密密钥，默认为None。
"""

    def _materialize_lazy_api_key(self) -> str:
        return self._get_api_key(None, None)

    def __init__(self,
                 embed_url: Optional[str] = None,
                 embed_model_name: Optional[str] = None,
                 api_key: str = None,
                 secret_key: str = None,
                 batch_size: int = 16,
                 **kw):
        embed_url = embed_url or 'https://api.sensenova.cn/v1/llm/embeddings'
        embed_model_name = embed_model_name or 'nova-embedding-stable'
        if api_key not in LAZY_API_KEY_TOKENS:
            api_key = self._get_api_key(api_key, secret_key)
        super().__init__(embed_url, api_key, embed_model_name, batch_size=batch_size, **kw)

    def _get_embed_url(self, url: str) -> str:
        url, done = self._normalize_embed_url(url)
        if done: return url
        return urljoin(url, 'v1/llm/embeddings')

    def _parse_response(self, response: Dict, input: Union[List, str]) -> Union[List[List[float]], List[float]]:
        embeddings = response.get('embeddings', [])
        if not embeddings:
            return []
        if isinstance(input, str):
            return embeddings[0].get('embedding', [])
        else:
            return [res.get('embedding', []) for res in embeddings]

`lazyllm.module.llms.onlinemodule.supplier.siliconflow.SiliconFlowTTS`

Bases: LazyLLMOnlineTTSModuleBase

SiliconFlow文本转语音模块，继承自OnlineMultiModalBase。

提供基于SiliconFlow的文本转语音(TTS)功能，支持将文本转换为音频文件。

Parameters:

api_key (str, default: None ) –

API密钥，默认为配置中的siliconflow_api_key
model_name (str, default: None ) –

模型名称，默认为"fnlp/MOSS-TTSD-v0.5"
base_url (str, default: None ) –

API基础URL，默认为"https://api.siliconflow.cn/v1/"
return_trace (bool, default: False ) –

是否返回追踪信息，默认为False
**kwargs –

其他模型参数

Source code in lazyllm/module/llms/onlinemodule/supplier/siliconflow.py

class SiliconFlowTTS(LazyLLMOnlineTTSModuleBase):
    """SiliconFlow文本转语音模块，继承自OnlineMultiModalBase。

提供基于SiliconFlow的文本转语音(TTS)功能，支持将文本转换为音频文件。

Args:
    api_key (str, optional): API密钥，默认为配置中的siliconflow_api_key
    model_name (str, optional): 模型名称，默认为"fnlp/MOSS-TTSD-v0.5"
    base_url (str, optional): API基础URL，默认为"https://api.siliconflow.cn/v1/"
    return_trace (bool, optional): 是否返回追踪信息，默认为False
    **kwargs: 其他模型参数
"""
    MODEL_NAME = 'fnlp/MOSS-TTSD-v0.5'

    def __init__(self, api_key: str = None, model_name: str = None,
                 base_url: Optional[str] = None,
                 return_trace: bool = False, **kwargs):
        base_url = base_url or 'https://api.siliconflow.cn/v1/'
        super().__init__(api_key=api_key or self._default_api_key(),
                         model_name=model_name or SiliconFlowTTS.MODEL_NAME,
                         return_trace=return_trace, base_url=base_url, **kwargs)
        self._endpoint = 'audio/speech'

    def _make_binary_request(self, endpoint, payload, base_url=None, timeout=180):
        url = f'{(base_url or self._base_url)}{endpoint}'
        try:
            response = requests.post(url, headers=self._header, json=payload, timeout=timeout)
            response.raise_for_status()
            return response.content
        except Exception as e:
            LOG.error(f'API request failed: {str(e)}')
            raise

    def _forward(self, input: str = None, response_format: str = 'mp3',
                 sample_rate: int = 44100, speed: float = 1.0,
                 voice: str = None, references=None, out_path: str = None,
                 url: str = None, model: str = None, **kwargs):

        if not voice:
            active_model = model
            if active_model == 'fnlp/MOSS-TTSD-v0.5':
                voice = 'fnlp/MOSS-TTSD-v0.5:alex'
            elif active_model == 'FunAudioLLM/CosyVoice2-0.5B':
                voice = 'FunAudioLLM/CosyVoice2-0.5B:alex'
            else:
                raise ValueError(
                    f'Default voice is only supported for models "fnlp/MOSS-TTSD-v0.5" and '
                    f'"FunAudioLLM/CosyVoice2-0.5B". For model "{active_model}", '
                    f'please provide a valid voice parameter.')
        payload = {
            'model': model,
            'input': input,
            'response_format': response_format,
            'sample_rate': sample_rate,
            'speed': speed,
            'voice': voice
        }

        if references:
            payload['references'] = references

        payload.update(kwargs)
        audio_content = self._make_binary_request(self._endpoint, payload, base_url=url, timeout=180)
        file_path = bytes_to_file([audio_content])[0]

        if out_path:
            with open(file_path, 'rb') as src, open(out_path, 'wb') as dst:
                dst.write(src.read())
            file_path = out_path

        result = encode_query_with_filepaths(None, [file_path])

        return result

`lazyllm.module.llms.onlinemodule.supplier.siliconflow.SiliconFlowChat`

Bases: OnlineChatModuleBase, FileHandlerBase

SiliconFlow 模块，继承自 OnlineChatModuleBase 和 FileHandlerBase。

提供基于 SiliconFlow 平台的大语言模型对话能力，支持多种模型（包括视觉语言模型），并具备文件处理功能。

Parameters:

base_url (str, default: None ) –

API 基础地址，默认为 "https://api.siliconflow.cn/v1/"
model (str, default: None ) –

使用的模型名称，默认为 "Qwen/QwQ-32B"
api_key (str, default: None ) –

API 密钥，默认从配置项 lazyllm.config['siliconflow_api_key'] 中读取
stream (bool, default: True ) –

是否启用流式输出，默认为 True
return_trace (bool, default: False ) –

是否返回追踪信息，默认为 False
**kwargs –

其他模型参数

Source code in lazyllm/module/llms/onlinemodule/supplier/siliconflow.py

class SiliconFlowChat(OnlineChatModuleBase, FileHandlerBase):
    """SiliconFlow 模块，继承自 OnlineChatModuleBase 和 FileHandlerBase。

提供基于 SiliconFlow 平台的大语言模型对话能力，支持多种模型（包括视觉语言模型），并具备文件处理功能。

Args:
    base_url (str, optional): API 基础地址，默认为 "https://api.siliconflow.cn/v1/"
    model (str, optional): 使用的模型名称，默认为 "Qwen/QwQ-32B"
    api_key (str, optional): API 密钥，默认从配置项 lazyllm.config['siliconflow_api_key'] 中读取
    stream (bool, optional): 是否启用流式输出，默认为 True
    return_trace (bool, optional): 是否返回追踪信息，默认为 False
    **kwargs: 其他模型参数
"""
    VLM_MODEL_PREFIX = ['Qwen/Qwen2.5-VL-72B-Instruct', 'Qwen/Qwen3-VL-30B-A3B-Instruct', 'deepseek-ai/deepseek-vl2',
                        'Qwen/Qwen3-VL-30B-A3B-Thinking', 'THUDM/GLM-4.1V-9B-Thinking']

    def __init__(self, base_url: Optional[str] = None, model: Optional[str] = None,
                 api_key: str = None, stream: bool = True, return_trace: bool = False, **kwargs):
        base_url = base_url or 'https://api.siliconflow.cn/v1/'
        model = model or 'Qwen/Qwen3-8B'
        super().__init__(api_key=api_key or self._default_api_key(), base_url=base_url, model_name=model,
                         stream=stream, return_trace=return_trace, **kwargs)
        FileHandlerBase.__init__(self)
        if stream:
            self._model_optional_params['stream'] = True

    def _get_system_prompt(self):
        return 'You are an intelligent assistant provided by SiliconFlow. You are a helpful assistant.'

    def _validate_api_key(self):
        """Validate API Key by sending a minimal request"""
        try:
            # SiliconFlow validates API key using a minimal chat request
            models_url = urljoin(self._base_url, 'models')
            response = requests.get(models_url, headers=self._header, timeout=10)
            return response.status_code == 200
        except Exception:
            return False

`lazyllm.module.llms.onlinemodule.supplier.siliconflow.SiliconFlowRerank`

Bases: LazyLLMOnlineRerankModuleBase

SiliconFlow 重排序模块，继承自 OnlineEmbeddingModuleBase。

提供基于 SiliconFlow 平台的文本重排序（Reranking）功能，用于对文档列表根据查询相关性进行重新排序。

Parameters:

embed_url (str, default: None ) –

重排序 API 的 URL，默认为 "https://api.siliconflow.cn/v1/rerank"
embed_model_name (str, default: None ) –

使用的重排序模型名称，默认为 "BAAI/bge-reranker-v2-m3"
api_key (str, default: None ) –

API 密钥，默认从配置项 lazyllm.config['siliconflow_api_key'] 中读取
**kw –

其他重排序模块参数

Returns:

–

List[Tuple]: 包含排序结果的列表，每个元素为包含 'index'、'relevance_score' 的元组。

Source code in lazyllm/module/llms/onlinemodule/supplier/siliconflow.py

class SiliconFlowRerank(LazyLLMOnlineRerankModuleBase):
    """SiliconFlow 重排序模块，继承自 OnlineEmbeddingModuleBase。

提供基于 SiliconFlow 平台的文本重排序（Reranking）功能，用于对文档列表根据查询相关性进行重新排序。

Args:
    embed_url (str, optional): 重排序 API 的 URL，默认为 "https://api.siliconflow.cn/v1/rerank"
    embed_model_name (str, optional): 使用的重排序模型名称，默认为 "BAAI/bge-reranker-v2-m3"
    api_key (str, optional): API 密钥，默认从配置项 lazyllm.config['siliconflow_api_key'] 中读取
    **kw: 其他重排序模块参数

Returns:
    List[Tuple]: 包含排序结果的列表，每个元素为包含 'index'、'relevance_score' 的元组。
"""
    def __init__(self, embed_url: Optional[str] = None, embed_model_name: Optional[str] = None,
                 api_key: str = None, **kw):
        embed_url = embed_url or 'https://api.siliconflow.cn/v1/rerank'
        embed_model_name = embed_model_name or 'BAAI/bge-reranker-v2-m3'
        super().__init__(embed_url, api_key or self._default_api_key(), embed_model_name, **kw)

    def _encapsulated_data(self, query: str, documents: List[str], top_n: int, **kwargs) -> Dict:
        json_data = {
            'model': self._embed_model_name,
            'query': query,
            'documents': documents,
            'top_n': top_n
        }
        if len(kwargs) > 0:
            json_data.update(kwargs)
        return json_data

    def _parse_response(self, response: Dict, input: Union[List, str]) -> List[Tuple]:
        results = response.get('results', [])
        return [(result['index'], result['relevance_score']) for result in results]

`lazyllm.module.llms.onlinemodule.supplier.siliconflow.SiliconFlowText2Image`

Bases: LazyLLMOnlineText2ImageModuleBase

SiliconFlow文生图模块，继承自OnlineMultiModalBase。

提供基于SiliconFlow的文本生成图像功能，支持根据文本描述生成图像，支持纯文本生成图像和图像编辑。

Parameters:

api_key (str, default: None ) –

API密钥，默认为配置中的siliconflow_api_key
model_name (str) –

模型名称，默认为"Qwen/Qwen-Image"
base_url (str) –

API基础URL，默认为"https://api.siliconflow.cn/v1/"
return_trace (bool, default: False ) –

是否返回追踪信息，默认为False
**kwargs –

其他模型参数

Source code in lazyllm/module/llms/onlinemodule/supplier/siliconflow.py

class SiliconFlowText2Image(LazyLLMOnlineText2ImageModuleBase):
    """SiliconFlow文生图模块，继承自OnlineMultiModalBase。

提供基于SiliconFlow的文本生成图像功能，支持根据文本描述生成图像，支持纯文本生成图像和图像编辑。

Args:
    api_key (str, optional): API密钥，默认为配置中的siliconflow_api_key
    model_name (str, optional): 模型名称，默认为"Qwen/Qwen-Image"
    base_url (str, optional): API基础URL，默认为"https://api.siliconflow.cn/v1/"
    return_trace (bool, optional): 是否返回追踪信息，默认为False
    **kwargs: 其他模型参数
"""
    MODEL_NAME = 'Qwen/Qwen-Image'
    IMAGE_EDITING_MODEL_NAME = 'Qwen/Qwen-Image-Edit-2509'

    def __init__(self, api_key: str = None, model: str = None,
                 url: Optional[str] = None,
                 return_trace: bool = False, **kwargs):
        url = url or 'https://api.siliconflow.cn/v1/'
        super().__init__(api_key=api_key or self._default_api_key(),
                         model=model or SiliconFlowText2Image.MODEL_NAME, url=url, return_trace=return_trace, **kwargs)
        self._endpoint = 'images/generations'

    def _get_image_data_from_url(self, url: str, timeout: int = 30) -> bytes:
        """
        Override parent implementation because SiliconFlow S3 temporary URLs
        may return application/octet-stream instead of image/* content type.
        """
        self._validate_url_security(url)

        resp = requests.get(
            url,
            timeout=timeout,
            allow_redirects=True
        )
        resp.raise_for_status()
        data = resp.content
        self._validate_image_data(data, url)
        return data

    def _make_request(self, endpoint, payload, base_url=None, timeout=180):
        url = f'{(base_url or self._base_url)}{endpoint}'
        try:
            response = requests.post(url, headers=self._header, json=payload, timeout=timeout)
            response.raise_for_status()
            return response.json()
        except Exception as e:
            LOG.error(f'API request failed: {str(e)}')
            raise

    def _forward(self, input: str = None, files: List[str] = None, size: str = '1024x1024', url: str = None,
                 model: str = None, **kwargs):
        has_ref_image = files is not None and len(files) > 0
        reference_image_data = None
        if self._type == LLMType.IMAGE_EDITING and not has_ref_image:
            raise ValueError(
                f'Image editing is enabled for model {self._model_name}, but no image file was provided. '
                f'Please provide an image file via the "files" parameter.'
            )
        if self._type != LLMType.IMAGE_EDITING and has_ref_image:
            raise ValueError(
                f'Image file was provided, but image editing is not enabled for model {self._model_name}. '
                f'Please use default image-editing model {self.IMAGE_EDITING_MODEL_NAME} or other image-editing model'
            )

        payload = {
            'model': model,
            'prompt': input,
            **kwargs
        }
        if has_ref_image:
            for i, file in enumerate(files):
                reference_image_base64, _ = self._load_images(file)[0]
                reference_image_data = f'data:image/png;base64,{reference_image_base64}'
                if i == 0:
                    payload['image'] = reference_image_data
                elif i > 0:
                    payload[f'image{i + 1}'] = reference_image_data
        result = self._make_request(self._endpoint, payload)
        image_urls = [item['url'] for item in result.get('data', [])]
        if not image_urls:
            raise Exception('No images returned from API')
        image_results = self._load_images(image_urls)
        image_bytes = [data for _, data in image_results]
        if not image_bytes:
            raise Exception('Failed to download any images')
        ai_img_path = os.path.join(config['temp_dir'], 'ai_img')
        file_paths = bytes_to_file(image_bytes, target_dir=ai_img_path)
        return encode_query_with_filepaths(None, file_paths)

`lazyllm.module.llms.onlinemodule.supplier.aiping.AipingChat`

Bases: OnlineChatModuleBase, FileHandlerBase

AipingChat 是 AIPing 的在线聊天模块，继承自 OnlineChatModuleBase 和 FileHandlerBase。

提供与 AIPing 平台大语言模型交互的接口，支持对话生成、文件处理以及模型微调等功能。支持多种模型，包括视觉语言模型（VLM）如 Qwen2.5-VL、Qwen3-VL、GLM-4.5V、GLM-4.6V 等。

Parameters:

base_url (str, default: None ) –

API 基础 URL，默认为 "https://aiping.cn/api/v1/"。
model (str, default: None ) –

使用的模型名称，默认为 "DeepSeek-R1"。
api_key (Optional[str], default: None ) –

访问 AIPing 服务的 API Key，若未提供则从 lazyllm 配置中读取。
stream (bool, default: True ) –

是否开启流式输出，默认为 True。
return_trace (bool, default: False ) –

是否返回调试追踪信息，默认为 False。
**kwargs –

其他传递给 OnlineChatModuleBase 的参数。

功能特点

支持多种大语言模型，包括通用对话模型和视觉语言模型
支持流式输出，提升用户体验
集成文件处理功能，支持微调数据格式验证和转换
内置系统提示："You are an intelligent assistant developed by AIPing. You are a helpful assistant."
支持 API Key 验证，确保服务安全性

Source code in lazyllm/module/llms/onlinemodule/supplier/aiping.py

class AipingChat(OnlineChatModuleBase, FileHandlerBase):
    """AipingChat 是 AIPing 的在线聊天模块，继承自 OnlineChatModuleBase 和 FileHandlerBase。

提供与 AIPing 平台大语言模型交互的接口，支持对话生成、文件处理以及模型微调等功能。支持多种模型，包括视觉语言模型（VLM）如 Qwen2.5-VL、Qwen3-VL、GLM-4.5V、GLM-4.6V 等。

Args:
    base_url (str): API 基础 URL，默认为 "https://aiping.cn/api/v1/"。
    model (str): 使用的模型名称，默认为 "DeepSeek-R1"。
    api_key (Optional[str]): 访问 AIPing 服务的 API Key，若未提供则从 lazyllm 配置中读取。
    stream (bool): 是否开启流式输出，默认为 True。
    return_trace (bool): 是否返回调试追踪信息，默认为 False。
    **kwargs: 其他传递给 OnlineChatModuleBase 的参数。

功能特点:
    1. 支持多种大语言模型，包括通用对话模型和视觉语言模型
    2. 支持流式输出，提升用户体验
    3. 集成文件处理功能，支持微调数据格式验证和转换
    4. 内置系统提示："You are an intelligent assistant developed by AIPing. You are a helpful assistant."
    5. 支持 API Key 验证，确保服务安全性
"""
    VLM_MODEL_PREFIX = [
        'Qwen2.5-VL-',
        'Qwen3-VL-',
        'GLM-4.5V',
        'GLM-4.6V'
    ]

    def __init__(self, base_url: Optional[str] = None, model: Optional[str] = None,
                 api_key: str = None, stream: bool = True, return_trace: bool = False, **kwargs):
        base_url = base_url or 'https://aiping.cn/api/v1/'
        model = model or 'DeepSeek-R1'
        super().__init__(api_key=api_key or self._default_api_key(), base_url=base_url, model_name=model,
                         stream=stream, return_trace=return_trace, **kwargs)
        FileHandlerBase.__init__(self)
        if stream:
            self._model_optional_params['stream'] = True

    def _get_system_prompt(self):
        return 'You are an intelligent assistant developed by AIPing. You are a helpful assistant.'

    def _validate_api_key(self):
        try:
            data = {
                'model': self._model_name,
                'messages': [{'role': 'user', 'content': 'hi'}],
                'max_tokens': 1
            }
            response = requests.post(self._chat_url, headers=self._header, json=data, timeout=TIMEOUT)
            return response.status_code == 200
        except Exception:
            return False

`lazyllm.module.llms.onlinemodule.supplier.aiping.AipingEmbed`

Bases: LazyLLMOnlineEmbedModuleBase

AIPing 文本嵌入模块，继承自 OnlineEmbeddingModuleBase。

提供与 AIPing 文本嵌入服务交互的接口，支持将文本转换为向量表示，支持批量处理。

Parameters:

embed_url (str, default: 'https://aiping.cn/api/v1/embeddings' ) –

嵌入 API 的 URL，默认为 "https://aiping.cn/api/v1/embeddings"。
embed_model_name (str, default: 'text-embedding-v1' ) –

使用的嵌入模型名称，默认为 "text-embedding-v1"。
api_key (Optional[str], default: None ) –

访问 AIPing 服务的 API Key，若未提供则从 lazyllm 配置中读取。
batch_size (int, default: 16 ) –

批处理大小，默认为 16。
**kw –

其他传递给基类的参数。

功能特点

将文本转换为高维向量表示
支持批量文本处理，提高效率
可配置的批处理大小，适应不同性能需求
与 AIPing API 无缝集成

Source code in lazyllm/module/llms/onlinemodule/supplier/aiping.py

class AipingEmbed(LazyLLMOnlineEmbedModuleBase):
    """ AIPing 文本嵌入模块，继承自 OnlineEmbeddingModuleBase。

提供与 AIPing 文本嵌入服务交互的接口，支持将文本转换为向量表示，支持批量处理。

Args:
    embed_url (str): 嵌入 API 的 URL，默认为 "https://aiping.cn/api/v1/embeddings"。
    embed_model_name (str): 使用的嵌入模型名称，默认为 "text-embedding-v1"。
    api_key (Optional[str]): 访问 AIPing 服务的 API Key，若未提供则从 lazyllm 配置中读取。
    batch_size (int): 批处理大小，默认为 16。
    **kw: 其他传递给基类的参数。

功能特点:
    1. 将文本转换为高维向量表示
    2. 支持批量文本处理，提高效率
    3. 可配置的批处理大小，适应不同性能需求
    4. 与 AIPing  API 无缝集成
"""
    def __init__(self, embed_url: str = 'https://aiping.cn/api/v1/embeddings',
                 embed_model_name: str = 'text-embedding-v1', api_key: str = None,
                 batch_size: int = 16, **kw):
        embed_url = embed_url or 'https://aiping.cn/api/v1/embeddings'
        embed_model_name = embed_model_name or 'text-embedding-v1'
        super().__init__(embed_url, api_key or self._default_api_key(),
                         embed_model_name, batch_size=batch_size, **kw)

`lazyllm.module.llms.onlinemodule.supplier.aiping.AipingRerank`

Bases: LazyLLMOnlineRerankModuleBase

AIPing 重排序模块，继承自 OnlineEmbeddingModuleBase。

提供与 AIPing 重排序服务交互的接口，用于对文档列表根据查询相关性进行重新排序。该模块返回一个包含文档索引和相关性得分的元组列表。

Parameters:

embed_url (str, default: None ) –

重排序 API 的 URL，默认为 "https://aiping.cn/api/v1/rerank"。
embed_model_name (str, default: None ) –

使用的重排序模型名称，默认为 "Qwen3-Reranker-0.6B"。
api_key (Optional[str], default: None ) –

访问 AIPing 服务的 API Key，若未提供则从 lazyllm 配置中读取。
**kw –

其他传递给基类的参数。

属性

type (str): 返回模型类型，固定为 "RERANK"。

功能特点

根据查询对文档列表进行相关性重排序
支持自定义排序参数（top_n 等）
返回每个文档的索引和相关性得分
适用于搜索结果优化和文档推荐场景

Source code in lazyllm/module/llms/onlinemodule/supplier/aiping.py

class AipingRerank(LazyLLMOnlineRerankModuleBase):
    """ AIPing 重排序模块，继承自 OnlineEmbeddingModuleBase。

提供与 AIPing 重排序服务交互的接口，用于对文档列表根据查询相关性进行重新排序。该模块返回一个包含文档索引和相关性得分的元组列表。

Args:
    embed_url (str): 重排序 API 的 URL，默认为 "https://aiping.cn/api/v1/rerank"。
    embed_model_name (str): 使用的重排序模型名称，默认为 "Qwen3-Reranker-0.6B"。
    api_key (Optional[str]): 访问 AIPing 服务的 API Key，若未提供则从 lazyllm 配置中读取。
    **kw: 其他传递给基类的参数。

属性:
    type (str): 返回模型类型，固定为 "RERANK"。

功能特点:
    1. 根据查询对文档列表进行相关性重排序
    2. 支持自定义排序参数（top_n 等）
    3. 返回每个文档的索引和相关性得分
    4. 适用于搜索结果优化和文档推荐场景
"""
    def __init__(self, embed_url: Optional[str] = None, embed_model_name: Optional[str] = None,
                 api_key: str = None, **kw):
        embed_url = embed_url or 'https://aiping.cn/api/v1/rerank'
        embed_model_name = embed_model_name or 'Qwen3-Reranker-0.6B'
        super().__init__(embed_url, api_key or self._default_api_key(),
                         embed_model_name, **kw)

    @property
    def type(self):
        return 'RERANK'

    def _encapsulated_data(self, query: str, documents: List[str], top_n: int, **kwargs) -> Dict[str, str]:
        json_data = {
            'model': self._embed_model_name,
            'query': query,
            'documents': documents,
            'top_n': top_n
        }
        if len(kwargs) > 0:
            json_data.update(kwargs)

        return json_data

    def _parse_response(self, response: Dict, input: Union[List, str]) -> List[Tuple]:
        results = response.get('results', [])
        if not results:
            return []
        return [(result['index'], result['relevance_score']) for result in results]

`lazyllm.module.llms.onlinemodule.supplier.aiping.AipingText2Image`

Bases: LazyLLMOnlineText2ImageModuleBase

AIPing 文本生成图像模块，继承自 OnlineMultiModalBase。

提供与 AIPing 图像生成服务交互的接口，支持根据文本描述生成图像。支持负面提示、图像数量、尺寸和随机种子等参数。

Parameters:

api_key (Optional[str], default: None ) –

访问 AIPing 服务的 API Key，若未提供则从 lazyllm 配置中读取。
model_name (str, default: None ) –

使用的模型名称，默认为 "Qwen-Image"。
base_url (str, default: None ) –

API 基础 URL，默认为 "https://aiping.cn/api/v1/"。
return_trace (bool, default: False ) –

是否返回调试追踪信息，默认为 False。
**kwargs –

其他传递给基类的参数。

功能特点

根据文本提示生成高质量图像
支持负面提示，过滤不想要的图像特征
可配置生成图像的数量（n 参数）
支持多种图像尺寸规格
支持随机种子控制，确保结果可重现
自动下载生成的图像并编码为文件格式
默认负面提示："模糊，低质量"

注意

该模块会自动下载生成的图像到本地文件
返回结果会包含文件路径信息，便于后续处理

Source code in lazyllm/module/llms/onlinemodule/supplier/aiping.py

class AipingText2Image(LazyLLMOnlineText2ImageModuleBase):
    """ AIPing 文本生成图像模块，继承自 OnlineMultiModalBase。

提供与 AIPing 图像生成服务交互的接口，支持根据文本描述生成图像。支持负面提示、图像数量、尺寸和随机种子等参数。

Args:
    api_key (Optional[str]): 访问 AIPing 服务的 API Key，若未提供则从 lazyllm 配置中读取。
    model_name (str): 使用的模型名称，默认为 "Qwen-Image"。
    base_url (str): API 基础 URL，默认为 "https://aiping.cn/api/v1/"。
    return_trace (bool): 是否返回调试追踪信息，默认为 False。
    **kwargs: 其他传递给基类的参数。

功能特点:
    1. 根据文本提示生成高质量图像
    2. 支持负面提示，过滤不想要的图像特征
    3. 可配置生成图像的数量（n 参数）
    4. 支持多种图像尺寸规格
    5. 支持随机种子控制，确保结果可重现
    6. 自动下载生成的图像并编码为文件格式
    7. 默认负面提示："模糊，低质量"

注意:
    - 该模块会自动下载生成的图像到本地文件
    - 返回结果会包含文件路径信息，便于后续处理
"""
    def __init__(self, api_key: str = None, model_name: Optional[str] = None,
                 base_url: Optional[str] = None,
                 return_trace: bool = False, **kwargs):
        model_name = model_name or 'Qwen-Image'
        base_url = base_url or 'https://aiping.cn/api/v1/'
        super().__init__(model_name=model_name, api_key=api_key or self._default_api_key(),
                         return_trace=return_trace, **kwargs)
        self._endpoint = 'images/generations'
        self._base_url = base_url

    def _make_request(self, endpoint, payload, timeout=TIMEOUT):
        url = f'{self._base_url}{endpoint}'

        try:
            response = requests.post(url, headers=self._header, json=payload, timeout=timeout)
            response.raise_for_status()
            return response.json()
        except requests.exceptions.RequestException as e:
            lazyllm.LOG.error(f'Request failed: {e}')
            raise

    def _forward(self, input: str = None, negative_prompt: str = None, n: int = None,
                 size: str = None, seed: int = None, **kwargs):
        if not input:
            raise ValueError('Prompt is required')

        input_params = {
            'prompt': input,
            'negative_prompt': negative_prompt or '模糊，低质量'
        }

        extra_body = {}

        if n is not None:
            extra_body['n'] = n

        if size is not None:
            extra_body['size'] = size

        if seed is not None:
            extra_body['seed'] = seed

        payload = {
            'model': self._model_name,
            'input': input_params
        }

        if extra_body:
            payload['extra_body'] = extra_body

        try:
            result = self._make_request(self._endpoint, payload)

            images = result.get('data')
            if not images or not isinstance(images, list) or not images:
                raise ValueError(f'Unexpected response format: {result}')

            image_urls = [img.get('url') for img in images if img.get('url')]
            if not image_urls:
                raise ValueError(f'No image URLs found in response: {result}')

            return encode_query_with_filepaths(None, bytes_to_file([requests.get(url).content for url in image_urls]))

        except Exception as e:
            lazyllm.LOG.error(f'Failed to generate image: {e}')
            raise Exception(f'Failed to generate image: {str(e)}')

模块

lazyllm.module.ModuleBase

eval(*, recursive=True)

evalset(evalset, load_f=None, collect_f=lambda x: x)

forward(*args, **kw)

start()

restart()

update(*, recursive=True)

stream_output(stream_output=None)

used_by(module_id)

register_hook(hook_type)

unregister_hook(hook_type)

clear_hooks()

update_server(*, recursive=True)

wait()

stop()

for_each(filter, action)

lazyllm.module.servermodule.LLMBase

prompt(prompt=None, history=None)

formatter(format=None)

share(prompt=None, format=None, stream=None, history=None, copy_static_params=False)

lazyllm.module.ActionModule

submodules property

forward(*args, **kw)

lazyllm.module.TrainableModule

wait()

stop(task_name=None)

prompt(prompt='', history=None)

log_path(task_name=None)

forward_openai(__input=package(), *, llm_chat_history=None, lazyllm_files=None, tools=None, stream_output=False, max_retries=3, **kw)

forward_standard(__input=package(), *, llm_chat_history=None, lazyllm_files=None, tools=None, stream_output=False, max_retries=3, **kw)

forward(__input=package(), *, llm_chat_history=None, lazyllm_files=None, tools=None, stream_output=False, max_retries=3, **kw)

lazyllm.module.UrlModule

forward(*args, **kw)

lazyllm.module.ServerModule

wait()

stop()

lazyllm.module.AutoModel

lazyllm.module.TrialModule

update()

work(m, q) staticmethod

lazyllm.module.OnlineChatModule

lazyllm.module.llms.onlinemodule.supplier.doubao.DoubaoChat

lazyllm.module.llms.onlinemodule.supplier.ppio.PPIOChat

lazyllm.module.OnlineEmbeddingModule

lazyllm.module.OnlineMultiModalModule

lazyllm.module.llms.onlinemodule.supplier.openai.OpenAIEmbed

lazyllm.module.llms.onlinemodule.supplier.qwen.QwenSTT

lazyllm.module.OnlineChatModuleBase = LazyLLMOnlineChatModuleBase module-attribute

lazyllm.module.OnlineEmbeddingModuleBase

run_embed_batch(input, data, proxies, url=None, **kwargs)

lazyllm.module.llms.onlinemodule.supplier.doubao.DoubaoEmbed

lazyllm.module.llms.onlinemodule.supplier.doubao.DoubaoMultimodalEmbed

lazyllm.module.llms.onlinemodule.supplier.glm.GLMChat

lazyllm.module.llms.onlinemodule.supplier.glm.GLMText2Image

lazyllm.module.llms.onlinemodule.supplier.qwen.QwenText2Image

lazyllm.module.llms.onlinemodule.supplier.kimi.KimiChat

lazyllm.module.llms.onlinemodule.fileHandler.FileHandlerBase

get_finetune_data(filepath)

lazyllm.module.llms.onlinemodule.supplier.glm.GLMChat

lazyllm.module.llms.onlinemodule.supplier.glm.GLMRerank

lazyllm.module.OnlineMultiModalModule

lazyllm.module.llms.onlinemodule.supplier.qwen.QwenRerank

lazyllm.module.llms.onlinemodule.supplier.qwen.QwenTTS

lazyllm.module.llms.onlinemodule.supplier.sensenova.SenseNovaChat

set_deploy_parameters(**kw)

lazyllm.module.llms.onlinemodule.base.onlineMultiModalBase.OnlineMultiModalBase

lazyllm.module.llms.onlinemodule.base.utils.LazyLLMOnlineBase

lazyllm.module.module.ModuleCache

close()

get(key, args, kw)

set(key, args, kw, value)

lazyllm.module.llms.onlinemodule.supplier.qwen.QwenChat

TODO: The Qianwen model has been finetuned and deployed successfully,

set_deploy_parameters(**kw)

lazyllm.module.llms.onlinemodule.supplier.qwen.QwenEmbed

lazyllm.module.llms.onlinemodule.supplier.glm.GLMEmbed

lazyllm.module.llms.onlinemodule.supplier.glm.GLMSTT

lazyllm.module.llms.onlinemodule.supplier.deepseek.DeepSeekChat

lazyllm.module.llms.onlinemodule.supplier.doubao.DoubaoText2Image

`lazyllm.module.ModuleBase`

`eval(*, recursive=True)`

`evalset(evalset, load_f=None, collect_f=lambda x: x)`

`forward(*args, **kw)`

`start()`

`restart()`

`update(*, recursive=True)`

`stream_output(stream_output=None)`

`used_by(module_id)`

`register_hook(hook_type)`

`unregister_hook(hook_type)`

`clear_hooks()`

`update_server(*, recursive=True)`

`wait()`

`stop()`

`for_each(filter, action)`

`lazyllm.module.servermodule.LLMBase`

`prompt(prompt=None, history=None)`

`formatter(format=None)`

`share(prompt=None, format=None, stream=None, history=None, copy_static_params=False)`

`lazyllm.module.ActionModule`

`submodules` `property`

`forward(*args, **kw)`

`lazyllm.module.TrainableModule`

`wait()`

`stop(task_name=None)`

`prompt(prompt='', history=None)`

`log_path(task_name=None)`

`forward_openai(__input=package(), *, llm_chat_history=None, lazyllm_files=None, tools=None, stream_output=False, max_retries=3, **kw)`

`forward_standard(__input=package(), *, llm_chat_history=None, lazyllm_files=None, tools=None, stream_output=False, max_retries=3, **kw)`

`forward(__input=package(), *, llm_chat_history=None, lazyllm_files=None, tools=None, stream_output=False, max_retries=3, **kw)`

`lazyllm.module.UrlModule`

`forward(*args, **kw)`

`lazyllm.module.ServerModule`

`wait()`

`stop()`

`lazyllm.module.AutoModel`

`lazyllm.module.TrialModule`

`update()`

`work(m, q)` `staticmethod`

`lazyllm.module.OnlineChatModule`

`lazyllm.module.llms.onlinemodule.supplier.doubao.DoubaoChat`

`lazyllm.module.llms.onlinemodule.supplier.ppio.PPIOChat`

`lazyllm.module.OnlineEmbeddingModule`

`lazyllm.module.OnlineMultiModalModule`

`lazyllm.module.llms.onlinemodule.supplier.openai.OpenAIEmbed`

`lazyllm.module.llms.onlinemodule.supplier.qwen.QwenSTT`

`lazyllm.module.OnlineChatModuleBase = LazyLLMOnlineChatModuleBase` `module-attribute`

`lazyllm.module.OnlineEmbeddingModuleBase`

`run_embed_batch(input, data, proxies, url=None, **kwargs)`

`lazyllm.module.llms.onlinemodule.supplier.doubao.DoubaoEmbed`

`lazyllm.module.llms.onlinemodule.supplier.doubao.DoubaoMultimodalEmbed`

`lazyllm.module.llms.onlinemodule.supplier.glm.GLMChat`

`lazyllm.module.llms.onlinemodule.supplier.glm.GLMText2Image`

`lazyllm.module.llms.onlinemodule.supplier.qwen.QwenText2Image`

`lazyllm.module.llms.onlinemodule.supplier.kimi.KimiChat`

`lazyllm.module.llms.onlinemodule.fileHandler.FileHandlerBase`

`get_finetune_data(filepath)`

`lazyllm.module.llms.onlinemodule.supplier.glm.GLMChat`

`lazyllm.module.llms.onlinemodule.supplier.glm.GLMRerank`

`lazyllm.module.OnlineMultiModalModule`

`lazyllm.module.llms.onlinemodule.supplier.qwen.QwenRerank`

`lazyllm.module.llms.onlinemodule.supplier.qwen.QwenTTS`

`lazyllm.module.llms.onlinemodule.supplier.sensenova.SenseNovaChat`

`set_deploy_parameters(**kw)`

`lazyllm.module.llms.onlinemodule.base.onlineMultiModalBase.OnlineMultiModalBase`

`lazyllm.module.llms.onlinemodule.base.utils.LazyLLMOnlineBase`

`lazyllm.module.module.ModuleCache`

`close()`

`get(key, args, kw)`

`set(key, args, kw, value)`

`lazyllm.module.llms.onlinemodule.supplier.qwen.QwenChat`

`set_deploy_parameters(**kw)`

`lazyllm.module.llms.onlinemodule.supplier.qwen.QwenEmbed`

`lazyllm.module.llms.onlinemodule.supplier.glm.GLMEmbed`

`lazyllm.module.llms.onlinemodule.supplier.glm.GLMSTT`

`lazyllm.module.llms.onlinemodule.supplier.deepseek.DeepSeekChat`

`lazyllm.module.llms.onlinemodule.supplier.doubao.DoubaoText2Image`

`lazyllm.module.llms.onlinemodule.supplier.openai.OpenAIChat`

`lazyllm.module.llms.onlinemodule.supplier.openai.OpenAIRerank`