asyncify python sdk v2 by prathikr · Pull Request #819 · microsoft/Foundry-Local

prathikr · 2026-06-18T15:56:07Z

No description provided.

vercel · 2026-06-18T15:56:14Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
foundry-local	Ready	Preview, Comment	Jun 18, 2026 3:56pm

Copilot

Pull request overview

This PR converts the Python SDK v2 from synchronous to async, wrapping all blocking native FFI calls with asyncio.to_thread() so they don't block the event loop. The public API surfaces change from sync methods to async def / async for, and tests are updated to use pytest-asyncio with asyncio_mode = "auto".

Changes:

All SDK public methods that call into the native layer (load, unload, download, close, shutdown, start_web_service, stop_web_service, discover_eps, complete, stream, transcribe, transcribe_streaming, generate_embedding(s)) are now async and delegate blocking work to asyncio.to_thread.
Chat and audio streaming APIs are renamed (complete_streaming_chat → stream, synchronous generators → async generators) and the implementation now buffers all streamed items before yielding.
Test infrastructure is updated to pytest-asyncio with a session-scoped event loop, and all integration tests are converted to async def.

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
`sdk_v2/python/src/foundry_local_sdk/foundry_local_manager.py`	Converts singleton manager to async: `asyncio.Lock`, async `initialize`/`close`/`shutdown`/web-service/EP methods, `__aenter__`/`__aexit__`, atexit handler, `__del__`
`sdk_v2/python/src/foundry_local_sdk/imodel.py`	Makes `download`, `load`, `unload` async with `asyncio.to_thread`
`sdk_v2/python/src/foundry_local_sdk/openai/chat_client.py`	Renames `complete_chat` → `complete`, `complete_streaming_chat` → `stream`; async via `asyncio.to_thread`
`sdk_v2/python/src/foundry_local_sdk/openai/embedding_client.py`	Makes `generate_embedding(s)` async via `asyncio.to_thread`
`sdk_v2/python/src/foundry_local_sdk/openai/audio_client.py`	Makes `transcribe` / `transcribe_streaming` async via `asyncio.to_thread`
`sdk_v2/python/src/foundry_local_sdk/request.py`	Adds `cancel_async` thin wrapper
`sdk_v2/python/test/conftest.py`	Session-scoped async fixtures, deprecated `event_loop` fixture override
`sdk_v2/python/pyproject.toml`	Adds `pytest-asyncio` dependency and `asyncio_mode = "auto"`
`sdk_v2/python/test/integration/test_chat_client.py`	Converts chat tests to async
`sdk_v2/python/test/integration/test_embedding_client.py`	Converts embedding tests to async
`sdk_v2/python/test/integration/test_audio_client.py`	Converts audio tests to async
`sdk_v2/python/test/integration/test_model_lifecycle.py`	Converts model lifecycle tests to async
`sdk_v2/python/test/integration/test_ep_lifecycle.py`	Converts EP discovery test to async
`sdk_v2/python/test/integration/test_web_service_and_eps.py`	Converts web service/EP tests to async
`sdk_v2/python/test/integration/test_zz_manager_shutdown.py`	Converts shutdown tests to async
`sdk_v2/python/test/integration/test_zz_singleton_recreate.py`	Converts singleton recreate tests to async

+        def _blocking_stream():
+            """Run blocking streaming in a separate thread."""
+            items = []
+            with ChatSession(self._model) as session:
+                session.set_streaming(True)
+                with Request() as request:
+                    request.add_item(TextItem(request_json, TextItemType.OPENAI_JSON))
+                    with session.process_streaming_request(request) as stream:
+                        for item in stream:
+                            items.append(item)
+            return items
+
+        # Run blocking operation in thread and yield each result
+        items = await asyncio.to_thread(_blocking_stream)


+        def _blocking_stream():
+            """Run blocking streaming in a separate thread."""
+            items = []
+            with AudioSession(self._model) as session:
+                session.set_streaming(True)
+                with Request() as request:
+                    request.add_item(TextItem(request_json, TextItemType.OPENAI_JSON))
+                    with session.process_streaming_request(request) as stream:
+                        for item in stream:
+                            items.append(item)
+            return items
+
+        # Run blocking operation in thread and yield each result
+        items = await asyncio.to_thread(_blocking_stream)


    def __del__(self) -> None:
        # Best-effort safety net — production code should call close() explicitly.
        try:
-            self.close()
+            if self._native_manager is not None:
+                asyncio.run(self.close())
        except Exception:


@@ -314,15 +345,18 @@ def close(self) -> None:
                if FoundryLocalManager.instance is self:
                    FoundryLocalManager.instance = None

-    def __enter__(self) -> "FoundryLocalManager":
+        return await asyncio.to_thread(_close)


-import threading
-from typing import Callable
+import asyncio
+from typing import AsyncGenerator, Callable


+import asyncio
 from abc import ABC, abstractmethod
-from typing import TYPE_CHECKING, Callable
+from typing import TYPE_CHECKING, AsyncGenerator, Callable


+@pytest.fixture(scope="session")
+def event_loop():
+    loop = asyncio.new_event_loop()
+    yield loop
+    loop.close()


+    async def stop_web_service(self) -> None:
        """Stop the optional built-in web service.

        Raises:
            FoundryLocalException: If the web service is not currently running.
        """
-        from foundry_local_sdk._native.api import api
+        if self.urls is None:
+            raise FoundryLocalException("Web service is not running.")

-        with FoundryLocalManager._lock:
-            if self.urls is None:
-                raise FoundryLocalException("Web service is not running.")
+        def _stop():
+            from foundry_local_sdk._native.api import api

            api.check_status(api.root.Manager_WebServiceStop(self._native_manager))
            self.urls = None

+        return await asyncio.to_thread(_stop)


asyncify

4728804

Copilot AI review requested due to automatic review settings June 18, 2026 15:56

vercel Bot deployed to Preview June 18, 2026 15:56 View deployment

Copilot started reviewing on behalf of prathikr June 18, 2026 15:57 View session

Copilot AI reviewed Jun 18, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

asyncify python sdk v2#819

asyncify python sdk v2#819
prathikr wants to merge 1 commit into
mainfrom
prathikrao/asyncify-python-sdk-v2

prathikr commented Jun 18, 2026

Uh oh!

vercel Bot commented Jun 18, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

prathikr commented Jun 18, 2026

Uh oh!

vercel Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vercel Bot commented Jun 18, 2026 •

edited

Loading