Integrate "get model versions" and "download specific model version" into cpp Core with max_versions#816
Integrate "get model versions" and "download specific model version" into cpp Core with max_versions#816selenayang888 wants to merge 14 commits into
Conversation
…o baijumeswani/catalog
…/integrate-get-model-versions-into-cpp
…odel-versions-into-cpp
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
| /// @param model_alias Alias of the model (e.g. "phi-4-mini"). May be NULL to return | ||
| /// all versions of all models (subject to device/EP filtering). |
There was a problem hiding this comment.
Is there a use-case where someone needs to find all versions of all models? That seems like a problematic thing to enable given how much data could be retrieved.
If we require the alias I assume we have no need to expose pagination which is adding a lot of complexity to the code.
If a user really really needs to get all versions for all models they can iterate each alias themselves.
There was a problem hiding this comment.
If we can remove the complexity from public pagination the implementation for fetching all versions of an alias or model name should be pretty similar to AzureCatalogClient::FetchModelsByIds, just with a slightly different filter.
Foundry-Local/sdk_v2/cpp/src/catalog/azure_catalog_client.cc
Lines 373 to 385 in 01939b1
| /// @param max_versions Soft upper bound on the number of variants returned. | ||
| /// Pass 0 (or any negative value) for no cap. |
There was a problem hiding this comment.
Is the max on the versions or the variants? This feels like a really confusing value as I can think of various permutations between version and variant.
e.g. say I have model A with CPU (v1, 2, 3) and GPU variants (v 2, 3, 4).
'latest 2 versions' could be anything that is v3 or v4 (per alias), or it could be v2 and v3 for CPU and v3 and v4 for GPU (per variant).
I'm not sure which is best though. If versioning tends to be per variant, selecting the latest X for the variant might make more sense.
| /// Returned list contains existing flModel handles owned by the catalog; releasing | ||
| /// the list does not invalidate the underlying model handles. | ||
| FL_API_STATUS(GetModelVersions, _In_ const flCatalog* catalog, _In_opt_ const char* model_alias, | ||
| _In_opt_ const char* variant_name, int32_t max_versions, |
There was a problem hiding this comment.
nit: would 'model_name' be more consistent given this value comes from Info_GetName (ModelInfo.Name) doesn't it?
| /// `annotations/tags/alias` filter scopes the result to that one alias; when | ||
| /// empty, no alias filter is added and every versioned model the local hardware | ||
| /// can run is returned across all its versions. | ||
| std::vector<std::vector<CatalogFilter>> BuildAllVersionsFilters(const IEpDetector& ep_detector, |
There was a problem hiding this comment.
Can we add a param to BuildSearchFilters for 'all versions' or 'latest only' and linclude the 'labels' filter based on that instead of duplicating the logic here?
That said, shouldn't the model name be passed in here and used in the filter if provided so all filtering is server side?
| return created1 > created2; | ||
| } | ||
|
|
||
| int CompareCaseInsensitive(const std::string& lhs, const std::string& rhs) { |
There was a problem hiding this comment.
nit: might be good to put this in utils/string_utils.h
| /// Provided with a default `{}` body so an implementation that has not yet | ||
| /// overridden it still compiles. | ||
| virtual PagedModelInfos FetchAllVersionsByAlias( | ||
| const std::string& /*model_alias*/, |
There was a problem hiding this comment.
Would have expected the model name to be passed in here if it can be used in the filter.
| // Subsequent pages are pinned to the region that served page 1 — a filter set | ||
| // never mixes regions (continuation tokens are region-specific). | ||
| response = http_post_response_(BuildRegionalUrl(url_prefix_, url_suffix_, *pinned_region), body); |
There was a problem hiding this comment.
AI likes randomly deleting existing comment. Was it intentional to do so?
Porting the changes "get model versions" and "download specific model version" from C# into C++ Core now: