SDK rejects valid numeric RAG Corpus IDs due to _VALID_RESOURCE_NAME_…#6355
Open
Mauratay wants to merge 1 commit intogoogleapis:mainfrom
Open
SDK rejects valid numeric RAG Corpus IDs due to _VALID_RESOURCE_NAME_…#6355Mauratay wants to merge 1 commit intogoogleapis:mainfrom
Mauratay wants to merge 1 commit intogoogleapis:mainfrom
Conversation
…REGEX in _gapic_utils.py
the SDK validates {rag_corpus} names against the following regex: _VALID_RESOURCE_NAME_REGEX = "[a-z][a-zA-Z0-9._-]{0,127}"
Because this regex strictly requires the first character to be a lowercase letter [a-z], it fails for any numeric IDs generated by Google Cloud, causing functions like get_corpus_name to throw a confusing ValueError instructing the user to provide the correct format.
Steps to reproduce:
~~~
import vertexai
from vertexai.rag.utils import _gapic_utils
# 1. Initialize Vertex AI
vertexai.init(project="PROJECT_NAME", location="us-central1")
# 2. Attempt to use a numeric rag_corpus ID
try:
formatted_name = _gapic_utils.get_corpus_name("123456789")
print(f"Success: {formatted_name}")
except ValueError as e:
print(e)
# Output: name must be of the format `projects/{project}/locations/{location}/ragCorpora/{rag_corpus}` or `{rag_corpus}`
~~~
# BEFORE
_VALID_RESOURCE_NAME_REGEX = "[a-z][a-zA-Z0-9._-]{0,127}"
# AFTER
_VALID_RESOURCE_NAME_REGEX = "[a-zA-Z0-9][a-zA-Z0-9._-]{0,127}"
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
…REGEX in _gapic_utils.py
the SDK validates {rag_corpus} names against the following regex: VALID_RESOURCE_NAME_REGEX = "[a-z][a-zA-Z0-9.-]{0,127}"
Because this regex strictly requires the first character to be a lowercase letter [a-z], it fails for any numeric IDs generated by Google Cloud, causing functions like get_corpus_name to throw a confusing ValueError instructing the user to provide the correct format.
Steps to reproduce:
BEFORE
VALID_RESOURCE_NAME_REGEX = "[a-z][a-zA-Z0-9.-]{0,127}"
AFTER
VALID_RESOURCE_NAME_REGEX = "[a-zA-Z0-9][a-zA-Z0-9.-]{0,127}"
Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:
Note: do not merge your PR from GitHub. Adding the "ready to pull" label is the final step in the review process.
After approvals, the changes in your PR will be committed to the
mainbranch and this PR will be closed.Fixes #<issue_number_goes_here> 🦕