Because "10.5" is likely a community-specific version or a mistyped reference to a specific commit, users should rely on the official repositories and trusted hubs to find the correct files.

Xdecoder treats segmentation masks, bounding boxes, and text tokens uniformly. This means you can input an image and a text prompt (e.g., "the red car"), and the model can output a mask or a bounding box for that specific object.