Vision AI - REST API Documentation

About Vision AI

Computer Vision API Documentation

asticaVision API is a general purpose computer vision model with state-of-the-art capabilities. Available through an API, it enables developers to provide users with a powerful and comprehensive suite of image analysis and understanding capabilities.

Vision AI Online Demonstration:

Effortlessly process images and extract valuable insights using the asticaVision API to automate image moderation, automatic categorization, face recognition, and object detection.

REST API ‐ Examples

Computer Vision Code Samples

View API Options and API Output

Endpoint

https://vision.astica.ai/describe

Basic CURL Usage

curl --location --request POST 'https://vision.astica.ai/describe' \
--header 'Content-Type: application/json' \
--insecure \
--data '{
  "tkn": "your_api_key",
  "modelVersion": "2.5_full",
  "input": "https://www.astica.org/inputs/analyze_3.jpg",
  "visionParams": "gpt, describe, describe_all, tags, objects",
  "objects_custom_kw": "",
  "gpt_prompt": "",
  "gpt_length": "90"
}'

Vision Inputs

Input List

Vision API Inputs

Each call to the asticaVision API should include the computer vision parameters that you wish to detect. If you do not supply any parameters, then it will default to all and you will be billed for each transaction for every call.

List of Astica Vision Inputs

token

string

your astica API Key

string

modelVersion

string

Model versions can be one of the following:

2.5_full
2.1_full
older models
1.0_full
2.0_full

string

input

string

Supports valid HTTPS link to image file or base64 encoded string.

Max Filesize: 20MB
Max Resolution: 16000x16000
Compatible File Types
- .PNG
- .JPG version 2.5+ or higher:
- .BMP
- .JIFF

string

visionParams

string

View All Parameters

string

gpt_prompt ^(optional)

string

Use a custom prompt to control the GPT-S description produced by Vision AI. Use natural language prompts to request a specific writing style, written language and more.

Min characters: 8
Max characters: 150 ^{version 2.1 or lower}
Max characters: 325 ^{version 2.5 or higher}
Required Parameter: gpt or gpt_detailed

about custom gpt prompts

string

prompt_length ^(optional)

integer

An optional integer to control the ideal number of words to be returned by the GPT-S description. Defaults to 90 for "gpt" and 125 for "gpt_detailed"

Min characters: 8
Max characters: 125 ^{version 2.1 or lower}
Max characters: 250 ^{version 2.5 or higher}

integer

objects_custom_kw ^(optional)

string

Optional comma-separated string of keywords to be detected. Used to annotate and detect bounding boxes of domain-specific or highly specific objects.

Max characters: 245
Required model version: v2.5 or higher

about custom object detection

string

Vision Parameters

Parameter List

Vision API Parameters

Each request to the API can contain one or more vision parameter which determines which capabilities will be engaged. Utilize multiple different Vision AI functions in a single API call.

Utilize multiple different Vision AI functions in a single API call. If left blank then all available parameters will be used. For cost-effective usage it is recommended to use only the vision parameters that are relevant to your usecase.

Specifying Vision Parameters

Use a comma-separated string:

"visionParams": "gpt_detailed, describe, describe_all, faces, moderate",

List of Computer Vision Parameters

describe

Returns a caption which describes the image.

describe_all ^{v2.0+ or higher}

Returns multiple auxilliary captions that describe the image.

text_read ^{v2.0+ or higher}

Returns the all text found in the image. OCR results include word level bounding boxes.

gpt

Produces a detailed paragraph describing the image roughly 90 words.

gpt_detailed

Produces a high detailed paragraph with increased verbosity, and improved logic and reasoning for custom prompting. Using this parameter will increase the processing time of the request by several seconds.

faces

Returns the age and gender of all faces detected in the image.

objects

Returns the name and bounding box of detected objects.

objects_custom ^{v2.5+ or higher}

Returns the bounding box of all custom objects listed in user-provided objects_custom_kw.

objects_color ^{v2.5+ or higher}

Returns the color codes of all detected objects (including custom objects).

moderate

Returns a calculated value for different types of sensitive materials found in the image.

categories

list (array)

a list of general categories describing the image

{
    "categories": [
        {
          "name": "people",
          "score": 0.7075
        }
    ]
}

view

brands

list (array)

a list of known brands and logo detected in the image

{
    "brands": [
        {
          "name": "Coca-Cola",
          "confidence": 0.827,
          "rectangle": {
            "x": 13,
            "y": 390,
            "w": 145,
            "h": 90
          }
        },
        {
          "name": "Coca-Cola",
          "confidence": 0.833,
          "rectangle": {
            "x": 160,
            "y": 390,
            "w": 140,
            "h": 89
          }
        }
    ]
}

view

landmarks

list (array)

a list of known geographical landmarks found in the image.

{
    "landmarks": [
        {
          "name": "Eiffel Tower",
          "score": "1.00"
        }
    ]
}

view

celebrities

list (array)

a list of known public figures found in the image.

{
    "celebrities": [
        {
          "name": "Jane Doe",
          "score": "1.00",
          "rectangle": {
            "left": 570,
            "top": 282,
            "width": 392,
            "height": 392
          }
        },
        {
          "name": "John Doe",
          "score": "1.00",
          "rectangle": {
            "left": 270,
            "top": 251,
            "width": 197,
            "height": 291
          }
        }
    ]
}

view

astica

dict (object)

contains request specification and transaction details

{
    "astica": {
        "request": "vision",
        "requestType": "analyze",
        "modelVersion": "2.5",
        "api_qty": 30.778
      }
}

view

Vision API Response Output

The following output was produced by submitting this image.

{
   "astica": {
    "request": "vision",
    "requestType": "analyze",
    "modelVersion": "2.5",
    "api_qty": 30.778
  },                                                      
  "caption": {
    "text": "a dog standing on its hind legs and a person standing on the sidewalk",
    "confidence": 1
  },
  "caption_GPTS": "In the image, we see a dog standing on its hind legs on a sidewalk, reaching out to a person standing nearby. The person is wearing black sneakers and jeans. The dog appears to be a German Shepherd, and it is interacting playfully with the person. The scene takes place in a park, with green grass and trees in the background. The person is extending their hand towards the dog, possibly offering a treat or engaging in a game of fetch with a frisbee. The interaction between the dog and the person is heartwarming and playful, showcasing a bond between human and animal.",
  "caption_list": [
    {
      "text": "a dog being held by a person in a park",
      "conf": 0.333,
      "bbox": {
        "w": 190,
        "h": 461,
        "x": 166,
        "y": 0
      },
      "guid_object": "0_9aca5"
    },
    {
      "text": "the dog's face is a man's hand and a dog's leg",
      "conf": 0.133,
      "bbox": {
        "w": 190,
        "h": 461,
        "x": 166,
        "y": 0
      },
      "guid_object": "0_9aca5"
    },
    {
      "text": "the image is a german shepherd dog and a man",
      "conf": 0.133,
      "bbox": {
        "w": 256,
        "h": 413,
        "x": 304,
        "y": 45
      },
      "guid_object": "1_fe687"
    },
    {
      "text": "a dog standing on its hind legs to reach a person's hand",
      "conf": 0.133,
      "bbox": {
        "w": 256,
        "h": 413,
        "x": 304,
        "y": 45
      },
      "guid_object": "1_fe687"
    },
    {
      "text": "a dog and a person playing with a frisbee",
      "conf": 0.133,
      "bbox": {
        "w": 256,
        "h": 413,
        "x": 304,
        "y": 45
      },
      "guid_object": "1_fe687"
    },
    {
      "text": "the shoe of a person standing on a skateboard",
      "conf": 0.133,
      "bbox": {
        "w": 124,
        "h": 67,
        "x": 223,
        "y": 394
      },
      "guid_object": "2_dbc99"
    },
    {
      "text": "a person's feet and shoes on a sidewalk",
      "conf": 0.133,
      "bbox": {
        "w": 124,
        "h": 67,
        "x": 223,
        "y": 394
      },
      "guid_object": "2_dbc99"
    },
    {
      "text": "a person wearing black shoes and jeans standing on a sidewalk",
      "conf": 0.133,
      "bbox": {
        "w": 124,
        "h": 67,
        "x": 223,
        "y": 394
      },
      "guid_object": "2_dbc99"
    },
    {
      "text": "the dog's paw is a woman's hand",
      "conf": 0.133,
      "bbox": null,
      "guid_object": ""
    },
    {
      "text": "a dog standing on its hind legs to get a treat from a person",
      "conf": 0.133,
      "bbox": null,
      "guid_object": ""
    },
    {
      "text": "a dog and a person standing on a sidewalk",
      "conf": 0.133,
      "bbox": null,
      "guid_object": ""
    },
    {
      "text": "the dog's face is the same as the human",
      "conf": 0.133,
      "bbox": null,
      "guid_object": ""
    }
  ],
  "objects": [
    {
      "name": "Person",
      "conf": 0.83,
      "guid_object": "0_9aca5",
      "confidence": 0.83,
      "rectangle": {
        "x": 166,
        "y": 0,
        "w": 190,
        "h": 461
      }
    },
    {
      "name": "Cat",
      "conf": 0.63,
      "guid_object": "1_fe687",
      "confidence": 0.63,
      "rectangle": {
        "x": 304,
        "y": 45,
        "w": 256,
        "h": 413
      }
    },
    {
      "name": "Sneakers",
      "conf": 0.54,
      "guid_object": "2_dbc99",
      "confidence": 0.54,
      "rectangle": {
        "x": 223,
        "y": 394,
        "w": 124,
        "h": 67
      }
    }
  ],
  "objects_custom": [],
  "tags": [
    {
      "name": "grass",
      "confidence": 0.87
    },
    {
      "name": "plant",
      "confidence": 0.8
    },
    {
      "name": "reed",
      "confidence": 0.7
    },
    {
      "name": "sky",
      "confidence": 0.71
    },
    {
      "name": "tree",
      "confidence": 0.84
    },
    {
      "name": "weed",
      "confidence": 0.67
    },
    {
      "name": "animal",
      "confidence": 0.72
    },
    {
      "name": "floor",
      "confidence": 0.73
    },
    {
      "name": "ledge",
      "confidence": 0.68
    },
    {
      "name": "log",
      "confidence": 0.65
    },
    {
      "name": "stand",
      "confidence": 0.88
    },
    {
      "name": "blanket",
      "confidence": 0.76
    },
    {
      "name": "fur",
      "confidence": 0.73
    },
    {
      "name": "pillow",
      "confidence": 0.65
    },
    {
      "name": "curtain",
      "confidence": 0.72
    },
    {
      "name": "dress shirt",
      "confidence": 0.7
    },
    {
      "name": "selfie",
      "confidence": 0.6
    },
    {
      "name": "tie",
      "confidence": 0.84
    },
    {
      "name": "woodpecker",
      "confidence": 0.51
    },
    {
      "name": "brown",
      "confidence": 0.84
    },
    {
      "name": "brown bear",
      "confidence": 0.66
    },
    {
      "name": "enclosure",
      "confidence": 0.7
    },
    {
      "name": "stone",
      "confidence": 0.79
    },
    {
      "name": "stare",
      "confidence": 0.66
    },
    {
      "name": "walk",
      "confidence": 0.82
    },
    {
      "name": "german shepherd",
      "confidence": 0.67
    },
    {
      "name": "black",
      "confidence": 0.86
    },
    {
      "name": "catch",
      "confidence": 0.83
    },
    {
      "name": "sheepdog",
      "confidence": 0.72
    },
    {
      "name": "dog",
      "confidence": 1
    },
    {
      "name": "man",
      "confidence": 0.86
    },
    {
      "name": "park",
      "confidence": 0.89
    },
    {
      "name": "paw",
      "confidence": 0.76
    },
    {
      "name": "play",
      "confidence": 0.81
    },
    {
      "name": "shepherd",
      "confidence": 0.79
    },
    {
      "name": "woman",
      "confidence": 0.87
    }
  ],
  "colors": {
    "dominantColors": [
      "#dccbcb",
      "#2e2729",
      "#92766a"
    ],
    "accentColor": "#1c181b",
    "list": [
      "#b89789",
      "#93756a",
      "#1c181b",
      "#ece0e3",
      "#403737",
      "#ccb6b4",
      "#6d584b"
    ]
  },
  "colors_object": [
    {
      "guid_object": "0_9aca5",
      "primary": [
        "#9d8a78",
        "#282529",
        "#ddd6d9"
      ],
      "accent": "#161418",
      "list": [
        "#3a363a",
        "#bea48d",
        "#7c7163",
        "#ddd6d9",
        "#161418"
      ]
    },
    [
      {
        "guid_object": "1_fe687",
        "primary": [
          "#b1957f",
          "#443932",
          "#e9dcd9"
        ],
        "accent": "#282120",
        "list": [
          "#e9dcd9",
          "#282120",
          "#957e6b",
          "#605144",
          "#cdac93"
        ]
      }
    ],
    [
      {
        "guid_object": "2_dbc99",
        "primary": [
          "#433e37",
          "#af987d",
          "#f2dfd7"
        ],
        "accent": "#2a2727",
        "list": [
          "#c5ab92",
          "#5c5547",
          "#9a8569",
          "#f2dfd7",
          "#2a2727"
        ]
      }
    ]
  ],
  "categories": [],
  "faces": [],
  "brands": [],
  "landmarks": [],
  "celebrities": [],
  "readResult": {
    "stringIndexType": "TextElements",
    "content": "",
    "pages": [
      {
        "height": 486,
        "width": 729,
        "angle": 0,
        "pageNumber": 1,
        "words": [],
        "spans": [
          {
            "offset": 0,
            "length": 0
          }
        ],
        "lines": []
      }
    ],
    "styles": []
  },
  "moderate": {
    "isAdultContent": false,
    "isRacyContent": false,
    "isGoryContent": false,
    "adultScore": 0.01,
    "racyScore": 0.01,
    "goreScore": 0
  },  
  "metadata": {
    "width": 729,
    "height": 486,
    "format": "jpg"
  },
  "GPT_level": 0,              
  "status": "success"
}

Vision Errors

Errors Handling

Vision API Errors

The API will provide a handled error if it encounters an issue during a request to the API.

Vision API Error Format

{
    "status": "error",
    "error": "Unable to handle asticaVision v2.1_full request."
}

More About Vision AI

asticaVision Features

1. Face Detection (Age and Gender):
asticaVision API employs advanced facial detection algorithms to accurately identify and analyze faces within images. In addition to detecting the presence of faces, it can also estimate the age and gender of the individuals, providing useful demographic information for targeted marketing or personalized user experiences.

2. Object Detection:
The API is also capable of identifying and detecting a wide range of objects within images, enabling users to ascertain the presence of specific items or elements. This powerful feature can be used for applications such as inventory management, surveillance, and visual search.

3. Image Tagging and Categorization:
asticaVision API uses machine learning techniques to automatically generate descriptive tags and assign appropriate categories to images based on their visual content. This feature simplifies the process of organizing and indexing large image databases, streamlining the search and retrieval of relevant images based on specific keywords or themes.

4. Content Moderation:
The API includes intelligent content moderation capabilities designed to automatically detect and filter out images containing adult content or mature subject matter. This feature significantly reduces the manual effort required to moderate visual content, ensuring that inappropriate material is not displayed on websites, social platforms, or digital applications.

5. Automatic Image Description and Captioning:
Leveraging its advanced computer vision capabilities, asticaVision API can automatically generate descriptive text and captions for images, providing an accurate and concise summary of the visual content. This feature not only helps in enhancing the accessibility of images for visually impaired users but also aids in improving the overall user experience by offering meaningful and contextual information about the images.

Discover More AI

Experiment with different kinds of artificial intelligence. See, hear, and speak with astica.

Return to Dashboard

You can return to this page at any time.

Home

NOTIFICATIONS

Vision AI API Documentation

Discover More AI

Vision AI ‐ Custom GPT Prompt

Vision AI ‐ Custom Object Detection