Crawler Settings Image Extraction

The Image Extraction tab configures how the system chooses a representative image for each page. That image is used in search results and in agent citations.

The crawler first checks standard metadata (e.g. og:image). If none is present or you want to override it, you can use Image URL Patterns and/or Image XPaths (under Advanced). → Open Crawler Settings (Image Extraction tab)

Enable AI Auto-Detection

When enabled, the system automatically picks a representative image for pages that don’t have suitable og:image (or similar) metadata. It runs after each crawl: it learns global images to exclude (logos, icons), then uses heuristics to pick the best content image per page.

On — Use auto-detection when metadata is missing or you want a fallback.
Off — Rely only on metadata and any Image URL Patterns or Image XPaths you configure.

Image URL Patterns

URL substrings that identify product or content images (e.g. /images/products/, /uploads/). The first image whose URL contains any of these patterns is used as the page thumbnail. This is more flexible than XPaths and also considers background-image styles.

Add one pattern per row.
Order matters: the first matching image wins.
Leave empty if you only use metadata and/or XPaths.

Advanced: Image XPaths

XPath expressions used to find image URLs on pages that lack standard og:image (or similar) metadata. They are evaluated in the order you list them. Image URL Patterns are tried before XPaths.

Examples:
//img[@id='main_image']/@src
//div[@class='article-image']/img/@src
Use when your pages have a consistent structure but no og:image. Add one XPath per row and reorder as needed.

← Back to Crawler settings overview

Help

Crawler Settings Image Extraction

Enable AI Auto-Detection

Image URL Patterns

Advanced: Image XPaths