---
name: ecommerce-price-monitor
description: >-
  Monitor prices on a set of e-commerce product URLs. Fetch each product page (including dynamic pages when needed), clean the page content to remove tracking-heavy noise (strip script tags except ld+json, remove style and svg tags) before LLM analysis, use AI to extract the current primary product purchase price from the page contents, compare it to a per-product trigger price, and record both the extracted current price and the trigger price into local history/state files. Use when a user asks to run a daily (cron) price check, or when you need automated price watching from a configured product list.
---

# Ecommerce Price Monitor

## Workflow

### 0) Load configuration

Read this config file (if missing, create a helpful default and report that no products are configured yet):

- `state/ecommerce-price-monitor.products.json`

Expected shape:

- `products[]` entries, each with:
  - `id` (string, required)
  - `url` (string, required)
  - `triggerPrice` (number, required)
  - `triggerCurrency` (string, optional; defaults from top-level `defaultCurrency`)
  - `comparator` (string, optional: `lte` or `gte`; default `lte`)
  - `priceHint` (string, optional; e.g. “main price, exclude shipping”) 

### 1) Fetch page contents

For each product URL:

- Prefer `web_fetch` (fast, text extraction).
- If the price is not clearly extractable from fetched text (common with JS-rendered pages), use the `browser` tool to load the page and obtain rendered DOM/text.

Collect enough surrounding evidence (e.g., a quoted snippet or short HTML fragment) to justify the extracted price.

#### Clean content before LLM analysis

Before sending any page content to the LLM for price extraction:

- If you have HTML, run: `python3 scripts/clean_scraped_content.py`
- Strip wasteful markup (tracking scripts, CSS, icons):
  - Remove `<script>` tags except those with `type="application/ld+json"`
  - Remove `<style>` and `<svg>` tags

Then send only the cleaned HTML/text (or a small relevant subset) to the LLM.

#### Fast-path for JSON endpoints (deterministic)

If the URL looks like it returns JSON (for example it ends with `.json` or contains `/_next/data/`), run a deterministic extraction first by running:

- `python3 scripts/extract_product_price.py --url "<productUrl>"`

If `priceHint` contains `listPrice`, also pass `--pricehint "listPrice"` (or the full `priceHint`) so the script can prefer `listPrice` over `sellingPrice`.

If the script returns `ok=true` with `priceValue` (and currency/evidence), use that as `currentPrice` and skip the AI extraction step for that URL. If the script fails (or returns `price-not-found`), only then fall back to AI extraction.


### 2) AI-assisted price extraction (flexible per URL)

From the fetched content, choose the primary product purchase price (not shipping/tax/secondary offers) and extract:

- `currency` (best-effort)
- `priceValue` as a number (normalize decimal separators)
- `rawPriceText` (verbatim)
- `evidence` (short quote/snippet used to choose the price)
- `extractionNote` (optional, especially if only a range or ambiguous price is visible)

Extraction hierarchy and ignore rules (apply in order):

1. **Hidden structured data (highest priority)**
   - Look for `<script type="application/ld+json">` blocks containing Schema.org `"@type": "Product"`.
   - Extract the price from `offers` / `Offer` (prefer active purchase price; skip unavailable/out-of-stock offers when an available one exists).
   - Prefer `price` / `highPrice/lowPrice` when present. If ranges are present, treat `lowPrice` as the best-effort “current” value.

2. **Open Graph / meta tags**
   - If structured data is missing, look in the HTML head for price meta tags such as `og:price:amount` and `product:price:amount`.

3. **Visible page text (fallback)**
   - Locate the main product heading (H1) and select the closest prominent monetary value.

Ignore rules (for this monitor’s “current active purchase price”):
- Ignore prices from “Recommended Products”, “Customers Also Bought”, and similar blocks.
- Ignore strike-through / original prices unless the product explicitly requests it via `priceHint` (for example `priceHint: "listPrice"`).
- Ignore shipping costs.

Final selection guidance:
- If currency is ambiguous, fall back to the product’s `triggerCurrency` (or config default) and note uncertainty.
- Prefer the single price that most directly represents the active purchase price for the main product.

### 3) Compare to trigger price

Compute:

- `triggerHit`:
  - `lte`: `priceValue <= triggerPrice`
  - `gte`: `priceValue >= triggerPrice`

### 4) Register results (history + latest)

Append one JSON record per product per run to:

- `state/ecommerce-price-history.jsonl`

Also write/update a latest snapshot file for convenience:

- `state/ecommerce-price-monitor.latest.json`

History record shape (one JSON object per line):

```json
{
  "runAt": "ISO-8601",
  "productId": "...",
  "url": "...",
  "triggerPrice": 0,
  "triggerCurrency": "...",
  "comparator": "lte|gte",
  "currentPrice": {
    "priceValue": 0,
    "currency": "...",
    "rawPriceText": "...",
    "evidence": "..."
  },
  "triggerHit": true,
  "extractionNote": "optional"
}
```

### 5) Final response (what to announce)

Return a concise human-readable summary for all products:

- product id (and name if available)
- current price + currency
- trigger price + comparator
- `TRIGGER HIT` when applicable

If extraction fails for a URL, include:

- product id + url
- short reason
- suggestion (e.g. “add priceHint”)
