[agentic-token-optimizer] Optimize Daily Agentic AI Research Digest — reduce 8× AIC variance via fetch caps and content limits

### Target Workflow

**Daily Agentic AI Research Digest** (`daily-agentic-research.md`)

**Why selected:** Third-highest AIC workflow in the repository (981 AIC / 7 runs over the analysis window). While recently analyzed, fresh run data reveals continued extreme AIC volatility (8× variance) driven by unconstrained web-fetch behavior — a structurally different problem from prior optimization passes. The AIC monitoring family workflows were excluded from consideration as higher priorities due to 7-day recency.

---

### Analysis Period

**7 runs** from 2026-06-26 to 2026-07-02 (all successful, no failures).

### Spend Profile

| Metric | Value |
|--------|-------|
| Total AIC | 981.00 |
| Avg AIC / run | 140.14 |
| Min AIC / run | 39.11 (run #14, 4 turns) |
| Max AIC / run | 310.66 (run #12, ~20 turns est.) |
| AIC variance ratio | **8×** |
| Avg turns / run (5 known) | 9.2 |
| Avg tokens / turn | ~27,050 |
| AIC / 1,000 tokens | ~0.35 |

<details><summary>Per-run breakdown (7 runs)</summary>

| Run | Date | AIC | Tokens | Turns | Ratio |
|-----|------|-----|--------|-------|-------|
| [§28232804830](https://github.com/githubnext/agentic-ops/actions/runs/28232804830) | Jun 26 | 121.84 | 343,376 | 11 | 11.1/turn |
| [§28286345553](https://github.com/githubnext/agentic-ops/actions/runs/28286345553) | Jun 27 | 233.75 | ~666K est. | ~17 est. | — |
| [§28319198943](https://github.com/githubnext/agentic-ops/actions/runs/28319198943) | Jun 28 | 310.66 | ~885K est. | ~20 est. | — |
| [§28368310575](https://github.com/githubnext/agentic-ops/actions/runs/28368310575) | Jun 29 | 91.30 | 262,408 | 9 | 10.1/turn |
| [§28438420177](https://github.com/githubnext/agentic-ops/actions/runs/28438420177) | Jun 30 | 39.11 | 110,382 | 4 | **9.8/turn** |
| [§28512043721](https://github.com/githubnext/agentic-ops/actions/runs/28512043721) | Jul 01 | 85.30 | 248,291 | 8 | 10.7/turn |
| [§28583122414](https://github.com/githubnext/agentic-ops/actions/runs/28583122414) | Jul 02 | 99.05 | 279,878 | 14 | 7.1/turn |

Token estimates for runs 11–12 derived from confirmed AIC-per-token ratio (0.351 AIC/1,000 tokens) across 5 measured runs.

</details>

---

### Key Finding: AIC Cost Is Turn-Linear

AIC tracks nearly linearly with turn count at ~9.45 AIC/turn. The 8× AIC variance (39 to 311) is driven entirely by turn count variance (4 to ~20), not model choice or prompt size. The agent consumes ~27,000 tokens/turn regardless of run — meaning every extra source fetch adds ~10 AIC.

The two most expensive runs (Jun 27–28, combined 544 AIC = **55% of weekly total**) likely triggered 3+ source fetches including the broad arxiv search URL, which returns large result pages.

---

### Ranked Recommendations

#### 1. Harden the source-fetch limit and early-exit condition — estimated **~55 AIC/run savings**

**Current prompt (§ Browsing Instructions):**
> Browse 2–3 of these sources (don't fetch all if you find a strong candidate early).

**Problem:** "2–3" is permissive, and "if you find a strong candidate" is vague. On Jun 27 and Jun 28 the agent consumed 17–20 estimated turns (234–311 AIC). Capping at 2 sources and making the early-exit binary would eliminate these outlier runs.

**Proposed change:**
```
Fetch at most 2 sources. After each fetch, stop immediately if you find **any** paper or announcement published within the last 3 days — do not fetch a second source in that case.
```

**Evidence:** The cheapest run (Jun 30, 4 turns, 39 AIC) terminated early. The two most expensive (combined 544 AIC) did not. A hard-stop rule — not a suggestion — removes the ambiguity that causes overrun.

**Estimated savings:** Eliminating the two outlier runs and trimming 1–2 turns from mid-range runs projects to ~50–60 AIC/run reduction (36–43%).

---

#### 2. Cap per-page content consumption — estimated **~18 AIC/run savings**

**Problem:** At ~27,000 tokens/turn, the agent is consuming full pages including boilerplate, navigation chrome, and many result entries. The arxiv search URL alone can return 10+ full abstracts.

**Proposed addition to §Browsing Instructions:**
```
After fetching each page, extract only the first 6,000 characters. Do not scroll or paginate.
```

**Evidence:** The low-AIC run (Jun 30, 110,382 tokens across 4 turns = 27,596 tokens/turn) used the same pages. A 6,000-character cap per fetch would roughly halve per-turn token consumption, saving ~2 equivalent turns of AIC per run.

**References:**
- [§28319198943](https://github.com/githubnext/agentic-ops/actions/runs/28319198943) — highest-AIC run, agent artifact 1.84 MB vs 0.80 MB for recent run
- [§28438420177](https://github.com/githubnext/agentic-ops/actions/runs/28438420177) — most efficient run, 4 turns / 110,382 tokens

---

#### 3. Replace broad arxiv search URL with max-results API endpoint — estimated **~10 AIC/run savings (when arxiv is fetched)**

**Current:** `(arxiv.org/redacted)

This returns a full HTML search page with 10 results and full abstracts. The export API (`export.arxiv.org` — already allowlisted) returns structured Atom XML with a configurable result cap.

**Proposed replacement:**
```
(export.arxiv.org/redacted)
```

**Evidence:** `export.arxiv.org` is already in the network allowlist. Limiting to 3 results removes 70% of arxiv page content on fetches where this source is used. Conservative savings of ~1 turn equivalent (~10 AIC) per run where arxiv is visited.

---

### Structural Optimization — Inline Sub-Agent for Browse-and-Extract

The workflow has **no existing `## agent:` blocks** and has 4 major `##` sections. The **§Browsing Instructions** section is a strong sub-agent candidate.

#### Candidate: `browse-sources` sub-agent

| Dimension | Score | Rationale |
|-----------|-------|-----------|
| Independence | 2/3 | Can run before selection/output phases |
| Small-model adequacy | 3/3 | Pure extraction: find title, date, abstract, URL from HTML — no synthesis |
| Parallelism | 1/2 | Sequential browsing (early-exit pattern discourages parallel fetches) |
| Size | 2/2 | Substantial: 2+ web fetches, multi-page scanning |
| **Total** | **8/10** | Strong candidate |

**Why a smaller model fits:** The browsing phase is entirely extractive — the agent matches paper titles against a date filter and extracts structured metadata. No judgment or synthesis is needed. A smaller model can identify "this paper is from July 1, here is its title and abstract" as reliably as the main model.

**Proposed invocation change:** Add a new `## agent: browse-sources` block before `## Selection Criteria`:

```markdown
## agent: browse-sources

Fetch at most 2 of the following sources. For each page, extract all papers or announcements published in the last 7 days. Return a JSON array of objects with keys: title, url, date, one_sentence_summary. Stop after the first source if you find any item from the last 3 days.

Sources (priority order):
1. (huggingface.co/redacted)
2. (openai.com/redacted)

Respond with only the JSON array — no prose.
```

The main agent then receives the extracted candidates and applies the **§Selection Criteria** + **§Output** steps without reading raw HTML. This shifts ~60–70% of turn costs to a cheaper model.

**Estimated savings:** 30–50 AIC/run, depending on model cost differential. Conservative: ~30 AIC/run.

---

### Caveats

- Token estimates for runs 11–12 (Jun 27–28) are derived from the consistent AIC-per-token ratio (0.351) across 5 measured runs; actual figures may differ.
- The arxiv export API returns Atom XML rather than HTML; the agent prompt may need to be updated to mention this format if it causes parsing confusion.
- The sub-agent recommendation depends on the agentic workflow framework's inline sub-agent support and the cost tier of the smaller model available.
- Recommendations 1–3 are independent and safe to apply individually.







> Generated by [Agentic Workflow AIC Usage Optimizer](https://github.com/githubnext/agentic-ops/actions/runs/28600464637) · 345.4 AIC · ⊞ 21.6K · [◷](https://github.com/search?q=repo%3Agithubnext%2Fagentic-ops+is%3Aissue+%22gh-aw-workflow-call-id%3A+githubnext%2Fagentic-ops%2Fagentic-token-optimizer%22&type=issues)
> - [x] expires  on Jul 9, 2026, 3:18 PM UTC

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[agentic-token-optimizer] Optimize Daily Agentic AI Research Digest — reduce 8× AIC variance via fetch caps and content limits #229

Target Workflow

Analysis Period

Spend Profile

Key Finding: AIC Cost Is Turn-Linear

Ranked Recommendations

1. Harden the source-fetch limit and early-exit condition — estimated ~55 AIC/run savings

2. Cap per-page content consumption — estimated ~18 AIC/run savings

3. Replace broad arxiv search URL with max-results API endpoint — estimated ~10 AIC/run savings (when arxiv is fetched)

Structural Optimization — Inline Sub-Agent for Browse-and-Extract

Candidate: `browse-sources` sub-agent

Caveats

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Metric	Value
Total AIC	981.00
Avg AIC / run	140.14
Min AIC / run	39.11 (run #14, 4 turns)
Max AIC / run	310.66 (run #12, ~20 turns est.)
AIC variance ratio	8×
Avg turns / run (5 known)	9.2
Avg tokens / turn	~27,050
AIC / 1,000 tokens	~0.35

Run	Date	AIC	Tokens	Turns	Ratio
§28232804830	Jun 26	121.84	343,376	11	11.1/turn
§28286345553	Jun 27	233.75	~666K est.	~17 est.	—
§28319198943	Jun 28	310.66	~885K est.	~20 est.	—
§28368310575	Jun 29	91.30	262,408	9	10.1/turn
§28438420177	Jun 30	39.11	110,382	4	9.8/turn
§28512043721	Jul 01	85.30	248,291	8	10.7/turn
§28583122414	Jul 02	99.05	279,878	14	7.1/turn

Dimension	Score	Rationale
Independence	2/3	Can run before selection/output phases
Small-model adequacy	3/3	Pure extraction: find title, date, abstract, URL from HTML — no synthesis
Parallelism	1/2	Sequential browsing (early-exit pattern discourages parallel fetches)
Size	2/2	Substantial: 2+ web fetches, multi-page scanning
Total	8/10	Strong candidate

Uh oh!

[agentic-token-optimizer] Optimize Daily Agentic AI Research Digest — reduce 8× AIC variance via fetch caps and content limits #229

Description

Target Workflow

Analysis Period

Spend Profile

Key Finding: AIC Cost Is Turn-Linear

Ranked Recommendations

1. Harden the source-fetch limit and early-exit condition — estimated ~55 AIC/run savings

2. Cap per-page content consumption — estimated ~18 AIC/run savings

3. Replace broad arxiv search URL with max-results API endpoint — estimated ~10 AIC/run savings (when arxiv is fetched)

Structural Optimization — Inline Sub-Agent for Browse-and-Extract

Candidate: browse-sources sub-agent

Caveats

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Candidate: `browse-sources` sub-agent