← aptselect_v1.0

User Manual

v1.0.0-rc-1 // Build 2026.06

1.0 Overview

aptselect is a local-first environment for prompt engineering. It is organized into three primary views:

Home View: Your dashboard for initiating new prompts and checkpoints.
LLM Task Explorer: The retrospective interface for auditing and grading individual, ad-hoc prompt executions across multiple models side-by-side.
Eval Task Explorer: The analytics interface for reviewing batch dataset evaluations, tracking pass rates, and comparing aggregate model performance via leaderboards.
Provider View: Where you connect LLMs (Anthropic, Gemini, MISTRAL, OpenAI, XAI) and manage keys.

2.0 Managing Providers (Provider View)

Before running prompts, you must configure your model providers. aptselect stores all keys locally in SQLite database.

2.1 Adding a Provider

Navigate to the Provider View (Network Icon).
Select a service (e.g., OpenAI, Anthropic).
Paste your API Key. The connection is tested immediately.
Note: You can enable/disable specific models (e.g., turn off gpt-3.5-turbo if you only want to test gpt-4o).

3.0 The Prompt Explorer

This is where you build and test. aptselect treats prompts like code, automatically saving checkpoints so you never lose an iteration.

3.1 Variables & Templating

You can inject dynamic data into your prompts using double curly braces (e.g., {{user_input}}). The sidebar inspector will automatically detect these variables so you can provide test values before running the prompt.

3.2 Evaluations & Datasets

Instead of testing one prompt at a time, you can upload CSV datasets to run bulk evaluations. The app will automatically grade the outputs against your reference criteria and generate a leaderboard comparing model performance and latency.

4.0 The LLM Task Explorer

The LLM Task Explorer is your archive for ad-hoc prompt testing. It allows you to review every manual prompt you have ever run.

Side-by-Side Comparison: Review how different models answered the exact same prompt simultaneously.
Developer Details: Click any past response to inspect the raw JSON API response, exact token usage (Input/Output), and execution duration.
Curation: Mark specific outputs as "Good" or "Bad", or bookmark them to build a dataset of high-quality responses.

5.0 The Eval Task Explorer

The Eval Task Explorer houses your quantitative benchmarking data.

Leaderboards: Compare models head-to-head on pass rates, average token consumption, and response latency across hundreds of test cases.
Failure Analysis: Drill down into specific failed test cases to identify consistent edge-case failures or formatting breaks for specific models.

6.0 Troubleshooting

6.1 "Provider Not Available"

If a model is unavailable when trying to run a task, return to the Providers view to ensure your API key is valid and the specific model is toggled "ON".

6.2 Database Reset

If you need to completely reset your local data, you can find your encrypted SQLite database file here:

Windows: %APPDATA%\aptselect\aptselect.db
Mac: ~/Library/Application Support/aptselect/aptselect.db
Linux: ~/.config/aptselect/aptselect.db