Skip to content

olostep-api/CLI

Repository files navigation

Olostep CLI

Python npm License

Command-line interface for the Olostep API — search, map, scrape, crawl, answer, and batch the web from your terminal. Every command returns structured JSON so it pipes cleanly into jq, agents, and CI.

Available as a standalone binary via npm (no Python needed), or from source with Python.


Table of contents


Installation

npm (recommended)

Installs a standalone binary on postinstall (macOS arm64/x64, Linux x64, Windows x64). Node.js 16+ is required only for the install step.

npm install -g olostep-cli
olostep --version

Or run without installing:

npx -y olostep-cli@latest --help

Python (from source)

python -m venv .venv
source .venv/bin/activate   # Windows: .venv\Scripts\activate
pip install -U pip
pip install -e .
olostep --version

Sign in

The fastest way:

olostep login

This opens your browser, you click Authorize, and the CLI saves your key to a local credentials file. You're done.

olostep login                  # browser flow
olostep login --no-browser     # print the URL (useful over SSH)
olostep login --env-file .env  # write to a project .env instead

Alternative — set an env var. Good for CI:

export OLOSTEP_API_KEY=your_key_here

Get a key from the API Keys dashboard.

Where the key is stored (after olostep login):

OS Path
macOS ~/Library/Application Support/olostep-cli/credentials.json
Linux ~/.config/olostep-cli/credentials.json
Windows %USERPROFILE%\AppData\Roaming\olostep-cli\credentials.json

Delete that file to sign out. Set OLOSTEP_CLI_CONFIG_DIR to override the directory.

Resolution order: OLOSTEP_API_KEY env var → project .env → saved credentials.


Quick start

olostep login

# Get a researched answer from the web
olostep answer "What does Olostep do?"

# Discover URLs on a site
olostep map "https://example.com" --top-n 20

# Pull one page as markdown
olostep scrape "https://example.com" --formats markdown

# Crawl a whole site
olostep crawl "https://docs.example.com" --max-pages 50

# Scrape many URLs from a CSV in parallel
olostep batch-scrape urls.csv --formats markdown,html

Every command prints its JSON result to the terminal by default. Pass --out <path> to save it to a file instead.


What can it do?

You want to… Command Olostep product
Get a researched answer to a question answer Answers
Discover the URLs on a site map Maps
Pull one page scrape Scrapes
Pull every page on a site crawl Crawls
Pull many URLs from a CSV batch-scrape Batches
Extract structured fields ({title, price, …}) --parser-id on batch-scrape Parsers
Refetch a previous result by ID scrape-get Scrapes
Tag/organize a batch batch-update Batches

Choosing between them:

  • answer — you want a synthesized answer, not raw page content. The CLI does the research for you.
  • scrape — you already have the URL and want clean content out.
  • crawl — you want every page on a site (or a filtered subset) without enumerating URLs by hand.
  • batch-scrape — you have a list of URLs and want them processed in parallel.

Commands

Run olostep <command> --help for the full flag list.

answer — ask a question

Polls until the researched answer is ready.

Option Description
--out File path or - for stdout
--json-format Optional JSON shape for structured output
--poll-interval, --poll-timeout Polling controls (seconds)
--timeout HTTP timeout (seconds)
olostep answer "Summarize this company's product"
olostep answer "Extract facts" --json-format '{"company":"","country":""}' --out -

map — discover URLs

Option Description
--out File or -
--top-n Max URLs to return
--search-query Optional query to guide discovery
--include-subdomain / --no-include-subdomain Subdomains
--include-url, --exclude-url Repeatable URL patterns
--cursor Pagination cursor
--timeout HTTP timeout (seconds)
olostep map "https://example.com" --top-n 100 --search-query "blog"
olostep map "https://example.com" --include-subdomain --out - | jq '.urls[:5]'

scrape — one URL

Formats (comma-separated): html, markdown, text, json, raw_pdf, screenshot.

Option Description
--formats Comma-separated (default: markdown)
--country Country code (e.g. US, GB)
--wait-before-scraping Wait before scrape (ms)
--payload-json / --payload-file Advanced options as a JSON object (mutually exclusive)
--out, --timeout Output / HTTP timeout
olostep scrape "https://example.com" --formats markdown,html
olostep scrape "https://example.com" --country US --wait-before-scraping 2000
olostep scrape "https://example.com" --out - | jq .result.markdown_content

scrape-get — fetch by ID

olostep scrape-get "scrape_abc123"
olostep scrape-get "scrape_abc123" --out - | jq .result.markdown_content

crawl — whole site

Starts a crawl, polls until finished, then retrieves page contents.

Retrieve formats (comma-separated): markdown, html, json.

Option Description
--max-pages, --max-depth Crawl scope
--include-subdomain / --include-external Domain scope
--include-url, --exclude-url Repeatable URL patterns
--search-query, --top-n Discovery filter and cap
--follow-robots-txt / --ignore-robots-txt robots.txt
--webhook Optional webhook URL
--formats Retrieve formats
--pages-limit, --pages-search-query Pages API paging and filter
--poll-seconds, --poll-timeout, --crawl-timeout Timing
--dry-run Print the API payload and exit
--out, --timeout Output / HTTP timeout
olostep crawl "https://docs.example.com" --max-pages 50 --formats markdown,html
olostep crawl "https://example.com" --max-pages 10 --dry-run

batch-scrape — CSV

Submit many URLs in parallel. The CSV needs two columns: custom_id (or id) and url.

Option Description
--formats markdown, html, json (comma-separated)
--country Optional country code
--parser-id Parser ID for structured extraction
--poll-seconds, --log-every, --items-limit Polling and paging
--dry-run Print payload and exit
--out Output path or -
olostep batch-scrape urls.csv --formats markdown,html --country US
olostep batch-scrape urls.csv --parser-id "<PARSER_ID>" --out results.json

batch-update — metadata

Update metadata on an existing batch. One of --metadata-json or --metadata-file is required (must be a JSON object).

olostep batch-update "batch_abc123" --metadata-json '{"team":"growth"}'
olostep batch-update "batch_abc123" --metadata-file meta.json --out -

Output

By default every command prints its JSON result to the terminal — no files created, nothing to go hunting for.

Flag Behavior
(none) Print JSON to stdout (UTF-8, indented) — the default
--out <path> Write JSON to that file instead (parent dirs created if needed)
--out - Explicitly stdout (same as the default)

Logs go to stderr, so stdout stays clean for pipes:

olostep map "https://example.com" --top-n 50 | jq '.urls[:10]'
olostep answer "What is Olostep?" | jq .result
olostep scrape "https://docs.example.com" --out result.json   # save to a file

Install the MCP server in your agent

The CLI can wire the Olostep MCP server into Cursor, Claude Code, Windsurf, VS Code, or Kilo for you — no JSON editing.

olostep mcp install                          # detect agents, hosted endpoint
olostep mcp install --agent cursor           # only Cursor
olostep mcp install --transport stdio        # local npx instead of hosted
olostep mcp install --no-global              # write into current project
olostep mcp install --dry-run --json         # plan only
olostep mcp uninstall                        # remove the olostep entry
olostep list mcp                             # show where it's installed

By default it uses the hosted endpoint at https://mcp.olostep.com/mcp with Authorization: Bearer <key> — no Node, no Docker. Pass --transport stdio to use the local npx -y olostep-mcp install instead.

The CLI merges the olostep entry into your existing MCP config without touching other servers or unrelated keys. Restart your agent after install to pick up the new server.


Windows / PowerShell notes

PowerShell tokenizes a few characters differently from bash. If you hit an error like Invalid scrape format(s): markdown html or see * expand unexpectedly, quote the argument:

# Quote comma-separated values
olostep scrape "https://example.com" --formats "markdown,html"

# Quote patterns containing *
olostep map "https://example.com" --include-url "/*"

# Quote JSON payload strings
olostep answer "Extract facts" --json-format '{"company":"","year":""}'

Single quotes ('...') are usually safest for JSON, since PowerShell won't interpolate $ inside them.


Skills for AI agents

The CLI ships Olostep skills — drop-in capabilities you can install into Claude Code, Cursor, and other agents so they can use Olostep tools natively (scrape, map, crawl, search, batch, research, and more).

olostep add skills              # install all 13 skills into every detected agent
olostep add skills --login      # log in first, then install
olostep add skills --skill scrape --skill map --exclude setup
olostep list skills             # show what's installed and where
olostep remove skills           # uninstall
Option Description
--agent Install for specific agent(s), repeatable
--all-agents / --no-all-agents Target all detected agents
--global / --no-global Install into global vs project-local agent dirs
--skill, --exclude Include/exclude skill names, repeatable
--overwrite / --no-overwrite Replace existing installs
--link-mode auto, symlink, or copy
--json Machine-readable output

Status & updates

olostep status            # version, auth state, what's installed (skills, MCP)
olostep update            # update to the latest version
olostep update --check    # just check — don't install

status shows your CLI version, whether you're authenticated (and from where), and a summary of installed skills + MCP agents. update checks the npm registry and upgrades via npm. Interactive sessions also get a one-line "update available" notice (silence it with OLOSTEP_NO_UPDATE_NOTICE=1).


Development

python -m venv .venv
source .venv/bin/activate
pip install -e ".[test]"
pytest
olostep --help

Release binaries are built with PyInstaller (pip install -e ".[build]"); see .github/workflows/release.yml.


Links

License

MIT — see LICENSE.

About

CLI for Olostep - The fastest way to get clean web data into your AI workflows. Search, scrape, and crawl the web from your terminal with Olostep — no headless browsers, no anti-bot headaches, no infra.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors