138 lines
5.9 KiB
Markdown
138 lines
5.9 KiB
Markdown
# Newsletter Link Catalog
|
|
|
|
`nlc` is a TypeScript/Node.js CLI for cataloging links from newsletters in a configured Gmail label into Google Sheets and/or a local Excel workbook.
|
|
|
|
## Commands
|
|
|
|
```bash
|
|
nlc init
|
|
nlc run --dry-run
|
|
nlc run
|
|
nlc run --from 2026-05-01 --to 2026-05-16
|
|
nlc run --last 30d
|
|
nlc run --enrich-only
|
|
nlc serve
|
|
```
|
|
|
|
## Setup
|
|
|
|
1. Install dependencies with `npm install`.
|
|
2. Run `npm run build`.
|
|
3. Run `node dist/index.js init` to create `config.yaml`.
|
|
4. Place OAuth client JSON files in the configured local paths, typically:
|
|
- `~/.nlc/gmail-credentials.json`
|
|
- `~/.nlc/sheets-credentials.json`
|
|
5. Run `node dist/index.js run --dry-run` before live writes.
|
|
6. Run `node dist/index.js run` to import into SQLite.
|
|
7. Run `node dist/index.js serve` and open <http://127.0.0.1:3000>.
|
|
|
|
Tokens are persisted locally under `~/.nlc` and must not be committed.
|
|
|
|
## Collect Google Credentials
|
|
|
|
`nlc` uses OAuth desktop-app credentials because it runs locally and opens a browser for user authorization. You can use one Google Cloud project for both Gmail and Sheets.
|
|
|
|
1. Create or select a Google Cloud project at <https://console.cloud.google.com/>.
|
|
2. Enable the APIs you plan to use:
|
|
- Gmail API: <https://console.cloud.google.com/apis/library/gmail.googleapis.com>
|
|
- Google Sheets API: <https://console.cloud.google.com/apis/library/sheets.googleapis.com>
|
|
3. Configure the OAuth consent screen:
|
|
- Go to **Google Auth platform > Branding**.
|
|
- Click **Get started** if the auth platform is not configured yet.
|
|
- Enter an app name such as `Newsletter Link Catalog`.
|
|
- Use your own email for user support and developer contact.
|
|
- For personal use, choose **External** and add your Google account as a test user under **Audience**. For a Google Workspace-only internal tool, choose **Internal** if available.
|
|
4. Create the desktop OAuth client:
|
|
- Go to **Google Auth platform > Clients**.
|
|
- Click **Create client**.
|
|
- Choose **Application type > Desktop app**.
|
|
- Name it `Newsletter Link Catalog Local CLI`.
|
|
- Click **Create**, then immediately download the JSON file.
|
|
5. Save the downloaded JSON file locally:
|
|
- For Gmail reads, save it as `~/.nlc/gmail-credentials.json`.
|
|
- For Google Sheets writes, either copy the same OAuth JSON to `~/.nlc/sheets-credentials.json` or point both config values to the same file.
|
|
- On Windows, `~/.nlc` means `C:\Users\<you>\.nlc`.
|
|
6. Keep the JSON private. Do not commit it, paste it into issues, or share it. Google notes that OAuth client secrets may only be downloadable at creation time for newer clients; if you lose the secret, rotate or recreate the client.
|
|
|
|
On first live use, `nlc` opens a browser consent screen and writes token files such as `~/.nlc/gmail-token.json` and `~/.nlc/sheets-token.json`. Those token files are also secrets.
|
|
|
|
### Google Sheets Destination
|
|
|
|
Create the target spreadsheet in Google Sheets and copy its spreadsheet ID from the URL:
|
|
|
|
```text
|
|
https://docs.google.com/spreadsheets/d/SPREADSHEET_ID/edit
|
|
```
|
|
|
|
Put that value in `output.sheets_api.spreadsheet_id` when Sheets output is enabled. The Google account you authorize in the browser must have edit access to that spreadsheet.
|
|
|
|
### Optional LLM Provider Keys
|
|
|
|
LLM categorization is BYOK. Set the environment variable named in `categories.llm.api_key_env`, for example:
|
|
|
|
```powershell
|
|
$env:ANTHROPIC_API_KEY = "sk-ant-..."
|
|
$env:OPENAI_API_KEY = "sk-..."
|
|
```
|
|
|
|
For local providers such as Ollama or LM Studio, set `categories.llm.provider` to `local` or `openai-compatible` and configure `categories.llm.base_url`; no cloud API key is required unless that endpoint requires one.
|
|
|
|
## Configuration
|
|
|
|
Start from [config.example.yaml](config.example.yaml). The important choices are:
|
|
|
|
- `gmail.folder`: the single Gmail label/folder to process.
|
|
- `output.excel.enabled`: writes a local `.xlsx` file.
|
|
- `output.sheets_api.enabled`: enables Google Sheets integration when credentials and spreadsheet ID are configured.
|
|
- `database.enabled`: writes SQLite catalog data during `nlc run`; defaults to `true`.
|
|
- `database.path`: SQLite database path; defaults to `./data/newsletter-catalog.sqlite`.
|
|
- `links.tracking_params`: query parameters stripped during URL normalization.
|
|
- `categories.llm`: optional BYOK categorization provider.
|
|
|
|
## SQLite and Web App
|
|
|
|
SQLite is the default catalog store. `nlc run` writes imported links, sponsors, dead links, and run history to the configured database even when spreadsheet outputs are disabled.
|
|
|
|
Start the local read-only web app with:
|
|
|
|
```bash
|
|
nlc serve --host 127.0.0.1 --port 3000
|
|
```
|
|
|
|
The web UI is intentionally functional and read-only. It includes dashboard totals, a two-pane
|
|
newsletter browser, all links, sponsored links, dead links, and run history. The newsletter browser
|
|
opens on the most recent issue to keep the default view focused; switch to **All Issues** when you
|
|
want the full link history for the selected newsletter.
|
|
|
|
## Build and Distribution
|
|
|
|
The build uses `tsup` for the JavaScript bundle and `@yao-pkg/pkg` for the standalone executable:
|
|
|
|
```bash
|
|
npm run build
|
|
```
|
|
|
|
This bundles `src/index.ts` to `dist/index.js`, adds a Node shebang, emits types, and packages the current-platform executable as `dist/nlc.exe` on Windows or `dist/nlc` on macOS/Linux. The packaged artifact embeds the Node runtime for operational use without a separate Node install.
|
|
|
|
## Validation
|
|
|
|
Local validation does not need Gmail, Sheets, or LLM credentials:
|
|
|
|
```bash
|
|
npm run lint
|
|
npm run format:check
|
|
npm run typecheck
|
|
npm test
|
|
npm run build
|
|
npm run smoke
|
|
```
|
|
|
|
`npm run smoke` exercises `nlc --help`, `nlc init --help`, `nlc run --help`, and a fixture-backed dry run.
|
|
|
|
## Safety Notes
|
|
|
|
- Formula-like spreadsheet cells are escaped before output.
|
|
- Dry runs do not write output files or state.
|
|
- Live integrations are isolated behind adapters so tests use fakes.
|
|
- Individual email/link failures are counted and processing continues; critical config/write failures stop the command.
|