# SIM Water Ledger Manual Import Workflow

This pass keeps the public site live without database credentials. When MySQL is configured, the pages and APIs read from the database first and fall back to fixtures if tables are empty or unavailable.

## Configure MySQL

Set either a full DSN:

```bash
export SIM_DB_DSN='mysql:host=127.0.0.1;dbname=sim_water;charset=utf8mb4'
export SIM_DB_USER='sim_user'
export SIM_DB_PASSWORD='...'
```

Or separate variables:

```bash
export SIM_DB_HOST='127.0.0.1'
export SIM_DB_DATABASE='sim_water'
export SIM_DB_USER='sim_user'
export SIM_DB_PASSWORD='...'
```

Apply the schema:

```bash
mysql -u "$SIM_DB_USER" -p "$SIM_DB_DATABASE" < sim/schema/water-ledger.sql
```

Seed public source metadata, import targets, and default opportunity zones:

```bash
php sim/cli/seed-water-ledger.php
```

## Import Ledger CSV

Dry run:

```bash
php sim/cli/import-ledger-csv.php sim/imports/adwr-gwsi.csv \
  --source-agency=ADWR \
  --source-dataset='Groundwater Site Inventory' \
  --source-url='https://www.azwater.gov/gis-data-and-maps' \
  --dry-run
```

Import:

```bash
php sim/cli/import-ledger-csv.php sim/imports/adwr-gwsi.csv \
  --source-agency=ADWR \
  --source-dataset='Groundwater Site Inventory' \
  --source-url='https://www.azwater.gov/gis-data-and-maps'
```

Recognized CSV columns include:

```text
public_record_id, source_agency, source_dataset, source_url, record_type,
water_source, legal_class, holder_name, facility_name, district_name,
ama, basin, county, latitude, longitude, volume_af, reliability_class,
shortage_exposure, storage_status, recovery_status, delivery_status,
confidence_level, notes, data_year, last_source_update
```

Unknown columns are ignored. Missing source fields are filled from CLI options.

## Import Facility GeoJSON

Dry run:

```bash
php sim/cli/import-facilities-geojson.php sim/imports/recharge-facilities.geojson \
  --source-agency=ADWR \
  --source-dataset='Long-Term Storage Credits Map' \
  --facility-type='Storage Layer' \
  --dry-run
```

Import:

```bash
php sim/cli/import-facilities-geojson.php sim/imports/recharge-facilities.geojson \
  --source-agency=ADWR \
  --source-dataset='Long-Term Storage Credits Map' \
  --facility-type='Storage Layer'
```

The importer accepts `Point` and `MultiPoint` features. Recognized property names include:

```text
facility_name, name, site_name, title, facility_type, type, layer_type,
source_agency, agency, ama, basin, county, public_source_url,
source_url, url, notes, label, description
```

## Initial Public Source Targets

Start with curated manual files from:

- ADWR GWSI tabular download
- ADWR Well Registry layer
- ADWR Grandfathered Water Rights layer
- ADWR Long-Term Storage Credits map/dashboard data
- CAP Subcontracting Status Report
- Reclamation Lower Colorado River Water Accounting and CUL data

Every import writes an `import_batches` row with file name, SHA-256 hash, rows seen, rows imported, rows rejected, status, and timestamps.

