Files
TableProcessing/CLAUDE.md
Aiden_ 5b32208194 xx
2025-10-31 21:24:08 +08:00

263 lines
9.1 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Project Overview
财务Excel数据处理系统 (Financial Excel Data Processing System) - A Python-based automation system for processing financial Excel data, extracting payment information, and generating standardized accounting entries with data validation and error marking capabilities.
**Language**: Chinese (中文) - All documentation, comments, and output are in Chinese.
## Development Commands
### Setup and Installation
```bash
# Install dependencies (system-wide)
pip install openpyxl --break-system-packages
# OR use virtual environment (recommended)
python3 -m venv venv
source venv/bin/activate # Linux/Mac
# venv\Scripts\activate # Windows
pip install openpyxl
```
### Run the Processing Pipeline
```bash
# Step 1: Extract data from Excel to JSON
python3 process_excel.py
# Step 2: Generate accounting entries Excel
python3 generate_accounting_entries.py
# Optional: Analyze Excel structure (debugging tool)
python3 analyze_excel.py
```
### Verify Installation
```bash
python3 -c "import openpyxl; print(openpyxl.__version__)"
```
## Architecture and Data Flow
### Processing Pipeline
```
data/data.xlsx (Raw financial data)
[process_excel.py] - Extract payment records
res.json (Intermediate JSON data)
[generate_accounting_entries.py] - Generate accounting entries
AccountingEntries.xlsx (Final accounting entry table)
```
### Key Components
1. **`process_excel.py`** - Excel Data Extraction Engine
- Handles merged and non-merged cells in column F (ReceivedAmount)
- Extracts orders from rows within merged cell ranges
- Validates amounts: `ReceivedAmount + HandlingFee ≈ Sum(Order[].Amount)` (tolerance: 0.01)
- Uses `data_only=True` to read formula results from column O
2. **`generate_accounting_entries.py`** - Accounting Entry Generator
- Creates debit/credit entries following Chinese accounting standards
- Merges cells for same ReceivedAmount groups
- Marks validation failures with pink background (#FAD1D4)
- Marks debit/credit imbalance with yellow background (#fff799)
- Applies fixed exchange rate to currency conversions (3 decimal places)
- Sets "核算项目" to empty for 到账金额 debit entries
3. **`analyze_excel.py`** - Structure Analysis Utility
- Debugging tool to inspect merged cells
- Preview data structure
### Data Structures
#### res.json Schema
```json
[
{
"ReceivedAmount": 12125, // Column F - supports merged cells
"HandlingFee": 25, // Column G - null becomes 0
"Order": [
{
"OrderNum": "XLRQD300T25", // Column H
"Amount": 550, // Column I
"AccountName": "24台湾长荣航运" // Column O - formula result
}
],
"checkRes": true // Validation: amount match within 0.01
}
]
```
#### Accounting Entry Rules
**For each ReceivedAmount record:**
1. **Debit Entry (到账金额)** - 1 record per ReceivedAmount
- Account: `1002.02` - 银行存款 - 中行USD
- Currency: 美元 (USD)
- Amount: `ReceivedAmount × EXCHANGE_RATE` (3 decimal places)
- 核算项目: Empty (left blank)
2. **Debit Entry (手续费)** - Only if HandlingFee > 0
- Account: `5603.03` - 财务费用-手续费
- Currency: 人民币 (RMB)
- Amount: `HandlingFee × EXCHANGE_RATE` (3 decimal places)
3. **Credit Entries (订单明细)** - 1 record per Order
- Account: `1122` - 应收账款
- Currency: 美元 (USD)
- Amount: `Order.Amount × EXCHANGE_RATE` (3 decimal places)
- **Display Order.Amount in "应收账款" column**
- Skip orders where Amount is null
**Balance Verification**:
- After generating debit and credit entries for each record, the system verifies: `|SUM(Debit) - SUM(Credit)| < 0.001`
- If imbalanced, all entries for that record are marked with yellow background (#fff799)
### Special Processing Logic
#### Merged Cell Handling (process_excel.py:33-69)
- `get_f_column_ranges()`: Identifies all data ranges in column F
- Handles mixed scenarios: merged and non-merged cells
- Non-merged cells are treated as single-row ranges
- Merged cell value read from top-left corner (min_row, min_col)
#### Validation and Error Marking
- **checkRes calculation**: `abs((ReceivedAmount + HandlingFee) - Sum(Order[].Amount)) < 0.01` (from extraction phase)
- **Balance verification**: Per-record debit/credit validation with tolerance: `abs(SUM(Debit) - SUM(Credit)) < 0.001`
- **Error marking priority**:
1. Yellow background (#fff799) for debit/credit imbalance
2. Pink background (#FAD1D4) for checkRes = false
- Background color applied **before** cell merging to ensure visibility
#### Cell Merging Strategy (generate_accounting_entries.py:179-215)
- Groups entries by `(ReceivedAmount, HandlingFee)` key
- Merges "到账金额" (column A) and "手续费" (column B) for consecutive rows
- Centers content vertically and horizontally
- Re-applies background color after merging
## Configuration
### Exchange Rate
**Priority**: Program reads exchange rate in the following order:
1. **From `exchange_rate.txt` file** (if exists in current directory)
- Create a text file named `exchange_rate.txt` containing only the exchange rate value
- Example: `echo "7.25" > exchange_rate.txt`
- Validation: Rate must be between 0.1 and 100, otherwise falls back to default
2. **From default constant** (if file doesn't exist or contains invalid value)
- Location: `generate_accounting_entries.py:13`
- Default value: `7.1072`
**Error Handling** (generate_accounting_entries.py:16-52):
- File not found → Use default rate
- Invalid format (non-numeric) → Use default rate
- Unreasonable value (<0.1 or >100) → Use default rate
- Any other error → Use default rate
**Examples**:
```bash
# Set custom exchange rate
echo "7.25" > exchange_rate.txt
# Program will display which rate is being used
python3 generate_accounting_entries.py
# Output: "从 exchange_rate.txt 读取汇率: 7.25"
# Remove file to use default
rm exchange_rate.txt
python3 generate_accounting_entries.py
# Output: "汇率文件 exchange_rate.txt 不存在,使用默认汇率: 7.1072"
```
### Column Mapping (data/data.xlsx)
| Field | Column | Notes |
|-------|--------|-------|
| ReceivedAmount | F (6) | Supports merged cells |
| HandlingFee | G (7) | Null → 0 |
| OrderNum | H (8) | Skip if empty |
| Amount | I (9) | Null orders skipped |
| AccountName | O (15) | Formula result (data_only=True) |
### Excel Output Format
**Column widths**: `[12, 10, 18, 12, 25, 25, 8, 15, 25, 25, 10, 10, 12, 15]`
**Headers**:
```
到账金额, 手续费, 订单号, 应收账款, 金蝶名称,
摘要, 借/贷, 科目代码(*), 科目名称(*),
核算项目, 币别, 汇率, 原币金额, 金额
```
**Header style**: Bold, blue background (#CCE5FF), centered
## Important Implementation Notes
1. **Data starts from row 2** (row 1 is header)
2. **Formula handling**: Always use `data_only=True` when loading workbook to read calculated values
3. **Order filtering**: Skip rows where OrderNum is None or empty string
4. **Amount precision**:
- Exchange rate calculations: **3 decimal places** (e.g., 25656.992)
- Debit/credit balance tolerance: **0.001**
5. **核算项目 (Accounting dimension)**:
- 到账金额 (Received amount debit): **Empty/blank**
- 手续费 (Fee debit): **Empty/blank**
- 订单明细 (Order credit): Account name
6. **UTF-8 encoding**: All files use UTF-8 encoding
7. **Error handling**:
- File not found: Exit with error message
- Invalid sheet: Exit with error message
- Invalid data: Log and skip row
- checkRes=false: Mark but continue processing
8. **Performance**: Handles 300+ rows of Excel data generating 500+ accounting entries in <10 seconds
## Testing Guidance
### Test Scenarios (from task.md:253-272)
1. **Single order, no fee**: ReceivedAmount=695, HandlingFee=0, Order[0].Amount=695
- Expected: 2 entries (debit + credit), checkRes=true
2. **Multiple orders with fee**: ReceivedAmount=12125, HandlingFee=25, Orders=[550, 11600]
- Expected: 4 entries, checkRes=true
3. **Amount mismatch**: ReceivedAmount=17270, HandlingFee=0, Orders=[5676, 11450]
- Expected: checkRes=false, pink background on all entries
4. **Null order amount**: ReceivedAmount=240, HandlingFee=25, Order[0].Amount=null
- Expected: Skip order credit entry, no error
## Version History
- **v1.3** (2025-10-20): Enhanced accounting entry validation
- Changed amount precision to **3 decimal places** for exchange rate calculations
- Added per-record debit/credit balance verification with 0.001 tolerance
- Added yellow background (#fff799) marking for imbalanced entries
- Set 核算项目 to empty for 到账金额 debit entries
- Updated balance verification tolerance to 0.001
- **v1.2** (2025-10-17): Added exchange rate file support (`exchange_rate.txt`), intelligent rate validation, improved error handling
- **v1.1** (2025-01-17): Optimized accounting rules - removed redundant debit entries, simplified single-order logic
- **v1.0** (2025-01-17): Initial release with extraction, generation, validation, and error marking features