First Commit

This commit is contained in:
yangfan
2025-10-17 13:40:44 +08:00
commit c21e3189e3
16 changed files with 4477 additions and 0 deletions

236
CLAUDE.md Normal file
View File

@@ -0,0 +1,236 @@
# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Project Overview
财务Excel数据处理系统 (Financial Excel Data Processing System) - A Python-based automation system for processing financial Excel data, extracting payment information, and generating standardized accounting entries with data validation and error marking capabilities.
**Language**: Chinese (中文) - All documentation, comments, and output are in Chinese.
## Development Commands
### Setup and Installation
```bash
# Install dependencies (system-wide)
pip install openpyxl --break-system-packages
# OR use virtual environment (recommended)
python3 -m venv venv
source venv/bin/activate # Linux/Mac
# venv\Scripts\activate # Windows
pip install openpyxl
```
### Run the Processing Pipeline
```bash
# Step 1: Extract data from Excel to JSON
python3 process_excel.py
# Step 2: Generate accounting entries Excel
python3 generate_accounting_entries.py
# Optional: Analyze Excel structure (debugging tool)
python3 analyze_excel.py
```
### Verify Installation
```bash
python3 -c "import openpyxl; print(openpyxl.__version__)"
```
## Architecture and Data Flow
### Processing Pipeline
```
data/data.xlsx (Raw financial data)
[process_excel.py] - Extract payment records
res.json (Intermediate JSON data)
[generate_accounting_entries.py] - Generate accounting entries
AccountingEntries.xlsx (Final accounting entry table)
```
### Key Components
1. **`process_excel.py`** - Excel Data Extraction Engine
- Handles merged and non-merged cells in column F (ReceivedAmount)
- Extracts orders from rows within merged cell ranges
- Validates amounts: `ReceivedAmount + HandlingFee ≈ Sum(Order[].Amount)` (tolerance: 0.01)
- Uses `data_only=True` to read formula results from column O
2. **`generate_accounting_entries.py`** - Accounting Entry Generator
- Creates debit/credit entries following Chinese accounting standards
- Merges cells for same ReceivedAmount groups
- Marks validation failures with pink background (#FAD1D4)
- Applies fixed exchange rate to currency conversions
3. **`analyze_excel.py`** - Structure Analysis Utility
- Debugging tool to inspect merged cells
- Preview data structure
### Data Structures
#### res.json Schema
```json
[
{
"ReceivedAmount": 12125, // Column F - supports merged cells
"HandlingFee": 25, // Column G - null becomes 0
"Order": [
{
"OrderNum": "XLRQD300T25", // Column H
"Amount": 550, // Column I
"AccountName": "24台湾长荣航运" // Column O - formula result
}
],
"checkRes": true // Validation: amount match within 0.01
}
]
```
#### Accounting Entry Rules
**For each ReceivedAmount record:**
1. **Debit Entry (到账金额)** - 1 record per ReceivedAmount
- Account: `1002.02` - 银行存款 - 中行USD
- Currency: 美元 (USD)
- Amount: `ReceivedAmount × EXCHANGE_RATE`
2. **Debit Entry (手续费)** - Only if HandlingFee > 0
- Account: `5603.03` - 财务费用-手续费
- Currency: 人民币 (RMB)
- Amount: `HandlingFee × EXCHANGE_RATE`
3. **Credit Entries (订单明细)** - 1 record per Order
- Account: `1122` - 应收账款
- Currency: 美元 (USD)
- Amount: `Order.Amount × EXCHANGE_RATE`
- **Display Order.Amount in "应收账款" column**
- Skip orders where Amount is null
### Special Processing Logic
#### Merged Cell Handling (process_excel.py:33-69)
- `get_f_column_ranges()`: Identifies all data ranges in column F
- Handles mixed scenarios: merged and non-merged cells
- Non-merged cells are treated as single-row ranges
- Merged cell value read from top-left corner (min_row, min_col)
#### Validation and Error Marking
- **checkRes calculation**: `abs((ReceivedAmount + HandlingFee) - Sum(Order[].Amount)) < 0.01`
- **Error marking**: Pink background (#FAD1D4) applied to all entries where checkRes = false
- Background color applied **before** cell merging to ensure visibility
#### Cell Merging Strategy (generate_accounting_entries.py:178-206)
- Groups entries by `(ReceivedAmount, HandlingFee)` key
- Merges "到账金额" (column A) and "手续费" (column B) for consecutive rows
- Centers content vertically and horizontally
- Re-applies background color after merging
## Configuration
### Exchange Rate
**Priority**: Program reads exchange rate in the following order:
1. **From `exchange_rate.txt` file** (if exists in current directory)
- Create a text file named `exchange_rate.txt` containing only the exchange rate value
- Example: `echo "7.25" > exchange_rate.txt`
- Validation: Rate must be between 0.1 and 100, otherwise falls back to default
2. **From default constant** (if file doesn't exist or contains invalid value)
- Location: `generate_accounting_entries.py:13`
- Default value: `7.1072`
**Error Handling** (generate_accounting_entries.py:16-52):
- File not found → Use default rate
- Invalid format (non-numeric) → Use default rate
- Unreasonable value (<0.1 or >100) → Use default rate
- Any other error → Use default rate
**Examples**:
```bash
# Set custom exchange rate
echo "7.25" > exchange_rate.txt
# Program will display which rate is being used
python3 generate_accounting_entries.py
# Output: "从 exchange_rate.txt 读取汇率: 7.25"
# Remove file to use default
rm exchange_rate.txt
python3 generate_accounting_entries.py
# Output: "汇率文件 exchange_rate.txt 不存在,使用默认汇率: 7.1072"
```
### Column Mapping (data/data.xlsx)
| Field | Column | Notes |
|-------|--------|-------|
| ReceivedAmount | F (6) | Supports merged cells |
| HandlingFee | G (7) | Null → 0 |
| OrderNum | H (8) | Skip if empty |
| Amount | I (9) | Null orders skipped |
| AccountName | O (15) | Formula result (data_only=True) |
### Excel Output Format
**Column widths**: `[12, 10, 18, 12, 25, 25, 8, 15, 25, 25, 10, 10, 12, 15]`
**Headers**:
```
到账金额, 手续费, 订单号, 应收账款, 金蝶名称,
摘要, 借/贷, 科目代码(*), 科目名称(*),
核算项目, 币别, 汇率, 原币金额, 金额
```
**Header style**: Bold, blue background (#CCE5FF), centered
## Important Implementation Notes
1. **Data starts from row 2** (row 1 is header)
2. **Formula handling**: Always use `data_only=True` when loading workbook to read calculated values
3. **Order filtering**: Skip rows where OrderNum is None or empty string
4. **Amount precision**: All calculations rounded to 2 decimal places
5. **UTF-8 encoding**: All files use UTF-8 encoding
6. **Error handling**:
- File not found: Exit with error message
- Invalid sheet: Exit with error message
- Invalid data: Log and skip row
- checkRes=false: Mark but continue processing
7. **Performance**: Handles 300+ rows of Excel data generating 500+ accounting entries in <10 seconds
## Testing Guidance
### Test Scenarios (from task.md:253-272)
1. **Single order, no fee**: ReceivedAmount=695, HandlingFee=0, Order[0].Amount=695
- Expected: 2 entries (debit + credit), checkRes=true
2. **Multiple orders with fee**: ReceivedAmount=12125, HandlingFee=25, Orders=[550, 11600]
- Expected: 4 entries, checkRes=true
3. **Amount mismatch**: ReceivedAmount=17270, HandlingFee=0, Orders=[5676, 11450]
- Expected: checkRes=false, pink background on all entries
4. **Null order amount**: ReceivedAmount=240, HandlingFee=25, Order[0].Amount=null
- Expected: Skip order credit entry, no error
## Version History
- **v1.2** (2025-10-17): Added exchange rate file support (`exchange_rate.txt`), intelligent rate validation, improved error handling
- **v1.1** (2025-01-17): Optimized accounting rules - removed redundant debit entries, simplified single-order logic
- **v1.0** (2025-01-17): Initial release with extraction, generation, validation, and error marking features