Files
TableProcessing/CLAUDE.md
2025-10-17 13:40:44 +08:00

7.8 KiB
Raw Blame History

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

财务Excel数据处理系统 (Financial Excel Data Processing System) - A Python-based automation system for processing financial Excel data, extracting payment information, and generating standardized accounting entries with data validation and error marking capabilities.

Language: Chinese (中文) - All documentation, comments, and output are in Chinese.

Development Commands

Setup and Installation

# Install dependencies (system-wide)
pip install openpyxl --break-system-packages

# OR use virtual environment (recommended)
python3 -m venv venv
source venv/bin/activate  # Linux/Mac
# venv\Scripts\activate   # Windows
pip install openpyxl

Run the Processing Pipeline

# Step 1: Extract data from Excel to JSON
python3 process_excel.py

# Step 2: Generate accounting entries Excel
python3 generate_accounting_entries.py

# Optional: Analyze Excel structure (debugging tool)
python3 analyze_excel.py

Verify Installation

python3 -c "import openpyxl; print(openpyxl.__version__)"

Architecture and Data Flow

Processing Pipeline

data/data.xlsx (Raw financial data)
    ↓
[process_excel.py] - Extract payment records
    ↓
res.json (Intermediate JSON data)
    ↓
[generate_accounting_entries.py] - Generate accounting entries
    ↓
AccountingEntries.xlsx (Final accounting entry table)

Key Components

  1. process_excel.py - Excel Data Extraction Engine

    • Handles merged and non-merged cells in column F (ReceivedAmount)
    • Extracts orders from rows within merged cell ranges
    • Validates amounts: ReceivedAmount + HandlingFee ≈ Sum(Order[].Amount) (tolerance: 0.01)
    • Uses data_only=True to read formula results from column O
  2. generate_accounting_entries.py - Accounting Entry Generator

    • Creates debit/credit entries following Chinese accounting standards
    • Merges cells for same ReceivedAmount groups
    • Marks validation failures with pink background (#FAD1D4)
    • Applies fixed exchange rate to currency conversions
  3. analyze_excel.py - Structure Analysis Utility

    • Debugging tool to inspect merged cells
    • Preview data structure

Data Structures

res.json Schema

[
  {
    "ReceivedAmount": 12125,      // Column F - supports merged cells
    "HandlingFee": 25,             // Column G - null becomes 0
    "Order": [
      {
        "OrderNum": "XLRQD300T25", // Column H
        "Amount": 550,              // Column I
        "AccountName": "24台湾长荣航运" // Column O - formula result
      }
    ],
    "checkRes": true               // Validation: amount match within 0.01
  }
]

Accounting Entry Rules

For each ReceivedAmount record:

  1. Debit Entry (到账金额) - 1 record per ReceivedAmount

    • Account: 1002.02 - 银行存款 - 中行USD
    • Currency: 美元 (USD)
    • Amount: ReceivedAmount × EXCHANGE_RATE
  2. Debit Entry (手续费) - Only if HandlingFee > 0

    • Account: 5603.03 - 财务费用-手续费
    • Currency: 人民币 (RMB)
    • Amount: HandlingFee × EXCHANGE_RATE
  3. Credit Entries (订单明细) - 1 record per Order

    • Account: 1122 - 应收账款
    • Currency: 美元 (USD)
    • Amount: Order.Amount × EXCHANGE_RATE
    • Display Order.Amount in "应收账款" column
    • Skip orders where Amount is null

Special Processing Logic

Merged Cell Handling (process_excel.py:33-69)

  • get_f_column_ranges(): Identifies all data ranges in column F
  • Handles mixed scenarios: merged and non-merged cells
  • Non-merged cells are treated as single-row ranges
  • Merged cell value read from top-left corner (min_row, min_col)

Validation and Error Marking

  • checkRes calculation: abs((ReceivedAmount + HandlingFee) - Sum(Order[].Amount)) < 0.01
  • Error marking: Pink background (#FAD1D4) applied to all entries where checkRes = false
  • Background color applied before cell merging to ensure visibility

Cell Merging Strategy (generate_accounting_entries.py:178-206)

  • Groups entries by (ReceivedAmount, HandlingFee) key
  • Merges "到账金额" (column A) and "手续费" (column B) for consecutive rows
  • Centers content vertically and horizontally
  • Re-applies background color after merging

Configuration

Exchange Rate

Priority: Program reads exchange rate in the following order:

  1. From exchange_rate.txt file (if exists in current directory)

    • Create a text file named exchange_rate.txt containing only the exchange rate value
    • Example: echo "7.25" > exchange_rate.txt
    • Validation: Rate must be between 0.1 and 100, otherwise falls back to default
  2. From default constant (if file doesn't exist or contains invalid value)

    • Location: generate_accounting_entries.py:13
    • Default value: 7.1072

Error Handling (generate_accounting_entries.py:16-52):

  • File not found → Use default rate
  • Invalid format (non-numeric) → Use default rate
  • Unreasonable value (<0.1 or >100) → Use default rate
  • Any other error → Use default rate

Examples:

# Set custom exchange rate
echo "7.25" > exchange_rate.txt

# Program will display which rate is being used
python3 generate_accounting_entries.py
# Output: "从 exchange_rate.txt 读取汇率: 7.25"

# Remove file to use default
rm exchange_rate.txt
python3 generate_accounting_entries.py
# Output: "汇率文件 exchange_rate.txt 不存在,使用默认汇率: 7.1072"

Column Mapping (data/data.xlsx)

Field Column Notes
ReceivedAmount F (6) Supports merged cells
HandlingFee G (7) Null → 0
OrderNum H (8) Skip if empty
Amount I (9) Null orders skipped
AccountName O (15) Formula result (data_only=True)

Excel Output Format

Column widths: [12, 10, 18, 12, 25, 25, 8, 15, 25, 25, 10, 10, 12, 15]

Headers:

到账金额, 手续费, 订单号, 应收账款, 金蝶名称,
摘要, 借/贷, 科目代码(*), 科目名称(*),
核算项目, 币别, 汇率, 原币金额, 金额

Header style: Bold, blue background (#CCE5FF), centered

Important Implementation Notes

  1. Data starts from row 2 (row 1 is header)

  2. Formula handling: Always use data_only=True when loading workbook to read calculated values

  3. Order filtering: Skip rows where OrderNum is None or empty string

  4. Amount precision: All calculations rounded to 2 decimal places

  5. UTF-8 encoding: All files use UTF-8 encoding

  6. Error handling:

    • File not found: Exit with error message
    • Invalid sheet: Exit with error message
    • Invalid data: Log and skip row
    • checkRes=false: Mark but continue processing
  7. Performance: Handles 300+ rows of Excel data generating 500+ accounting entries in <10 seconds

Testing Guidance

Test Scenarios (from task.md:253-272)

  1. Single order, no fee: ReceivedAmount=695, HandlingFee=0, Order[0].Amount=695

    • Expected: 2 entries (debit + credit), checkRes=true
  2. Multiple orders with fee: ReceivedAmount=12125, HandlingFee=25, Orders=[550, 11600]

    • Expected: 4 entries, checkRes=true
  3. Amount mismatch: ReceivedAmount=17270, HandlingFee=0, Orders=[5676, 11450]

    • Expected: checkRes=false, pink background on all entries
  4. Null order amount: ReceivedAmount=240, HandlingFee=25, Order[0].Amount=null

    • Expected: Skip order credit entry, no error

Version History

  • v1.2 (2025-10-17): Added exchange rate file support (exchange_rate.txt), intelligent rate validation, improved error handling
  • v1.1 (2025-01-17): Optimized accounting rules - removed redundant debit entries, simplified single-order logic
  • v1.0 (2025-01-17): Initial release with extraction, generation, validation, and error marking features