Week 2: Action Item Extractor

Project Overview / 项目概述

This week, you'll build a complete web application with both rule-based and LLM-powered extraction capabilities. The application allows users to:

✓ Create and manage notes / 创建和管理笔记
✓ Extract action items using rules / 使用规则提取行动项
✓ Extract action items using AI / 使用AI提取行动项
✓ Track completion status / 跟踪完成状态

Project Structure / 项目结构

week2/
├── app/
│ ├── main.py # FastAPI application entry
│ ├── config.py # Configuration management
│ ├── models.py # Pydantic models
│ ├── exceptions.py # Custom exceptions
│ ├── db.py # Database layer
│ ├── routers/
│ │ ├── action_items.py # Action items routes
│ │ └── notes.py # Notes routes
│ └── services/
│ └── extract.py # Extraction logic
├── frontend/
│ └── index.html # Web UI
├── tests/
│ └── test_extract.py # Unit tests
├── data/
│ └── app.db # SQLite database
└── README.md # Project documentation

TODO 1: Implement LLM Extraction / 实现LLM提取

1

Implement LLM-driven extraction function

✅ Complete

Step 1: Set up Ollama client / 设置Ollama客户端

from ollama import chat
from app.config import settings

def extract_action_items_llm(text: str) -> list[str]:
    """
    Extract action items from text using LLM.

    Args:
        text: Input text to extract from

    Returns:
        List of extracted action items
    """
    # Input validation
    if not text or len(text) > settings.max_text_length:
        raise ValueError(f"Text must be 1-{settings.max_text_length} characters")

    # Build prompt
    prompt = f"""Extract action items from the following text.

An action item is a task that needs to be done. Look for:
- Bullet points (-, *, •, or numbered lists)
- Keywords (todo:, action:, next:)
- Imperative sentences starting with verbs like: fix, implement, create, update
- Checkbox markers ([ ], [todo])

Text:
{text}

Return a JSON object with a single key "items" containing an array of action item strings.
If no action items are found, return {{"items": []}}."""

Step 2: Call LLM with structured output / 使用结构化输出调用LLM

    # Call LLM with JSON format
    response = chat(
        model=settings.ollama_model,
        messages=[{"role": "user", "content": prompt}],
        format="json",  # Enable structured output
        options={"temperature": settings.extraction_temperature}
    )

    # Parse response
    result = json.loads(response["message"]["content"])

    # Handle different response formats
    if isinstance(result, list):
        return result
    elif "items" in result:
        return result["items"]
    elif "actionItems" in result:
        return [item["description"] for item in result["actionItems"]]
    else:
        return []

Key Design Decisions / 关键设计决策

✅ format="json" - Ensures valid JSON output / 确保有效的JSON输出
✅ temperature=0.3 - Low temperature for consistency / 低温度保证一致性
✅ Flexible parsing - Handle multiple JSON formats / 处理多种JSON格式
✅ Input validation - Prevent abuse / 防止滥用

TODO 2: Add Unit Tests / 添加单元测试

2

Write comprehensive unit tests

✅ Complete - 11 tests passing

Test Coverage / 测试覆盖

Test Class	Tests	Coverage
TestExtractActionItems	5 tests	Bullets, keywords, empty, no items, imperatives
TestExtractActionItemsLLM	6 tests	LLM extraction scenarios

Example Test / 测试示例

def test_extract_bullets():
    """Test extraction from bullet points."""
    text = """
    - Fix the login bug
    * Update the documentation
    • Review PR #123
    """
    items = extract_action_items(text)
    assert len(items) == 3
    assert "fix the login bug" in items[0].lower()

def test_extract_empty():
    """Test extraction from empty text."""
    items = extract_action_items("")
    assert len(items) == 0

Running Tests / 运行测试

# Run all tests
cd week2
pytest tests/ -v

# Run with coverage
pytest tests/ --cov=app --cov-report=html

# Result: ======================== 11 passed in 14.30s =========================

TODO 3: Refactor Code / 重构代码

3

Refactor for better architecture

✅ Complete

Step 1

models.py - Pydantic Models

Define API contracts with automatic validation / 使用自动验证定义API契约

Step 2

config.py - Configuration

Centralized settings with env var support / 支持环境变量的集中配置

Step 3

exceptions.py - Custom Exceptions

Typed exception hierarchy / 类型化异常层次

Step 4

db.py - Database Layer

Context managers, error handling / 上下文管理器、错误处理

Pydantic Models / Pydantic模型

from pydantic import BaseModel, Field, validator

class NoteCreate(BaseModel):
    """Request model for creating a note."""
    title: str = Field(..., min_length=1, max_length=200)
    content: str = Field(..., min_length=1, max_length=10000)

class NoteResponse(BaseModel):
    """Response model for a note."""
    id: int
    title: str
    content: str
    created_at: datetime
    updated_at: datetime

    class Config:
        from_attributes = True  # Pydantic v2

Configuration / 配置管理

from pydantic_settings import BaseSettings, SettingsConfigDict

class Settings(BaseSettings):
    model_config = SettingsConfigDict(
        env_file=".env",
        case_sensitive=False
    )

    app_name: str = "Action Item Extractor"
    ollama_model: str = "llama3.1:8b"
    max_text_length: int = 10000
    extraction_temperature: float = 0.3

# Usage
from app.config import settings
model = settings.ollama_model

TODO 4: Add API Endpoints / 添加API端点

4

Implement new API endpoints

✅ Complete

New Endpoints / 新端点

POST /extract-llm - Extract action items using LLM

GET /notes - List all notes

POST /notes - Create a new note

Frontend Integration / 前端集成

<!-- Extract with Rules -->
<button onclick="extractWithRules()">
    Extract (Rules)
</button>

<!-- Extract with LLM -->
<button onclick="extractWithLLM()">
    Extract LLM
</button>

<!-- List Notes -->
<button onclick="listNotes()">
    List Notes
</button>

TODO 5: Generate README / 生成README

5

Create comprehensive documentation

✅ Complete

README Sections / README章节

✅ Project Overview / 项目概述
✅ Features / 功能特性
✅ Quick Start / 快速开始
✅ Installation / 安装步骤
✅ API Documentation / API文档
✅ Testing Guide / 测试指南
✅ Project Structure / 项目结构
✅ Configuration / 配置说明
✅ Troubleshooting / 故障排除

Key Learnings / 关键学习

1. LLM Integration Best Practices / LLM集成最佳实践

Prompt Engineering:

✅ Clear instructions > vague descriptions / 清晰的指令 > 模糊的描述
✅ Provide examples to help model understand / 提供示例帮助模型理解
✅ Specify output format explicitly / 明确指定输出格式

API Calling:

✅ Use format="json" for structured output / 使用结构化输出
✅ Set appropriate temperature / 设置适当的温度
✅ Handle multiple response formats / 处理多种响应格式

2. FastAPI Development Patterns / FastAPI开发模式

Type Safety:

✅ Pydantic models define API contracts / Pydantic模型定义API契约
✅ Automatic data validation / 自动数据验证
✅ Auto-generated API docs / 自动生成API文档

Error Handling:

✅ Custom exception classes / 自定义异常类
✅ Global exception handlers / 全局异常处理器
✅ Appropriate HTTP status codes / 适当的HTTP状态码

3. Code Refactoring Principles / 代码重构原则

Separation of Concerns:

✅ Routes handle HTTP only / 路由仅处理HTTP
✅ Services contain business logic / 服务包含业务逻辑
✅ Data layer encapsulates database / 数据层封装数据库

Configuration Management:

✅ Centralized configuration / 集中配置
✅ Environment variable overrides / 环境变量覆盖
✅ Type-safe access / 类型安全访问

Achievements Unlocked / 成就解锁

🚀

First LLM App

Complete AI-powered application

✅

11 Tests Passing

Comprehensive test coverage

🔧

Clean Architecture

Professional code structure

📚

Full Documentation

Complete README

Action Item Extractor行动项提取器完整教程

Project Overview / 项目概述

TODO 1: Implement LLM Extraction / 实现LLM提取

TODO 2: Add Unit Tests / 添加单元测试

TODO 3: Refactor Code / 重构代码

TODO 4: Add API Endpoints / 添加API端点

TODO 5: Generate README / 生成README

Key Learnings / 关键学习

Achievements Unlocked / 成就解锁

Action Item Extractor
行动项提取器完整教程