Skill Seekers: конвертер документации в AI-скиллы
Есть library с отличной документацией, но нет готового скилла? Skill Seekers от Yusuf Karaaslan автоматически конвертирует документацию, GitHub репозитории и PDF в ready-to-use Claude AI skills.
Documentation → Skills Pipeline
Skill Seekers решает fundamental проблему: 99% полезных tools и libraries не имеют ready-made skills для AI-агентов. Manual создание skills — slow и error-prone process.
Automatic pipeline:
- SCRAPE — documentation, GitHub repos, PDFs
- ANALYZE — extract knowledge patterns
- ENHANCE — AI-powered skill optimization
- PACKAGE — ready-to-use skill files
Supported sources:
- Documentation websites (any structure)
- GitHub repositories (code + README + wiki)
- PDF documents (technical guides, manuals)
- Word documents (.docx)
- YouTube videos (transcription → skill)
- Multi-source unified scraping
Установка и конфигурация
pip install skill-seekersInitial configuration:
# Настройка API ключей и GitHub tokens
skill-seekers config
# Interactive setup:
# - GitHub Personal Access Token
# - AI enhancement API keys (optional)
# - Output preferences
# - Quality thresholdsАрхитектура:
Skill Seekers Pipeline:
Source → Scraper → Analyzer → Enhancer → Packager → DeployОсновные команды
Auto-detection (рекомендуется):
# Smart detection source type
skill-seekers create --source "https://docs.react.dev" --name "react-guide"
skill-seekers create --source "microsoft/TypeScript" --name "typescript"
skill-seekers create --source "python-guide.pdf" --name "python-guide"Documentation scraping:
# Scrape documentation website
skill-seekers scrape --config configs/react.json
# Custom configuration
skill-seekers scrape \
--url "https://docs.nextjs.org" \
--name "nextjs-guide" \
--max-pages 100 \
--depth 3GitHub repository analysis:
# Scrape GitHub repository
skill-seekers github --repo microsoft/TypeScript --name typescript
# Include specific patterns
skill-seekers github \
--repo facebook/react \
--name react-internals \
--include "*.md,*.ts,*.js" \
--exclude "test/*,example/*"Configuration Files
Example: React documentation config
{
"name": "react-docs",
"base_url": "https://react.dev",
"start_urls": [
"https://react.dev/learn",
"https://react.dev/reference"
],
"allowed_domains": ["react.dev"],
"exclude_patterns": [
"/blog/",
"/versions/"
],
"max_pages": 200,
"depth_limit": 4,
"extract_code": true,
"include_examples": true,
"quality_threshold": 0.7
}GitHub repository config:
{
"repo": "microsoft/TypeScript",
"name": "typescript-compiler",
"branches": ["main"],
"include_paths": [
"src/compiler/",
"src/services/",
"lib/"
],
"file_patterns": ["*.ts", "*.md"],
"max_file_size": "1MB",
"extract_comments": true,
"include_tests": false
}Multi-Source Unified Scraping
Complete knowledge extraction:
# Unified: docs + GitHub + PDF
skill-seekers unified --config configs/complete_react.jsonUnified config example:
{
"name": "react-complete",
"sources": [
{
"type": "docs",
"url": "https://react.dev",
"max_pages": 100
},
{
"type": "github",
"repo": "facebook/react",
"include": "packages/react/src/"
},
{
"type": "pdf",
"file": "react-patterns-guide.pdf"
}
],
"merge_strategy": "weighted",
"deduplicate": true
}AI-Powered Enhancement
Auto-enhancement:
# AI-powered optimization
skill-seekers enhance output/react/
# Background processing (для больших skills)
skill-seekers enhance output/react/ --daemon
# Check status
skill-seekers enhance-statusEnhancement features:
- Structure optimization** — reorganize content for better flow
- Example generation** — create practical usage examples
- Cross-referencing** — link related concepts
- Quality scoring** — assess completeness and accuracy
- Template generation** — standardized skill format
Enhancement workflow presets:
# Show available presets
skill-seekers workflows list
# Use specific workflow
skill-seekers enhance output/react/ --workflow "documentation-to-skill"
# Custom workflow
skill-seekers enhance output/react/ --workflow custom.jsonSpecialized Extraction
PDF extraction:
# Technical documentation PDF
skill-seekers pdf --file "kubernetes-guide.pdf" --name "k8s-ops"
# With OCR для scanned PDFs
skill-seekers pdf \
--file "legacy-manual.pdf" \
--name "legacy-system" \
--ocr \
--lang enVideo content extraction:
# YouTube technical talks
skill-seekers video \
--url "https://youtube.com/watch?v=tech-talk-id" \
--name "advanced-react" \
--transcript-only
# Local video files
skill-seekers video \
--file "conference-talk.mp4" \
--name "microservices-patterns"Word document extraction:
# Corporate documentation
skill-seekers word \
--file "api-guidelines.docx" \
--name "api-standards" \
--extract-tables \
--preserve-formattingQuality Assessment
Built-in quality scoring:
# Assess skill quality
skill-seekers quality output/react/
# Detailed analysis report
skill-seekers quality output/react/ --detailed --report quality-report.htmlQuality metrics:
- Completeness** — coverage of source material
- Coherence** — logical structure and flow
- Accuracy** — factual correctness
- Usability** — practical applicability
- Examples** — quantity and quality of usage examples
Quality thresholds:
{
"quality_thresholds": {
"completeness": 0.8,
"coherence": 0.7,
"accuracy": 0.9,
"usability": 0.8,
"examples": 0.6
}
}Advanced Features
Incremental updates:
# Update existing skill без full rescrape
skill-seekers update output/react/ --check-changes
# Smart update strategy
skill-seekers update output/react/ \
--strategy "delta" \
--preserve-customizationsMulti-language support:
# Documentation в multiple languages
skill-seekers multilang \
--base-url "https://docs.example.com" \
--languages "en,es,fr,de" \
--name "multilang-guide"Stream processing (для больших sources):
# Stream large repositories chunk-by-chunk
skill-seekers stream \
--repo "tensorflow/tensorflow" \
--chunk-size 1000 \
--parallel 4Resume interrupted jobs:
# Resume после network interruption
skill-seekers resume --job-id "react-docs-20240310"
# Show active jobs
skill-seekers jobs listComplete Workflow Example
Example: Create Kubernetes skill
Step 1: Estimate scope
skill-seekers estimate --url "https://kubernetes.io/docs" --depth 3
# Output: ~450 pages, estimated 2-3 hoursStep 2: Multi-source scraping
skill-seekers unified --config k8s-config.jsonk8s-config.json:
{
"name": "kubernetes-ops",
"sources": [
{
"type": "docs",
"url": "https://kubernetes.io/docs",
"max_pages": 300,
"focus_sections": ["concepts", "tutorials", "reference"]
},
{
"type": "github",
"repo": "kubernetes/kubernetes",
"include": "docs/",
"exclude": "vendor/"
},
{
"type": "pdf",
"file": "k8s-best-practices.pdf"
}
]
}Step 3: AI enhancement
skill-seekers enhance output/kubernetes-ops/ --workflow "ops-guide"Step 4: Quality check
skill-seekers quality output/kubernetes-ops/ --threshold 0.8Step 5: Package and deploy
skill-seekers package output/kubernetes-ops/
skill-seekers install-agent output/kubernetes-ops.skillIntegration с Claude Code
Auto-install to agent directories:
# Install to Claude Code skills directory
skill-seekers install-agent output/react.skill
# Bulk install multiple skills
skill-seekers install-agent output/*.skill --directory /path/to/claude/skillsTesting generated skills:
# Extract test examples from source
skill-seekers extract-test-examples output/react/
# Generate validation prompts
skill-seekers test-gen output/react/ --count 10Best Practices
For Documentation Scraping:
- Start с small scope, expand gradually
- Use --estimate before full scrape
- Configure appropriate depth limits
- Exclude irrelevant sections (blog, changelog)
- Monitor rate limits
For GitHub Repositories:
- Focus на documentation и core source
- Exclude test files unless specifically needed
- Use file pattern filtering
- Respect repository size limits
- Include README и wiki content
For Quality Enhancement:
- Always run AI enhancement на final output
- Use workflow presets для consistent results
- Set appropriate quality thresholds
- Review generated examples for accuracy
- Validate cross-references
For Maintenance:
- Set up incremental update schedules
- Monitor source changes
- Preserve manual customizations
- Version control skill changes
- Regular quality assessments
Advanced Use Cases
Enterprise documentation pipeline:
# Corporate knowledge extraction
skill-seekers unified --config enterprise-config.json
# Sources: internal docs + GitHub enterprise + Confluence + PDFs
# Automated skill generation for internal tools
skill-seekers github --org company-internal --bulk --auto-enhanceOpen source project onboarding:
# Generate comprehensive project guide
skill-seekers create --source "https://github.com/apache/kafka" --complete
# Includes: docs, code analysis, examples, troubleshootingMulti-version documentation:
# Track multiple versions
skill-seekers scrape --url "https://docs.react.dev" --versions "17,18,19"
# Generate version-aware skillsЗаключение
Skill Seekers революционизирует процесс создания AI skills:
Что делает его unique:
- Multi-source intelligence** — docs + code + PDFs unified
- AI-powered enhancement** — не просто scraping, а intelligent processing
- Quality-driven approach** — assessment и optimization built-in
- Production-ready output** — ready для Claude Code integration
- Maintenance automation** — incremental updates, не full rewrites
Real impact:
- Reduces skill creation time от days до minutes
- Ensures comprehensive coverage источников
- Maintains quality standards автоматически
- Keeps skills updated с source changes
GitHub: https://github.com/yusufkaraaslan/Skill_Seekers
Результат: Any documentation или repository becomes a Claude Code skill in minutes, not days. The democratization of AI skill creation.
*Skill Seekers: когда ИИ учится from любой документации automatically.* 🧪
> Пока нет комментариев