Engineering Team Playbook
Overviewβ
This playbook defines the standards, processes, and best practices for SkyMirror's engineering team. It ensures consistency, quality, and efficiency across all product development efforts.
Team Lead: CTO (Eric)
Scope: All engineering activities across CheckMet, Traquiva, Software Solutions
Last Updated: December 2024
For detailed tool configurations and integrations, see the Workflow & Tooling Guide.
Engineering Tool Stackβ
| Tool | Purpose | When to Use |
|---|---|---|
| Linear | Sprint management, product issues | CheckMet, Traquiva, Platform work |
| Jira | Client project management | Software Solutions client work |
| GitHub | Code repository, CI/CD, PRs | All code-related work |
| Slack | Communication, alerts | Daily communication, incidents |
| Notion | Documentation, specs, ADRs | Technical documentation |
Linear Workflow for Product Teamsβ
βββββββββββ ββββββββ βββββββββββββββ βββββββββββββ ββββββββ
β Backlog ββββΆβ Todo ββββΆβ In Progress ββββΆβ In Review ββββΆβ Done β
βββββββββββ ββββββββ βββββββββββββββ βββββββββββββ ββββββββ
Issue Naming Convention:
[Type] Brief description
Examples:
- [Feature] Add facial recognition for CheckMet
- [Bug] Fix login timeout on Traquiva
- [Chore] Update dependencies
Labels to Use:
frontend,backend,api,ml,infra,docsurgent,high,medium,lowxs,s,m,l,xl(size)
Jira Workflow for Client Projectsβ
βββββββββββ ββββββββββββ βββββββββββββββ ββββββββββ ββββββββ
β Backlog ββββΆβ Selected ββββΆβ In Progress ββββΆβ Review ββββΆβ Done β
βββββββββββ ββββββββββββ βββββββββββββββ ββββββββββ ββββββββ
Required Fields:
- Client name
- Billable hours (for time tracking)
- Story points
- Sprint assignment
GitHub Integrationβ
Branch Naming (linked to Linear/Jira):
feature/SKY-123-add-user-authentication
bugfix/CLIENT-456-fix-payment-flow
hotfix/SKY-789-critical-security-patch
Commit Messages (auto-link to issues):
feat(auth): add OAuth2 login support SKY-123
fix(payment): resolve timeout issue CLIENT-456
PR Auto-Transitions:
- PR opened β Issue moves to "In Review"
- PR merged to develop β Issue stays "In Review" (QA)
- PR merged to main β Issue moves to "Done"
Slack Channels for Engineeringβ
| Channel | Purpose |
|---|---|
#team-engineering | General engineering discussions |
#product-checkmet | CheckMet development |
#product-traquiva | Traquiva development |
#product-solutions | Software Solutions |
#github-activity | GitHub notifications |
#linear-updates | Linear sprint updates |
#alerts-production | Production alerts |
#alerts-security | Security notifications |
Team Structureβ
Engineering Organizationβ
CTO (Eric)
βββ CheckMet Engineering
β βββ Tech Lead
β βββ ML Engineer
β βββ Backend Developers (2)
β βββ Frontend Developer
β βββ QA Engineer
βββ Traquiva Engineering
β βββ Tech Lead
β βββ AI/ML Engineer
β βββ Full-Stack Developers (2)
β βββ QA Engineer
βββ Software Solutions
β βββ Delivery Manager
β βββ Tech Lead
β βββ Senior Developers (2)
β βββ Mid Developers (2)
β βββ QA Engineer
βββ Platform/DevOps
βββ DevOps Lead
βββ DevOps Engineer
Role Definitionsβ
| Role | Level | Responsibilities | Experience |
|---|---|---|---|
| Junior Developer | L1 | Feature implementation, bug fixes | 0-2 years |
| Mid Developer | L2 | Feature ownership, code reviews | 2-4 years |
| Senior Developer | L3 | Technical leadership, mentoring | 4-7 years |
| Tech Lead | L4 | Architecture, team leadership | 7+ years |
| Principal Engineer | L5 | Cross-team technical strategy | 10+ years |
Development Processβ
Agile Frameworkβ
We follow Scrum with 2-week sprints:
| Event | When | Duration | Purpose | Tool |
|---|---|---|---|---|
| Sprint Planning | Day 1 | 2 hours | Plan sprint work | Linear/Jira |
| Daily Standup | Daily | 15 min | Sync, blockers | Slack Huddle |
| Sprint Review | Day 10 | 1 hour | Demo to stakeholders | Meet + Linear |
| Retrospective | Day 10 | 1 hour | Process improvement | Notion |
| Backlog Refinement | Mid-sprint | 1 hour | Prepare future work | Linear/Jira |
Sprint Workflowβ
Backlog β Sprint Planning β In Progress β Code Review β QA β Done
Definition of Ready (DoR)β
A story is ready for sprint when:
- User story follows format: "As a [user], I want [goal], so that [benefit]"
- Acceptance criteria are clear and testable
- Story is estimated (story points)
- Dependencies are identified
- Design/mockups available (if UI)
- Technical approach discussed
Definition of Done (DoD)β
A story is done when:
- Code complete and follows standards
- Unit tests written (>80% coverage)
- Code reviewed and approved
- Integration tests passing
- Documentation updated
- QA tested and approved
- Deployed to staging
- Product owner accepted
Code Standardsβ
General Principlesβ
- Readability: Code should be self-documenting
- Simplicity: Prefer simple solutions over clever ones
- Consistency: Follow established patterns
- Testability: Write testable code
- Security: Security by design
Language-Specific Standardsβ
TypeScript/JavaScriptβ
// File naming: kebab-case
// user-service.ts
// Class naming: PascalCase
class UserService {
// Method naming: camelCase
async getUserById(id: string): Promise<User> {
// Implementation
}
}
// Constants: SCREAMING_SNAKE_CASE
const MAX_RETRY_ATTEMPTS = 3;
// Interfaces: PascalCase with 'I' prefix optional
interface UserProfile {
id: string;
email: string;
createdAt: Date;
}
Pythonβ
# File naming: snake_case
# user_service.py
# Class naming: PascalCase
class UserService:
# Method naming: snake_case
def get_user_by_id(self, user_id: str) -> User:
"""Get user by ID.
Args:
user_id: The unique user identifier
Returns:
User object if found
Raises:
UserNotFoundError: If user doesn't exist
"""
pass
# Constants: SCREAMING_SNAKE_CASE
MAX_RETRY_ATTEMPTS = 3
Code Review Guidelinesβ
For Authorsβ
- Keep PRs small (under 400 lines)
- Write clear PR descriptions
- Self-review before requesting
- Respond to feedback promptly
- Don't take feedback personally
For Reviewersβ
- Review within 24 hours
- Be constructive and specific
- Approve when "good enough"
- Use conventional comments:
nit:Minor suggestionsuggestion:Optional improvementquestion:Need clarificationblocker:Must fix before merge
Git Workflowβ
Branch Namingβ
feature/TICKET-123-add-user-authentication
bugfix/TICKET-456-fix-login-error
hotfix/TICKET-789-critical-security-patch
chore/update-dependencies
Commit Messagesβ
type(scope): subject
[optional body]
[optional footer]
Types: feat, fix, docs, style, refactor, test, chore
Example:
feat(auth): add OAuth2 login support
- Implement Google OAuth2 flow
- Add user session management
- Update login UI
Closes #123
Testing Standardsβ
Test Pyramidβ
| Level | Coverage | Responsibility |
|---|---|---|
| Unit Tests | 80%+ | Developers |
| Integration Tests | Key flows | Developers |
| E2E Tests | Critical paths | QA |
| Performance Tests | Key endpoints | DevOps |
Unit Testingβ
// Example: Jest test
describe('UserService', () => {
describe('getUserById', () => {
it('should return user when found', async () => {
// Arrange
const mockUser = { id: '1', email: 'test@example.com' };
mockRepository.findById.mockResolvedValue(mockUser);
// Act
const result = await userService.getUserById('1');
// Assert
expect(result).toEqual(mockUser);
});
it('should throw when user not found', async () => {
// Arrange
mockRepository.findById.mockResolvedValue(null);
// Act & Assert
await expect(userService.getUserById('1'))
.rejects.toThrow(UserNotFoundError);
});
});
});
Test Naming Conventionβ
should_[expected behavior]_when_[condition]
Architecture Standardsβ
Microservices Guidelinesβ
| Principle | Description |
|---|---|
| Single Responsibility | One service, one business domain |
| Loose Coupling | Services communicate via APIs |
| High Cohesion | Related functionality together |
| Independent Deployment | Deploy without affecting others |
| Decentralized Data | Each service owns its data |
API Designβ
REST Conventionsβ
GET /api/v1/users # List users
GET /api/v1/users/:id # Get user
POST /api/v1/users # Create user
PUT /api/v1/users/:id # Update user
DELETE /api/v1/users/:id # Delete user
Response Formatβ
{
"success": true,
"data": {
"id": "123",
"email": "user@example.com"
},
"meta": {
"timestamp": "2024-12-01T10:00:00Z",
"requestId": "req-abc-123"
}
}
Error Responseβ
{
"success": false,
"error": {
"code": "USER_NOT_FOUND",
"message": "User with ID 123 not found",
"details": {}
},
"meta": {
"timestamp": "2024-12-01T10:00:00Z",
"requestId": "req-abc-123"
}
}
Database Guidelinesβ
| Type | Use Case | Technology |
|---|---|---|
| Relational | Transactional data | PostgreSQL |
| Document | Flexible schemas | MongoDB |
| Cache | Session, hot data | Redis |
| Search | Full-text search | Elasticsearch |
DevOps & CI/CDβ
Pipeline Stagesβ
Code Push β Build β Test β Security Scan β Deploy Staging β Deploy Production
Environment Strategyβ
| Environment | Purpose | Deployment |
|---|---|---|
| Development | Local development | Manual |
| Staging | Integration testing | Auto on merge to main |
| Production | Live users | Manual approval |
Deployment Checklistβ
- All tests passing
- Security scan clean
- Performance benchmarks met
- Database migrations tested
- Rollback plan documented
- Monitoring alerts configured
- Stakeholders notified
Monitoring & Alertingβ
| Metric | Warning | Critical | Action |
|---|---|---|---|
| Error Rate | Over 1% | Over 5% | Investigate |
| Response Time | Over 500ms | Over 2s | Scale/optimize |
| CPU Usage | Over 70% | Over 90% | Scale |
| Memory Usage | Over 75% | Over 90% | Scale/investigate |
| Disk Usage | Over 80% | Over 95% | Cleanup/expand |
Security Standardsβ
OWASP Top 10 Checklistβ
- Injection prevention (parameterized queries)
- Authentication & session management
- Sensitive data encryption
- XML external entities (XXE) prevention
- Access control implementation
- Security misconfiguration prevention
- XSS prevention
- Insecure deserialization prevention
- Component vulnerability management
- Logging and monitoring
Security Review Checklistβ
| Area | Check |
|---|---|
| Authentication | MFA, secure password storage |
| Authorization | Role-based access, least privilege |
| Data | Encryption at rest and in transit |
| Input | Validation, sanitization |
| Dependencies | Regular updates, vulnerability scanning |
| Logging | No sensitive data in logs |
| Secrets | Environment variables, secret management |
Documentation Standardsβ
Code Documentationβ
/**
* Authenticates a user with email and password.
*
* @param email - User's email address
* @param password - User's password (plaintext)
* @returns JWT token if authentication successful
* @throws AuthenticationError if credentials invalid
* @example
* const token = await authService.login('user@example.com', 'password123');
*/
async login(email: string, password: string): Promise<string> {
// Implementation
}
README Templateβ
# Service Name
Brief description of the service.
## Getting Started
### Prerequisites
- Node.js 18+
- PostgreSQL 14+
### Installation
```bash
npm install
Configurationβ
Copy .env.example to .env and configure.
Runningβ
npm run dev
API Documentationβ
Link to API docs or OpenAPI spec.
Testingβ
npm test
Deploymentβ
Deployment instructions.
Contributingβ
Contribution guidelines.
---
## On-Call & Incident Response
### On-Call Rotation
- Weekly rotation among senior engineers
- Primary and secondary on-call
- Handoff meeting every Monday
### Incident Severity Levels
| Level | Description | Response Time | Example |
|-------|-------------|---------------|---------|
| SEV1 | Service down | 15 min | Production outage |
| SEV2 | Major degradation | 30 min | 50% error rate |
| SEV3 | Minor issue | 4 hours | Non-critical bug |
| SEV4 | Low priority | Next business day | UI glitch |
### Incident Response Process
1. **Detect:** Alert triggered or user report
2. **Triage:** Assess severity, assign owner
3. **Communicate:** Update status page, notify stakeholders
4. **Mitigate:** Implement quick fix or rollback
5. **Resolve:** Permanent fix deployed
6. **Review:** Post-incident review within 48 hours
### Post-Incident Review Template
```markdown
# Incident Report: [Title]
**Date:** [Date]
**Duration:** [Duration]
**Severity:** [SEV1-4]
**Owner:** [Name]
## Summary
Brief description of what happened.
## Timeline
- HH:MM - Event 1
- HH:MM - Event 2
## Root Cause
What caused the incident.
## Impact
- Users affected: X
- Revenue impact: β¬X
- Data loss: None/Description
## Resolution
How was it fixed.
## Action Items
- [ ] Action 1 (Owner, Due Date)
- [ ] Action 2 (Owner, Due Date)
## Lessons Learned
What we learned and how to prevent recurrence.
Performance Standardsβ
Response Time Targetsβ
| Endpoint Type | Target | Max |
|---|---|---|
| API Read | Under 100ms | 500ms |
| API Write | Under 200ms | 1s |
| Page Load | Under 2s | 5s |
| Search | Under 500ms | 2s |
Performance Testingβ
- Load testing before major releases
- Baseline performance metrics
- Automated performance regression tests
Knowledge Sharingβ
Engineering Meetingsβ
| Meeting | Frequency | Purpose |
|---|---|---|
| Tech Talk | Bi-weekly | Knowledge sharing |
| Architecture Review | Monthly | Design discussions |
| Retrospective | Sprint end | Process improvement |
| 1:1s | Weekly | Career development |
Documentation Requirementsβ
- ADRs for significant decisions
- Runbooks for operational procedures
- API documentation (OpenAPI)
- System architecture diagrams
Career Developmentβ
Engineering Ladderβ
| Level | Title | Focus |
|---|---|---|
| L1 | Junior Developer | Learning, execution |
| L2 | Developer | Independent delivery |
| L3 | Senior Developer | Technical leadership |
| L4 | Tech Lead | Team leadership |
| L5 | Principal Engineer | Organization impact |
Growth Frameworkβ
| Dimension | Description |
|---|---|
| Technical Skills | Depth and breadth of expertise |
| Delivery | Shipping quality work |
| Collaboration | Working with others |
| Leadership | Influencing and mentoring |
| Business Impact | Understanding and driving outcomes |
Daily Engineering Workflowβ
09:00 - Check Linear/Jira for sprint priorities
09:15 - Daily standup (Slack huddle)
09:30 - Deep work: coding, reviews
12:00 - Lunch
13:00 - Meetings, collaboration
15:00 - Code reviews (GitHub)
16:00 - Documentation (Notion)
17:00 - Update Linear/Jira, plan tomorrow
Weekly Engineering Ritualsβ
| Day | Activity | Tool |
|---|---|---|
| Monday | Sprint planning | Linear/Jira |
| Wednesday | Tech talk | Notion + Slack |
| Friday | Retrospective | Notion |
| Friday | Backlog refinement | Linear/Jira |
| Friday | Update engineering docs | Notion |
Document Version: 1.1
Last Updated: December 2024
Owner: CTO (Eric)
Review Cycle: Quarterly