OSS
THIS IS A DRAFT
This text may not be complete.
- title
- OSS Training Course
- author
- Lukasz Sokolowski
OSS
OSS Training Materials
Copyright Notice
Copyright © 2004-2026 by NobleProg Limited All rights reserved.
This publication is protected by copyright, and permission must be obtained from the publisher prior to any prohibited reproduction, storage in a retrieval system, or transmission in any form or by any means, electronic, mechanical, photocopying, recording, or likewise.
Introduction/Outline
- Code management, versioning, and licensing
- Automation and code quality (best practices)
- Continuous Integration (CI) on GitHub/GitLab
- Automated testing (unit, integration, end-to-end)
- Changelog (Keep a Changelog, Conventional Commits)
- Issue management and roadmap
- Best practices in issue creation (templates, labels, milestones)
- Documentation
- Effective README: objectives, installation, usage, contributions
- Contributing Guide (CONTRIBUTING.md)
- API documentation (Swagger, Sphinx, Docusaurus, etc.)
Main Keys/Concerns
Open Source Software in Research: Strategy, Practice, and Impact
This training introduces Open Source Software (OSS) as a strategic, technical, and scientific asset for research institutes.
Slide 1 — The Strategic Role of OSS in a Research Institute
Open Source Software is a core enabler of modern research.
Key benefits:
- Visibility – public code increases discoverability and citations
- Scientific impact – software becomes a first-class research output
- Collaboration – enables cross-institutional and interdisciplinary work
- Transparency – supports reproducibility and open science
- Sustainability – shared maintenance reduces long-term costs
OSS is not just dissemination — it is research infrastructure.
Slide 2 — OSS and Publicly Funded Research
For publicly funded research, OSS aligns with policy and ethics.
Key considerations:
- Public money → public value
- Increasing mandates for:
- Open access
- Open data
- Open software
- OSS supports:
- Reproducibility of results
- Verification and reuse
- Long-term preservation
Publishing software openly increases return on public investment.
Slide 3 — Licensing Choices: Why They Matter
A license defines how others can use, modify, and redistribute software.
Without a license:
- Code is not legally reusable
- Collaboration is blocked
Licensing is a strategic decision, not a technical afterthought.
Slide 4 — Permissive vs. Copyleft Licenses
Two main license families are commonly used in research.
Permissive Licenses
Examples:
- MIT
- BSD
- Apache 2.0
Characteristics:
- Minimal restrictions
- Allow reuse in proprietary software
- Maximize adoption and reuse
Copyleft Licenses
Examples:
- GPL
- LGPL
Characteristics:
- Derivative works must remain open
- Protect long-term openness
- May limit industrial uptake
Slide 5 — Licensing Implications for Research Institutes
Choosing a license affects downstream impact.
Consider:
- Institutional IP policies
- Industry collaboration goals
- Community expectations
- Compatibility with dependencies
Common guidance:
- MIT / Apache 2.0 for broad dissemination
- GPL / LGPL when enforcing openness is a priority
Always involve:
- Technology transfer office
- Legal/IP advisors
Slide 6 — Minimum Best Practices for Publishing Research Software
Every published research software should meet a minimum quality bar.
Required elements:
- Public version-controlled repository
- Clear license
- Basic documentation
- Versioning and releases
- Citation information
“Working code” is not enough — usable code is the goal.
Slide 7 — Recommended Repository Structure
A simple, understandable structure improves reuse.
Minimum structure:
project/ ├── src/ or app/ ├── tests/ ├── README.md ├── LICENSE ├── CITATION.cff ├── CHANGELOG.md └── VERSION / tags
Clarity beats complexity.
Slide 8 — Documentation as a Research Output
Documentation enables reuse and scientific validation.
Minimum documentation:
- Project purpose and scope
- Installation instructions
- Usage examples
- Limitations and assumptions
- Contact or maintainer info
Good documentation:
- Reduces support burden
- Increases citations
- Enables reproducibility
Slide 9 — Versioning, Releases, and Citation
Stable versions are essential for scientific referencing.
Best practices:
- Semantic Versioning (MAJOR.MINOR.PATCH)
- Tagged releases in Git
- Archived releases (e.g., Zenodo integration)
Citation:
- Provide CITATION.cff
- Enable DOI minting for releases
Software should be citable like a paper.
Slide 10 — Governance When Opening Internal Code
Opening code changes responsibilities.
Key governance questions:
- Who can approve changes?
- Who sets project direction?
- How are conflicts resolved?
- What happens if maintainers leave?
Governance should be:
- Lightweight
- Transparent
- Documented
Slide 11 — Roles and Responsibilities
Clearly defined roles prevent burnout and confusion.
Typical roles:
- Maintainers – technical direction, reviews, releases
- Contributors – code, documentation, issues
- Users – feedback and validation
Document roles in:
- CONTRIBUTING.md
- GOVERNANCE.md
Slide 12 — Managing External Contributions
External contributions require structure.
Best practices:
- Use Pull Requests for all changes
- Enforce code review
- Require CI to pass
- Provide contribution guidelines
- Adopt a Code of Conduct
Good processes enable safe and scalable collaboration.
Slide 13 — Positioning OSS as Scientific Impact
Research software is a measurable output.
Recognized impact signals:
- Citations of software DOIs
- External contributors
- Downstream reuse in other projects
- Inclusion in workflows or infrastructures
- Adoption by industry or public bodies
OSS extends impact beyond publications.
Slide 14 — Reporting OSS Impact
To make OSS visible in evaluations:
Track:
- Releases and versions
- Citations (DOIs)
- GitHub/GitLab metrics (stars, forks, contributors)
- Known reusers (projects, institutions)
Describe:
- Scientific problems enabled
- Communities served
- Longevity and maintenance
Slide 15 — Technical Repository Management
Sound engineering practices support sustainability.
Key elements:
- Branching model (main + feature branches)
- Pull Request–based workflow
- Automated testing
- Continuous Integration (CI)
- Tagged releases
- Dependency management
Automation protects quality as teams change.
Slide 16 — Branching and PR Workflows
Recommended model:
- main branch always stable
- Feature branches for development
- All changes via Pull Requests
Pull Requests should:
- Reference issues
- Include tests
- Be reviewed
- Pass CI
Slide 17 — CI, Testing, and Dependencies
Minimum technical safeguards:
- Automated tests on every PR
- CI pipelines for reproducibility
- Dependency version pinning
- Regular dependency updates
These practices:
- Reduce regressions
- Support long-term reuse
- Enable external trust
Slide 18 — Key Takeaways
- OSS is a strategic research asset
- Licensing decisions shape impact
- Minimum quality standards matter
- Governance enables sustainable openness
- Software impact is measurable
- Technical discipline supports scientific credibility
Extended Details
Unfold it with the Expand button on the very right side below
Modern Software Development Practices (Python & JavaScript)
Best practices for managing, testing, and documenting software projects built with Python and JavaScript.
Code Management, Versioning, and Licensing
- Use Git for source control
- Branching strategy:
- main – stable production code
- feature branches – new development
- Use Semantic Versioning (MAJOR.MINOR.PATCH)
- Add a LICENSE file (MIT or Apache 2.0 commonly used)
- Protect main branches with:
- Pull / Merge Request reviews
- Mandatory CI checks
Automation and Code Quality (Python & JS)
Python
- Linters: flake8, pylint
- Formatter: black
- Import sorting: isort
- Type checking: mypy
JavaScript
- Linter: ESLint
- Formatter: Prettier
- Type checking: TypeScript (recommended)
Best practices:
- Run linters and formatters automatically
- Keep functions small and readable
- Follow PEP 8 (Python) and standard JS style guides
Continuous Integration (CI)
CI pipelines automatically validate code on each push or pull request.
Example: GitHub Actions
name: CI
on:
pull_request:
push:
branches: [ main ]
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.11"
- name: Install Python dependencies
run: |
pip install -r requirements.txt
- name: Lint Python
run: |
flake8 .
black --check .
- name: Run Python tests
run: pytest
- name: Set up Node.js
uses: actions/setup-node@v4
with:
node-version: "20"
- name: Install JS dependencies
run: npm ci
- name: Lint JS
run: npm run lint
- name: Run JS tests
run: npm test
Example: GitLab CI
stages:
- lint
- test
python_lint:
stage: lint
image: python:3.11
script:
- pip install flake8 black
- flake8 .
- black --check .
python_test:
stage: test
image: python:3.11
script:
- pip install -r requirements.txt
- pytest
js_lint:
stage: lint
image: node:20
script:
- npm ci
- npm run lint
js_test:
stage: test
image: node:20
script:
- npm ci
- npm test
Benefits:
- Early detection of issues
- Enforced quality standards
- Reliable and repeatable builds
Automated Testing
Python
- Frameworks: pytest, unittest
- Tools:
- pytest-cov (coverage)
- requests-mock / responses (API mocking)
JavaScript
- Unit & integration: Jest, Vitest
- End-to-end (E2E): Cypress, Playwright
Best practices:
- Run tests automatically in CI
- Test behavior, not implementation details
- Keep test execution fast
Changelog and Commit Standards
- Maintain CHANGELOG.md
- Follow Keep a Changelog structure:
- Added
- Changed
- Fixed
- Deprecated
Conventional Commits
- feat: new feature
- fix: bug fix
- docs: documentation
- test: tests
- chore: maintenance
Issue Management and Roadmap
- Use issues to track bugs, features, and technical debt
- Organize work using milestones and boards
- Reference issues in commits and merge requests
Best Practices in Issue Creation
- Use issue templates (bug / feature)
- Apply labels:
- python
- javascript
- bug
- enhancement
- documentation
- Always include clear reproduction steps for bugs
Documentation
Effective README
A strong README.md includes:
- Project overview
- Python / Node.js requirements
- Installation steps
- Usage examples
- Testing instructions
- License
Contributing Guide (CONTRIBUTING.md)
Should define:
- Environment setup
- Coding standards
- Commit conventions
- Pull Request workflow
API Documentation
Python
- Sphinx – documentation from docstrings
- FastAPI – automatic OpenAPI / Swagger
- MkDocs – lightweight docs
JavaScript
- Swagger / OpenAPI – REST APIs
- JSDoc – inline documentation
- Docusaurus – documentation portals
Recommended Project Structure
Example structure for a combined Python + JavaScript repository:
project-root/
├── backend/
│ ├── app/
│ │ ├── __init__.py
│ │ ├── main.py
│ │ ├── api/
│ │ └── services/
│ ├── tests/
│ ├── requirements.txt
│ └── pyproject.toml
│
├── frontend/
│ ├── src/
│ │ ├── components/
│ │ ├── pages/
│ │ └── services/
│ ├── tests/
│ ├── package.json
│ └── package-lock.json
│
├── docs/
│ ├── api/
│ └── guides/
│
├── .github/ or .gitlab/
│ └── ci/
│
├── CHANGELOG.md
├── CONTRIBUTING.md
├── README.md
└── LICENSE
Key Takeaways
- CI enforces quality for Python and JavaScript
- Automated testing reduces regressions
- Clear structure improves maintainability
- Documentation is part of the codebase
Open Source Best Practices (Python & JavaScript)
This section extends the project guidelines with patterns commonly used in successful open source projects.
Separate CI Pipelines per Service
In multi-service or monorepo projects, each service should have an independent CI pipeline.
Benefits:
- Faster CI execution
- Clear ownership per service
- Reduced coupling between frontend and backend
GitHub Actions (Per Service)
Each service has its own workflow file.
.github/workflows/
├── backend-ci.yml
└── frontend-ci.yml
Example: Backend CI
name: Backend CI
on:
push:
paths:
- "backend/**"
pull_request:
paths:
- "backend/**"
jobs:
backend:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: "3.11"
- run: pip install -r backend/requirements.txt
- run: flake8 backend
- run: pytest backend/tests
Example: Frontend CI
name: Frontend CI
on:
push:
paths:
- "frontend/**"
pull_request:
paths:
- "frontend/**"
jobs:
frontend:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: "20"
- run: cd frontend && npm ci
- run: cd frontend && npm run lint
- run: cd frontend && npm test
GitLab CI (Per Service)
backend:
stage: test
rules:
- changes:
- backend/**/*
image: python:3.11
script:
- pip install -r backend/requirements.txt
- pytest backend/tests
Monorepo vs Multirepo
Choosing the right repository strategy is critical for scalability.
| Aspect | Monorepo | Multirepo |
|---|---|---|
| Code location | Single repository | One repository per service |
| CI complexity | Higher | Lower |
| Dependency sharing | Easy | Requires versioning |
| Access control | Unified | Granular |
| Tooling | Requires advanced CI | Simpler |
| Open source friendliness | Good for small teams | Best for large ecosystems |
Recommendations:
- Monorepo – small teams, tight coupling, shared releases
- Multirepo – independent services, different release cycles, large communities
Issue Templates (Wiki Format)
Clear issue templates improve collaboration and contributor experience.
Bug Report
== Description == A clear and concise description of the bug. == Steps to Reproduce == # Step 1 # Step 2 # Step 3 == Expected Behavior == What you expected to happen. == Actual Behavior == What actually happened. == Environment == * OS: * Python / Node.js version: * Browser (if applicable): == Additional Context == Logs, screenshots, or links.
Feature Request
== Summary == Short description of the requested feature. == Motivation == Why is this feature needed? == Proposed Solution == Describe the preferred solution. == Alternatives == Other approaches considered. == Additional Context == Links, mockups, or references.
Documentation Issue
== Documentation Section == Which page or file needs improvement? == Problem == What is unclear, missing, or incorrect? == Suggested Improvement == Proposed text or structure.
Open Source Project Best Practices
These practices help attract and retain contributors.
Governance and Transparency
- Define maintainers and roles
- Use public roadmaps
- Make decisions in issues and PRs
Contribution Experience
- Clear README and CONTRIBUTING.md
- Friendly issue templates
- Label beginner issues (e.g. good first issue)
Licensing and Legal
- Always include a LICENSE file
- Ensure dependencies are license-compatible
- Avoid committing secrets or credentials
Community Standards
- Add a Code of Conduct (e.g. Contributor Covenant)
- Enforce respectful communication
- Moderate discussions consistently
Release Management
- Use semantic versioning
- Maintain a changelog
- Tag releases
- Automate releases where possible
Security
- Provide a SECURITY.md
- Define responsible disclosure process
- Keep dependencies up to date
Open Source Checklist
- README.md
- CONTRIBUTING.md
- CHANGELOG.md
- LICENSE
- CODE_OF_CONDUCT.md
- SECURITY.md
- CI pipelines enabled
- Issue and PR templates
Open Source Collaboration and Release Management
This section defines contribution workflows, security policies, community standards, and automated releases.
Pull Request Templates
Pull Request templates help reviewers and contributors align on expectations.
Pull Request Template (General)
## Description Brief summary of the changes introduced by this PR. ## Related Issue Closes #<issue-number> ## Type of Change - [ ] Bug fix - [ ] New feature - [ ] Documentation update - [ ] Refactoring - [ ] CI / tooling ## How Has This Been Tested? Describe the tests that you ran. ## Checklist - [ ] Code follows project style guidelines - [ ] Tests added or updated - [ ] Documentation updated (if applicable) - [ ] CI pipeline passes
Best practices:
- Require PR templates for all contributions
- Enforce reviews via branch protection
- Keep PRs small and focused
Security Policy (SECURITY.md)
Open source projects should clearly define how to report vulnerabilities.
Example SECURITY.md
# Security Policy ## Supported Versions Only the latest major version is actively supported with security updates. ## Reporting a Vulnerability If you discover a security vulnerability, please do NOT open a public issue. Instead, report it by emailing: security@project-domain.example Please include: - A description of the vulnerability - Steps to reproduce - Potential impact - Suggested remediation (if available) We aim to respond within 72 hours.
Best practices:
- Never discuss vulnerabilities publicly before a fix
- Acknowledge reporters responsibly
- Publish security advisories after resolution
Code of Conduct (CODE_OF_CONDUCT.md)
A Code of Conduct creates a safe and welcoming community.
Example CODE_OF_CONDUCT.md
# Code of Conduct ## Our Pledge We are committed to providing a respectful and inclusive environment for everyone. ## Expected Behavior - Be respectful and considerate - Use welcoming and inclusive language - Accept constructive criticism - Focus on what is best for the community ## Unacceptable Behavior - Harassment or discrimination - Trolling or personal attacks - Publishing private information ## Enforcement Project maintainers are responsible for enforcing this code of conduct. ## Reporting Report incidents to: conduct@project-domain.example
Recommendation:
- Use the Contributor Covenant as a base
- Enforce consistently and transparently
Release Automation (semantic-release)
Automated releases reduce human error and ensure consistency.
What semantic-release Does
- Determines next version from commit messages
- Generates changelog entries
- Creates Git tags and releases
- Publishes artifacts automatically
Commit Requirements
semantic-release requires Conventional Commits:
- feat: introduces a new feature (MINOR)
- fix: bug fix (PATCH)
- feat!: or BREAKING CHANGE (MAJOR)
Example semantic-release Configuration
{
"branches": ["main"],
"plugins": [
"@semantic-release/commit-analyzer",
"@semantic-release/release-notes-generator",
"@semantic-release/changelog",
"@semantic-release/github"
]
}
GitHub Actions: Automated Release
name: Release
on:
push:
branches:
- main
jobs:
release:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: "20"
- run: npm ci
- run: npx semantic-release
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
Python + semantic-release Notes
- semantic-release manages versions and tags
- Python packages should:
- Read version from git tags
- Or inject version during build (setuptools_scm)
Open Source Release Best Practices
- Use automated releases
- Never manually edit versions
- Always release from main branch
- Keep CHANGELOG.md generated automatically
- Tag every release
Final Open Source Readiness Checklist
- README.md
- CONTRIBUTING.md
- CHANGELOG.md
- LICENSE
- CODE_OF_CONDUCT.md
- SECURITY.md
- Issue templates
- Pull Request templates
- CI pipelines per service
- Automated releases enabled
Advanced Open Source Project Setup
This section completes the open source framework with publishing automation, governance, labeling standards, and repository structure.
Automated Package Publishing
Automated publishing ensures consistent, repeatable releases.
PyPI Publishing (Python)
Best practice:
- Publish only from tagged releases
- Use CI for trusted publishing
GitHub Actions: Publish to PyPI
name: Publish Python Package
on:
release:
types: [published]
jobs:
publish:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: "3.11"
- run: pip install build
- run: python -m build
- uses: pypa/gh-action-pypi-publish@release/v1
Requirements:
- pyproject.toml configured
- Trusted Publishing enabled in PyPI
npm Publishing (JavaScript)
Best practice:
- Use semantic-release
- Publish only from main branch
GitHub Actions: Publish to npm
name: Publish npm Package
on:
push:
branches:
- main
jobs:
publish:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: "20"
registry-url: https://registry.npmjs.org
- run: npm ci
- run: npx semantic-release
env:
NPM_TOKEN: ${{ secrets.NPM_TOKEN }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
GitHub Labels Taxonomy
A consistent label system improves triage and contributor onboarding.
Type Labels
- bug
- enhancement
- documentation
- refactor
- security
- question
Priority Labels
- priority: critical
- priority: high
- priority: medium
- priority: low
Status Labels
- status: triage
- status: blocked
- status: in progress
- status: ready for review
Scope / Stack Labels
- python
- javascript
- frontend
- backend
- api
- ci
Community Labels
- good first issue
- help wanted
- breaking change
Maintainers and Governance Model
Clear governance improves trust and sustainability.
Roles
- Maintainers
- Own project direction
- Review and merge PRs
- Manage releases
- Contributors
- Submit issues and PRs
- Improve code and documentation
Decision Making
- Decisions are made publicly in issues or PRs
- Maintainers aim for consensus
- Maintainer vote is final when consensus cannot be reached
Becoming a Maintainer
- Consistent high-quality contributions
- Community engagement
- Invitation by existing maintainers
Governance File
Recommended file:
- GOVERNANCE.md
Complete Open Source Starter Repository Structure
Recommended structure for a Python + JavaScript open source project:
project-root/
├── backend/
│ ├── app/
│ │ ├── __init__.py
│ │ ├── main.py
│ │ ├── api/
│ │ └── services/
│ ├── tests/
│ ├── pyproject.toml
│ └── README.md
│
├── frontend/
│ ├── src/
│ │ ├── components/
│ │ ├── pages/
│ │ └── services/
│ ├── tests/
│ ├── package.json
│ └── README.md
│
├── docs/
│ ├── api/
│ ├── guides/
│ └── README.md
│
├── .github/
│ ├── workflows/
│ │ ├── backend-ci.yml
│ │ ├── frontend-ci.yml
│ │ └── release.yml
│ ├── ISSUE_TEMPLATE/
│ └── PULL_REQUEST_TEMPLATE.md
│
├── CHANGELOG.md
├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.md
├── GOVERNANCE.md
├── LICENSE
├── README.md
├── SECURITY.md
└── semantic-release.json
Open Source Maturity Checklist
- Automated CI per service
- Automated releases
- PyPI and npm publishing
- Clear contribution workflow
- Governance defined
- Labels and templates configured
- Security policy documented
- Code of conduct enforced
Final Notes
Well-maintained open source projects prioritize:
- Automation over manual work
- Transparency over private decisions
- Documentation over tribal knowledge
- Community over individual ownership